Research
The Future of Database Queries: Evaluating Text-to-SQL and Text-to-NoSQL with AI
Auzair
April 19, 2024
23 min read
Text-to-SQL and Text-to-NoSQL with AI
In the wake of ChatGPT and other large language models (LLMs) gaining prominence, the fascination with Retrieval Augmented Generation (RAG) — essentially conversing directly with your data — has skyrocketed.
Introduction
In the wake of ChatGPT and other large language models (LLMs) gaining prominence, the fascination with Retrieval Augmented Generation (RAG) — essentially conversing directly with your data — has skyrocketed. While the concept of querying databases using natural language is captivating, the practical implementation of such RAG applications poses significant challenges.
This article delves into the exciting realm of using LLMs to transform natural language into SQL and NoSQL queries. Imagine the ease of fetching and filtering data just by typing out your thoughts. However, as straightforward as this may sound, the process is fraught with complexities. LLMs, for all their intelligence, are still prone to errors and can sometimes produce inaccurate or fabricated information.
Despite these challenges, the focus of our discussion will not be on the pitfalls alone but rather on comparing two powerful methods of data interaction: Text-to-SQL versus Text-to-NoSQL. Which approach is more effective? Which offers more accurate results or lower latency? And, when dealing with simpler data formats like CSV or Excel files, which method should you prefer?
Join me as we explore these questions, offering insights and perhaps even answers on how best to leverage these groundbreaking technologies in real-world applications.
Text-to-SQL
Text-to-SQL process involves providing an LLM with the schema of a database table, sometimes accompanied by an example row, to contextualize the data structure. This setup allows the model to understand what kind of information is stored and how it is organized.
When a user poses a query in natural language, the LLM utilizes this contextual information to generate a corresponding SQL query. This query isn't just a direct transformation of text; it's an intelligent creation that considers the user's intent and the database's architecture.
Here's how it typically works in a practical setting:
- 1. Input Preparation: The user's natural language query is received.
- 2. Query Generation: The LLM processes this input along with the provided schema and potentially an example row, to generate a SQL query.
- 3. Execution: The SQL query is executed within a database environment, such as SQLite in Python.
- 4. Output: The result of the query is returned to the user, completing the cycle from natural language to database response.
Below is a Python function that illustrates this process in action:
(Note: While there are several methods to extract database schemas, we'll skip that part for brevity in this discussion.)
Now, let's demonstrate the practicality of Text-to-SQL using a common dataset found in many examples, Movies.csv. To test the functionality, we'll pose a straightforward query to our system.
Here's the function call that tests the system:
SQL Query Generated by LLM:
Output:
As illustrated, the query successfully identifies 'The Shawshank Redemption' as the top-rated movie according to the dataset.
Text-to-NoSQL
Moving from SQL to NoSQL databases, the approach remains fundamentally similar, yet it adapts to the distinct structure and capabilities of NoSQL systems. In the case of MongoDB, one of the most powerful features is its aggregation pipeline, which allows for complex data processing and transformation directly within the database.
Similar to the Text-to-SQL process, we provide the LLM with an understanding of the database's structure. However, instead of generating SQL queries, the LLM now crafts MongoDB aggregation pipelines based on the user's natural language queries. These pipelines are sequences of data processing stages that transform, filter, and aggregate data efficiently.
The following is the function for Text-to-NoSQL:
Here we'll be using property_rentals.csv, another common csv dataset available.
Aggregation pipeline generated by LLM:
Output:
This example illustrates how the LLM formulates a MongoDB aggregation pipeline.
Comparing Text-to-SQL and Text-to-NoSQL: Performance, Accuracy, and Practicality
Having explored both Text-to-SQL and Text-to-NoSQL techniques, the question arises: which approach is better? The answer varies depending on several factors, including performance, accuracy, and the specific use case.
Latency and Performance
In terms of response time, Text-to-SQL generally demonstrates faster query generation compared to Text-to-NoSQL using MongoDB's aggregation pipelines. This difference primarily stems from the inherent complexity of NoSQL operations. Even basic queries in MongoDB can require multiple stages in an aggregation pipeline, making them inherently more complex and slower to generate than equivalent SQL queries.
Accuracy and Reliability
Accuracy is another critical factor where Text-to-SQL tends to outshine Text-to-NoSQL. SQL queries are often more straightforward and less prone to errors, particularly in scenarios involving complex calculations or multiple filtering steps. LLMs handling NoSQL queries can sometimes 'hallucinate' or generate inaccurate data manipulations, especially under complex querying conditions. This makes Text-to-SQL more reliable for precise data retrieval and operations that involve complex statistical calculations.
Practical Considerations
It's essential to consider the database schema complexity. For SQL databases with numerous interconnected tables, fully injecting a complex schema into an LLM can be impractical due to constraints like token limits in the model. This can lead to failures or incomplete schema understanding, which impacts the quality of the generated queries.
On the other hand, when using formats like CSV for testing both SQL and NoSQL methods, the simplicity of the data format does not fully exploit the strengths and capabilities of either approach. While simpler, this does not reflect the performance and utility of these methods in handling more complex, real-world database structures.
Use Case Specificity
The choice between Text-to-SQL and Text-to-NoSQL ultimately depends on the specific use case:
Text-to-SQL is preferable for environments where precise and accurate retrieval of structured data is critical, especially when dealing with statistical data or when the database schema is highly relational. Text-to-NoSQL may be more suitable in environments where the data is more hierarchical or loosely structured, or where the database operations require more flexible and dynamic data manipulation capabilities.
Conclusion
Both Text-to-SQL and Text-to-NoSQL have their respective strengths and weaknesses. The decision on which to use should be guided by the specific requirements of the application, including the need for speed, accuracy, and the complexity of the data and queries involved. As technology and LLM capabilities evolve, these distinctions may blur, but for now, a careful evaluation of each method’s merits and limitations is essential for optimal implementation.
LLMText to SQLText to NoSqlLarge Language ModelsOpenAIAIQueryloop