Blog

From Text to SQL: Create a Natural Language Database Query Tool

Learn how to build a tool that translates plain English questions into valid SQL queries using language models.

Posted on: 2026-03-16 by AI Assistant


From Text to SQL: Create a Natural Language Database Query Tool

Writing complex SQL queries can be tedious, especially for non-technical stakeholders who just want to know “How many users signed up last month?”. By combining Large Language Models with your database schema, you can build a tool that translates natural language directly into executable SQL.

In this tutorial, you will learn how to build a basic Text-to-SQL tool using Python and the Gemini API.

Prerequisites

Building the Tool

The key to a successful Text-to-SQL tool is providing the LLM with the exact schema of your database so it understands the tables, columns, and relationships available.

1. Creating a Sample Database

Let’s create a simple database using SQLite:

import sqlite3

conn = sqlite3.connect("ecommerce.db")
cursor = conn.cursor()

cursor.execute('''
    CREATE TABLE IF NOT EXISTS users (
        id INTEGER PRIMARY KEY,
        name TEXT,
        signup_date DATE
    )
''')
cursor.execute('''
    CREATE TABLE IF NOT EXISTS orders (
        id INTEGER PRIMARY KEY,
        user_id INTEGER,
        amount DECIMAL,
        order_date DATE,
        FOREIGN KEY(user_id) REFERENCES users(id)
    )
''')
# Insert some dummy data here...
conn.commit()

2. The Translation Script

Now, let’s use the Gemini API to generate the SQL.

import os
import sqlite3
from google import genai

# Configure API key
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

def get_database_schema(db_path):
    # In a real app, you might dynamically extract this using PRAGMA statements
    return """
    Table: users
    Columns: id (INTEGER), name (TEXT), signup_date (DATE)
    
    Table: orders
    Columns: id (INTEGER), user_id (INTEGER), amount (DECIMAL), order_date (DATE)
    Relations: orders.user_id references users.id
    """

def natural_language_to_sql(question, schema):
    prompt = f"""
    You are an expert SQL developer. Given the following database schema, 
    write a SQL query that answers the user's question. 
    Return ONLY the raw SQL query without any markdown formatting or explanations.
    
    Schema:
    {schema}
    
    Question: {question}
    """
    
    response = client.models.generate_content(
        model='gemini-2.5-pro',
        contents=prompt
    )
    return response.text.strip()

if __name__ == "__main__":
    db_path = "ecommerce.db"
    schema = get_database_schema(db_path)
    
    question = "What is the total revenue from users who signed up in 2026?"
    
    print(f"Question: {question}\n")
    
    sql_query = natural_language_to_sql(question, schema)
    print(f"Generated SQL: \n{sql_query}\n")
    
    # Optional: Execute the query
    # conn = sqlite3.connect(db_path)
    # cursor = conn.cursor()
    # cursor.execute(sql_query)
    # print(cursor.fetchall())

Conclusion & Next Steps

You’ve built a foundational Text-to-SQL tool! This approach allows anyone to query a database using everyday language.

For your next steps, consider adding validation to ensure the generated SQL is safe (read-only) before executing it against a live database. You could also explore frameworks like LangChain, which have built-in SQL database chain utilities.