Back to Insights
2026-05-05 8 min read Tanuj Garg

AI Agents That Write SQL: Patterns for Reliable Database Query Generation

AI & Automation#AI Agents#SQL#Databases#LLMs#Query Generation#System Design

Introduction

Every engineering team has a recurring problem: non-technical stakeholders need data from the database, and engineers become the bottleneck translating business questions into SQL queries.

AI agents that can generate and execute SQL queries against your database seem like the obvious solution. And they are—but only if you solve the hard parts: schema understanding, query validation, safe execution, and result interpretation.

A poorly designed query agent will generate invalid SQL, run expensive queries that tank your database, or return results that the user misinterprets. A well-designed one becomes a force multiplier for your entire organization.


Section 1: The Core Architecture of a Query Agent

A production-grade SQL query agent needs more than an LLM that outputs SQL strings. The architecture looks like this:

User Question → Schema Grounding → Query Generation → Validation → Execution → Result Formatting → Response

Each stage has specific responsibilities:

  • Schema Grounding: the agent reads the relevant portion of your database schema (tables, columns, types, relationships, constraints) and incorporates it into context.
  • Query Generation: the LLM generates SQL based on the question and the grounded schema.
  • Validation: a pre-execution check that catches syntax errors, unsafe operations, and expensive queries before they hit the database.
  • Execution: the query runs against the database with appropriate timeouts and resource limits.
  • Result Formatting: raw rows are translated into a human-readable answer with context.

Skip any of these stages and you'll have incidents.


Section 2: Schema Grounding Done Right

The single biggest determinant of query quality is how much schema context the agent has—and how it's presented.

Include the right schema information

Don't just dump CREATE TABLE statements. Include:

  • Table names and descriptions (what each table represents),
  • Column names, types, and descriptions,
  • Primary and foreign key relationships,
  • Indexes (so the agent can write queries that use them),
  • Example rows (2–3 rows per table to illustrate data shape and conventions),
  • Business logic constraints (e.g., "status column only takes values: active, pending, cancelled").

Manage context window budget

For databases with hundreds of tables, you can't include the full schema. Strategies:

  • Retrieval-based grounding: embed table descriptions and retrieve the top-k most relevant tables for each question using semantic search.
  • Lazy schema loading: start with table names only, let the agent request specific table schemas as needed.
  • Schema summaries: maintain a curated summary of your database schema that highlights the most commonly queried tables and their relationships.
// Retrieval-based schema grounding
async function getRelevantSchema(question: string): Promise<SchemaContext> {
  const questionEmbedding = await embed(question);
  const relevantTables = await vectorDb.search({
    vector: questionEmbedding,
    topK: 5,
    filter: { type: "table_schema" }
  });

  return buildSchemaContext(relevantTables);
}

Section 3: Query Validation Before Execution

Never let an LLM-generated query hit your database without validation. Implement these checks:

Syntax validation

Parse the SQL with a lightweight parser (like sql-parser or database-specific tools) to catch syntax errors before execution. This saves a round-trip to the database and gives the agent fast feedback for self-correction.

Safety validation

Block these query types entirely in most configurations:

  • DROP, TRUNCATE, ALTER, CREATE (DDL operations),
  • INSERT, UPDATE, DELETE (DML operations—unless you explicitly want a read-write agent with safeguards),
  • GRANT, REVOKE (permission changes).

For read-only agents, enforce that the query starts with SELECT or WITH (CTE).

Cost estimation

For databases that support it (Postgres with EXPLAIN), run a dry-run explain plan to estimate query cost. If the estimated cost exceeds a threshold, reject the query and ask the agent to simplify.

async function validateQuery(sql: string, db: Database): Promise<ValidationResult> {
  // 1. Syntax check
  const syntaxCheck = parseSQL(sql);
  if (!syntaxCheck.valid) return { error: syntaxCheck.error };

  // 2. Safety check
  if (containsWriteOperations(sql)) {
    return { error: "Write operations are not permitted" };
  }

  // 3. Cost estimation
  const explain = await db.query(`EXPLAIN ${sql}`);
  const estimatedCost = extractCost(explain);
  if (estimatedCost > MAX_COST_THRESHOLD) {
    return { error: "Query cost too high, please simplify" };
  }

  return { valid: true };
}

Section 4: Execution with Guardrails

Even validated queries can cause problems. Your execution layer needs guardrails:

Timeouts

Set a hard timeout on query execution (e.g., 30 seconds). If the query doesn't complete, terminate it. Long-running queries consume connection pool slots and database resources.

Row limits

Enforce a LIMIT clause if one isn't present. A query that returns 10 million rows will exhaust your agent's memory and slow down the database.

Connection pooling

Don't open a new connection per query. Use your existing connection pool, but consider a separate pool for agent queries so they don't compete with application traffic.

Read replicas

Route agent queries to a read replica (not the primary) when possible. This isolates analytical queries from transactional workloads.


Section 5: Handling Ambiguous Questions

Users ask ambiguous questions. "Show me revenue last month" could mean:

  • Last calendar month (April 2026),
  • Last 30 days from today,
  • The most recent complete month in your data,
  • Revenue recognized vs revenue booked.

A good query agent handles ambiguity by:

  1. Identifying missing context: the agent explicitly lists what assumptions it's making.
  2. Asking clarifying questions: for high-ambiguity questions, the agent asks the user before querying.
  3. Showing its work: the agent displays the generated SQL so the user can verify the interpretation before execution.
User: "Show me revenue last month"

Agent: I'll interpret "last month" as April 2026 (the most recent complete calendar month).
       Here's the query I'll run:

       SELECT SUM(amount) FROM transactions
       WHERE type = 'revenue'
       AND created_at >= '2026-04-01'
       AND created_at < '2026-05-01'

       Does this look correct? [Yes] [No, let me clarify]

Section 6: Result Interpretation, Not Just Retrieval

The job isn't done when the query returns rows. The agent needs to interpret the results:

  • Summarize findings: don't just return a table—explain what the data shows.
  • Highlight anomalies: if a metric is 10x higher or lower than expected, call it out.
  • Suggest follow-ups: "Revenue was down 30% in April. Would you like to break this down by product category?"
  • Handle empty results: if the query returns zero rows, explain possible reasons (wrong date range, filters too restrictive, data not yet ingested).

Section 7: When to Use and When to Avoid

Good use cases

  • Ad-hoc analytics questions from product and business teams,
  • Debugging data issues ("Why does this user's account show zero transactions?"),
  • Generating reports that combine data from multiple tables,
  • Exploratory data analysis during feature development.

Bad use cases

  • High-frequency, repeated queries (build a proper dashboard instead),
  • Queries that require complex business logic (the agent won't know your domain nuances),
  • Real-time, latency-sensitive applications (query generation adds seconds of latency),
  • Regulatory or compliance reporting (human-verified queries only).

Conclusion

AI agents that generate SQL queries are a practical, high-impact application of LLMs—but they require careful engineering. Schema grounding, query validation, execution guardrails, and result interpretation are all necessary components of a production system.

Start with a read-only agent against a read replica. Add validation layers before expanding access. The goal isn't to replace data analysts—it's to handle the long tail of ad-hoc questions that don't warrant a dashboard or an analyst's time.


Need help building AI agents that safely interact with your data infrastructure?