Transactional AI Agents: Patterns for Database Consistency and Safe Rollbacks
Introduction
Read-only AI agents are relatively safe. They query data, summarize results, and return answers. The risks are low because they don't modify state.
Write-capable AI agents are a different story.
When an AI agent creates a user account, updates a payment status, or provisions infrastructure, it's modifying state in your database. If the agent makes a mistake, gets interrupted mid-operation, or encounters a partial failure, you're left with inconsistent data.
Traditional software handles this with transactions, ACID guarantees, and compensation logic. AI agents need the same—but adapted for their probabilistic, multi-step nature.
Section 1: The Consistency Problem with Agent Writes
AI agents fail differently than traditional software, and these failure modes create unique consistency challenges:
Partial completion
An agent tasked with "create user, assign role, send welcome email" might succeed at creating the user, fail at assigning the role, and never reach the email step. The user exists in an inconsistent state—created but not properly configured.
Cascading writes
Agents often perform multiple database operations in sequence. If operation 3 fails, operations 1 and 2 have already committed. Without rollback logic, the database is left partially updated.
Hallucinated operations
An agent might "think" it already created a record and skip the creation step, or "think" a record doesn't exist and create a duplicate. The agent's internal state doesn't always match the actual database state.
Retry storms
If an agent encounters an error and retries, it might double-execute write operations that aren't idempotent. Two duplicate records, two charges, two emails.
Section 2: Pattern 1 — Database Transactions for Atomic Operations
The first line of defense: wrap related database operations in a transaction. If any operation fails, the entire transaction rolls back.
async function agentCreateUserWithRole(
db: Database,
userData: UserData,
role: string
) {
const result = await db.transaction(async (tx) => {
// All operations inside this transaction succeed or fail together
const user = await tx.users.create(userData);
await tx.userRoles.create({ userId: user.id, role });
await tx.auditLog.create({
action: "user_created",
userId: user.id,
agentRunId: currentRun.id
});
return user;
});
// Only reached if transaction commits successfully
await sendWelcomeEmail(result);
return result;
}
Limitations of transactions
Transactions only cover database operations. If your agent also calls external APIs (send email, call webhook, provision cloud resource), those aren't rolled back by a database transaction. For those, you need compensation logic (see Section 4).
Also, long-running agent tasks may hold transactions open too long, causing lock contention. Keep transactions short—prefer executing the transaction at the end of a planning phase rather than holding it open during agent reasoning.
Section 3: Pattern 2 — Idempotency Keys for Safe Retries
Agents retry operations when they encounter errors. Without idempotency, retries cause duplicate writes.
Every agent-driven write should include an idempotency key that uniquely identifies the operation:
async function agentCreatePayment(db: Database, amount: number, runId: string) {
const idempotencyKey = `agent-run-${runId}-payment`;
// Check if this operation was already completed
const existing = await db.payments.findByUnique({
idempotencyKey
});
if (existing) {
return existing; // Already done, return existing record
}
// Execute the operation
return await db.payments.create({
amount,
idempotencyKey,
createdBy: "agent",
agentRunId: runId
});
}
The idempotency key should be deterministic: given the same task and run ID, the same key is generated. This ensures that retries don't create duplicates.
Section 4: Pattern 3 — Compensation Transactions (Sagas)
For multi-step operations that span databases and external services, use the Saga pattern. Each step has a corresponding compensation action that undoes it.
Step 1: Create user → Compensation: Delete user
Step 2: Assign role → Compensation: Remove role
Step 3: Send email → Compensation: Log rollback (emails can't be unsent)
async function createUserSaga(db: Database, userData: UserData) {
const completedSteps: CompletedStep[] = [];
try {
// Step 1
const user = await db.users.create(userData);
completedSteps.push({ type: "create_user", data: user });
// Step 2
await db.userRoles.create({ userId: user.id, role: "member" });
completedSteps.push({ type: "assign_role", data: { userId: user.id } });
// Step 3 (external)
await sendWelcomeEmail(user);
completedSteps.push({ type: "send_email", data: { userId: user.id } });
return user;
} catch (error) {
// Compensation: undo completed steps in reverse order
await compensate(completedSteps.reverse());
throw error;
}
}
async function compensate(steps: CompletedStep[]) {
for (const step of steps) {
try {
switch (step.type) {
case "create_user":
await db.users.delete(step.data.id);
break;
case "assign_role":
await db.userRoles.delete({ userId: step.data.userId });
break;
case "send_email":
// Can't unsend email, but log the rollback
await db.auditLog.create({
action: "email_rollback_logged",
userId: step.data.userId
});
break;
}
} catch (compError) {
// Log but continue compensating remaining steps
console.error("Compensation failed:", compError);
}
}
}
Section 5: Pattern 4 — Optimistic Locking for Concurrent Agents
If multiple agents (or an agent and a human user) might modify the same record concurrently, you need optimistic locking to prevent lost updates.
Add a version column to records that agents modify:
// Schema
{
id: "uuid",
data: "jsonb",
version: "integer", // Incremented on each update
}
When the agent reads a record, it captures the version. When it writes, it includes a version check:
async function agentUpdateRecord(db: Database, id: string, newData: any) {
// Read with current version
const record = await db.records.findUnique({ id });
// Agent does its reasoning...
// Write with version check
const result = await db.records.update({
where: { id, version: record.version },
data: { ...newData, version: record.version + 1 }
});
if (!result) {
// Version mismatch: someone else updated the record
throw new Error("Concurrent modification detected. Re-read and retry.");
}
return result;
}
If the version check fails, the agent should re-read the record, re-evaluate its plan, and retry the update.
Section 6: Pattern 5 — Dry-Run Mode for High-Stakes Operations
For operations with significant impact (large data changes, financial transactions, infrastructure changes), implement a dry-run mode where the agent plans the changes but doesn't execute them.
async function agentBulkUpdateUsers(
db: Database,
filter: UserFilter,
updates: Partial<User>,
dryRun: boolean = true
) {
// First, show what would be affected
const affectedUsers = await db.users.findMany({ where: filter });
const plan = {
operation: "bulk_update",
filter,
updates,
affectedCount: affectedUsers.length,
sampleUsers: affectedUsers.slice(0, 5),
};
if (dryRun) {
return { plan, executed: false, message: "Dry run complete. Set dryRun=false to execute." };
}
// Execute with transaction
const result = await db.transaction(async (tx) => {
return await tx.users.updateMany({
where: filter,
data: updates
});
});
return { plan, executed: true, result };
}
This gives humans visibility into what the agent intends to do before it does it.
Section 7: Audit Trails for Agent-Driven Changes
When agents modify data, you need an audit trail that captures:
- What was changed (before/after values),
- Why (the task the agent was executing),
- Which agent run made the change,
- When the change occurred.
async function agentUpdateWithAudit(
db: Database,
recordId: string,
updates: any,
runId: string
) {
return await db.transaction(async (tx) => {
const before = await tx.records.findUnique({ id: recordId });
const updated = await tx.records.update({
where: { id: recordId },
data: updates
});
await tx.auditLog.create({
entityType: "record",
entityId: recordId,
action: "agent_update",
before: before,
after: updated,
agentRunId: runId,
timestamp: new Date(),
});
return updated;
});
}
This audit trail is essential for debugging agent mistakes and for compliance in regulated industries.
Conclusion
AI agents that write to databases need the same consistency patterns that reliable distributed systems use: transactions, idempotency, compensation logic, optimistic locking, and audit trails.
The difference is that agents add probabilistic failures to the mix. A transaction might fail because the agent's plan was wrong, not because of a database error. Compensation logic might need to handle cases where the agent's understanding of state doesn't match reality.
Build these patterns into your agent framework from the start. Agents that modify data without them will cause incidents that are difficult to diagnose and expensive to clean up.
Related Service: Backend System Scaling
Need help designing transactional AI agent systems with strong consistency guarantees?