Multi-Model Database Patterns for AI Systems: Combining Relational, Vector, and Document Stores

Introduction

AI systems have diverse data needs that no single database model serves well. You need relational tables for structured data (users, transactions, audit logs), vector stores for embeddings and similarity search, document stores for flexible content (articles, conversations, agent state), and graph databases for relationship traversal.

The mistake most teams make is trying to force all data into one database model. Postgres with JSONB and pgvector can handle many use cases, but it won't give you the write throughput of Cassandra, the vector search performance of Qdrant, or the relationship traversal of Neo4j.

The solution is a multi-model architecture—using multiple databases, each playing to its strengths, coordinated by your application layer into a cohesive system.

Section 1: The AI System Data Model Landscape

AI systems typically need four data models, each serving a distinct purpose:

Relational data (PostgreSQL, MySQL)

Structured data with clear schemas:

User accounts, authentication, billing,
Transaction records, audit logs,
Configuration, feature flags,
Relationships between entities (foreign keys).

Why relational: ACID transactions, referential integrity, mature tooling, SQL querying.

Vector data (Pinecone, Qdrant, Milvus)

High-dimensional vectors for similarity search:

Document embeddings for RAG,
Image embeddings for visual search,
Agent memory embeddings for recall,
Semantic search across content.

Why vector databases: optimized ANN algorithms, high-dimensional indexing, similarity search at scale.

Document data (MongoDB, DynamoDB)

Flexible schema data:

AI-generated content (articles, summaries, translations),
Conversation histories and agent state,
Unstructured metadata and annotations,
Rapidly evolving data structures.

Why document stores: schema flexibility, nested data structures, horizontal scaling.

Graph data (Neo4j, Amazon Neptune)

Relationship-centric data:

Knowledge graphs for domain knowledge,
Entity relationships for RAG,
Agent interaction graphs,
Dependency tracking for AI pipelines.

Why graph databases: relationship traversal, path finding, graph algorithms.

Section 2: Pattern 1—The Unified Relational Core with Specialized Stores

Use a relational database as the system of record, with specialized databases for specific data types:

PostgreSQL (system of record)
    ├── Users, transactions, core entities
    ├── References to vector DB (vector_id)
    ├── References to document DB (document_id)
    └── References to graph DB (node_id)

Implementation

// PostgreSQL: system of record with references
CREATE TABLE documents (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  title VARCHAR(255) NOT NULL,
  content_type VARCHAR(50),
  -- Reference to vector database
  vector_id VARCHAR(255),
  -- Reference to document store
  document_store_id VARCHAR(255),
  -- Reference to graph database
  graph_node_id VARCHAR(255),
  created_at TIMESTAMPTZ DEFAULT NOW()
);

// When inserting a document, write to all stores
async function createDocument(doc: DocumentInput): Promise<Document> {
  // 1. Generate embedding
  const embedding = await embed(doc.content);

  // 2. Store in vector database
  const vectorResult = await vectorDb.upsert([{
    id: `doc:${doc.id}`,
    values: embedding,
    metadata: { title: doc.title, type: doc.contentType }
  }]);

  // 3. Store full content in document store
  const docStoreResult = await documentDb.collection('documents').insertOne({
    title: doc.title,
    content: doc.content,
    metadata: doc.metadata
  });

  // 4. Create graph relationships
  const graphResult = await graphDb.query(`
    CREATE (d:Document {
      id: $docId,
      title: $title,
      type: $type
    })
    RETURN d
  `, { docId: doc.id, title: doc.title, type: doc.contentType });

  // 5. Store references in PostgreSQL
  const document = await db.query(`
    INSERT INTO documents (id, title, content_type, vector_id, document_store_id, graph_node_id)
    VALUES ($1, $2, $3, $4, $5, $6)
    RETURNING *
  `, [doc.id, doc.title, doc.contentType, `doc:${doc.id}`, docStoreResult.insertedId, graphResult[0].id]);

  return document[0];
}

Query pattern

async function getDocumentWithContext(docId: string): Promise<DocumentWithContext> {
  // 1. Get core data from PostgreSQL
  const [doc] = await db.query('SELECT * FROM documents WHERE id = $1', [docId]);

  // 2. Get similar documents from vector DB
  const docEmbedding = await vectorDb.fetch(doc.vector_id);
  const similarDocs = await vectorDb.search(docEmbedding.values, 5);

  // 3. Get full content from document store
  const fullContent = await documentDb.collection('documents').findOne({
    _id: doc.document_store_id
  });

  // 4. Get relationships from graph DB
  const related = await graphDb.query(`
    MATCH (d:Document {id: $docId})-[r:RELATED_TO]-(related)
    RETURN related
    LIMIT 10
  `, { docId });

  return { ...doc, content: fullContent.content, similarDocs, related };
}

Section 3: Pattern 2—The Polyglot Persistence Layer

Abstract database access behind a repository layer that routes to the correct database:

interface DocumentRepository {
  findById(id: string): Promise<Document>;
  searchSimilar(query: string, topK: number): Promise<Document[]>;
  save(document: Document): Promise<void>;
}

class PolyglotDocumentRepository implements DocumentRepository {
  constructor(
    private relationalDb: PostgreSQL,
    private vectorDb: VectorDatabase,
    private documentDb: DocumentDatabase
  ) {}

  async findById(id: string): Promise<Document> {
    // Core metadata from relational
    const [meta] = await this.relationalDb.query(
      'SELECT * FROM documents WHERE id = $1', [id]
    );

    // Full content from document store
    const content = await this.documentDb.findById(meta.document_store_id);

    return { ...meta, content };
  }

  async searchSimilar(query: string, topK: number): Promise<Document[]> {
    // Use vector DB for similarity search
    const queryEmbedding = await embed(query);
    const results = await this.vectorDb.search(queryEmbedding, topK);

    // Fetch metadata for results
    const ids = results.map(r => r.metadata.docId);
    const metadata = await this.relationalDb.query(
      'SELECT * FROM documents WHERE id = ANY($1)', [ids]
    );

    return results.map(r => ({
      ...metadata.find(m => m.id === r.metadata.docId),
      similarity: r.score
    }));
  }
}

Benefit: application code doesn't know about multiple databases—the repository handles routing.

Section 4: Pattern 3—Event-Driven Data Synchronization

When data changes in one database, propagate changes to others via events:

class DocumentSyncService {
  constructor(
    private eventBus: EventBus,
    private vectorDb: VectorDatabase,
    private documentDb: DocumentDatabase,
    private graphDb: GraphDatabase
  ) {
    // Listen for document events
    this.eventBus.subscribe('document_created', this.handleDocumentCreated.bind(this));
    this.eventBus.subscribe('document_updated', this.handleDocumentUpdated.bind(this));
  }

  async createDocument(doc: DocumentInput): Promise<Document> {
    // Write to primary store (PostgreSQL)
    const document = await db.query(`
      INSERT INTO documents (title, content_type)
      VALUES ($1, $2)
      RETURNING *
    `, [doc.title, doc.contentType]);

    // Publish event for other databases
    await this.eventBus.publish('document_created', {
      documentId: document[0].id,
      title: doc.title,
      content: doc.content,
      contentType: doc.contentType
    });

    return document[0];
  }

  private async handleDocumentCreated(event: DocumentCreatedEvent): Promise<void> {
    // Parallel writes to all databases
    await Promise.all([
      // Vector DB
      this.vectorDb.upsert([{
        id: `doc:${event.documentId}`,
        values: await embed(event.content),
        metadata: { docId: event.documentId, title: event.title }
      }]),
      // Document DB
      this.documentDb.collection('documents').insertOne({
        _id: event.documentId,
        content: event.content,
        metadata: { title: event.title, type: event.contentType }
      }),
      // Graph DB
      this.graphDb.query(`
        CREATE (d:Document {id: $docId, title: $title})
      `, { docId: event.documentId, title: event.title })
    ]);
  }
}

Benefit: loose coupling between databases, async propagation, fault tolerance (retry failed syncs).

Section 5: Pattern 4—The Data Mesh with Domain-Owned Databases

For large AI systems, let each domain team own their database choice:

AI System
├── Content Domain (owns: Document DB + Vector DB)
├── User Domain (owns: Relational DB)
├── Knowledge Domain (owns: Graph DB + Vector DB)
└── Agent Domain (owns: Relational DB + Document DB)

Implementation via service boundaries

// Content Service: owns document and vector stores
class ContentService {
  constructor(
    private documentDb: MongoDB,
    private vectorDb: Qdrant
  ) {}

  async ingestDocument(content: string): Promise<Document> {
    const embedding = await embed(content);

    const doc = await this.documentDb.collection('docs').insertOne({
      content,
      embedding: embedding, // stored here too for simplicity
      createdAt: new Date()
    });

    await this.vectorDb.upsert([{
      id: doc.insertedId.toString(),
      values: embedding,
      metadata: { source: 'content_service' }
    }]);

    return doc;
  }
}

// Knowledge Service: owns graph and vector stores
class KnowledgeService {
  constructor(
    private graphDb: Neo4j,
    private vectorDb: Qdrant
  ) {}

  async addFact(fact: string): Promise<void> {
    const embedding = await embed(fact);

    // Add to knowledge graph
    await this.graphDb.query(`
      CREATE (f:Fact {text: $fact, embedding_id: $embId})
    `, { fact, embId: `fact:${hash(fact)}` });

    // Add to vector store
    await this.vectorDb.upsert([{
      id: `fact:${hash(fact)}`,
      values: embedding,
      metadata: { type: 'fact' }
    }]);
  }
}

Benefit: teams choose the best database for their domain, independent scaling and optimization.

Section 6: Transaction Management Across Databases

The hardest part of multi-model architectures is maintaining consistency across databases.

Two-phase commit (for strongly consistent needs)

async function createDocumentWithConsistency(doc: DocumentInput): Promise<void> {
  const client = await db.connect();

  try {
    await client.query('BEGIN');

    // 1. Write to PostgreSQL
    const result = await client.query(
      'INSERT INTO documents (title) VALUES ($1) RETURNING id',
      [doc.title]
    );
    const docId = result.rows[0].id;

    // 2. Prepare vector DB write (if supported)
    // Note: most vector DBs don't support 2PC, so we use a different strategy

    await client.query('COMMIT');

    // 3. Write to other databases AFTER commit (eventual consistency)
    await eventBus.publish('document_created', { docId, ...doc });

  } catch (error) {
    await client.query('ROLLBACK');
    throw error;
  } finally {
    client.release();
  }
}

Saga pattern for long-running transactions

class DocumentCreationSaga {
  async execute(doc: DocumentInput): Promise<void> {
    const sagaId = generateId();

    try {
      // Step 1: Create in relational DB
      const docId = await this.createInRelationalDb(doc, sagaId);

      // Step 2: Create in vector DB
      await this.createInVectorDb(docId, doc, sagaId);

      // Step 3: Create in document DB
      await this.createInDocumentDb(docId, doc, sagaId);

      // Step 4: Mark saga complete
      await this.markSagaComplete(sagaId);

    } catch (error) {
      await this.compensate(sagaId);
      throw error;
    }
  }

  private async compensate(sagaId: string): Promise<void> {
    const steps = await this.getSagaSteps(sagaId);

    // Undo completed steps in reverse order
    for (const step of steps.reverse()) {
      if (step.status === 'completed') {
        await this.undoStep(step);
      }
    }
  }
}

Section 7: When to Use Single vs. Multi-Model

Use a single database when:

Your data needs are simple (mostly relational with some JSON),
Your team is small and can't manage multiple databases,
You're early-stage and optimizing for speed of development,
Postgres with pgvector and JSONB meets your needs.

// Postgres can handle many AI use cases
CREATE TABLE ai_content (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  content TEXT,
  metadata JSONB, -- flexible document-like data
  embedding VECTOR(1536), -- vector data
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Vector search
SELECT * FROM ai_content
ORDER BY embedding <-> $queryEmbedding
LIMIT 10;

-- JSON query
SELECT * FROM ai_content
WHERE metadata->>'type' = 'article';

Use multi-model when:

You need specialized capabilities (high-write vector DB, graph traversal),
Different data types have very different access patterns,
You need independent scaling of different data stores,
Your team has the operational maturity to manage multiple databases.

Conclusion

Multi-model database architectures for AI systems aren't about using more databases—they're about using the right database for each data type. Relational for structured data, vector for embeddings, document for flexible content, graph for relationships.

The patterns that work: start with a unified relational core (PostgreSQL) and add specialized stores as needed. Use repository patterns to abstract database complexity from application code. Consider event-driven synchronization for loose coupling. And for large systems, let domains own their database choices.

The AI systems that scale best are the ones where each piece of data lives in the database model that serves it best—not the one where you try to force everything into a single model.

Need help designing multi-model database architectures for your AI system?

Multi-Model Database Patterns for AI Systems: Combining Relational, Vector, and Document Stores

Introduction

Section 1: The AI System Data Model Landscape

Relational data (PostgreSQL, MySQL)

Vector data (Pinecone, Qdrant, Milvus)

Document data (MongoDB, DynamoDB)

Graph data (Neo4j, Amazon Neptune)

Section 2: Pattern 1—The Unified Relational Core with Specialized Stores

Implementation

Query pattern

Section 3: Pattern 2—The Polyglot Persistence Layer

Section 4: Pattern 3—Event-Driven Data Synchronization

Section 5: Pattern 4—The Data Mesh with Domain-Owned Databases

Implementation via service boundaries

Section 6: Transaction Management Across Databases

Two-phase commit (for strongly consistent needs)

Saga pattern for long-running transactions

Section 7: When to Use Single vs. Multi-Model

Use a single database when:

Use multi-model when:

Conclusion

Related Insights

Vector Databases at Scale: Architecture Patterns for High-Throughput AI Applications

AI Agent Memory Systems: How Vector Databases Enable Long-Term Context and Learning

Continue Thinking

Introduction

Section 1: The AI System Data Model Landscape

Relational data (PostgreSQL, MySQL)

Vector data (Pinecone, Qdrant, Milvus)

Document data (MongoDB, DynamoDB)

Graph data (Neo4j, Amazon Neptune)

Section 2: Pattern 1—The Unified Relational Core with Specialized Stores

Implementation

Query pattern

Section 3: Pattern 2—The Polyglot Persistence Layer

Section 4: Pattern 3—Event-Driven Data Synchronization

Section 5: Pattern 4—The Data Mesh with Domain-Owned Databases

Implementation via service boundaries

Section 6: Transaction Management Across Databases

Two-phase commit (for strongly consistent needs)

Saga pattern for long-running transactions

Section 7: When to Use Single vs. Multi-Model

Use a single database when:

Use multi-model when:

Conclusion

Related Service: AI Systems & Automation

Related Insights

Vector Databases at Scale: Architecture Patterns for High-Throughput AI Applications

AI Agent Memory Systems: How Vector Databases Enable Long-Term Context and Learning

Continue Thinking