Anthropic RAG Pipeline for Google Workspace SMB Email Knowledge Search

Ask questions in plain English and get answers with citations from your entire Google Workspace email and documents – no more manual searching.

anthropic rag-pipeline google-workspace nextjs pgvector voyageai knowledge-search email-search

The problem

SMB teams waste hours digging through Gmail threads and Drive files to find critical information. An AI knowledge agent that indexes Workspace and answers queries with source links would drastically cut search time.

Built from

Intro

This tutorial walks you through building an AI-powered knowledge search pipeline for Google Workspace that lets you ask questions in plain English and get answers with citations from your Gmail and Drive content. The pipeline ingests emails and documents nightly, generates embeddings with VoyageAI, stores vectors in PGVector, and answers questions via Anthropic’s Claude with streaming, semantic caching, session continuity, and cost telemetry. Quality gates from the REAA evaluation stack catch retrieval regressions before users notice them.

Prerequisites

Node.js 22+ and pnpm 10+
A Google Cloud service account with Gmail readonly and Drive readonly scopes, plus domain-wide delegation configured in your Google Workspace Admin Console
An Anthropic API key (claude-sonnet-4-6 access)
A VoyageAI API key for embeddings
PostgreSQL 16+ with the pgvector extension installed
Langfuse account (optional, for observability)
Basic familiarity with Next.js App Router, TypeScript, and PostgreSQL

Step 1: Create the Project Scaffold

Start by creating the Next.js project and installing all dependencies.

terminal

npx create-next-app@latest anthropic-rag-workspace --typescript --app --eslint --use-pnpm
cd

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

187 kB·127 tests·97.0% coverage·vitest passing

SHA-25677df3d77df48b791bd68fca890464b8ddf0d3dea552972fbb8a2d97cc6329582

Book a conversation All solutions

Comments

Loading comments…

import { getDb } from "../db/connection.js" import { generateEmbedding, generateEmbeddings } from "./embedder.js" import pgvector from 'pgvector' import type { DocumentChunk } from "../types/index.js" export function chunkText(text: string, opts?: { chunkSize?: number; overlap?: number }): string[] { const chunkSize = opts?.chunkSize ?? 1000 const overlap = opts?.overlap ?? 200 const chunks: string[] = [] let start = 0 while (start < text.length) { const end = Math.min(start + chunkSize, text.length) chunks.push(text.slice(start, end)) if (end === text.length) break start = end - overlap } return chunks } export async function embedAndStore( content: string, meta: { sourceId: string; sourceType: string; chunkIndex: number; metadata: Record<string, unknown> } ): Promise<void> { const sql = getDb() const chunks = chunkText(content) const embeddings = await generateEmbeddings(chunks) for (let i = 0; i < chunks.length; i++) { const embeddingSql = pgvector.toSql(embeddings[i]) const metaJson: string = JSON.stringify(meta.metadata) await sql` INSERT INTO chunks (source_id, source_type, chunk_index, text, embedding, metadata) VALUES (${meta.sourceId}, ${meta.sourceType}, ${meta.chunkIndex + i}, ${chunks[i]}, ${embeddingSql}::vector, ${metaJson}::jsonb) ` } } export async function searchSimilar(queryText: string, limit?: number): Promise<DocumentChunk[]> { const sql = getDb() const n = limit ?? 10 const embedding = await generateEmbedding(queryText) const embeddingSql = pgvector.toSql(embedding) const rows = await sql` SELECT id, source_id, source_type, text, embedding, metadata, 1 - (embedding <=> ${embeddingSql}::vector) AS relevance FROM chunks ORDER BY embedding <=> ${embeddingSql}::vector LIMIT ${n} ` function toStr(val: unknown): string { if (typeof val === "string") return val; if (typeof val === "number" || typeof val === "boolean") return String(val); return ""; } return rows.map((row: Record<string, unknown>) => ({ id: toStr(row.id), sourceId: toStr(row.source_id), sourceType: toStr(row.source_type) as "email" | "drive", text: toStr(row.text), embedding: Array.isArray(row.embedding) ? (row.embedding as number[]) : [], metadata: typeof row.metadata === "object" && row.metadata !== null ? (row.metadata as Record<string, unknown>) : {}, relevance: typeof row.relevance === "number" ? row.relevance : 0, })) }

import { describe, it, expect, vi } from "vitest"; const mockGetDb = vi.hoisted(() => vi.fn()); const mockGenerateEmbedding = vi.hoisted(() => vi.fn().mockResolvedValue([0.1, 0.2, 0.3])); const mockGenerateEmbeddings = vi.hoisted(() => vi.fn()); vi.mock("../../src/config/env.js", () => ({ env: { VOYAGE_API_KEY: "vo-test-key", ANTHROPIC_API_KEY: "sk-ant-test" }, })); vi.mock("../../src/db/connection.js", () => ({ getDb: mockGetDb, })); vi.mock("../../src/rag/embedder.js", () => ({ generateEmbedding: mockGenerateEmbedding, generateEmbeddings: mockGenerateEmbeddings, })); import { checkCache, storeInCache } from "../../src/rag/cache.js"; import { initSessionManager, createChatSession, addTurn, getContext } from "../../src/services/session.js"; import { recordLlmCall, getCostSpans, resetCostSpans } from "../../src/services/cost-telemetry.js"; describe("chat flow integration", () => { it("cache miss and session round-trip work together", async () => { const cacheHit = await checkCache("test query", "claude-sonnet-4-6"); expect(cacheHit.hit).toBe(false); const span = recordLlmCall("anthropic", "claude-sonnet-4-6", 50, 200); expect(span.inputTokens).toBe(50); expect(span.outputTokens).toBe(200); expect(span.costUsd).toBeGreaterThan(0); const manager = initSessionManager(); const session = await createChatSession(manager, "user-1"); const msg = await addTurn(manager, session.id, "user", "test query"); expect(msg.role).toBe("user"); expect(msg.content).toBe("test query"); const ctx = await getContext(manager, session.id); expect(ctx).toBeDefined(); }); it("cache hit returns stored response", async () => { const response = { answer: "cached answer", citations: [] }; await storeInCache("repeated query", response, "claude-sonnet-4-6"); const result = await checkCache("repeated query", "claude-sonnet-4-6"); expect(result.hit).toBe(true); expect(result.entry).toBeDefined(); }); it("cost telemetry accumulates across multiple llm calls", () => { resetCostSpans(); recordLlmCall("anthropic", "claude-sonnet-4-6", 100, 50); recordLlmCall("anthropic", "claude-sonnet-4-6", 200, 100); const spans = getCostSpans(); expect(spans).toHaveLength(2); expect(spans[0].inputTokens).toBe(100); expect(spans[1].inputTokens).toBe(200); }); });

Anthropic RAG Pipeline for Google Workspace SMB Email Knowledge Search

The problem

Built from

Intro

Prerequisites

Step 1: Create the Project Scaffold

Example artifact

Comments

Intro

Prerequisites

Step 1: Create the Project Scaffold

Step 2: Configure Environment Variables

Step 3: Define Shared Domain Types

Step 4: Create the Environment Configuration

Step 5: Set Up the Database Connection

Step 6: Build the Content Parser

Step 7: Create the Gmail Ingestion

Step 8: Create the Drive Ingestion

Step 9: Build the Ingestion Orchestrator

Step 10: Create the VoyageAI Embedder

Step 11: Build the RAG Pipeline

Step 12: Add Semantic Caching

Step 13: Implement Session Continuity

Step 14: Add Cost Telemetry

Step 15: Configure Langfuse Observability

Step 16: Create the Chat API Route

Step 17: Build the Quality Gates

Step 18: Add the Chat UI

Step 19: Create a Chat Flow Integration Test

Next steps