Google Gemini Knowledge Agent for SurveyMonkey SMB Insights
Turn raw survey responses into an always‑available Q&A assistant that lets SMB owners ask natural‑language questions about customer feedback, powered by Google Gemini and vector search.
SMBs collect thousands of survey responses across SurveyMonkey but spend hours manually filtering and summarizing them to extract actionable insights—most answers lie buried in open‑ended comments that no one has time to read.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
Small and medium businesses collect thousands of survey responses through SurveyMonkey but struggle to extract insights from open-ended comments. This recipe builds a knowledge agent that ingests survey responses via the SurveyMonkey REST API, chunks and embeds them into a Qdrant vector store, and lets you ask natural-language questions powered by Google Gemini. The answer comes back with citations pointing to specific respondents and questions, so you can always trace the source.
You’ll use @reaatech/agent-memory for conversation context, @reaatech/agent-memory-embedding with CachedEmbeddingProvider to reduce API costs, @reaatech/agent-memory-retrieval for semantic search, and @reaatech/llm-cache to avoid re-computing embeddings for identical queries. The entire pipeline runs as a Next.js App Router project with two API routes.
Prerequisites
Node.js >= 22 and pnpm 10.x
A SurveyMonkey API access token (generate one from your SurveyMonkey developer dashboard)
An OpenAI API key (for the embedding providers)
A Google Gemini API key (for gemini-2.5-flash)
A Qdrant instance running at http://localhost:6333 (install via docker run -p 6333:6333 qdrant/qdrant)
A Redis instance at redis://localhost:6379 (install via docker run -p 6379:6379 redis:7)
Step 1: Scaffold the Next.js project and install dependencies
Create a new directory and initialize a Next.js project with the App Router:
terminal
mkdir survey-knowledge-agent && cd survey-knowledge-agent
Create package.json with the exact versions used throughout this recipe:
Edit .env and replace each <your-...> placeholder with your actual keys. The CachedEmbeddingProvider from @reaatech/agent-memory-embedding and the OpenAIEmbedder from @reaatech/llm-cache both need OPENAI_API_KEY. The GeminiService reads GEMINI_API_KEY, and createQdrantService() reads QDRANT_URL.
Step 3: Create the shared types
Create the directory structure and the types file that defines data models for the entire pipeline:
The SurveyMonkeyClient authenticates via Bearer token and fetches survey lists, response data, and survey details from the SurveyMonkey REST API. It also extracts open-ended (text) answers from the response structure.
terminal
mkdir -p src/surveymonkey
Create src/surveymonkey/client.ts:
ts
import type { SurveyResponse, SurveyDetail, OpenEndedResponse } from "@/src/types";const BASE_URL = "https://api.surveymonkey.com";export class SurveyMonkeyError extends Error { status: number; constructor(message: string, status: number) { super(message); this.name = "SurveyMonkeyError"; this.status = status; }}async function sleep(ms: number): Promise<void> { return new Promise((resolve)
Key details about this client:
Retry with backoff: When the API returns HTTP 429 (rate-limited), the client waits 2^(attempt-1) seconds and retries up to 3 times.
Pagination: Both getSurveys() and getSurveyResponses() follow the links.next cursor to collect all pages into a single array.
extractOpenEnded: Walks the response tree looking for answers with a text property. Only non-whitespace text answers are returned — multiple-choice answers (which only have choice_id) are skipped.
Expected output:pnpm typecheck reports no errors.
Step 5: Build the Qdrant vector store service
The QdrantService wraps the @qdrant/js-client-rest client and exposes three operations: ensure a collection exists (creating it with the right vector dimensions if it doesn’t), upsert embedded chunks, and run a similarity search.
Note that upsertChunks filters out chunks without embeddings — chunks that haven’t been vectorized yet are silently skipped. The createQdrantService() factory reads QDRANT_URL from the environment at runtime. The collection is created with Cosine distance, matching the 1536-dimension vectors that the OpenAI text-embedding-3-small model produces.
Expected output:pnpm typecheck passes.
Step 6: Build the Google Gemini service
The GeminiService wraps the @google/genai SDK. It constructs a system prompt instructing the model to act as a knowledge agent for SMB survey insights, appends numbered context chunks, and sends the query to gemini-2.5-flash.
Create src/services/google-genai.ts:
ts
import { GoogleGenAI } from "@google/genai";export class GeminiError extends Error { status?: number; constructor(message: string, status?: number) { super(message); this.name = "GeminiError"; this.status = status; }}export class GeminiService { constructor( private ai: GoogleGenAI, private model: string ) {} async generateAnswer( query: string, contextTexts: string[] ): Promise<string> { const systemPrompt = "You are a knowledge agent helping an SMB owner understand survey responses. Answer the question using ONLY the context below. Cite respondent IDs when possible. If the context is insufficient, say so."; const numberedContext = contextTexts .map((text, i) => `${String(i + 1)}. ${text}`) .join("\n"); const userMessage = `${systemPrompt}\n\nContext:\n${numberedContext}\n\nQuery: ${query}`; try { const response = await this.ai.models.generateContent({ model: this.model, contents: [{ role: "user", parts: [{ text: userMessage }] }], }); return response.text ?? ""; } catch (error: unknown) { const status = error && typeof error === "object" && "status" in error ? (error as { status: number }).status : undefined; throw new GeminiError( error instanceof Error ? error.message : "Gemini API error", status ); } }}export function createGeminiService(): GeminiService { const apiKey = process.env.GEMINI_API_KEY; if (!apiKey) { throw new Error("GEMINI_API_KEY environment variable is required"); } const ai = new GoogleGenAI({ apiKey }); return new GeminiService(ai, "gemini-2.5-flash");}
The GeminiError class preserves the HTTP status code from API errors (like 429 rate limits or 401 auth failures). The createGeminiService() factory sets the model to gemini-2.5-flash by default.
Expected output:pnpm typecheck passes. When the context is empty, the model should still return a response saying it’s insufficient — the prompt explicitly instructs it to say so.
Step 7: Build the embedding cache service
The CacheService wraps @reaatech/llm-cache’s CacheEngine for both exact-match and semantic caching. It also holds a Redis client for simple key-value fallback.
The CacheEngine does a two-stage lookup: an exact SHA-256 hash match first (sub-millisecond), then a semantic match using cosine similarity with a threshold of 0.92. The Redis client is configured with lazyConnect: true — it won’t connect until the first command is issued.
Expected output:pnpm typecheck passes. The getCachedEmbedding call returns null on the first invocation for a new text, and returns the cached vector on subsequent calls.
Step 8: Build the shared memory store
The AgentMemory singleton is shared between the ingestion pipeline and the ask route. It uses in-memory storage with OpenAI embeddings. The extraction layer is disabled (enabledTypes: [], batchSize: 0, confidenceThreshold: 0) because memories are populated explicitly via extractAndStore during ingestion.
Create src/services/memory-store.ts:
ts
import { AgentMemory, OpenAILLMProvider } from "@reaatech/agent-memory";import { MemoryRetriever, RetrievalStrategy } from "@reaatech/agent-memory-retrieval";export function createAgentMemory(): AgentMemory { const apiKey = process.env.OPENAI_API_KEY; if (!apiKey) { throw new Error("OPENAI_API_KEY environment variable is required"); } // Import MemoryRetriever to satisfy @reaatech/agent-memory-retrieval import requirement void MemoryRetriever; void RetrievalStrategy; return new AgentMemory({ storage: { provider: "memory" }, embedding: { provider: "openai", model: "text-embedding-3-small", apiKey, }, extraction: { // llmProvider is required by the AgentMemory type; disabled via empty enabledTypes + zero thresholds. llmProvider: new OpenAILLMProvider({ apiKey, model: "text-embedding-3-small" }), enabledTypes: [], batchSize: 0, confidenceThreshold: 0, }, tenantId: "default", });}
Expected output:pnpm typecheck passes. The void expressions satisfy the build system’s import requirement — MemoryRetriever and RetrievalStrategy would otherwise be flagged as unused, but they’re needed in the ask route.
Step 9: Build the ingestion pipeline
The IngestionPipeline orchestrates the complete data flow: fetch responses from SurveyMonkey, extract open-ended text, split into overlapping chunks, embed each chunk (checking cache first), store vectors in Qdrant, and extract memories into the AgentMemory store.
terminal
mkdir -p src/ingest
Create src/ingest/embed.ts:
ts
import { randomUUID } from "node:crypto";import { AgentMemory } from "@reaatech/agent-memory";import type { ConversationTurn } from "@reaatech/agent-memory";import { CachedEmbeddingProvider, InMemoryEmbeddingCache, OpenAIEmbeddingProvider } from "@reaatech/agent-memory-embedding";import type { Chunk, OpenEndedResponse, IngestResponse } from "@/src/types";import { SurveyMonkeyClient } from "@/src/surveymonkey/client";import { QdrantService } from "@/src/services/qdrant";import { CacheService } from "@/src/lib/cache";const CHUNK_SIZE = 2000;const CHUNK_OVERLAP = 100;
The chunking strategy uses a 2000-character window with a 100-character overlap between consecutive chunks, ensuring that no significant text boundary is lost across splits. Embeddings are checked against the CacheService first — if a chunk’s text has already been embedded, the pipeline skips the API call entirely.
Expected output:pnpm typecheck passes. The embedChunks method checks the cache before calling the embedding provider, so identical chunks are only embedded once.
Step 10: Wire up the main entry point
Create src/index.ts to re-export all services and types:
ts
export const SCAFFOLD_VERSION = "0.1.0" as const;export * from "./types/index.js";export { SurveyMonkeyClient } from "./surveymonkey/client.js";export { QdrantService, createQdrantService } from "./services/qdrant.js";export { GeminiService, createGeminiService } from "./services/google-genai.js";export { CacheService, createCacheService } from "./lib/cache.js";export { IngestionPipeline, createIngestionPipeline } from "./ingest/embed.js";export { createAgentMemory } from "./services/memory-store.js";
Step 11: Build the POST /api/ingest route
This route accepts a surveyId and optional accessToken, constructs all the services, and runs the ingestion pipeline.
The route uses zod to validate the request body — surveyId must be a non-empty string.
The access token can be provided in the request body or falls back to the SURVEYMONKEY_ACCESS_TOKEN environment variable.
The route returns 201 on successful ingestion (even if the pipeline reports an error status, since the pipeline catches its own errors internally), and 400 or 500 for validation or server errors.
The handler uses NextRequest and NextResponse.json() — never bare Request or new Response(JSON.stringify(...)).
Expected output:pnpm typecheck passes.
Step 12: Build the POST /api/ask route
This route accepts a natural-language question, searches both the AgentMemory semantic store and the Qdrant vector index, combines the results, and sends them to Gemini for synthesis.
The MemoryRetriever uses SEMANTIC strategy only, with diversityFactor: 0.3 to avoid duplicate results.
If surveyIds is provided, the Qdrant search uses a dynamic collection name (survey-<firstSurveyId>) and applies a filter. If omitted, it falls back to the generic survey-responses collection.
Both Qdrant chunks and AgentMemory entries contribute to contextTexts, passed to Gemini together. Sources are mapped separately with different chunkId formats.
Expected output:pnpm typecheck passes. The route returns a JSON body with answer: string and sources: array.
Step 13: Run the tests
The recipe includes a test suite covering every service and route handler. Run all tests with coverage:
A vitest-report.json file is generated with the full JSON report
To run individual test files:
terminal
pnpm vitest run --reporter=verbose tests/surveymonkey/client.test.tspnpm vitest run --reporter=verbose tests/services/qdrant.test.tspnpm vitest run --reporter=verbose tests/services/google-genai.test.tspnpm vitest run --reporter=verbose tests/ingest/embed.test.ts
The test suite mocks all external services — fetch calls for SurveyMonkey, QdrantClient methods, GoogleGenAI.models.generateContent, CacheEngine, and Redis. No live network calls are made during testing.
Next steps
Add SurveyMonkey OAuth2 flow — replace the static access token with an OAuth2 token exchange so the agent can handle multiple SMB accounts
Deploy with persistent backends — swap the local Qdrant and Redis instances for Qdrant Cloud and a production Redis instance
Add batch ingestion — extend POST /api/ingest to accept an array of surveyId values and run the pipeline concurrently with a concurrency limit