Google Gemini Lead Intake for Salesforce SMB Sales

Automatically extract and qualify leads from email attachments and forms, pushing structured lead records directly to Salesforce.

google-gemini lead-intake salesforce nextjs document-parsing pii-scrubbing confidence-router agent-budget-engine

The problem

SMB sales teams lose leads buried in email attachments and messy web forms, relying on manual data entry that delays follow-up and lets opportunities slip.

Built from

Intro

In this tutorial you’ll build an automated lead intake pipeline that accepts files from email attachments and web forms, parses text from PDFs and images using OCR, extracts structured lead fields with Google Gemini, scrubs PII, classifies each lead by quality, and upserts qualified records into Salesforce. By the end you’ll have a running Next.js API with three endpoints, a full test suite with coverage, and a solid understanding of how to chain real AI packages — @google/genai, @reaatech/confidence-router, @reaatech/agent-budget-engine, @presidio-dev/hai-guardrails, and more — into an ingestion service you can extend.

Prerequisites

Node.js >= 22 with pnpm 10.x installed (pnpm --version should show 10.x)
A Google Gemini API key — get one at aistudio.google.com
An OpenAI API key — used by the LLM cache’s embedding model (text-embedding-3-small)
A Salesforce Developer Edition account (free at developer.salesforce.com) — you’ll need the username, password, and security token
A Langfuse project (free-tier at langfuse.com) for observability tracing
Familiarity with TypeScript and basic Next.js App Router conventions

Step 1: Scaffold the Next.js project

Create a fresh Next.js project with the App Router and TypeScript. Next.js 16 with create-next-app gives you the right starting shell — you’ll replace the default content as you go.

Example artifact

A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.

Download example (zip)Browse files

183 kB·103 tests·100.0% coverage·vitest passing

SHA-256b08a11aa02c2d6472c8e313d9c1360086cb5b333477623dc4dc7c74ba047bd87

Book a conversation All solutions

Comments

Loading comments…

Intro

Prerequisites

Node.js >= 22 with pnpm 10.x installed (pnpm --version should show 10.x)
A Google Gemini API key — get one at aistudio.google.com
An OpenAI API key — used by the LLM cache’s embedding model (text-embedding-3-small)
A Salesforce Developer Edition account (free at developer.salesforce.com) — you’ll need the username, password, and security token
A Langfuse project (free-tier at langfuse.com) for observability tracing
Familiarity with TypeScript and basic Next.js App Router conventions

Step 1: Scaffold the Next.js project

Create a fresh Next.js project with the App Router and TypeScript. Next.js 16 with create-next-app gives you the right starting shell — you’ll replace the default content as you go.

import { GoogleGenAI } from "@google/genai"; import { ExtractedLeadSchema } from "../../types/lead-schemas.js"; import type { ExtractedLead } from "../../types/lead.js"; import { GeminiError } from "../errors.js"; export interface GeminiClient { generateContent(prompt: string | string[], opts?: { model?: string }): Promise<{ text?: string }>; } export function createGeminiClient(): GeminiClient { const apiKey = process.env.GEMINI_API_KEY; if (!apiKey) { throw new GeminiError("GEMINI_API_KEY is not set", 500); } const ai = new GoogleGenAI({ apiKey }); return { async generateContent(prompt: string | string[], opts?: { model?: string }) { try { const contents = Array.isArray(prompt) ? prompt : prompt; const model = opts?.model ?? "gemini-2.5-flash"; const response = await ai.models.generateContent({ model, contents }); return { text: response.text }; } catch (e: unknown) { if (e && typeof e === "object" && "name" in e && "message" in e) { const err = e as { name: string; message: string; status?: number }; throw new GeminiError(`Gemini API error: ${err.message}`, err.status ?? 500); } throw new GeminiError("Unknown Gemini API error", 500); } }, }; } export async function extractLeadFromText( ai: GeminiClient, rawText: string, ): Promise<ExtractedLead> { const prompt = [ "Extract lead information from the following text. Return ONLY valid JSON with these fields:", '{ "name": string | null, "company": string | null, "email": string | null, "phone": string | null, "needs": string | null }', "Text:", rawText, ]; const response = await ai.generateContent(prompt, { model: "gemini-2.5-flash" }); const text = response.text ?? ""; let parsed: Record<string, string | null>; try { parsed = JSON.parse(text) as Record<string, string | null>; } catch { throw new GeminiError("Failed to parse Gemini response as JSON", 500); } const lead = ExtractedLeadSchema.parse({ name: parsed.name ?? undefined, company: parsed.company ?? undefined, email: parsed.email ?? undefined, phone: parsed.phone ?? undefined, needs: parsed.needs ?? undefined, source: "email_attachment", rawText, }); return lead; } export const geminiClient: GeminiClient = createGeminiClient();

import { injectionGuard, piiGuard, GuardrailsEngine, SelectionType } from "@presidio-dev/hai-guardrails"; import type { ExtractedLead } from "../../types/lead.js"; import { InjectionDetectedError } from "../errors.js"; export { GeminiError, InjectionDetectedError, BudgetExceededError, DocumentParseError } from "../errors.js"; export function buildGuardrailsEngine(): GuardrailsEngine { return new GuardrailsEngine({ guards: [ injectionGuard({ roles: ["user"] }, { mode: "heuristic", threshold: 0.7 }), piiGuard({ selection: SelectionType.All }), ], }); } function allGuardsPassed( results: Awaited<ReturnType<GuardrailsEngine["run"]>>, ): boolean { for (const guardResult of results.messagesWithGuardResult) { for (const msg of guardResult.messages) { if (!msg.passed) return false; } } return true; } export async function scrubInput(text: string): Promise<string> { const results = await guardrailsEngine().run([{ role: "user", content: text }]); if (!allGuardsPassed(results)) { throw new InjectionDetectedError(); } return results.messages[0]?.content ?? text; } export async function scrubLeadFields(lead: ExtractedLead): Promise<ExtractedLead> { const result = await guardrailsEngine().run([ { role: "user", content: lead.name ?? "" }, { role: "user", content: lead.company ?? "" }, { role: "user", content: lead.email ?? "" }, { role: "user", content: lead.phone ?? "" }, { role: "user", content: lead.needs ?? "" }, { role: "user", content: lead.notes ?? "" }, { role: "user", content: lead.rawText }, ]); return { ...lead, name: result.messages[0]?.content || lead.name, company: result.messages[1]?.content || lead.company, email: result.messages[2]?.content || lead.email, phone: result.messages[3]?.content || lead.phone, needs: result.messages[4]?.content || lead.needs, notes: result.messages[5]?.content || lead.notes, rawText: result.messages[6]?.content || lead.rawText, }; } let _guardrailsEngine: GuardrailsEngine | undefined; export function guardrailsEngine(): GuardrailsEngine { if (!_guardrailsEngine) { _guardrailsEngine = buildGuardrailsEngine(); } return _guardrailsEngine; }

Google Gemini Lead Intake for Salesforce SMB Sales

The problem

Built from

Intro

Prerequisites

Step 1: Scaffold the Next.js project

Example artifact

Comments

Intro

Prerequisites

Step 1: Scaffold the Next.js project

Step 2: Install dependencies

Step 3: Configure environment variables

Step 4: Define the lead domain types

Step 5: Create the Gemini extraction client

Step 6: Add PII guardrails and observability

Step 7: Set up budget control and LLM caching

Step 8: Build the document parser

Step 9: Build lead extraction and classification services

Step 10: Wire the pipeline orchestrator

Step 11: Create the API routes

Step 12: Run the tests and start the dev server

Next steps