Cohere MCP Server for SMB Research and Summarization
Expose Cohere's language models and search tools to AI agents via an MCP server, enabling automated research, summarization, and content generation for SMBs.
Small businesses need to integrate AI agents that can research topics, summarize documents, and answer complex queries, but building and maintaining API wrappers for Cohere's capabilities is time-consuming.
A complete, working implementation of this recipe — downloadable as a zip or browsable file by file. Generated by our build pipeline; tested with full coverage before publishing.
This tutorial walks you through building an MCP (Model Context Protocol) server that exposes Cohere’s language models and Tavily’s web search tools as composable AI agent tools. By the end, you’ll have a working server with seven MCP tools — summarization, chat, web search, content extraction, website crawling, PDF parsing, and automated research report generation — wrapped in auth, rate limiting, and OpenTelemetry observability.
Prerequisites
Node.js 22+ and pnpm installed (the project pins pnpm@10.0.0)
The project starts with a Next.js 16 scaffold (App Router) that provides the build toolchain — TypeScript, Vitest, ESLint, and pnpm workspaces. This shell is already on disk; your job is to build the server code inside it.
.env.example — placeholder environment variables (you’ll review it next)
app/layout.tsx and app/page.tsx
Install the dependencies:
terminal
pnpm install
Expected output: No errors. A node_modules/ directory and pnpm-lock.yaml exist.
Step 2: Configure environment variables
Open .env.example and review the entries. The Cohere V2 client auto-detects COHERE_API_KEY from the environment, while Tavily reads TAVILY_API_KEY directly. The MCP server uses API_KEY for client authentication.
The file should already contain:
env
# Env vars for cohere-mcp-server-for-smb-research-and-summarization# Keep placeholders only — never commit real values.NODE_ENV=developmentPORT=8080# Cohere API key (V2 client auto-detects COHERE_API_KEY from env)COHERE_API_KEY=<your-cohere-api-key># Tavily API key for web search/extract/crawlTAVILY_API_KEY=<your-tavily-api-key># MCP server auth (not required in dev)API_KEY=<your-mcp-api-key>AUTH_MODE=api-keyAUTH_BYPASS_IN_DEV=true# ObservabilityLOG_LEVEL=infoOTEL_EXPORTER_OTLP_ENDPOINT=<your-otlp-endpoint>OTEL_SERVICE_NAME=cohere-mcp-server# Rate limiting and CORSRATE_LIMIT_RPM=60CORS_ORIGIN=*IDEMPOTENCY_TTL_MS=300000
Now copy it to .env and fill in your real API keys:
terminal
cp .env.example .env
Expected output: Both .env.example and .env exist in your project root.
Step 3: Create the Cohere client wrapper
Create src/lib/cohere.ts. This wrapper encapsulates the Cohere V2 SDK (cohere-ai) behind clean TypeScript interfaces. The CohereClient class auto-detects COHERE_API_KEY from the environment when you call new CohereClientV2({}).
Each tool lives under src/tools/ and uses the defineTool() helper from @reaatech/mcp-server-tools. Tools are auto-discovered at startup — no manual registration needed.
Create src/tools/summarize.tool.ts:
ts
import { defineTool } from '@reaatech/mcp-server-tools';import { z } from 'zod';import { textContent, errorResponse } from '@reaatech/mcp-server-core';import { CohereClient } from '../lib/cohere.js';import { recordToolInvocation, recordError } from '@reaatech/mcp-server-observability';export default defineTool({ name: 'summarize', description: 'Summarize text using Cohere', inputSchema: z.object({ text: z.string().describe('The text to summarize'), model: z.string().optional().describe('Cohere model ID'), length: z.enum(['short', 'medium', 'long']).optional().describe('Summary length'), format: z.enum(['paragraph', 'bullets']).optional().describe('Output format'), temperature: z.number().min(0).max(2).optional().describe('Generation temperature'), }), handler: async (args, _) => { const start = Date.now(); const toolName = 'summarize'; const client = new CohereClient(); const result = await client.summarize({ text: args.text as string, model: args.model as string | undefined, length: args.length as 'short' | 'medium' | 'long' | undefined, format: args.format as 'paragraph' | 'bullets' | undefined, temperature: args.temperature as number | undefined, }).catch((err: unknown) => { const message = err instanceof Error ? err.message : 'Summary failed'; recordError({ errorType: 'tool_execution_failed', toolName }); recordToolInvocation({ toolName, status: 'error', durationMs: Date.now() - start }); return { error: message }; }); if ('error' in result) { return errorResponse((result as { error: string }).error); } recordToolInvocation({ toolName, status: 'success', durationMs: Date.now() - start }); return { content: [textContent((result as { summary: string }).summary)] }; },});
Expected output:pnpm typecheck still passes.
Step 7: Create the search-web MCP tool
Create src/tools/search-web.tool.ts. This tool wraps Tavily’s web search and formats results with titles, URLs, relevance scores, and content snippets.
ts
import { defineTool } from '@reaatech/mcp-server-tools';import { z } from 'zod';import { textContent, errorResponse } from '@reaatech/mcp-server-core';import { TavilyClient } from '../lib/tavily.js';import type { ToolResponse } from '@reaatech/mcp-server-core';import { recordToolInvocation, recordError } from '@reaatech/mcp-server-observability';export default defineTool({ name: 'search-web', description: 'Search the web using Tavily search engine', inputSchema: z.object({ query: z.string().describe('Search query'), maxResults: z.number().min(1).max(20).optional().describe('Maximum number of results'), includeAnswer: z.boolean().optional().describe('Include an answer in the response'), searchDepth: z.enum(['basic', 'advanced']).optional().describe('Search depth'), includeDomains: z.array(z.string()).optional().describe('Domains to include'), excludeDomains: z.array(z.string()).optional().describe('Domains to exclude'), }), handler: async (args, _): Promise<ToolResponse> => { const start = Date.now(); const toolName = 'search-web'; const result = await (async () => { const client = new TavilyClient({ apiKey: process.env.TAVILY_API_KEY ?? '' }); return await client.search({ query: args.query as string, maxResults: args.maxResults as number | undefined, includeAnswer: args.includeAnswer as boolean | undefined, searchDepth: args.searchDepth as 'basic' | 'advanced' | undefined, includeDomains: args.includeDomains as string[] | undefined, excludeDomains: args.excludeDomains as string[] | undefined, }); })().catch((err: unknown) => { const message = err instanceof Error ? err.message : 'Search failed'; recordError({ errorType: 'tool_execution_failed', toolName }); recordToolInvocation({ toolName, status: 'error', durationMs: Date.now() - start }); return { error: message }; }); if ('error' in result) { return errorResponse((result as { error: string }).error); } const data = result as { results: Array<{ title: string; url: string; content: string; score: number }>; answer?: string }; const formatted = data.results.map((r, i) => `### ${String(i + 1)}. ${r.title}\nURL: ${r.url}\nScore: ${String(r.score)}\n\n${r.content}` ).join('\n\n'); const answer = data.answer ? `**Answer:** ${data.answer}\n\n` : ''; recordToolInvocation({ toolName, status: 'success', durationMs: Date.now() - start }); return { content: [textContent(`${answer}${formatted}`)] }; },});
Expected output:pnpm typecheck passes.
Step 8: Create the extract MCP tool
Create src/tools/extract.tool.ts. This tool takes URLs and extracts raw web page content via Tavily.
ts
import { defineTool } from '@reaatech/mcp-server-tools';import { z } from 'zod';import { textContent, errorResponse } from '@reaatech/mcp-server-core';import { TavilyClient } from '../lib/tavily.js';import type { ToolResponse } from '@reaatech/mcp-server-core';import { recordToolInvocation, recordError } from '@reaatech/mcp-server-observability';export default defineTool({ name: 'extract', description: 'Extract raw content from URLs using Tavily', inputSchema: z.object({ urls: z.array(z.string().url()).min(1).max(20).describe('URLs to extract content from'), }), handler: async (args, _): Promise<ToolResponse> => { const start = Date.now(); const toolName = 'extract'; const result = await (async () => { const client = new TavilyClient({ apiKey: process.env.TAVILY_API_KEY ?? '' }); return await client.extract({ urls: args.urls as string[] }); })().catch((err: unknown) => { const message = err instanceof Error ? err.message : 'Extraction failed'; recordError({ errorType: 'tool_execution_failed', toolName }); recordToolInvocation({ toolName, status: 'error', durationMs: Date.now() - start }); return { error: message }; }); if ('error' in result) { return errorResponse((result as { error: string }).error); } const data = result as { results: Array<{ url: string; rawContent: string }>; failedResults: Array<{ url: string; error: string }> }; const parts: string[] = []; for (const r of data.results) { parts.push(`=== ${r.url} ===\n${r.rawContent}`); } for (const f of data.failedResults) { parts.push(`=== ${f.url} ===\n[Failed to extract: ${f.error}]`); } recordToolInvocation({ toolName, status: 'success', durationMs: Date.now() - start }); return { content: [textContent(parts.join('\n\n'))] }; },});
Expected output:pnpm typecheck passes.
Step 9: Create the crawl MCP tool
Create src/tools/crawl.tool.ts. This tool crawls a website starting from a given URL using Tavily.
ts
import { defineTool } from '@reaatech/mcp-server-tools';import { z } from 'zod';import { textContent, errorResponse } from '@reaatech/mcp-server-core';import { TavilyClient } from '../lib/tavily.js';import type { ToolResponse } from '@reaatech/mcp-server-core';import { recordToolInvocation, recordError } from '@reaatech/mcp-server-observability';export default defineTool({ name: 'crawl', description: 'Crawl a website using Tavily', inputSchema: z.object({ url: z.string().url().describe('Starting URL to crawl'), maxDepth: z.number().min(1).max(5).optional().describe('Maximum crawl depth'), limit: z.number().min(1).max(100).optional().describe('Maximum number of pages to crawl'), instructions: z.string().optional().describe('Crawling instructions'), }), handler: async (args, _): Promise<ToolResponse> => { const start = Date.now(); const toolName = 'crawl'; const result = await (async () => { const client = new TavilyClient({ apiKey: process.env.TAVILY_API_KEY ?? '' }); return await client.crawl({ url: args.url as string, maxDepth: args.maxDepth as number | undefined, limit: args.limit as number | undefined, instructions: args.instructions as string | undefined, }); })().catch((err: unknown) => { const message = err instanceof Error ? err.message : 'Crawl failed'; recordError({ errorType: 'tool_execution_failed', toolName }); recordToolInvocation({ toolName, status: 'error', durationMs: Date.now() - start }); return { error: message }; }); if ('error' in result) { return errorResponse((result as { error: string }).error); } const data = result as { results: Array<{ url: string; rawContent: string }> }; const formatted = data.results.map((r) => `=== ${r.url} ===\n${r.rawContent.slice(0, 2000)}${r.rawContent.length > 2000 ? '...' : ''}` ).join('\n\n'); recordToolInvocation({ toolName, status: 'success', durationMs: Date.now() - start }); return { content: [textContent(formatted)] }; },});
Expected output:pnpm typecheck passes.
Step 10: Create the chat MCP tool
Create src/tools/chat.tool.ts. This is the general-purpose Cohere chat tool. It accepts an optional system prompt and user message.
ts
import { defineTool } from '@reaatech/mcp-server-tools';import { z } from 'zod';import { textContent, errorResponse } from '@reaatech/mcp-server-core';import { CohereClient } from '../lib/cohere.js';import { recordToolInvocation, recordError } from '@reaatech/mcp-server-observability';export default defineTool({ name: 'chat', description: 'Chat with Cohere language model', inputSchema: z.object({ message: z.string().describe('User message'), system: z.string().optional().describe('System prompt'), model: z.string().optional().describe('Cohere model ID'), temperature: z.number().min(0).max(2).optional().describe('Generation temperature'), maxTokens: z.number().min(1).max(8192).optional().describe('Maximum tokens to generate'), }), handler: async (args, _) => { const start = Date.now(); const toolName = 'chat'; const client = new CohereClient(); const system = args.system as string | undefined; const messages: Array<{ role: 'user' | 'system' | 'assistant'; content: string }> = []; if (system) { messages.push({ role: 'system', content: system }); } messages.push({ role: 'user', content: args.message as string }); const result = await client.chat({ messages, model: args.model as string | undefined, temperature: args.temperature as number | undefined, maxTokens: args.maxTokens as number | undefined, }).catch((err: unknown) => { const message = err instanceof Error ? err.message : 'Chat failed'; recordError({ errorType: 'tool_execution_failed', toolName }); recordToolInvocation({ toolName, status: 'error', durationMs: Date.now() - start }); return { error: message }; }); if ('error' in result) { return errorResponse((result as { error: string }).error); } recordToolInvocation({ toolName, status: 'success', durationMs: Date.now() - start }); return { content: [textContent((result as { text: string }).text)] }; },});
Expected output:pnpm typecheck passes.
Step 11: Create the parse-document MCP tool
Create src/tools/parse-document.tool.ts. This tool accepts a Base64-encoded PDF, decodes it, and returns extracted text with page count.
ts
import { defineTool } from '@reaatech/mcp-server-tools';import { z } from 'zod';import { textContent, errorResponse } from '@reaatech/mcp-server-core';import { extractTextFromPdf } from '../lib/pdf.js';import type { ToolResponse } from '@reaatech/mcp-server-core';import { recordToolInvocation, recordError } from '@reaatech/mcp-server-observability';export default defineTool({ name: 'parse-document', description: 'Extract text from a Base64-encoded PDF document', inputSchema: z.object({ content: z.string().describe('Base64-encoded PDF content'), }), handler: async (args, _): Promise<ToolResponse> => { const start = Date.now(); const toolName = 'parse-document'; const result = await (async () => { const buffer = Buffer.from(args.content as string, 'base64'); return await extractTextFromPdf(buffer); })().catch((err: unknown) => { const message = err instanceof Error ? err.message : 'PDF parsing failed'; recordError({ errorType: 'tool_execution_failed', toolName }); recordToolInvocation({ toolName, status: 'error', durationMs: Date.now() - start }); return { error: message }; }); if ('error' in result) { return errorResponse((result as { error: string }).error); } recordToolInvocation({ toolName, status: 'success', durationMs: Date.now() - start }); return { content: [textContent(JSON.stringify({ text: result.text, numPages: result.numPages }))] }; },});
Expected output:pnpm typecheck passes.
Step 12: Create the generate-report MCP tool
Create src/tools/generate-report.tool.ts. This is the most sophisticated tool — it orchestrates Tavily search and extraction with Cohere chat to produce a synthesized research report. Concurrency is capped with p-limit.
ts
import { defineTool } from '@reaatech/mcp-server-tools';import { z } from 'zod';import { textContent, errorResponse } from '@reaatech/mcp-server-core';import { TavilyClient } from '../lib/tavily.js';import { CohereClient } from '../lib/cohere.js';import pLimit from 'p-limit';import type { ToolResponse } from '@reaatech/mcp-server-core';import { recordToolInvocation, recordError } from '@reaatech/mcp-server-observability';export default defineTool({ name: 'generate-report', description: 'Research a topic via web search and generate a structured report using Cohere', inputSchema: z.
Expected output:pnpm typecheck passes.
Step 13: Create the server entry point
Create src/index.ts. The createApp() function auto-discovers all tool files under src/tools/ and mounts auth middleware, rate limiting, and observability. The startServer() function listens on the configured PORT.
ts
import 'dotenv/config';import { startServer } from '@reaatech/mcp-server-engine';import { initObservability, logger } from '@reaatech/mcp-server-observability';async function main(): Promise<void> { await initObservability(); logger.info('Starting Cohere MCP Server...'); await startServer();}main().catch((err: unknown) => { logger.error({ err }, 'Failed to start server'); process.exit(1);});export { createApp, startServer } from '@reaatech/mcp-server-engine';export { authMiddleware } from '@reaatech/mcp-server-auth';
Expected output:pnpm typecheck passes with no errors.
Step 14: Run the tests
The project includes a test suite. Run it to verify everything works end to end:
terminal
pnpm test
Expected output: All 76 tests pass across 12 test files. You’ll see output like this:
The coverage report shows high coverage across source files — src/lib/pdf.ts and all tool files reach 100% on most metrics.
Next steps
Add a Tavily context tool — extend the server with a tool that wraps Tavily’s context API to retrieve relevant context snippets from the web for grounding LLM responses
Add streaming support — expose a Streamable HTTP transport alongside the default SSE transport so clients can stream Cohere chat responses token by token
Deploy with Docker — create a Dockerfile that builds and runs the server, accepting env vars at runtime for COHERE_API_KEY and TAVILY_API_KEY
:
string
}>;
model?: string;
temperature?: number;
maxTokens?: number;
}
export interface SummarizeResult {
summary: string;
usage?: { inputTokens: number; outputTokens: number };
}
export interface ChatResult {
text: string;
usage?: { inputTokens: number; outputTokens: number };