@reaatech/media-pipeline-mcp-elevenlabs

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

ElevenLabs provider for the media pipeline framework. Delivers high-quality text-to-speech synthesis with configurable voice selection, speaking speed, voice stability tuning, similarity boost, and style exaggeration. Supports multiple output formats and native audio-byte streaming.

Installation

terminal

npm install @reaatech/media-pipeline-mcp-elevenlabs
# or
pnpm add @reaatech/media-pipeline-mcp-elevenlabs

Feature Overview

High-quality TTS with eleven_monolingual_v1, eleven_multilingual_v2, and eleven_turbo_v2 models
Named voice selection (Rachel, Josh, Daniel, Charlotte) plus custom voice IDs
Fine-grained voice tuning: stability (0-1), similarity boost (0-1), style exaggeration (0-1)
Speaking speed control via SSML prosody tags
Multiple output formats: MP3, WAV, OGG, FLAC, AAC
Streaming support for TTS audio bytes (supportsStreaming)
Character-count-based cost estimation

Quick Start

typescript

import { ElevenLabsProvider } from "@reaatech/media-pipeline-mcp-elevenlabs";
 
const provider = new ElevenLabsProvider({ apiKey: process.env.ELEVENLABS_API_KEY! });
 
const audio = await provider.execute({
  operation: "audio.tts",
  params: {
    text: "Welcome to our media pipeline. This audio was generated with ElevenLabs.",
    voice: "Rachel",
    speed: 1.0,
    model: "eleven_turbo_v2",
  },
  config: {},
});
 
// Save or pipe the audio
import { writeFileSync } from "node:fs";
writeFileSync("output.mp3", audio.data);
console.log(`Generated ${audio.metadata.characterCount} chars in ${audio.metadata.duration}s`);

Supported Operations

Operation	Default Model	Description	Output Format
`audio.tts`	`eleven_monolingual_v1`	Text-to-speech with voice and parameter control	Audio bytes in `mp3`, `wav`, `ogg`, `flac`, or `aac`

Configuration Parameters

`audio.tts`

Parameter	Type	Default	Description
`text`	`string`	required	Text to convert to speech
`voice`	`string`	`Rachel`	Voice name (`Rachel`, `Josh`, `Daniel`, `Charlotte`) or custom voice ID
`speed`	`number`	`1.0`	Speaking rate multiplier (uses SSML prosody)
`model`	`string`	`eleven_monolingual_v1`	TTS model ID
`response_format`	`string`	`mp3`	Output audio format: `mp3`, `wav`, `ogg`, `flac`, `aac`

Voice Tuning (internal defaults)

The provider applies these voice settings automatically on every request:

Parameter	Default	Description
`stability`	`0.5`	Voice stability (0 = more variable, 1 = more consistent)
`similarity_boost`	`0.75`	Speaker similarity to target voice (0-1)
`style`	`0.0`	Style exaggeration (0-1)
`use_speaker_boost`	`true`	Enhance speaker clarity

API Reference

`ElevenLabsProvider`

typescript

class ElevenLabsProvider extends MediaProvider {
  constructor(config: ElevenLabsProviderConfig)
 
  healthCheck(): Promise<ProviderHealth>
  estimateCost(input: ProviderInput): Promise<CostEstimate>
  execute(input: ProviderInput): Promise<ProviderOutput>
}

`ElevenLabsProviderConfig`

typescript

interface ElevenLabsProviderConfig {
  apiKey: string;
  voices?: {
    default?: string;
    [voiceName: string]: string | undefined;
  };
  model?: string;    // Default model ID
  timeout?: number;  // Request timeout in ms
}

Factory Function

typescript

import { defineElevenLabsProvider } from "@reaatech/media-pipeline-mcp-elevenlabs";
 
const provider = defineElevenLabsProvider({ apiKey: process.env.ELEVENLABS_API_KEY! });

Voice Resolution Logic

Voice parameters are resolved in this order:

If a custom voices map is configured, the name is looked up there first
If the value starts with voice_ or is exactly 20 characters, it’s treated as a raw voice ID
If the name matches a built-in preset, that voice ID is used
Falls back to "Rachel"

Key Methods

Method	Returns	Description
`healthCheck()`	`ProviderHealth`	Validates API key by fetching `/v1/voices` from the ElevenLabs API
`estimateCost(input)`	`CostEstimate`	Estimates cost based on text character count × per-character rate
`execute(input)`	`ProviderOutput`	Synthesizes audio and returns raw audio bytes with metadata

Non-Retryable Errors

The provider classifies these errors as non-retryable: authentication failed, invalid API key, permission denied, insufficient credits, voice not found, invalid voice ID.

Cost Estimation

Per-Character Pricing

Model	Cost / Character
`eleven_turbo_v2`	$0.0002
`eleven_monolingual_v1`	$0.0003
`eleven_multilingual_v2`	$0.0005

Example Estimates

Text Length	Model	Est. Cost
100 chars	`eleven_turbo_v2`	$0.02
100 chars	`eleven_monolingual_v1`	$0.03
500 chars	`eleven_multilingual_v2`	$0.25

Cache Configuration

The provider exposes static cacheConfig with deterministic and non-deterministic parameters.

Deterministic parameters: text, voice_id, voice, model, voice_settings

Non-deterministic parameters: (none)

The normalize() function trims and collapses whitespace in text, and preserves voice settings as-is. All parameters are deterministic, so identical text + voice + model combinations will produce matching cache keys.

Health Check

The health check sends a GET request to https://api.elevenlabs.io/v1/voices using the xi-api-key header. Returns { healthy: true, latency: <ms> } on 2xx response, or { healthy: false, error: "<message>" } on failure.

@reaatech/media-pipeline-mcp-provider-core — Base provider class
@reaatech/media-pipeline-mcp-server — MCP server
@reaatech/media-pipeline-mcp-openai — Alternative TTS provider (TTS-1)

License

MIT

@reaatech/media-pipeline-mcp-elevenlabs

@reaatech/media-pipeline-mcp-elevenlabs

Installation

Feature Overview

Quick Start

Supported Operations

Configuration Parameters

audio.tts

Voice Tuning (internal defaults)

API Reference

ElevenLabsProvider

ElevenLabsProviderConfig

Factory Function

Voice Resolution Logic

Key Methods

Non-Retryable Errors

Cost Estimation

Per-Character Pricing

Example Estimates

Cache Configuration

Health Check

Related Packages

License

`audio.tts`

`ElevenLabsProvider`

`ElevenLabsProviderConfig`