Book a conversation

reaatech/llm-cache

★ 0Last commit: Apr 30, 2026GitHub →

These packages provide a semantic and exact-match caching layer for LLM interactions, including support for embedding-based similarity, model-aware fingerprinting, and cost tracking. You would adopt them to reduce latency and API expenses by serving cached responses for semantically equivalent prompts across different storage backends like Redis, DynamoDB, and Qdrant. The system is built around a modular architecture where a core engine composes pluggable storage adapters, cost calculators, and observability utilities to fit into either application code or as a standalone HTTP sidecar.

Packages

Sort

7 packages

@reaatech/llm-cache

Provides a multi-stage caching engine for LLM responses that performs exact-match lookups and embedding-based semantic similarity searches. It exposes a `CacheEngine` class that requires pluggable storage adapters for both metadata and vector data, along with an embedder for semantic matching.

awaiting publishView →

@reaatech/llm-cache-adapters-dynamodb

Provides a DynamoDB storage adapter class for the `llm-cache` library, enabling persistent caching with native TTL support and GSI-based metadata querying. It implements the `StorageAdapter` interface and requires a pre-configured DynamoDB table with specific partition and sort key schemas.

awaiting publishView →

@reaatech/llm-cache-adapters-qdrant

Provides a Qdrant vector database adapter for the `llm-cache` library, implementing the `VectorStorageAdapter` interface for semantic search and metadata filtering. It exposes a `QdrantAdapter` class that handles collection auto-provisioning, deterministic UUID-based point management, and hybrid vector-metadata queries.

awaiting publishView →

@reaatech/llm-cache-adapters-redis

Provides a Redis storage adapter for the `llm-cache` library, enabling persistent key-value caching with support for batch operations and metadata-based invalidation. It exports a `RedisAdapter` class that implements the `StorageAdapter` interface for use with the `CacheEngine`.

awaiting publishView →

@reaatech/llm-cache-cost-tracker

Calculates LLM request costs and cache savings using a built-in database of pricing for over 40 models. It provides a `CostCalculator` class that implements the `CostCalculatorLike` interface for direct integration with `@reaatech/llm-cache`.

awaiting publishView →

@reaatech/llm-cache-observability

Provides structured NDJSON logging with automatic PII redaction and Prometheus-compatible metrics collection for the `llm-cache` library. It exposes `Logger` and `MetricsCollector` classes to handle request-scoped correlation IDs and telemetry exposition.

awaiting publishView →

@reaatech/llm-cache-server

Provides a REST API server for managing LLM cache operations, including semantic search and exact-match lookups. It exposes a configurable HTTP interface that supports Redis, DynamoDB, and Qdrant backends via environment variables.

awaiting publishView →

Back to observability-cost