Skip to content
reaatech

reaatech/llm-cache

0Last commit: Jun 4, 2026GitHub →

These packages give you a semantic caching layer for LLM calls that returns cached responses for both exact prompt matches and semantically similar prompts above a configurable cosine similarity threshold. You'd adopt them to reduce API costs and latency by avoiding redundant LLM calls, especially when users ask the same question in different phrasings. The system is built as a modular engine with pluggable storage adapters (Redis, DynamoDB, Qdrant) and optional cost tracking, observability, and HTTP server packages that compose together through well-defined interfaces rather than a monolithic service.

Packages

7 packages

@reaatech/llm-cache

v0.1.0
A caching engine for LLM calls that provides both exact-match (SHA-256 hash) and semantic (cosine similarity on embeddings) cache lookups, with model-aware fingerprinting, use-case segmentation, and adaptive TTL. It exports a `CacheEngine` class that requires `StorageAdapter` and `VectorStorageAdapter` implementations (e.g., in-memory, Redis, DynamoDB) and an `Embedder` for semantic matching.
status
published
published
1 month ago

@reaatech/llm-cache-adapters-dynamodb

v0.1.0
A DynamoDB storage adapter for `@reaatech/llm-cache` that persists exact-match cache entries with native TTL, GSI-backed metadata queries, and batch operations chunked to AWS limits. Exports a `DynamoDBAdapter` class implementing the `StorageAdapter` interface from `@reaatech/llm-cache`.
status
published
published
1 month ago

@reaatech/llm-cache-adapters-qdrant

v0.1.0
A Qdrant vector database adapter for `@reaatech/llm-cache` that implements the `VectorStorageAdapter` interface, providing HNSW approximate nearest neighbor search with metadata filtering and deterministic UUID-based point IDs.
status
published
published
1 month ago

@reaatech/llm-cache-adapters-redis

v0.1.0
A Redis storage adapter for the `@reaatech/llm-cache` library that implements the `StorageAdapter` interface, providing exact-match cache operations with automatic TTL via `SETEX`, batch operations, and metadata queries using `SCAN`.
status
published
published
1 month ago

@reaatech/llm-cache-cost-tracker

v0.1.0
A cost calculator and pricing database for LLM API usage, providing a `CostCalculator` class that computes per-request costs from token counts and model pricing, and tracks savings from cache hits. It ships with reference pricing for 40+ models across OpenAI, Anthropic, and Google, and implements the `CostCalculatorLike` interface for drop-in integration with `@reaatech/llm-cache`.
status
published
published
1 month ago

@reaatech/llm-cache-observability

v0.1.0
A structured JSON logger and Prometheus-compatible metrics collector for LLM cache operations, providing automatic PII redaction on 17 sensitive field names, correlation ID propagation via `child()`, and cardinality-protected counters and histograms with zero runtime dependencies.
status
published
published
1 month ago

@reaatech/llm-cache-server

v0.1.0
An HTTP server wrapper for llm-cache that exposes a REST API for cache operations, Prometheus metrics, and health endpoints, configurable via environment variables for storage (memory, Redis, DynamoDB) and vector search (memory, Qdrant) backends. Exports `createApp()` returning an `App` object with an `http.Server`, cache engine instance, and `shutdown()` method, plus a `main()` convenience function for direct CLI or programmatic use.
status
published
published
1 month ago

Comments

Sign in with GitHub to comment and vote.

Loading comments…