Skip to content
reaatechREAATECH

@reaatech/agent-handoff-compression

npm v0.1.0

Reduces conversation history into a condensed format using sliding window, extractive summary, or hybrid strategies to fit within specific token budgets. It provides a set of compressor classes that implement the `ContextCompressor` interface from `@reaatech/agent-handoff` and supports custom token counting logic.

@reaatech/agent-handoff-compression

npm version License: MIT CI

Status: Pre-1.0 — APIs may change in minor versions. Pin to a specific version in production.

Context compression strategies for reducing conversation history before agent handoff. Includes three built-in compressors with configurable token budgets and a pluggable interface for custom implementations.

Installation

terminal
npm install @reaatech/agent-handoff-compression
# or
pnpm add @reaatech/agent-handoff-compression

Feature Overview

  • Three built-in strategies — hybrid, summary, and sliding-window compressors
  • Hybrid compression — sliding window + extractive summary + key fact extraction + entity detection + intent identification
  • Extractive summarization — sentence scoring by position weight, length penalty, and keyword bonus
  • Sliding window — most recent N messages within token budget, newest-first iteration
  • Fast token estimation — heuristic CJK/ASCII counter, no external tokenizer required
  • Pluggable — inject a custom TokenCounter (e.g. tiktoken) for your LLM’s exact tokenizer
  • Composable — each compressor implements ContextCompressor from @reaatech/agent-handoff

Quick Start

typescript
import {
  HybridCompressor,
  SummaryCompressor,
  SlidingWindowCompressor,
  SimpleTokenCounter,
} from '@reaatech/agent-handoff-compression';
 
const compressor = new HybridCompressor(new SimpleTokenCounter());
 
const result = await compressor.compress(messages, {
  maxTokens: 2000,
  preserveRecentMessages: 3,
});
 
console.log(result.summary);
console.log(`Compression ratio: ${result.compressionRatio}`);
console.log(`Key facts: ${result.keyFacts.length}`);
console.log(`Intents detected: ${result.intents.length}`);

Exports

Compressors

ExportStrategyBest For
HybridCompressorSliding window + summary + key facts + entities + intentsGeneral purpose (default)
SummaryCompressorExtractive summarization with sentence scoringLong conversations needing a condensed overview
SlidingWindowCompressorMost recent N messages within token budgetChat-style agents where recency matters most

Base Classes & Interfaces

ExportDescription
BaseCompressorAbstract class with tokenCounter and estimateTokens()
SimpleTokenCounterFast heuristic: CJK chars ≈ 0.67 tokens, ASCII ≈ 0.25 tokens
TokenCounterInterface: estimate(text: string): number
CompressionStrategyInterface: name, compress(messages, options?)

Compression Output (CompressedContext)

FieldTypeDescription
summarystringCondensed conversation narrative
keyFactsKeyFact[]Extracted facts with importance scores and source message IDs
intentsIntent[]Detected user intents with confidence scores
entitiesEntity[]Extracted emails, phones, names, organizations
openItemsOpenItem[]Pending questions and action items with priority
originalTokenCountnumberToken count of uncompressed input
compressedTokenCountnumberToken count of compressed output
compressionRationumbercompressed / original (lower = better compression)

Custom Compressors

Implement the ContextCompressor interface from @reaatech/agent-handoff:

typescript
import type { Message, CompressionOptions, CompressedContext } from '@reaatech/agent-handoff';
 
class LastNCompressor implements ContextCompressor {
  async compress(messages: Message[], options?: CompressionOptions): Promise<CompressedContext> {
    const limit = options?.preserveRecentMessages ?? 10;
    const recent = messages.slice(-limit);
 
    return {
      summary: recent.map((m) => `${m.role}: ${m.content}`).join('\n'),
      keyFacts: [],
      entities: [],
      intents: [],
      openItems: [],
      compressionMethod: 'last_n',
      originalTokenCount: messages.length,
      compressedTokenCount: recent.length,
      compressionRatio: recent.length / Math.max(1, messages.length),
    };
  }
 
  estimateTokens(text: string): number {
    return Math.ceil(text.length / 4);
  }
}

License

MIT