Skip to content

Summarization

As conversations grow longer, sending the entire history becomes impractical. Iris automatically condenses older messages into narrative summaries that preserve context, emotional dynamics, and unresolved threads.

Why Summarization Matters

Without summarization, you'd face a tradeoff:

  • Keep all messages: Context window fills up, costs increase, responses slow down
  • Drop old messages: Lose important context, Iris forgets what you discussed

Summarization offers a middle path: older messages are compressed into rich summaries that capture what matters, while recent messages stay in full detail.

How It Works

Summarization triggers when unsummarized messages exceed a threshold (default: 40). The process:

  1. Count unsummarized messages: Check how many messages haven't been included in a summary
  2. Trigger if above threshold: If count exceeds summarization.threshold, start summarization
  3. Select messages to summarize: Take older messages, leaving the most recent buffer messages untouched
  4. Generate summary: An LLM creates a structured summary capturing key information
  5. Mark messages as summarized: Link the summarized messages to the new summary

Recent messages (the "buffer") are kept in full detail so you can reference recent exchanges naturally.

What Gets Captured

Summaries aren't just text excerpts -they're structured documents that capture multiple dimensions of the conversation:

Narrative Summary

A 150-300 word narrative that tells the story of the conversation segment. This is what gets injected into the system prompt.

Example:

"The user discussed their ongoing Laravel project, expressing frustration with performance issues in the API layer. We explored several optimization strategies including query caching and eager loading. The conversation shifted to their upcoming vacation plans, and they mentioned needing to hand off the project to a colleague named Marcus. The user seemed stressed about the timeline but optimistic about the technical solutions we discussed."

Emotional Markers

Key emotional moments with intensity scores (0.0-1.0):

json
[
  {"moment": "Expressed frustration with API performance", "intensity": 0.7},
  {"moment": "Relief when caching solution clicked", "intensity": 0.6},
  {"moment": "Excitement about vacation plans", "intensity": 0.5}
]

Thread Tracking

Unresolved threads - topics that came up but weren't concluded:

  • "Performance testing before handoff"
  • "Meeting with Marcus about the project"

Resolved threads - topics that reached a conclusion:

  • "Caching strategy for API endpoints"
  • "Vacation dates confirmed"

Relationship Dynamics

How trust and rapport evolved during this segment:

  • Formality level changes
  • Building understanding
  • Areas of strong agreement or disagreement

Key Facts

Important information learned during this segment that might warrant memory extraction:

  • "Works with a colleague named Marcus"
  • "Has vacation planned soon"
  • "API performance is a current priority"

Summary Chaining

Summaries form a chain, with each referencing its predecessor via previous_summary_id. This creates continuity across long-running conversations.

Summary 1 → Summary 2 → Summary 3 (most recent)
    ↑           ↑           ↑
  Links to    Links to    Injected
  nothing     #1          into context

Each summary includes a narrative thread -a bridging sentence that connects to the previous summary:

"Continuing from our discussion about the API refactoring project..."

This helps Iris maintain conversational continuity even when the full history isn't available.

When Summaries Are Used

Up to 3 recent summaries are included in the system prompt for each request. They appear in the context after recalled memories, providing:

  • Historical context from previous sessions
  • Emotional continuity (Iris remembers how conversations felt)
  • Thread awareness (Iris knows what was left unresolved)

Configuration

SettingDefaultDescription
summarization.threshold40Unsummarized messages to trigger
summarization.buffer35Recent messages to keep in full
summarization.keep_recent50Messages to keep active
summarization.timeout120API timeout in seconds
summarization.modelclaude-sonnet-4-5Model for generating summaries

Understanding the Settings

  • threshold: How many unsummarized messages accumulate before triggering. Lower = more frequent summarization.
  • buffer: How many recent messages to preserve in full detail. These won't be summarized until the next trigger.
  • keep_recent: Total messages to keep active in the conversation. Older messages are marked as summarized.

WARNING

Very aggressive summarization (low threshold) may lose nuance. The defaults balance context preservation with token efficiency.

Viewing Summaries

Summaries are stored in the conversation_summaries table. You can explore them via:

bash
php artisan tinker
>>> User::first()->conversationSummaries()->latest()->first()

Each summary includes all the structured fields (emotional markers, threads, etc.) as JSON columns.