Summarization

As conversations grow longer, sending the entire history becomes impractical. Iris automatically condenses older messages into narrative summaries that preserve context, emotional dynamics, and unresolved threads.

Why Summarization Matters

Without summarization, you'd face a tradeoff:

Keep all messages: Context window fills up, costs increase, responses slow down
Drop old messages: Lose important context, Iris forgets what you discussed

Summarization offers a middle path: older messages are compressed into rich summaries that capture what matters, while recent messages stay in full detail.

How It Works

Summarization triggers when unsummarized messages exceed a threshold (default: 40). The process:

Count unsummarized messages: Check how many messages haven't been included in a summary
Trigger if above threshold: If count exceeds summarization.threshold, start summarization
Select messages to summarize: Take older messages, leaving the most recent buffer messages untouched
Generate summary: An LLM creates a structured summary capturing key information
Mark messages as summarized: Link the summarized messages to the new summary

Recent messages (the "buffer") are kept in full detail so you can reference recent exchanges naturally.

What Gets Captured

Summaries aren't just text excerpts -they're structured documents that capture multiple dimensions of the conversation:

Narrative Summary

A 150-300 word narrative that tells the story of the conversation segment. This is what gets injected into the system prompt.

Example:

"The user discussed their ongoing Laravel project, expressing frustration with performance issues in the API layer. We explored several optimization strategies including query caching and eager loading. The conversation shifted to their upcoming vacation plans, and they mentioned needing to hand off the project to a colleague named Marcus. The user seemed stressed about the timeline but optimistic about the technical solutions we discussed."

Emotional Markers

Key emotional moments with intensity scores (0.0-1.0):

json

[
  {"moment": "Expressed frustration with API performance", "intensity": 0.7},
  {"moment": "Relief when caching solution clicked", "intensity": 0.6},
  {"moment": "Excitement about vacation plans", "intensity": 0.5}
]

Thread Tracking

Unresolved threads - topics that came up but weren't concluded:

"Performance testing before handoff"
"Meeting with Marcus about the project"

Resolved threads - topics that reached a conclusion:

"Caching strategy for API endpoints"
"Vacation dates confirmed"

Relationship Dynamics

How trust and rapport evolved during this segment:

Formality level changes
Building understanding
Areas of strong agreement or disagreement

Key Facts

Important information learned during this segment that might warrant memory extraction:

"Works with a colleague named Marcus"
"Has vacation planned soon"
"API performance is a current priority"

Summary Chaining

Summaries form a chain, with each referencing its predecessor via previous_summary_id. This creates continuity across long-running conversations.

Summary 1 → Summary 2 → Summary 3 (most recent)
    ↑           ↑           ↑
  Links to    Links to    Injected
  nothing     #1          into context

Each summary includes a narrative thread -a bridging sentence that connects to the previous summary:

"Continuing from our discussion about the API refactoring project..."

This helps Iris maintain conversational continuity even when the full history isn't available.

When Summaries Are Used

Up to 3 recent summaries are included in the system prompt for each request. They appear in the context after recalled memories, providing:

Historical context from previous sessions
Emotional continuity (Iris remembers how conversations felt)
Thread awareness (Iris knows what was left unresolved)

Configuration

Setting	Default	Description
`summarization.threshold`	40	Unsummarized messages to trigger
`summarization.buffer`	35	Recent messages to keep in full
`summarization.timeout`	120	API timeout in seconds
`summarization.model`	claude-sonnet-4-5	Model for generating summaries

Understanding the Settings

threshold: How many unsummarized messages accumulate before triggering. Lower = more frequent summarization.
buffer: How many recent messages to preserve in full detail. These won't be summarized until the next trigger.

WARNING

Very aggressive summarization (low threshold) may lose nuance. The defaults balance context preservation with token efficiency.

Viewing Summaries

Summaries are stored in the conversation_summaries table. You can explore them via:

bash

php artisan tinker
>>> User::first()->conversationSummaries()->latest()->first()

Each summary includes all the structured fields (emotional markers, threads, etc.) as JSON columns.

Summarization ​

Why Summarization Matters ​

How It Works ​

What Gets Captured ​

Narrative Summary ​

Emotional Markers ​

Thread Tracking ​

Relationship Dynamics ​

Key Facts ​

Summary Chaining ​

When Summaries Are Used ​

Configuration ​

Understanding the Settings ​

Viewing Summaries ​