Skip to content

Memory Extraction

Memory extraction automatically identifies and stores important information from your conversations. Rather than requiring you to explicitly tell Iris what to remember, it analyzes your interactions and captures facts worth preserving.

How It Works

Extraction runs as a background job after every N messages (default: 10):

  1. Gather context: The job collects up to 50 recent messages, including both user and assistant turns
  2. Analyze for memorable content: An LLM reviews the conversation and identifies facts worth preserving
  3. Check for duplicates: Existing memories are compared to avoid storing redundant information
  4. Create memories: Each memory is stored with content, type, category, tags, and vector embedding

The extraction prompt instructs the model to be selective -quality over quantity. Each memory should be self-contained and useful on its own, not dependent on conversation context.

NOTE

Extraction runs via the queue worker. Make sure php artisan horizon is running.

What Gets Extracted

Good Candidates

TypeExamples
Personal detailsName, location, family members, birthday
PreferencesLikes dark mode, prefers morning meetings, vegetarian
GoalsWants to learn piano, training for a marathon
SkillsExperienced PHP developer, fluent in Spanish
RelationshipsSarah is their manager, works with John on projects
EventsStarting a new job next month, vacation planned for July
ContextWorks remotely, has a 2-hour commute

What Gets Skipped

  • Transient context: "I'm looking at the code right now" - not useful long-term
  • Already known: If a memory already exists, don't duplicate it
  • Trivial details: "I had coffee this morning" - unless there's a pattern
  • Conversation mechanics: "Thanks for your help" - not about the user

Quality vs Quantity Tradeoff

The max_memories setting (default: 6) intentionally limits how many memories can be created per extraction. This forces selectivity -the model must choose the most valuable facts to preserve.

Increasing this limit captures more information but may dilute quality. The model might start storing borderline-useful facts that clutter memory retrieval later.

TIP

If important information seems to be missed, first check if extraction is running (Horizon dashboard at /horizon). If it is, consider lowering extraction.threshold to run more frequently rather than raising max_memories.

Examples

Example 1: Professional Context

Conversation:

"I'm working on a Laravel project for a healthcare startup. We're building a patient portal that needs to be HIPAA compliant. I've been doing PHP for about 8 years now."

Extracted memories:

  • "Works at a healthcare startup building a patient portal" (type: fact, category: professional)
  • "Has 8 years of PHP development experience" (type: skill, category: professional)
  • "Working on HIPAA-compliant software" (type: context, category: professional)

Example 2: Personal Preferences

Conversation:

"I really prefer having my meetings in the morning when I'm fresh. After lunch I'm usually in deep focus mode and don't want to be interrupted."

Extracted memories:

  • "Prefers meetings in the morning when they feel fresh" (type: preference, category: preferences)
  • "Reserves afternoons for deep focus work" (type: habit, category: professional)

Example 3: Relationship Information

Conversation:

"My manager Sarah wants me to lead the API redesign project. I'll be working with the backend team on this."

Extracted memories:

  • "Sarah is their manager" (type: relationship, category: professional)
  • "Leading the API redesign project" (type: goal, category: professional)

Preventing Duplicates

Before storing a new memory, extraction checks for semantic similarity with existing memories. If a very similar memory already exists, the new one is skipped or the existing one is updated.

This prevents accumulation of near-duplicate memories like:

  • "Has 8 years of PHP experience"
  • "Experienced PHP developer for 8 years"
  • "Been doing PHP development for about 8 years"

The duplicate detection uses the same embedding comparison as memory retrieval, with a high similarity threshold to catch only true duplicates.

Configuration

SettingDefaultDescription
extraction.threshold10Messages between extractions
extraction.max_memories6Maximum memories per extraction
extraction.timeout120API timeout in seconds
extraction.modelclaude-sonnet-4-5Model for analysis

WARNING

Higher max_memories limits may lead to lower-quality memories. The default encourages selectivity.

Monitoring Extraction

To see if extraction is working:

  1. Check the queue: The Horizon dashboard at /horizon should show ExtractMemories jobs processing
  2. View memories: The Memories page shows when memories were created
  3. Check logs: Failed extractions log errors to storage/logs/laravel.log