Skip to content

Memory Consolidation

Over time, extraction creates many related memories about similar topics. Consolidation merges semantically similar memories into denser, more useful representations.

Why Consolidation Matters

  • Reduces redundancy - Multiple similar memories merged into one
  • Improves coherence - Related facts combined into comprehensive memories
  • Increases quality - LLM review improves wording and accuracy

How It Works

Consolidation uses a two-phase job architecture for efficient parallel processing:

Phase 1: Cluster Building

  1. Find memories with high similarity (≥0.80)
  2. Group them into clusters using greedy clustering
  3. Dispatch one job per cluster

Phase 2: Cluster Processing

  1. Each cluster job asks the LLM if memories should merge
  2. If approved, create consolidated memory and soft-delete originals
  3. If rejected, memories remain separate

The LLM may keep memories separate if they contain contradictory information, represent different time periods, or merging would lose important nuance.

Example

Input memories:

  • "Has 8 years of PHP experience"
  • "Works primarily with Laravel framework"
  • "Experienced with PHP development"

Result:

  • "Experienced PHP developer with 8+ years of experience, primarily working with the Laravel framework"

Generation Tracking

Consolidation tracks how many times a memory has been merged through generation tracking. This enables organic memory evolution while preventing runaway over-consolidation.

How Generations Work

  • Generation 0: Original memories from extraction or manual creation
  • Generation 1: First consolidation (merging original memories)
  • Generation 2+: Re-consolidation of already-consolidated memories
  • Generation 5: Maximum - these memories won't be re-consolidated

When memories are consolidated, the new memory's generation is calculated as max(source generations) + 1.

Re-Consolidation

Unlike many memory systems that only consolidate original memories, Iris can re-consolidate already-merged memories. This allows memories to evolve naturally over time as more related information is gathered.

For example, a Gen 1 memory about "prefers TypeScript" might later merge with another Gen 1 memory about "values static typing" to form a richer Gen 2 memory.

IMPORTANT

The LLM applies extra scrutiny when re-consolidating. High-generation memories already pack dense information, so only truly redundant memories are merged.

Generation Limits

Generation 5 is the hard limit. Memories reaching this generation are excluded from future consolidation to prevent:

  • Over-abstraction losing important details
  • Runaway consolidation chains
  • Memory content becoming too generic

Running Consolidation

By default, consolidation dispatches jobs to the queue for parallel processing:

bash
# Queue batch for all users (default behavior)
php artisan iris:consolidate-memories

# Full sweep - process ALL memories, ignore days filter
php artisan iris:consolidate-memories --full

# Run synchronously (useful for debugging)
php artisan iris:consolidate-memories --sync

# Preview without changes (forces sync)
php artisan iris:consolidate-memories --dry-run

# Single user
php artisan iris:consolidate-memories --user=1

# Custom threshold for full sweep
php artisan iris:consolidate-memories --full --threshold=0.75

Monitoring Health

Check the health of your memory consolidation system:

bash
# View consolidation statistics
php artisan iris:memory-health

# Filter by specific user
php artisan iris:memory-health --user=1

The command shows:

  • Generation distribution - Count of memories at each generation
  • Original vs consolidated - Balance between new and merged memories
  • Consolidation ratio - Percentage of memories that are consolidation results
  • At max generation - Memories that won't be re-consolidated
  • Eligible count - Memories available for future consolidation

Example output:

Memory Consolidation Health

Generation                                                      Count
Gen 0 (original) ............................................... 657
Gen 1 ............................................................ 444

Total memories ................................................. 1101
Original memories ............................................... 657
Consolidation results ........................................... 444
At max generation ................................................. 0
Consolidation ratio ........................................... 40.3%

Eligible for consolidation ..................................... 1101

Processing Batches

When running in queue mode, the command returns a batch ID:

bash
php artisan iris:consolidate-memories
# Dispatched batch: 9c3b5f2a-...

# Process jobs
php artisan queue:work

# Retry failed jobs in a batch
php artisan queue:retry-batch 9c3b5f2a-...

Use Laravel Horizon for monitoring batch progress in a UI.

Scheduled Runs

Iris automatically schedules consolidation in bootstrap/app.php:

php
// Daily incremental at 3:00 AM
$schedule->command('iris:consolidate-memories')
    ->dailyAt('03:00')
    ->withoutOverlapping()
    ->runInBackground();

// Weekly full sweep on Sundays at 4:00 AM
$schedule->command('iris:consolidate-memories --full')
    ->weeklyOn(Schedule::SUNDAY, '04:00')
    ->withoutOverlapping()
    ->runInBackground();

Job Architecture

Consolidation uses two job types for efficient processing:

ConsolidateUserMemories

Handles Phase 1 for a single user:

  • Builds memory clusters (fast, no LLM calls)
  • Dispatches ConsolidateMemoryCluster jobs for each cluster
  • 60 second timeout

ConsolidateMemoryCluster

Handles Phase 2 for a single cluster:

  • Reconstructs cluster from memory IDs
  • Calls LLM to review and decide on merging
  • Executes consolidation if approved
  • 120 second timeout
  • Rate-limited to prevent API overload

This architecture prevents timeout issues with large memory sets by processing clusters independently.

Configuration

SettingDefaultDescription
consolidation.similarity_threshold0.80Minimum similarity to cluster
consolidation.days_lookback3Days of memories to consider for daily runs
consolidation.max_cluster_size5Max memories per cluster
consolidation.min_cluster_size2Min memories to form a cluster
consolidation.max_generation5Maximum consolidation generation
consolidation.jobs_per_minute10Rate limit for queued jobs

Higher threshold (0.85+) is more conservative. Lower (0.75) merges more aggressively but may lose nuance.

TIP

Use --full for a weekly sweep to catch memories that didn't cluster during daily runs due to the time filter.

Rate Limiting

Consolidation jobs are rate-limited to prevent overwhelming LLM APIs:

  1. Preventive: Jobs throttled to jobs_per_minute limit
  2. Reactive: If rate limited by the API, jobs automatically retry after the limit resets using Prism's resetsAt timing

Adjust CONSOLIDATION_JOBS_PER_MINUTE in your .env based on your API tier.

NOTE

Only the ConsolidateMemoryCluster jobs are rate-limited since they make LLM calls. The parent ConsolidateUserMemories jobs run without throttling.