Appearance
Memory Consolidation
Over time, extraction creates many related memories about similar topics. Consolidation merges semantically similar memories into denser, more useful representations.
Why Consolidation Matters
- Reduces redundancy - Multiple similar memories merged into one
- Improves coherence - Related facts combined into comprehensive memories
- Increases quality - LLM review improves wording and accuracy
How It Works
Consolidation uses a two-phase job architecture for efficient parallel processing:
Phase 1: Cluster Building
- Find memories with high similarity (≥0.80)
- Group them into clusters using greedy clustering
- Dispatch one job per cluster
Phase 2: Cluster Processing
- Each cluster job asks the LLM if memories should merge
- If approved, create consolidated memory and soft-delete originals
- If rejected, memories remain separate
The LLM may keep memories separate if they contain contradictory information, represent different time periods, or merging would lose important nuance.
Example
Input memories:
- "Has 8 years of PHP experience"
- "Works primarily with Laravel framework"
- "Experienced with PHP development"
Result:
- "Experienced PHP developer with 8+ years of experience, primarily working with the Laravel framework"
Generation Tracking
Consolidation tracks how many times a memory has been merged through generation tracking. This enables organic memory evolution while preventing runaway over-consolidation.
How Generations Work
- Generation 0: Original memories from extraction or manual creation
- Generation 1: First consolidation (merging original memories)
- Generation 2+: Re-consolidation of already-consolidated memories
- Generation 5: Maximum - these memories won't be re-consolidated
When memories are consolidated, the new memory's generation is calculated as max(source generations) + 1.
Re-Consolidation
Unlike many memory systems that only consolidate original memories, Iris can re-consolidate already-merged memories. This allows memories to evolve naturally over time as more related information is gathered.
For example, a Gen 1 memory about "prefers TypeScript" might later merge with another Gen 1 memory about "values static typing" to form a richer Gen 2 memory.
IMPORTANT
The LLM applies extra scrutiny when re-consolidating. High-generation memories already pack dense information, so only truly redundant memories are merged.
Generation Limits
Generation 5 is the hard limit. Memories reaching this generation are excluded from future consolidation to prevent:
- Over-abstraction losing important details
- Runaway consolidation chains
- Memory content becoming too generic
Running Consolidation
By default, consolidation dispatches jobs to the queue for parallel processing:
bash
# Queue batch for all users (default behavior)
php artisan iris:consolidate-memories
# Full sweep - process ALL memories, ignore days filter
php artisan iris:consolidate-memories --full
# Run synchronously (useful for debugging)
php artisan iris:consolidate-memories --sync
# Preview without changes (forces sync)
php artisan iris:consolidate-memories --dry-run
# Single user
php artisan iris:consolidate-memories --user=1
# Custom threshold for full sweep
php artisan iris:consolidate-memories --full --threshold=0.75Monitoring Health
Check the health of your memory consolidation system:
bash
# View consolidation statistics
php artisan iris:memory-health
# Filter by specific user
php artisan iris:memory-health --user=1The command shows:
- Generation distribution - Count of memories at each generation
- Original vs consolidated - Balance between new and merged memories
- Consolidation ratio - Percentage of memories that are consolidation results
- At max generation - Memories that won't be re-consolidated
- Eligible count - Memories available for future consolidation
Example output:
Memory Consolidation Health
Generation Count
Gen 0 (original) ............................................... 657
Gen 1 ............................................................ 444
Total memories ................................................. 1101
Original memories ............................................... 657
Consolidation results ........................................... 444
At max generation ................................................. 0
Consolidation ratio ........................................... 40.3%
Eligible for consolidation ..................................... 1101Processing Batches
When running in queue mode, the command returns a batch ID:
bash
php artisan iris:consolidate-memories
# Dispatched batch: 9c3b5f2a-...
# Process jobs
php artisan queue:work
# Retry failed jobs in a batch
php artisan queue:retry-batch 9c3b5f2a-...Use Laravel Horizon for monitoring batch progress in a UI.
Scheduled Runs
Iris automatically schedules consolidation in bootstrap/app.php:
php
// Daily incremental at 3:00 AM
$schedule->command('iris:consolidate-memories')
->dailyAt('03:00')
->withoutOverlapping()
->runInBackground();
// Weekly full sweep on Sundays at 4:00 AM
$schedule->command('iris:consolidate-memories --full')
->weeklyOn(Schedule::SUNDAY, '04:00')
->withoutOverlapping()
->runInBackground();Job Architecture
Consolidation uses two job types for efficient processing:
ConsolidateUserMemories
Handles Phase 1 for a single user:
- Builds memory clusters (fast, no LLM calls)
- Dispatches
ConsolidateMemoryClusterjobs for each cluster - 60 second timeout
ConsolidateMemoryCluster
Handles Phase 2 for a single cluster:
- Reconstructs cluster from memory IDs
- Calls LLM to review and decide on merging
- Executes consolidation if approved
- 120 second timeout
- Rate-limited to prevent API overload
This architecture prevents timeout issues with large memory sets by processing clusters independently.
Configuration
| Setting | Default | Description |
|---|---|---|
consolidation.similarity_threshold | 0.80 | Minimum similarity to cluster |
consolidation.days_lookback | 3 | Days of memories to consider for daily runs |
consolidation.max_cluster_size | 5 | Max memories per cluster |
consolidation.min_cluster_size | 2 | Min memories to form a cluster |
consolidation.max_generation | 5 | Maximum consolidation generation |
consolidation.jobs_per_minute | 10 | Rate limit for queued jobs |
Higher threshold (0.85+) is more conservative. Lower (0.75) merges more aggressively but may lose nuance.
TIP
Use --full for a weekly sweep to catch memories that didn't cluster during daily runs due to the time filter.
Rate Limiting
Consolidation jobs are rate-limited to prevent overwhelming LLM APIs:
- Preventive: Jobs throttled to
jobs_per_minutelimit - Reactive: If rate limited by the API, jobs automatically retry after the limit resets using Prism's
resetsAttiming
Adjust CONSOLIDATION_JOBS_PER_MINUTE in your .env based on your API tier.
NOTE
Only the ConsolidateMemoryCluster jobs are rate-limited since they make LLM calls. The parent ConsolidateUserMemories jobs run without throttling.