Appearance
Background Jobs
Iris processes memory extraction, conversation summarization, and memory consolidation in the background so your chat experience stays fast and responsive. Here's how the pieces fit together.
Why Background Jobs?
When you send a message, you want a quick response. But extracting memories and generating summaries involves LLM calls that can take several seconds. By running these as background jobs:
- Chat stays fast - Responses stream immediately
- Processing is reliable - Jobs retry on failure
- Load is distributed - Jobs run when capacity is available
Running Horizon
Iris uses Laravel Horizon for queue management:
bash
# For development (included in composer dev)
php artisan horizonHorizon provides:
- Dashboard at
/horizonfor monitoring - Automatic process management
- Job metrics and failure tracking
IMPORTANT
Without Horizon running, chat messages won't process, memories won't be extracted, and conversations won't be summarized.
Chat Processing
Chat messages are processed asynchronously, enabling reliable delivery even when connections are unstable.
How It Works
- When you send a message, a job is queued immediately
- The job processes your request, executing tools as needed
- Response events broadcast to your browser via WebSockets
- If your connection drops, events are stored for replay
Reliability Features
| Feature | Benefit |
|---|---|
| Automatic retries | Failed requests retry with exponential backoff |
| Event storage | Reconnecting clients catch up automatically |
| Graceful stops | Users can stop generation mid-stream |
Automatic Jobs
These jobs dispatch automatically after Iris processes a response, based on configurable thresholds.
ExtractMemories
Pulls meaningful information from recent conversations and stores it as searchable memories with embeddings.
| Setting | Default | Description |
|---|---|---|
| Trigger | Every 10 messages | Configurable via iris.extraction.threshold |
| Max per run | 6 memories | Prevents over-extraction from a single batch |
| Timeout | 120 seconds | Maximum LLM call duration |
The job builds context from up to 50 recent messages, uses an LLM to identify what's worth remembering, then creates memory records with vector embeddings for semantic search.
See Memory Extraction for details on what gets extracted.
SummarizeConversation
Creates narrative summaries of older conversations, capturing emotional context and relationship dynamics.
| Setting | Default | Description |
|---|---|---|
| Trigger | 40+ unsummarized messages | After keeping 35 recent messages as buffer |
| Timeout | 120 seconds | Maximum LLM call duration |
Summaries chain together via previous_summary_id, maintaining conversational continuity across sessions. Each summary includes emotional markers, resolved/unresolved threads, and relationship dynamics.
See Summarization for details on what summaries capture.
Scheduled Jobs
Memory Consolidation
Consolidation merges semantically similar memories into denser, more useful representations. It runs on two schedules:
- Daily (3 AM) - Processes memories from the last 3 days
- Weekly (Sunday 4 AM) - Full sweep of all memories
Consolidation uses a two-phase job architecture:
ConsolidateUserMemories- Builds clusters of similar memoriesConsolidateMemoryCluster- Processes each cluster with LLM review
See Memory Consolidation for the full details on how it works, generation tracking, and command options.
Monitoring Jobs
Horizon Dashboard
Access the Horizon dashboard at /horizon to monitor:
- Job throughput and processing times
- Failed job details and stack traces
- Queue lengths and wait times
- Memory and process health
Queue Commands
bash
# Check failed jobs
php artisan queue:failed
# Retry failed jobs
php artisan queue:retry allLogging
Jobs log their activity to Laravel's default log channel. Check storage/logs/laravel.log for:
- Job start/completion timestamps
- Extraction and summarization results
- Error details for failed jobs
Failure Handling
Automatic Retries
Jobs are configured with retry policies:
| Job | Retries | Backoff |
|---|---|---|
| ProcessChatRequest | 3 | Exponential |
| ExtractMemories | 3 | Exponential |
| SummarizeConversation | 3 | Exponential |
| ConsolidateMemoryCluster | Time-based (2 hours) | Rate-limit aware |
Rate Limit Handling
Consolidation jobs handle API rate limits specially:
- Jobs release back to queue when rate limited
- Automatic retry when the limit resets
- Uses Prism's
resetsAttiming for precise retry scheduling
Manual Intervention
If jobs are persistently failing:
bash
# View failed jobs
php artisan queue:failed
# Retry specific job
php artisan queue:retry <id>
# Clear failed jobs
php artisan queue:flushConfiguration
Key settings in config/iris.php:
php
'extraction' => [
'threshold' => 10, // Messages between extractions
'max_memories' => 6, // Max memories per run
'timeout' => 120, // API timeout in seconds
'model' => 'claude-sonnet-4-5',
],
'summarization' => [
'threshold' => 40, // Unsummarized messages to trigger
'keep_recent' => 50, // Messages to keep in active conversation
'buffer' => 35, // Buffer before summarizing old messages
'timeout' => 120, // API timeout in seconds
'model' => 'claude-sonnet-4-5',
],
'consolidation' => [
'jobs_per_minute' => 10, // Rate limit for LLM calls
// ... other settings
],