Appearance
Background Jobs
Iris processes memory extraction, conversation summarization, and memory consolidation in the background so your chat experience stays fast and responsive. Here's how the pieces fit together.
Why Background Jobs?
When you send a message, you want a quick response. But extracting memories and generating summaries involves LLM calls that can take several seconds. By running these as background jobs:
- Chat stays fast - Responses stream immediately
- Processing is reliable - Jobs retry on failure
- Load is distributed - Jobs run when capacity is available
Running Horizon
Iris uses Laravel Horizon for queue management:
bash
# For development (included in composer dev)
php artisan horizonHorizon provides:
- Dashboard at
/horizonfor monitoring - Automatic process management
- Job metrics and failure tracking
IMPORTANT
Without Horizon running, chat messages won't process, memories won't be extracted, and conversations won't be summarized.
Chat Processing
Chat messages are processed asynchronously, enabling reliable delivery even when connections are unstable.
How It Works
- When you send a message, a job is queued immediately
- The job processes your request, executing tools as needed
- Response events broadcast to your browser via WebSockets
- If your connection drops, events are stored for replay
Reliability Features
| Feature | Benefit |
|---|---|
| Automatic retries | Failed requests retry with exponential backoff |
| Event storage | Reconnecting clients catch up automatically |
| Graceful stops | Users can stop generation mid-stream |
Automatic Jobs
These jobs dispatch automatically after Iris processes a response, based on configurable thresholds.
ExtractMemories
Pulls meaningful information from recent conversations and stores it as searchable memories with embeddings.
| Setting | Default | Description |
|---|---|---|
| Trigger | Every 10 messages | Configurable via iris.extraction.threshold |
| Max per run | 6 memories | Prevents over-extraction from a single batch |
| Timeout | 120 seconds | Maximum LLM call duration |
The job builds context from up to 50 recent messages, uses an LLM to identify what's worth remembering, then creates memory records with vector embeddings for semantic search.
See Memory Extraction for details on what gets extracted.
SummarizeConversation
Creates narrative summaries of older conversations, capturing emotional context and relationship dynamics.
| Setting | Default | Description |
|---|---|---|
| Trigger | 40+ unsummarized messages | After keeping 35 recent messages as buffer |
| Timeout | 120 seconds | Maximum LLM call duration |
Summaries chain together via previous_summary_id, maintaining conversational continuity across sessions. Each summary includes emotional markers, resolved/unresolved threads, and relationship dynamics.
See Summarization for details on what summaries capture.
Scheduled Jobs
Memory Consolidation
Consolidation merges semantically similar memories into denser, more useful representations. It runs on two schedules:
- Daily (3 AM) - Processes memories from the last 3 days
- Weekly (Sunday 4 AM) - Full sweep of all memories
Consolidation uses a two-phase job architecture:
ConsolidateUserMemories- Builds clusters of similar memoriesConsolidateMemoryCluster- Processes each cluster with LLM review
See Memory Consolidation for the full details on how it works, generation tracking, and command options.
Agent Daemon
Delegated sub-agent tasks are processed by the iris:agent daemon — a separate long-running process that operates independently of Horizon.
Why a Daemon?
Sub-agent tasks can run for several minutes and involve streaming LLM responses with real-time tool call broadcasting. This doesn't fit well into a traditional queue worker model. The daemon provides:
- Sequential processing — one task at a time, preventing resource exhaustion
- Inline rate limit handling — sleeps and retries when API limits are hit, rather than releasing back to a queue
- Graceful shutdown — responds to
SIGINT/SIGTERMsignals, finishing the current task before stopping - Exponential backoff — when idle, polling frequency decreases to reduce overhead
Running the Daemon
bash
# Included automatically in composer dev
php artisan iris:agentThe daemon polls for pending tasks, processes them using AgentTaskRunner, and broadcasts progress events via WebSockets. Task results are delivered as proactive messages in the chat.
IMPORTANT
The agent daemon is not managed by Horizon and won't appear in the Horizon dashboard. Monitor it through your process manager or terminal output.
See Task Delegation for feature details and configuration.
Monitoring Jobs
Horizon Dashboard
Access the Horizon dashboard at /horizon to monitor:
- Job throughput and processing times
- Failed job details and stack traces
- Queue lengths and wait times
- Memory and process health
Queue Commands
bash
# Check failed jobs
php artisan queue:failed
# Retry failed jobs
php artisan queue:retry allLogging
Jobs log their activity to Laravel's default log channel. Check storage/logs/laravel.log for:
- Job start/completion timestamps
- Extraction and summarization results
- Error details for failed jobs
Failure Handling
Automatic Retries
Jobs are configured with retry policies:
| Job | Retries | Backoff |
|---|---|---|
| ProcessChatRequest | 3 | Exponential |
| ExtractMemories | 3 | Exponential |
| SummarizeConversation | 3 | Exponential |
| ConsolidateMemoryCluster | Time-based (2 hours) | Rate-limit aware |
Rate Limit Handling
Consolidation jobs handle API rate limits specially:
- Jobs release back to queue when rate limited
- Automatic retry when the limit resets
- Uses Prism's
resetsAttiming for precise retry scheduling
Manual Intervention
If jobs are persistently failing:
bash
# View failed jobs
php artisan queue:failed
# Retry specific job
php artisan queue:retry <id>
# Clear failed jobs
php artisan queue:flushConfiguration
Key settings in config/iris.php:
php
'extraction' => [
'threshold' => 10, // Messages between extractions
'max_memories' => 6, // Max memories per run
'timeout' => 120, // API timeout in seconds
'model' => 'claude-sonnet-4-5',
],
'summarization' => [
'threshold' => 40, // Unsummarized messages to trigger
'buffer' => 35, // Buffer before summarizing old messages
'timeout' => 120, // API timeout in seconds
'model' => 'claude-sonnet-4-5',
],
'consolidation' => [
'jobs_per_minute' => 10, // Rate limit for LLM calls
// ... other settings
],