Skip to content

Background Jobs

Iris processes memory extraction, conversation summarization, and memory consolidation in the background so your chat experience stays fast and responsive. Here's how the pieces fit together.

Why Background Jobs?

When you send a message, you want a quick response. But extracting memories and generating summaries involves LLM calls that can take several seconds. By running these as background jobs:

  • Chat stays fast - Responses stream immediately
  • Processing is reliable - Jobs retry on failure
  • Load is distributed - Jobs run when capacity is available

Running Horizon

Iris uses Laravel Horizon for queue management:

bash
# For development (included in composer dev)
php artisan horizon

Horizon provides:

  • Dashboard at /horizon for monitoring
  • Automatic process management
  • Job metrics and failure tracking

IMPORTANT

Without Horizon running, chat messages won't process, memories won't be extracted, and conversations won't be summarized.

Chat Processing

Chat messages are processed asynchronously, enabling reliable delivery even when connections are unstable.

How It Works

  1. When you send a message, a job is queued immediately
  2. The job processes your request, executing tools as needed
  3. Response events broadcast to your browser via WebSockets
  4. If your connection drops, events are stored for replay

Reliability Features

FeatureBenefit
Automatic retriesFailed requests retry with exponential backoff
Event storageReconnecting clients catch up automatically
Graceful stopsUsers can stop generation mid-stream

Automatic Jobs

These jobs dispatch automatically after Iris processes a response, based on configurable thresholds.

ExtractMemories

Pulls meaningful information from recent conversations and stores it as searchable memories with embeddings.

SettingDefaultDescription
TriggerEvery 10 messagesConfigurable via iris.extraction.threshold
Max per run6 memoriesPrevents over-extraction from a single batch
Timeout120 secondsMaximum LLM call duration

The job builds context from up to 50 recent messages, uses an LLM to identify what's worth remembering, then creates memory records with vector embeddings for semantic search.

See Memory Extraction for details on what gets extracted.

SummarizeConversation

Creates narrative summaries of older conversations, capturing emotional context and relationship dynamics.

SettingDefaultDescription
Trigger40+ unsummarized messagesAfter keeping 35 recent messages as buffer
Timeout120 secondsMaximum LLM call duration

Summaries chain together via previous_summary_id, maintaining conversational continuity across sessions. Each summary includes emotional markers, resolved/unresolved threads, and relationship dynamics.

See Summarization for details on what summaries capture.

Scheduled Jobs

Memory Consolidation

Consolidation merges semantically similar memories into denser, more useful representations. It runs on two schedules:

  • Daily (3 AM) - Processes memories from the last 3 days
  • Weekly (Sunday 4 AM) - Full sweep of all memories

Consolidation uses a two-phase job architecture:

  1. ConsolidateUserMemories - Builds clusters of similar memories
  2. ConsolidateMemoryCluster - Processes each cluster with LLM review

See Memory Consolidation for the full details on how it works, generation tracking, and command options.

Monitoring Jobs

Horizon Dashboard

Access the Horizon dashboard at /horizon to monitor:

  • Job throughput and processing times
  • Failed job details and stack traces
  • Queue lengths and wait times
  • Memory and process health

Queue Commands

bash
# Check failed jobs
php artisan queue:failed

# Retry failed jobs
php artisan queue:retry all

Logging

Jobs log their activity to Laravel's default log channel. Check storage/logs/laravel.log for:

  • Job start/completion timestamps
  • Extraction and summarization results
  • Error details for failed jobs

Failure Handling

Automatic Retries

Jobs are configured with retry policies:

JobRetriesBackoff
ProcessChatRequest3Exponential
ExtractMemories3Exponential
SummarizeConversation3Exponential
ConsolidateMemoryClusterTime-based (2 hours)Rate-limit aware

Rate Limit Handling

Consolidation jobs handle API rate limits specially:

  • Jobs release back to queue when rate limited
  • Automatic retry when the limit resets
  • Uses Prism's resetsAt timing for precise retry scheduling

Manual Intervention

If jobs are persistently failing:

bash
# View failed jobs
php artisan queue:failed

# Retry specific job
php artisan queue:retry <id>

# Clear failed jobs
php artisan queue:flush

Configuration

Key settings in config/iris.php:

php
'extraction' => [
    'threshold' => 10,     // Messages between extractions
    'max_memories' => 6,   // Max memories per run
    'timeout' => 120,      // API timeout in seconds
    'model' => 'claude-sonnet-4-5',
],

'summarization' => [
    'threshold' => 40,     // Unsummarized messages to trigger
    'keep_recent' => 50,   // Messages to keep in active conversation
    'buffer' => 35,        // Buffer before summarizing old messages
    'timeout' => 120,      // API timeout in seconds
    'model' => 'claude-sonnet-4-5',
],

'consolidation' => [
    'jobs_per_minute' => 10,  // Rate limit for LLM calls
    // ... other settings
],