Appearance
Chat Interface
The chat interface orchestrates the full conversation flow -assembling context, streaming responses, executing tools, and persisting messages. Understanding this flow helps you see how all of Iris's components work together.
Request Flow
When you send a message, Iris processes it through several stages before streaming a response:
1. Context Recall
The ContextRetriever fetches relevant context using two complementary systems:
- Truths: Stable, core facts about you (pinned Truths always included, others ranked by relevance)
- Memories: Semantic search finds memories related to the current conversation
This happens first because context influences how Iris responds.
2. Context Assembly
Multiple context sources are gathered in parallel:
- Recent conversation history (up to 50 messages)
- Conversation summaries (up to 3 recent summaries)
- Calendar events (next 7 days, if connected)
- The current date and time
3. System Prompt Building
The SystemPromptBuilder assembles a personalized prompt by rendering each registered prompt class in order. The result includes Iris's identity, recalled memories, summaries, calendar context, and temporal information.
4. Process and Stream
The request is queued for asynchronous processing via Prism PHP. Response events stream to your browser via WebSockets in real-time, including text chunks and tool calls. If Iris decides to use a tool, execution happens server-side before continuing the response.
This architecture provides reliable delivery with automatic recovery from connection drops.
5. Persist and Process
After the response completes:
- The conversation is saved to the database
- Token usage is recorded for monitoring
- Background jobs are dispatched (memory extraction, summarization)
Agentic Behavior
Iris operates as an agent, meaning it can use tools and iterate multiple times before providing a final response. The agent.max_steps configuration (default: 30) limits how many iterations can occur.
This enables complex, multi-step tasks:
- Search and ask: Search memories, find nothing relevant, then ask a clarifying question
- Create and remember: Create a calendar event, then store a memory about why it was scheduled
- Generate and describe: Generate an image, then describe what was created
Each tool invocation is a step. A simple memory storage is 1 step. A complex request might use 5-10 steps as Iris searches, creates, and confirms.
NOTE
Tool calls stream to the frontend in real-time, so users see what Iris is doing as it works.
Features
Text-to-Speech
Assistant messages include a play button that reads the message aloud using ElevenLabs. Click once to play, click again to stop. This feature requires setup - see Text-to-Speech for configuration.
Image Support
Users can upload images for multi-modal conversations. Images are attached to messages and sent to Claude as part of the request. Common use cases include:
- Asking questions about screenshots
- Getting feedback on designs
- Extracting information from photos
Retry
If a response fails (network error, API timeout), users can retry the last message. This re-runs the full request flow with the same input.
Streaming
All responses stream in real-time via WebSockets using Laravel Reverb. The stream includes:
| Event Type | Content |
|---|---|
| Text chunks | Partial response text as it's generated |
| Tool calls | When Iris invokes a tool |
| Tool results | What the tool returned |
| Provider tools | Activity from Anthropic's built-in tools |
| Artifacts | Generated content like images |
Connection Recovery
If your connection drops briefly, you won't miss any events:
- Events are temporarily stored in Redis for replay
- Clients automatically catch up when reconnecting
- Sequence numbers ensure no duplicates
Stopping Streams
Users can stop a stream mid-generation. The stop is graceful—partial responses are preserved and the conversation state remains consistent.
Request Architecture
When you send a message, Iris processes it asynchronously:
- Accept & Queue: Your message is received and a background job is dispatched
- Process: The job builds context, executes the agent, and generates a response
- Broadcast: Response events stream to your browser via WebSockets in real-time
- Persist: The conversation is saved and background jobs (extraction, summarization) are queued
This architecture enables:
- Reliable delivery: If your connection drops, events are stored and replayed when you reconnect
- Graceful retries: Failed requests automatically retry with exponential backoff
- Non-blocking responses: Your browser remains responsive while processing happens server-side
Frontend Integration
The frontend uses Laravel Echo for WebSocket connections. A custom hook manages connection state, event sequencing, and automatic replay on reconnection.