Skip to content

Chat Interface

The chat interface orchestrates the full conversation flow -assembling context, streaming responses, executing tools, and persisting messages. Understanding this flow helps you see how all of Iris's components work together.

Request Flow

When you send a message, Iris processes it through several stages before streaming a response:

1. Context Recall

The ContextRetriever fetches relevant context using two complementary systems:

  • Truths: Stable, core facts about you (pinned Truths always included, others ranked by relevance)
  • Memories: Semantic search finds memories related to the current conversation

This happens first because context influences how Iris responds.

2. Context Assembly

Multiple context sources are gathered in parallel:

  • Recent conversation history (up to 50 messages)
  • Conversation summaries (up to 3 recent summaries)
  • Calendar events (next 7 days, if connected)
  • The current date and time

3. System Prompt Building

The SystemPromptBuilder assembles a personalized prompt by rendering each registered prompt class in order. The result includes Iris's identity, recalled memories, summaries, calendar context, and temporal information.

4. Process and Stream

The request is queued for asynchronous processing via Prism PHP. Response events stream to your browser via WebSockets in real-time, including text chunks and tool calls. If Iris decides to use a tool, execution happens server-side before continuing the response.

This architecture provides reliable delivery with automatic recovery from connection drops.

5. Persist and Process

After the response completes:

  • The conversation is saved to the database
  • Token usage is recorded for monitoring
  • Background jobs are dispatched (memory extraction, summarization)

Agentic Behavior

Iris operates as an agent, meaning it can use tools and iterate multiple times before providing a final response. The agent.max_steps configuration (default: 30) limits how many iterations can occur.

This enables complex, multi-step tasks:

  • Search and ask: Search memories, find nothing relevant, then ask a clarifying question
  • Create and remember: Create a calendar event, then store a memory about why it was scheduled
  • Generate and describe: Generate an image, then describe what was created

Each tool invocation is a step. A simple memory storage is 1 step. A complex request might use 5-10 steps as Iris searches, creates, and confirms.

NOTE

Tool calls stream to the frontend in real-time, so users see what Iris is doing as it works.

Features

Text-to-Speech

Assistant messages include a play button that reads the message aloud using ElevenLabs. Click once to play, click again to stop. This feature requires setup - see Text-to-Speech for configuration.

Image Support

Users can upload images for multi-modal conversations. Images are attached to messages and sent to Claude as part of the request. Common use cases include:

  • Asking questions about screenshots
  • Getting feedback on designs
  • Extracting information from photos

Retry

If a response fails (network error, API timeout), users can retry the last message. This re-runs the full request flow with the same input.

Streaming

All responses stream in real-time via WebSockets using Laravel Reverb. The stream includes:

Event TypeContent
Text chunksPartial response text as it's generated
Tool callsWhen Iris invokes a tool
Tool resultsWhat the tool returned
Provider toolsActivity from Anthropic's built-in tools
ArtifactsGenerated content like images

Connection Recovery

If your connection drops briefly, you won't miss any events:

  • Events are temporarily stored in Redis for replay
  • Clients automatically catch up when reconnecting
  • Sequence numbers ensure no duplicates

Stopping Streams

Users can stop a stream mid-generation. The stop is graceful—partial responses are preserved and the conversation state remains consistent.

Request Architecture

When you send a message, Iris processes it asynchronously:

  1. Accept & Queue: Your message is received and a background job is dispatched
  2. Process: The job builds context, executes the agent, and generates a response
  3. Broadcast: Response events stream to your browser via WebSockets in real-time
  4. Persist: The conversation is saved and background jobs (extraction, summarization) are queued

This architecture enables:

  • Reliable delivery: If your connection drops, events are stored and replayed when you reconnect
  • Graceful retries: Failed requests automatically retry with exponential backoff
  • Non-blocking responses: Your browser remains responsive while processing happens server-side

Frontend Integration

The frontend uses Laravel Echo for WebSocket connections. A custom hook manages connection state, event sequencing, and automatic replay on reconnection.