Daydreams Logo

Agent Lifecycle

How Daydreams agents process information and execute tasks.

The core of the Daydreams framework is the agent's execution lifecycle. This loop manages how an agent receives input, reasons with an LLM, performs actions, and handles results. Understanding this flow is crucial for building and debugging agents.

Let's trace the lifecycle of a typical request:

1. Input Reception

  • Source: An input source (like Discord, Telegram, CLI) is configured via an extension.
  • Subscription: The input definition within the extension (e.g., discord:message) uses a subscribe method to listen for external events.
  • Trigger: When the external service (e.g., Discord API) emits an event (like a new message), the subscribe callback is triggered.
  • Invocation: This callback usually invokes agent.send(context, args, data), providing:
    • The target context (e.g., discordChannelContext).
    • args to identify the specific context instance (e.g., { channelId: '...' }).
    • The input data (e.g., user message content).

2. agent.send

  • Log Creation: Creates an InputRef object (a type of Log) containing the input details (type, content, timestamp) and marks it as processed: false.
  • Run Initiation: Calls agent.run, passing the context details and the new InputRef as part of the initial processing chain.

3. agent.run

  • Context Initialization: Retrieves or creates the ContextState for the given context type and arguments. It also retrieves or creates the associated WorkingMemory for this specific run.
  • Concurrency Check: Checks if this specific context (ctxId) is already processing in the contextsRunning map.
    • If yes: Pushes the new InputRef onto the existing run's stream handler (push) and returns the promise associated with the ongoing run.
    • If no: Proceeds to set up a new run.
  • Stream Handler Setup: Calls createContextStreamHandler. This critical function sets up the state management for this run, including:
    • The state object (tracking steps, logs, actions, outputs, errors, calls, etc.).
    • The handler function that will process parsed XML tokens from the LLM's response stream.
    • The push function to add logs (Log objects) to the processing pipeline.
  • Tracking: Adds the new run state to the contextsRunning map.
  • Start Run: Calls state.start(). This prepares the initial context state (prepareContext) by loading available actions, outputs, etc., and creates the first StepRef log.
  • Step Loop: Enters the main processing loop (while ((maxSteps = getMaxSteps()) >= state.step)), which iterates through reasoning steps.

4. Inside the Step Loop

Each iteration represents one turn of the agent's reasoning cycle:

  • Prepare State: Calls prepareContext again (or state.nextStep() which internally calls prepare). This refreshes the available actions, outputs, and context data based on the current WorkingMemory, including results from the previous step.
  • Prompt Generation:
    • formatPromptSections gathers the prepared actions, outputs, context states, and WorkingMemory logs (both processed and unprocessed).
    • Various format* functions convert these objects into standardized XML strings using the xml helper and formatXml.
    • render injects these XML strings into the main promptTemplate.
  • LLM Call:
    • The runGenerate task is enqueued via the taskRunner.
    • It sends the fully formatted XML prompt to the configured LLM.
    • It returns a text stream (stream) of the LLM's response and a promise for the complete text (getTextResponse).
  • Stream Processing:
    • handleStream consumes the LLM's text stream.
    • It uses xmlStreamParser to parse the incoming text into XML tokens (StartTag, EndTag, TextContent) based on recognized tags.
    • handleStream reconstructs logical StackElement objects from these tokens.
    • For each StackElement, the handler function (created by createContextStreamHandler) is invoked.
  • Handling Parsed Elements:
    • This handler uses getOrCreateRef to create or update partial Log entries (like Thought, ActionCall, OutputRef) based on the incoming StackElement data.
    • When an element is fully parsed (el.done is true), it calls handlePushLog.
  • Processing Completed Logs:
    • This is the core logic reacting to the LLM's parsed output. Based on the Log object's ref type:
      • thought: Logs the reasoning step, calls the onThinking handler if provided.
      • input: Calls handleInput for schema validation, custom processing, and potential episodic memory query.
      • output: Calls handleOutputStream -> handleOutput. Validates against schema, runs the output's handler, formats result, adds OutputRef to memory.
      • action_call: Calls handleActionCallStream -> prepareActionCall to parse args and resolve templates. Pushes execution logic (handleActionCall -> runAction task) onto state.calls. The ActionResult is added back via handlePushLog.
      • action_result: Logs the result. Optionally calls generateEpisode.
    • Notifies subscribers (onLogStream).
    • Saves the updated WorkingMemory.
  • Action Synchronization: After the LLM response stream is fully processed, agent.run waits for all action promises in state.calls to settle.
  • Loop Continuation: Checks state.shouldContinue(). If yes, increments state.step and loops.

5. Run Completion

  • Exit Loop: Once the loop condition is met, the loop exits.
  • Cleanup: Marks any remaining logs as processed.
  • Hooks: Calls onRun hooks defined in the active contexts.
  • Save State: Saves the final state of all involved contexts.
  • Release: Removes the run (ctxId) from the contextsRunning map.
  • Return: Resolves the original promise returned by agent.run with the complete array of Log objects generated during the run (state.chain).

This detailed cycle illustrates how Daydreams agents iteratively perceive (inputs, results), reason (LLM prompt/response), and act (outputs, actions), using streaming and asynchronous processing to handle complex interactions efficiently.

On this page