Agent Lifecycle
How Daydreams agents process information and execute tasks.
The core of the Daydreams framework is the agent's execution lifecycle. This loop manages how an agent receives input, reasons with an LLM, performs actions, and handles results. Understanding this flow is crucial for building and debugging agents.
Let's trace the lifecycle of a typical request:
1. Input Reception
- Source: An input source (like Discord, Telegram, CLI) is configured via an
extension
. - Subscription: The
input
definition within the extension (e.g.,discord:message
) uses asubscribe
method to listen for external events. - Trigger: When the external service (e.g., Discord API) emits an event
(like a new message), the
subscribe
callback is triggered. - Invocation: This callback usually invokes
agent.send(context, args, data)
, providing:- The target
context
(e.g.,discordChannelContext
). args
to identify the specific context instance (e.g.,{ channelId: '...' }
).- The input
data
(e.g., user message content).
- The target
2. agent.send
- Log Creation: Creates an
InputRef
object (a type ofLog
) containing the input details (type, content, timestamp) and marks it asprocessed: false
. - Run Initiation: Calls
agent.run
, passing the context details and the newInputRef
as part of the initial processingchain
.
3. agent.run
- Context Initialization: Retrieves or creates the
ContextState
for the given context type and arguments. It also retrieves or creates the associatedWorkingMemory
for this specific run. - Concurrency Check: Checks if this specific context (
ctxId
) is already processing in thecontextsRunning
map.- If yes: Pushes the new
InputRef
onto the existing run's stream handler (push
) and returns the promise associated with the ongoing run. - If no: Proceeds to set up a new run.
- If yes: Pushes the new
- Stream Handler Setup: Calls
createContextStreamHandler
. This critical function sets up the state management for this run, including:- The
state
object (tracking steps, logs, actions, outputs, errors, calls, etc.).
- The
handler
function that will process parsed XML tokens from the LLM's response stream. - The
push
function to add logs (Log
objects) to the processing pipeline.
- The
- Tracking: Adds the new run state to the
contextsRunning
map. - Start Run: Calls
state.start()
. This prepares the initial context state (prepareContext
) by loading available actions, outputs, etc., and creates the firstStepRef
log. - Step Loop: Enters the main processing loop
(
while ((maxSteps = getMaxSteps()) >= state.step)
), which iterates through reasoning steps.
4. Inside the Step Loop
Each iteration represents one turn of the agent's reasoning cycle:
- Prepare State: Calls
prepareContext
again (orstate.nextStep()
which internally callsprepare
). This refreshes the available actions, outputs, and context data based on the currentWorkingMemory
, including results from the previous step. - Prompt Generation:
formatPromptSections
gathers the prepared actions, outputs, context states, andWorkingMemory
logs (both processed and unprocessed).- Various
format*
functions convert these objects into standardized XML strings using thexml
helper andformatXml
. render
injects these XML strings into the mainpromptTemplate
.
- LLM Call:
- The
runGenerate
task is enqueued via thetaskRunner
. - It sends the fully formatted XML prompt to the configured LLM.
- It returns a text stream (
stream
) of the LLM's response and a promise for the complete text (getTextResponse
).
- The
- Stream Processing:
handleStream
consumes the LLM's text stream.- It uses
xmlStreamParser
to parse the incoming text into XML tokens (StartTag
,EndTag
,TextContent
) based on recognized tags. handleStream
reconstructs logicalStackElement
objects from these tokens.- For each
StackElement
, thehandler
function (created bycreateContextStreamHandler
) is invoked.
- Handling Parsed Elements:
- This
handler
usesgetOrCreateRef
to create or update partialLog
entries (likeThought
,ActionCall
,OutputRef
) based on the incomingStackElement
data. - When an element is fully parsed (
el.done
is true), it callshandlePushLog
.
- This
- Processing Completed Logs:
- This is the core logic reacting to the LLM's parsed output. Based on the
Log
object'sref
type:thought
: Logs the reasoning step, calls theonThinking
handler if provided.input
: CallshandleInput
for schema validation, custom processing, and potential episodic memory query.output
: CallshandleOutputStream
->handleOutput
. Validates against schema, runs the output'shandler
, formats result, addsOutputRef
to memory.action_call
: CallshandleActionCallStream
->prepareActionCall
to parse args and resolve templates. Pushes execution logic (handleActionCall
->runAction
task) ontostate.calls
. TheActionResult
is added back viahandlePushLog
.action_result
: Logs the result. Optionally callsgenerateEpisode
.
- Notifies subscribers (
onLogStream
). - Saves the updated
WorkingMemory
.
- This is the core logic reacting to the LLM's parsed output. Based on the
- Action Synchronization: After the LLM response stream is fully processed,
agent.run
waits for all action promises instate.calls
to settle. - Loop Continuation: Checks
state.shouldContinue()
. If yes, incrementsstate.step
and loops.
5. Run Completion
- Exit Loop: Once the loop condition is met, the loop exits.
- Cleanup: Marks any remaining logs as processed.
- Hooks: Calls
onRun
hooks defined in the active contexts. - Save State: Saves the final state of all involved contexts.
- Release: Removes the run (
ctxId
) from thecontextsRunning
map. - Return: Resolves the original promise returned by
agent.run
with the complete array ofLog
objects generated during the run (state.chain
).
This detailed cycle illustrates how Daydreams agents iteratively perceive (inputs, results), reason (LLM prompt/response), and act (outputs, actions), using streaming and asynchronous processing to handle complex interactions efficiently.