Actions
Defining agent capabilities and task execution.
Actions represent the capabilities of a Daydreams agent – the specific tasks it can perform in response to inputs or its internal reasoning. They are the primary way agents interact with external systems, APIs, or execute complex logic.
Examples of actions could include:
- Sending a tweet.
- Fetching data from an API.
- Executing a smart contract transaction.
- Querying a database.
- Writing to a file.
Defining an Action
Actions are defined using the action
helper function exported from
@daydreamsai/core
.
Key Parameters:
name
(string): Unique identifier used in<action_call name="...">
.description
/instructions
(string, optional): Help the LLM understand the action's purpose and usage. Included in the<available-actions>
section of the prompt.schema
(Zod Schema, optional): Defines and validates arguments passed by the LLM. Arguments are parsed from the JSON content within the<action_call>
tag.handler
(Function): The core logic. Receives validatedargs
(if schema is defined) and theActionCallContext
. The return value is wrapped in anActionResult
.returns
(Zod Schema, optional): Documents the expected return shape of the handler.format
(Function, optional): Customizes how theActionResult
is logged or displayed.memory
(Memory, optional): Allows associating persistent state specifically with this action across multiple calls.enabled
(Function, optional): Dynamically determines if the action should be available to the LLM based on the current context.retry
(boolean | number | Function, optional): Configures automatic retries if the handler throws an error.onError
(Function, optional): Custom logic to execute if the handler fails (after retries).context
(Context, optional): Restricts the action to be available only when a specific context type is active.
ActionCallContext
The handler
function receives a context object (ctx
) with useful properties:
LLM Interaction
- Availability: Enabled actions are presented to the LLM within the
<available-actions>
tag in the prompt, including their name, description, instructions, and argument schema. - Invocation: The LLM requests an action by including an
<action_call name="actionName">...</action_call>
tag in its response stream. The content inside the tag should be a JSON object matching the action'sschema
.
Execution Flow
- Parsing: When the framework parses an
<action_call>
from the LLM stream (handleActionCallStream
instreaming.ts
), it identifies the action byname
. - Argument Handling:
prepareActionCall
(handlers.ts
) parses the JSON content inside the tag. - Template Resolution: It resolves any template variables (e.g.,
{{calls[0].id}}
,{{shortTermMemory.someKey}}
) within the parsed arguments against the current run's state. - Validation: If a
schema
is defined for the action, the resolved arguments are validated against it. - Execution:
handleActionCall
(handlers.ts
) enqueues the actual execution using therunAction
task (tasks/index.ts
) via theTaskRunner
. - Handler Invocation: The
runAction
task executes the action'shandler
function with the validated arguments and theActionCallContext
. - Result: The return value of the
handler
is wrapped in anActionResult
object. - Feedback: The
ActionResult
is pushed back into the processing loop (handlePushLog
instreaming.ts
), making the result available to the agent (and potentially the LLM in the next step's prompt).
Actions are the fundamental mechanism for agents to interact with the world and perform tasks beyond simple text generation.