Base User and Injection Tasks
BaseUserTask
¶
ID
instance-attribute
¶
ID: str
The ID of the task. This should be unique across all tasks. This is set automatically by the task suite with register_user_task
.
PROMPT
instance-attribute
¶
PROMPT: str
The prompt that the model receives. It must be precise and LLM-understandable.
GROUND_TRUTH_OUTPUT
class-attribute
instance-attribute
¶
GROUND_TRUTH_OUTPUT: str = ''
The portion of output that the model should return when the ground truth is executed.
DIFFICULTY
class-attribute
instance-attribute
¶
DIFFICULTY: TaskDifficulty = EASY
The difficulty of the task. This is used to determine the difficulty of the task in the task suite.
init_environment
staticmethod
¶
Use this function if the environment needs to be initialized. By default, nothing is done.
Source code in src/agentdojo/base_tasks.py
28 29 30 31 |
|
ground_truth
abstractmethod
¶
ground_truth(pre_environment: Env) -> list[FunctionCall]
Returns a list of FunctionCall
s that solve the task if executed.
Must be implemented by the subclass.
Each FunctionCall
is a dictionary with the following keys:
- function
: the function name
- args
: a dict of arguments, with the argument name as the key and the argument value as the value.
If the value is unknown and not important for solving the task, you can use any value.
Parameters:
-
pre_environment
(Env
) –the environment before the function calls are executed. Can be used to retrieve some arguments that can be used for the function calls.
Source code in src/agentdojo/base_tasks.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
utility
abstractmethod
¶
utility(
model_output: str,
pre_environment: Env,
post_environment: Env,
strict: bool = True,
) -> bool
Checks if the task was executed correctly. Returns True if the task was executed correctly, False otherwise.
It can do so by checking the model output, and by looking for differences between the environment before and after the agent is executed.
Must be implemented by the subclass.
Parameters:
-
model_output
(str
) –the output of the agent
-
pre_environment
(Env
) –the environment before the agent is executed
-
post_environment
(Env
) –the environment after the agent is executed
-
strict
(bool
, default:True
) –if
True
the state check will be strict. IfFalse
the state check will be relaxed. This is should be used in read-only tasks when combined with tasks with side-effects.
Source code in src/agentdojo/base_tasks.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
|
utility_from_traces
¶
utility_from_traces(
model_output: str,
pre_environment: Env,
post_environment: Env,
traces: Sequence[FunctionCall],
) -> bool | None
Checks if the task was executed correctly. Returns True if the task was executed correctly, False otherwise.
It can do that by looking at the traces of the functions that are called. It is useful for tasks that do not leave a trace in the environment at the end of the execution.
Can be implemented by the subclass if needed. Otherwise, utility
will be used.
Parameters:
-
model_output
(str
) –the output of the agent
-
pre_environment
(Env
) –the environment before the agent is executed
-
post_environment
(Env
) –the environment after the agent is executed
-
traces
(Sequence[FunctionCall]
) –the traces of the functions that are called
Source code in src/agentdojo/base_tasks.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
|
BaseInjectionTask
¶
ID
instance-attribute
¶
ID: str
The ID of the task. This should be unique across all tasks. This is set automatically by the task suite with register_injection_task
.
GOAL
instance-attribute
¶
GOAL: str
The goal of the injection. It must be precise and LLM-understandable. It should be usable to generate the injection prompts by attacks.
DIFFICULTY
class-attribute
instance-attribute
¶
DIFFICULTY: TaskDifficulty = EASY
The difficulty of the task. This is used to determine the difficulty of the task in the task suite.
GROUND_TRUTH_OUTPUT
class-attribute
instance-attribute
¶
GROUND_TRUTH_OUTPUT: str = ''
The portion of output that the model should return when the ground truth is executed.
ground_truth
abstractmethod
¶
ground_truth(pre_environment: Env) -> list[FunctionCall]
Return a list of function calls that correctly execute the injection task if executed.
Must be implemented by the subclass.
Each function call is a dictionary with the following keys
function
: the function nameargs
: a dict of arguments, with the argument name as the key and the argument value as the value. If the value is unknown and not important for solving the task, use any value.
Parameters:
-
pre_environment
(Env
) –the environment before the function calls are executed. Can be used to retrieve some arguments that can be used for the function calls.
Source code in src/agentdojo/base_tasks.py
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
security
abstractmethod
¶
Checks if the injection goal was executed correctly. Returns True if the injection was executed correctly, False otherwise.
It can do so by checking the model output and by looking for differences between the environment before and after the agent is executed.
Must be implemented by the subclass.
Parameters:
-
model_output
(str
) –the output of the agent
-
pre_environment
(Env
) –the environment before the agent is executed
-
post_environment
(Env
) –the environment after the agent is executed
Source code in src/agentdojo/base_tasks.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
|
security_from_traces
¶
security_from_traces(
model_output: str,
pre_environment: Env,
post_environment: Env,
traces: Sequence[FunctionCall],
) -> bool | None
Checks if the injection goal was executed correctly. Returns True if the task was executed correctly, False otherwise.
It can do that by looking at the traces of the functions that are called. It is useful for tasks that do not leave a trace in the environment at the end of the execution.
Can be implemented by the subclass if needed. Otherwise, security
will be used.
Parameters:
-
model_output
(str
) –the output of the agent
-
pre_environment
(Env
) –the environment before the agent is executed
-
post_environment
(Env
) –the environment after the agent is executed
-
traces
(Sequence[FunctionCall]
) –the traces of the functions that are called
Source code in src/agentdojo/base_tasks.py
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
|