Skip to main content

hackagent.router.tracking.tracker

Goal-based result tracking for attack techniques.

This module provides the main Tracker class which creates one Result per goal/datapoint and accumulates traces for each interaction during the attack. This addresses the issue of having too many Results (one per LLM call) with only 1-2 traces each.

Architecture: Attack → Tracker → Result per goal → Multiple Traces per Result

Each attack creates one Result per goal/datapoint via Tracker, then accumulates traces for each interaction during the attack.

For step-level tracking (pipeline steps like "Generation", "Evaluation"), use the StepTracker class from step.py instead.

Context Objects

@dataclass
class Context()

Context for tracking a single goal's attack execution.

increment_sequence

def increment_sequence() -> int

Atomically increment and return the next sequence number.

elapsed_s

@property
def elapsed_s() -> float

Wall-clock seconds since context creation (or until finalized).

Tracker Objects

class Tracker()

Tracks attack execution on a per-goal basis.

Creates one Result per goal, with multiple Traces capturing:

  • Attack attempts (prompts sent, responses received)
  • Intermediate steps (judge evaluations, refinements)
  • Final evaluation status

This provides better organization of results where each Result represents a complete attack attempt on a single goal/datapoint.

Attributes:

  • client - Authenticated client for API calls
  • run_id - Server-side run record ID
  • logger - Logger instance

Example:

>>> tracker = Tracker(client=client, run_id=run_id) >>> >>> for goal in goals: ... # Create result for this goal ... goal_ctx = tracker.create_goal_result(goal, goal_index=i) ... ... # Add traces for each attack attempt ... for iteration in range(n_iterations): ... response = query_target(prompt) ... tracker.add_interaction_trace( ... goal_ctx, ... request={"prompt": prompt}, ... response={"content": response}, ... step_name="Attack Attempt" ... ) ... ... # Finalize with evaluation ... tracker.finalize_goal( ... goal_ctx, ... success=is_success, ... evaluation_notes="Attack succeeded with score 10/10" ... )

__init__

def __init__(backend: StorageBackend,
run_id: str,
logger: Optional[logging.Logger] = None,
attack_type: Optional[str] = None,
category_classifier_config: Optional[Dict[str, Any]] = None,
event_bus: Optional[Any] = None)

Initialize tracker.

Arguments:

  • client - Authenticated client for API calls
  • run_id - Server-side run record ID
  • logger - Optional logger instance
  • attack_type - Optional attack type identifier for metadata
  • event_bus - Optional :class:hackagent.cli.tui.events.TUIEventBus. When provided, the tracker emits structured events (goal_started, goal_finalized, evaluation, ...) so the TUI can render execution live without parsing logs.

is_enabled

@property
def is_enabled() -> bool

Check if tracking is enabled (has backend and run_id).

create_goal_result

def create_goal_result(
goal: str,
goal_index: int,
initial_metadata: Optional[Dict[str, Any]] = None) -> Context

Create a Result record for a goal and return its tracking context.

Arguments:

  • goal - The goal/datapoint text
  • goal_index - Index of this goal in the batch
  • initial_metadata - Optional initial metadata to store

Returns:

Context for tracking this goal's attack execution

add_interaction_trace

def add_interaction_trace(ctx: Context,
request: Dict[str, Any],
response: Any,
step_name: str = "Agent Interaction",
step_type: StepTypeEnum = StepTypeEnum.OTHER,
metadata: Optional[Dict[str, Any]] = None) -> None

Add a trace for an agent interaction (request/response pair).

Arguments:

  • ctx - Context from create_goal_result
  • request - Request data sent to the agent
  • response - Response received from the agent
  • step_name - Human-readable step name
  • step_type - Type of step for categorization
  • metadata - Optional additional metadata

add_evaluation_trace

def add_evaluation_trace(ctx: Context,
evaluation_result: Any,
score: Optional[float] = None,
explanation: Optional[str] = None,
evaluator_name: Optional[str] = None,
metadata: Optional[Dict[str, Any]] = None) -> None

Add a trace for an evaluation step.

Arguments:

  • ctx - Context from create_goal_result
  • evaluation_result - Result from the evaluator
  • score - Optional numeric score
  • explanation - Optional explanation text
  • evaluator_name - Name of the evaluator used
  • metadata - Optional additional metadata

add_custom_trace

def add_custom_trace(ctx: Context,
step_name: str,
content: Dict[str, Any],
step_type: StepTypeEnum = StepTypeEnum.OTHER) -> None

Add a custom trace with arbitrary content.

Arguments:

  • ctx - Context from create_goal_result
  • step_name - Human-readable step name
  • content - Trace content dictionary
  • step_type - Type of step for categorization

finalize_goal

def finalize_goal(ctx: Context,
success: bool,
evaluation_notes: Optional[str] = None,
final_metadata: Optional[Dict[str, Any]] = None,
evaluation_status: Optional[Any] = None) -> bool

Finalize a goal's result with evaluation status.

Arguments:

  • ctx - Context from create_goal_result
  • success - Whether the attack was successful
  • evaluation_notes - Optional evaluation notes
  • final_metadata - Optional final metadata to merge

Returns:

True if update was successful, False otherwise

get_goal_context

def get_goal_context(goal_index: int) -> Optional[Context]

Get the Context for a specific goal index.

get_goal_context_by_goal

def get_goal_context_by_goal(goal: str) -> Optional[Context]

Get the Context for a specific goal string.

Searches all contexts to find one matching the goal text. Use this when you have the goal string but not the index.

Arguments:

  • goal - The goal text to find

Returns:

Context if found, None otherwise

get_result_id

def get_result_id(goal_index: int) -> Optional[str]

Get the result ID for a specific goal index.

get_all_contexts

def get_all_contexts() -> Dict[int, Context]

Get all goal contexts.

get_summary

def get_summary() -> Dict[str, Any]

Get summary statistics for all tracked goals.

track_goal

@contextmanager
def track_goal(goal: str,
goal_index: int,
initial_metadata: Optional[Dict[str, Any]] = None)

Context manager for tracking a single goal's attack execution.

Creates result on entry, yields context for adding traces, and auto-finalizes on exit (with failure status if exception occurs).

Arguments:

  • goal - The goal/datapoint text
  • goal_index - Index of this goal
  • initial_metadata - Optional initial metadata

Yields:

Context for adding traces during execution

Example:

>>> with tracker.track_goal(goal, i) as ctx: ... response = attack(goal) ... tracker.add_interaction_trace(ctx, request, response) ... # Finalize manually or let context manager handle it