Skip to main content

Evaluation Campaign

Use this workflow for a fast, high-signal jailbreak assessment before deeper testing.

What It Runs

The evaluation campaign uses the primary attacks from JAILBREAK_PROFILE:

  • h4rm3l
  • TAP
  • PAIR

These are executed against a primary jailbreak dataset.

Run It

Run the built-in evaluation campaign command:

hackagent eval \
--agent-name "quick-security-scan" \
--agent-type "other" \
--endpoint "http://localhost:8080/chat"

By default, the command:

  • selects the first primary jailbreak dataset from JAILBREAK_PROFILE
  • runs the three primary attacks in sequence (h4rm3l, TAP, PAIR)
  • uses ollama/llama3 with harmbench judge type
  • prints a per-attack summary table (status, result count, ASR, duration)

Common overrides:

hackagent eval \
--agent-name "quick-security-scan" \
--agent-type "other" \
--endpoint "http://localhost:8080/chat" \
--dataset "strongreject" \
--limit 50 \
--judge-identifier "ollama/llama3" \
--judge-type "harmbench" \
--timeout 600

Validation-only mode:

hackagent eval \
--agent-name "quick-security-scan" \
--agent-type "other" \
--endpoint "http://localhost:8080/chat" \
--dry-run

When To Use It

  • Before release as a security smoke test.
  • After model, prompt, or policy updates.
  • As a recurring regression check in CI pipelines.

Next Step

If the scan shows bypasses, continue with Evaluation Tutorial to tune and deepen each attack configuration.