Discover vulnerabilities in your AI agents before attackers do.
What is HackAgent?
HackAgent is a comprehensive Python SDK and CLI designed to help security researchers, developers, and AI safety practitioners evaluate and strengthen the security of AI agents.

Interactive TUI with real-time attack progress and beautiful visualizations
As AI agents become more powerful and autonomous, they face unique security challenges that traditional testing tools can't address:
| Threat | Description |
|---|---|
| Prompt Injection | Malicious inputs that hijack agent behavior |
| Jailbreaking | Bypassing safety guardrails and content filters |
| Goal Hijacking | Manipulating agents to pursue unintended objectives |
| Tool Misuse | Exploiting agent capabilities for unauthorized actions |
HackAgent automates testing for these vulnerabilities using research-backed attack techniques, helping you identify and fix security issues before they're exploited in the real world.
Get Started Now
python3 -m venv .venv
source .venv/bin/activate
pip install git+https://github.com/AISecurityLab/HackAgent.gitQuestions? Join our community discussions or email us at ais@ai4i.it
Architecture
HackAgent is built with a modular architecture that makes it easy to test any AI agent:
| Component | Description |
|---|---|
| Attack Engine | Orchestrates attacks using AdvPrefix, AutoDAN-Turbo, PAIR, TAP, FlipAttack, BoN, h4rm3l, CipherChat, PAP, and Baseline |
| Generator | LLM that creates adversarial prompts to test the target agent |
| Judge | LLM that evaluates whether attacks successfully bypassed safety measures |
| Target Agent | Your AI agent being tested (supports multiple frameworks) |
| Datasets | Pre-built benchmark presets plus custom HuggingFace/file datasets |
Supported Frameworks
Responsible Use
HackAgent is designed for authorized security testing only. Always obtain explicit permission before testing any AI system.