hackagent.datasets.providers.file
File-based dataset provider for loading goals from local files.
FileDatasetProvider Objects
class FileDatasetProvider(DatasetProvider)
Dataset provider for local files (JSON, JSONL, CSV).
This provider loads goals from local files in various formats.
Example:
JSON file with array of objects
provider = FileDatasetProvider({
-
"path"- "./goals.json", -
"goal_field"- "objective", }) goals = provider.load_goals()CSV file
provider = FileDatasetProvider({
-
"path"- "./goals.csv", -
"goal_field"- "prompt", }) goals = provider.load_goals()Plain text file (one goal per line)
provider = FileDatasetProvider({
-
"path"- "./goals.txt", }) goals = provider.load_goals()
__init__
def __init__(config: Dict[str, Any])
Initialize the file dataset provider.
Arguments:
config- Configuration dictionary with keys:- path (str): Path to the file
- goal_field (str, optional): Field name for JSON/CSV (default: "goal")
- encoding (str, optional): File encoding (default: "utf-8")
- fallback_fields (list, optional): Alternative fields if goal_field not found
load_goals
def load_goals(limit: Optional[int] = None,
shuffle: bool = False,
seed: Optional[int] = None,
**kwargs) -> List[str]
Load goals from the file.
Arguments:
limit- Maximum number of goals to return.shuffle- Whether to shuffle records before selecting.seed- Random seed for shuffling.**kwargs- Additional arguments (unused).
Returns:
List of goal strings.
get_metadata
def get_metadata() -> Dict[str, Any]
Return metadata about the loaded dataset.