Skip to main content

hackagent.datasets.providers.file

File-based dataset provider for loading goals from local files.

FileDatasetProvider Objects

class FileDatasetProvider(DatasetProvider)

Dataset provider for local files (JSON, JSONL, CSV).

This provider loads goals from local files in various formats.

Example:

JSON file with array of objects

provider = FileDatasetProvider({

  • "path" - "./goals.json",

  • "goal_field" - "objective", }) goals = provider.load_goals()

    CSV file

    provider = FileDatasetProvider({

  • "path" - "./goals.csv",

  • "goal_field" - "prompt", }) goals = provider.load_goals()

    Plain text file (one goal per line)

    provider = FileDatasetProvider({

  • "path" - "./goals.txt", }) goals = provider.load_goals()

__init__

def __init__(config: Dict[str, Any])

Initialize the file dataset provider.

Arguments:

  • config - Configuration dictionary with keys:
    • path (str): Path to the file
    • goal_field (str, optional): Field name for JSON/CSV (default: "goal")
    • encoding (str, optional): File encoding (default: "utf-8")
    • fallback_fields (list, optional): Alternative fields if goal_field not found

load_goals

def load_goals(limit: Optional[int] = None,
shuffle: bool = False,
seed: Optional[int] = None,
**kwargs) -> List[str]

Load goals from the file.

Arguments:

  • limit - Maximum number of goals to return.
  • shuffle - Whether to shuffle records before selecting.
  • seed - Random seed for shuffling.
  • **kwargs - Additional arguments (unused).

Returns:

List of goal strings.

get_metadata

def get_metadata() -> Dict[str, Any]

Return metadata about the loaded dataset.