Python SDK
pip install ironlabsNode.js SDK
npm install ironlabsWhen to use AgentOpt
- Automated prompt engineering — replace manual trial-and-error with a data-driven optimization loop
- Agent quality improvement — boost task accuracy without changing your agent’s code structure
- Benchmark-driven development — optimize against your own evaluation function and dataset
- Model-specific tuning — find the best system prompt for a specific target model
Prerequisites
Before you start, make sure you have:
- An IronLabs API key from the Settings page
- A ZIP bundle containing
agent.py,eval.py, anddataset.jsonhosted at a publicly accessible URL - Minimum 10 rows in your dataset
Installation
Install the SDK for your language:Initialize the client
Set your API key as an environment variable:Running an Optimization
Prepare your ZIP bundle
AgentOpt requires three files packed into a single ZIP:
Pack the three files into a ZIP and host it at a publicly accessible URL:
| File | Purpose |
|---|---|
agent.py | Your agent — defines run_batch(inputs, api_key) and uses EDITABLE/FIXED markers |
eval.py | Scoring function — defines score(expected, predicted) -> float in [0, 1] |
dataset.json | Array of {"input": str, "answer": str} objects (minimum 10 rows) |
agent.py
The EDITABLE section is what AgentOpt rewrites each iteration. The FIXED section defines the interface contract and is never modified.eval.py
Must define ascore function that returns a float between 0.0 and 1.0:dataset.json
A JSON array of input/answer pairs (minimum 10 rows):Submit the optimization job
Pass the ZIP URL, target model, and number of iterations to start the job.Parameters:
Response:
| Parameter | Required | Default | Description |
|---|---|---|---|
input_url | Yes | — | Public URL to your ZIP bundle |
target_model | Yes | — | OpenRouter model string to optimize for (e.g. target_model) |
n_iterations | No | 15 | Number of optimization iterations (1–50) |
overall_timeout_seconds | No | 3600 | Total job timeout in seconds (300–7200) |
llm_call_timeout_seconds | No | 300 | Timeout per LLM call (30–600) |
sandbox_timeout_seconds | No | 600 | Timeout per sandbox benchmark run (60–1800) |
Monitor progress
Poll Status values:
AgentOpt-specific status fields:
get_status() every 30 seconds. The response includes live per-iteration progress once the job starts running.| Status | Description |
|---|---|
queued | Job is waiting to start |
running | Optimization is active |
completed | Optimization finished successfully |
interrupted | Job timed out or was cancelled |
failed | Internal error — check error_message field |
| Field | Description |
|---|---|
current_iteration | Latest iteration completed (0 = baseline) |
best_score | Best score seen so far (0.0–1.0) |
baseline_score | Score before any optimization |
n_iterations | Total iterations requested |
Get results
Retrieve the optimized prompt and performance metrics once the job completes.Response:Result fields:
| Field | Description |
|---|---|
optimized_prompt | Best system prompt found across all iterations |
original_prompt | System prompt from your original agent.py |
train_score | Score on the 70% training split |
test_score | Score on the 30% held-out test split |
iterations_run | Total iterations executed |
iterations_kept | Iterations where score improved |
agent_code_url | Public URL to download the best agent.py |
Complete example
View full end-to-end example
View full end-to-end example
How AgentOpt works
AgentOpt runs a closed optimization loop:- Baseline — runs your original
agent.pyon the dataset to establish a starting score - Propose — Claude reads the current system prompt and proposes an improved version
- Benchmark — the proposed variant runs in an isolated sandbox against the dataset
- Accept or reject — improvements are kept; regressions are discarded
- Repeat — steps 2–4 repeat for
n_iterationsiterations
agent.py (with the winning system prompt embedded) is returned at the end.
Error handling
| Error | Cause | Fix |
|---|---|---|
422 agentopt requires input_url | Missing input_url in request | Upload ZIP and set the URL |
422 agentopt requires target_models | Empty target models list | Add at least one model string |
422 n_iterations must be 1–50 | Iterations out of range | Use a value between 1 and 50 |
422 overall_timeout_seconds must be 300–7200 | Timeout out of range | Use a value in range |
status: interrupted | Job timed out | Check error_message; retry with higher overall_timeout_seconds |
status: failed | Internal error | Check error_message; verify agent.py runs locally without error |