Skip to main content

Custom Router Skill

End-to-end workflow: dataset → train → poll → infer. A custom router is an XGBoost classifier that learns which LLM model to route each request to, based on embeddings of the input text and training labels (correct_models) from your evaluation data.

Trigger

Activate when user says things like:
  • “train a router model”
  • “create a custom LLM router”
  • “run router training on my dataset”
  • “use my trained router for inference”
  • /custom-router

Prerequisites

Ask for these if not already provided:
ItemWhere to get
IRONA_API_KEYUser’s bearer token (env var or prompted)
Data_URLsOne or more public JSON URLs hosting training data
dataset.jsonLocal file to upload (if no hosted URL yet)

Data Format

Training data must be a publicly accessible JSON file with a problems array. Each problem needs three fields:
{
  "problems": [
    {
      "problem_key": "unique_id",
      "problem": "input text / question / task",
      "correct_models": ["openai/gpt-4o", "anthropic/claude-3-haiku"]
    }
  ]
}
Field rules:
  • problem_key — unique string identifier per problem
  • problem — the raw input text the router will learn to classify
  • correct_models — list of model strings that answered this problem correctly
  • Minimum 10 problems required; 100+ recommended for quality routing
Reference example: examples/sample_dataset.json (C++ coding problems with real model labels) Validation rules (from src/ml_tasks.py:138-141):
  • Problems missing any of problem_key, problem, or correct_models are silently filtered
  • Problems where no model is correct are dropped as “unsolvable”
  • Problems where only one model ever succeeds add limited signal — include diverse problems

Step 1 — Host the dataset

R2_KEY="router-training/$(python3 -c 'import uuid; print(uuid.uuid4().hex)')/data.json"

INPUT_URL=$(R2_KEY="$R2_KEY" python3 - <<'EOF'
import boto3, os
from botocore.config import Config
s3 = boto3.client("s3", endpoint_url=os.environ["CF_ENDPOINT_URL"],
    aws_access_key_id=os.environ["CF_ACCESS_KEY_ID"],
    aws_secret_access_key=os.environ["CF_SECRET_ACCESS_KEY"],
    config=Config(signature_version="s3v4"), region_name="auto")
bucket = os.environ["CF_BUCKET_NAME"]
key = os.environ["R2_KEY"]
s3.upload_file("dataset.json", bucket, key)
print(f"https://{bucket}.{os.environ['CF_ACCOUNT_ID']}.r2.dev/{key}")
EOF
)
echo "Data URL: $INPUT_URL"

Step 2 — Submit training job

API_KEY="${IRONA_API_KEY:?Set IRONA_API_KEY}"
BASE_URL="https://irona-ai--train.modal.run"

JOB=$(curl -s -X POST "$BASE_URL" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d "{\"Data_URLs\": [\"$DATA_URL\"]}")

echo "$JOB" | python3 -m json.tool
JOB_ID=$(echo "$JOB" | python3 -c "import json,sys; print(json.load(sys.stdin)['training_job_id'])")
echo "Job ID: $JOB_ID"
Request body:
FieldTypeNotes
Data_URLsstring[]One or more publicly accessible JSON URLs
Response:
{"training_job_id": "uuid", "status": "queued", "version": "x.x.x"}

Step 3 — Poll status until complete

STATUS_URL="https://irona-ai--taskstatus.modal.run"

while true; do
  S=$(curl -s "$STATUS_URL/$JOB_ID" \
    -H "Authorization: Bearer $API_KEY")

  STATE=$(echo "$S" | python3 -c "import json,sys; print(json.load(sys.stdin).get('status',''))")
  MODEL_ID=$(echo "$S" | python3 -c "import json,sys; d=json.load(sys.stdin); print(d.get('model_id') or '—')")

  echo "[$(date +%H:%M:%S)] status=$STATE  model_id=$MODEL_ID"

  case "$STATE" in completed|failed) break ;; esac
  sleep 30
done
Status values: queuedrunningcompleted | failed Status response fields:
FieldDescription
training_job_idUUID of the training job
statusCurrent job state
model_idID of trained model (set when completed)
error_messageError details if failed
started_atISO 8601 timestamp
completed_atISO 8601 timestamp
training_configHyperparameters and timing metrics

Step 4 — Get model details (optional)

MODELS_URL="https://irona-ai--models.modal.run"

curl -s "$MODELS_URL/$MODEL_ID" \
  -H "Authorization: Bearer $API_KEY" | python3 -m json.tool

Step 5 — Run inference

Once training completes and you have a model_id, route inputs through the trained router:
INFER_URL="https://irona-ai--infer.modal.run"

RESULT=$(curl -s -X POST "$INFER_URL" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d "{
    \"model_id\": \"$MODEL_ID\",
    \"inputs\": [
      \"What is 2 + 2?\",
      \"Write a merge sort in Python\"
    ]
  }")

echo "$RESULT" | python3 -c "
import json, sys
data = json.load(sys.stdin)
preds = data.get('predictions', [])
for i, p in enumerate(preds):
    print(f'Input {i}: route to → {p}')
"
Inference request:
FieldTypeNotes
model_idstringID from completed training job
inputsstring[]Raw text inputs to route (max 1000)
warmupboolOptional; pre-warms GPU without running inference
Inference response:
{"predictions": ["openai/gpt-4o-mini", "anthropic/claude-3-opus"], "version": "x.x.x"}
Each prediction is the recommended model string for the corresponding input.

Complete end-to-end script

#!/usr/bin/env bash
set -euo pipefail

API_KEY="${IRONA_API_KEY:?Set IRONA_API_KEY}"
DATA_URL="${1:?Usage: $0 <data_url> [\"input1\" \"input2\" ...]}"
shift
INPUTS=("$@")

BASE_URL="https://irona-ai--train.modal.run"
STATUS_URL="https://irona-ai--taskstatus.modal.run"
INFER_URL="https://irona-ai--infer.modal.run"

# Submit
echo "Submitting training job..."
JOB=$(curl -s -X POST "$BASE_URL" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d "{\"Data_URLs\": [\"$DATA_URL\"]}")
echo "$JOB" | python3 -m json.tool
JOB_ID=$(echo "$JOB" | python3 -c "import json,sys; print(json.load(sys.stdin)['training_job_id'])")
echo "Job ID: $JOB_ID"

# Poll
echo "Polling..."
while true; do
  S=$(curl -s "$STATUS_URL/$JOB_ID" -H "Authorization: Bearer $API_KEY")
  STATE=$(echo "$S" | python3 -c "import json,sys; print(json.load(sys.stdin).get('status',''))")
  MODEL_ID=$(echo "$S" | python3 -c "import json,sys; d=json.load(sys.stdin); print(d.get('model_id') or '')")
  echo "[$(date +%H:%M:%S)] $STATE  model_id=$MODEL_ID"
  case "$STATE" in completed|failed) break ;; esac
  sleep 30
done

[[ "$STATE" != "completed" ]] && { echo "Training failed"; exit 1; }

# Infer
if [[ ${#INPUTS[@]} -gt 0 ]]; then
  INPUTS_JSON=$(python3 -c "import json,sys; print(json.dumps(sys.argv[1:]))" -- "${INPUTS[@]}")
  echo ""
  echo "Running inference..."
  curl -s -X POST "$INFER_URL" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $API_KEY" \
    -d "{\"model_id\": \"$MODEL_ID\", \"inputs\": $INPUTS_JSON}" | python3 -c "
import json, sys
data = json.load(sys.stdin)
for i, p in enumerate(data.get('predictions', [])):
    print(f'Input {i}: → {p}')
"
fi
Usage:
IRONA_API_KEY=sk_... bash run_router.sh \
  "https://<your-public-json-url>" \
  "Write a binary search in C++" \
  "What is the capital of France?"

Error handling

ErrorCauseFix
400 Missing 'Data_URLs'Body missing Data_URLs keyEnsure JSON body has "Data_URLs": [...]
400 No valid 'problems' foundAll problems failed validationCheck each problem has problem_key, problem, correct_models
400 R2 download failedURL not reachable or R2 key missingVerify URL is publicly accessible
401 UnauthorizedInvalid/missing IRONA_API_KEYCheck bearer token
status: failedInternal error during trainingCheck error_message; ensure dataset has ≥10 valid problems
400 Missing 'inputs' (infer)Inference body missing inputsPass "inputs": [...]
404 Model not found (infer)Wrong model_id or model inactiveUse model_id from completed training job status

Reference examples

Pre-built example in examples/:
FileDescription
examples/sample_dataset.json12 C++ coding problems with real model routing labels (gemini, gpt, qwen, etc.)