Custom Router - IronLabs Docs

IronLabs Custom Router lets you train a personalized model selection system on your own data. The router learns from your examples to automatically pick the most appropriate AI model for each task — improving accuracy and reducing costs.

Python SDK

pip install ironlabs

Node.js SDK

npm install ironlabs

When to use Custom Router

Cost optimization — route simple tasks to cheaper models, reserve powerful ones for complex requests
Domain specialization — match prompts to models that excel in your domain (code, legal, creative writing, etc.)
Multi-model pipelines — let the router decide which model handles each stage instead of hardcoding choices
Replace guesswork — use a data-driven system trained on your own examples instead of manual model selection

Prerequisites

Before you start, make sure you have:

An IronLabs API key from the Settings page
A training data file hosted at a publicly accessible URL (GitHub, S3, CDN, etc.)

Installation

Install the SDK for your language:

pip install ironlabs

Initialize the client

Set your API key as an environment variable:

export IRONLABS_API_KEY="your_api_key_here"

Then initialize the trainer in your code:

from ironlabs import RouterTrainer

trainer = RouterTrainer()

The client automatically picks up IRONLABS_API_KEY from your environment — no need to pass it explicitly.

Training a Custom Router

Prepare training data

Training data maps prompts to the ideal models for each. Supported formats are JSON and CSV — host your file at any publicly accessible URL (GitHub, S3, CDN, etc.).

[
  {
    "problem": "Write a simple hello world function",
    "correct_models": ["openai/gpt-4o-mini", "anthropic/claude-3-5-haiku-20241022"]
  },
  {
    "problem": "Explain quantum computing in detail",
    "correct_models": ["openai/gpt-4o", "anthropic/claude-3-5-sonnet-20241022"]
  }
]

You can pass multiple file URLs to combine datasets from different sources.

Start training

Pass one or more data URLs to kick off a training job.

import time
from ironlabs import RouterTrainer

trainer = RouterTrainer()

data_urls = ["https://example.com/path/to/your/training_data.json"]

training_info = trainer.fit(data_urls)
job_id = training_info.get("training_job_id")
print(f"Training job started. Job ID: {job_id}")

Response:

{
  "training_job_id": "abc123-def456-ghi789",
  "status": "queued",
  "version": "v1.0"
}

Check training status

Poll until the job reaches completed or failed. Training typically takes a few minutes.

while True:
    status_info = trainer.get_status()
    status = status_info.get("status")
    print(f"Current status: {status}")

    if status == "completed":
        model_id = status_info.get("model_id")
        print(f"Training completed! Model ID: {model_id}")
        break
    elif status == "failed":
        print("Training failed.")
        break

    time.sleep(10)

Response:

{
  "training_job_id": "abc123-def456-ghi789",
  "status": "completed",
  "model_id": "model-xyz-top1_0.8542",
  "started_at": "2026-02-04T10:30:00Z",
  "completed_at": "2026-02-04T10:35:00Z",
  "training_config": {
    "timing": {
      "training_time_seconds": 45.2,
      "embedding_time_seconds": 120.5,
      "total_time_seconds": 300.0
    }
  }
}

Status	Description
`queued`	Job is waiting to start
`running`	Training is in progress
`completed`	Training finished successfully
`failed`	Training encountered an error

Get model details

Retrieve metadata and performance metrics for your trained model.

details = trainer.get_model_details()
print(f"Model details: {details}")

Response:

{
  "model_id": "model-xyz-top1_0.8542",
  "version": "v1.0",
  "status": "active",
  "embedding_model": "Qwen/Qwen3-Embedding-4B",
  "num_classes": 8,
  "created_at": "2026-02-04T10:35:00Z",
  "metrics": {
    "top1_hit_rate": 0.8542,
    "top3_hit_rate": 0.9521,
    "accuracy": 0.8542
  }
}

Run inference

Use your trained router to select the best model for new prompts.

inputs = [
    "Write a Python function to calculate fibonacci numbers",
    "How to implement a binary search tree in JavaScript?",
    "What is the time complexity of quicksort?"
]

predictions = trainer.predict(inputs)

for i, pred in enumerate(predictions.get("predictions", [])):
    print(f"Input: {inputs[i]}")
    print(f"Recommended Model: {pred.get('top_model')}")
    print(f"Confidence: {pred.get('top_prob'):.0%}")
    print("-" * 40)

Response:

{
  "predictions": [
    {
      "top_model": "openai/gpt-4o-mini",
      "top_prob": 0.92,
      "models": [
        { "model": "openai/gpt-4o-mini", "confidence": 0.92, "rank": 1 },
        { "model": "anthropic/claude-3-5-haiku-20241022", "confidence": 0.06, "rank": 2 }
      ]
    }
  ],
  "version": "v1.0"
}

Each prediction includes:

top_model — the recommended model for this input
top_prob — confidence score (0–1)
models — full ranked list with individual confidence scores

Complete example

View full end-to-end example

import time
from ironlabs import RouterTrainer

def main():
    trainer = RouterTrainer()

    # 1. Start training
    data_urls = ["https://example.com/path/to/your/training_data.json"]
    training_info = trainer.fit(data_urls)
    job_id = training_info.get("training_job_id")
    print(f"Training job started. Job ID: {job_id}")

    # 2. Poll until complete
    while True:
        status_info = trainer.get_status()
        status = status_info.get("status")
        print(f"Current status: {status}")

        if status == "completed":
            model_id = status_info.get("model_id")
            print(f"Training completed! Model ID: {model_id}")
            break
        elif status == "failed":
            print("Training failed.")
            return

        time.sleep(10)

    # 3. Inspect model
    details = trainer.get_model_details()
    print(f"Model details: {details}")

    # 4. Run inference
    test_inputs = [
        "Write a Python function to calculate fibonacci numbers",
        "How to implement a binary search tree in JavaScript?",
        "What is the time complexity of quicksort?"
    ]

    predictions = trainer.predict(test_inputs)

    for i, pred in enumerate(predictions.get("predictions", [])):
        print(f"Input: {test_inputs[i]}")
        print(f"Recommended Model: {pred.get('top_model')}")
        print(f"Confidence: {pred.get('top_prob'):.0%}")
        print("-" * 40)

if __name__ == "__main__":
    main()

Loading an existing model

Reuse a previously trained model without going through training again.

from ironlabs import RouterTrainer

trainer = RouterTrainer()
trainer.model_id = "model-xyz-top1_0.8542"

predictions = trainer.predict(["What is the time complexity of quicksort?"])
print(f"Recommended model: {predictions['predictions'][0]['top_model']}")

Batch processing

The predict endpoint supports up to 500 inputs per request. For larger datasets, split into chunks.

from ironlabs import RouterTrainer

trainer = RouterTrainer()
trainer.model_id = "model-xyz-top1_0.8542"

large_dataset = [f"Test query {i}" for i in range(1200)]
BATCH_SIZE = 500

all_predictions = []
for i in range(0, len(large_dataset), BATCH_SIZE):
    batch = large_dataset[i:i + BATCH_SIZE]
    result = trainer.predict(batch)
    all_predictions.extend(result.get("predictions", []))
    print(f"Processed {min(i + BATCH_SIZE, len(large_dataset))}/{len(large_dataset)} inputs")

print(f"Total predictions: {len(all_predictions)}")

Supported data formats

Format	Description
JSON	Array of objects with `problem` and `correct_models` fields
CSV	Columns for `problem` and `correct_models` (comma-separated model names in a single cell)
Multiple files	Pass multiple URLs in `Data_URLs` to combine datasets

Model lifecycle

State	Description
Active	Available for inference
Archived	Automatically archived after 1 year of inactivity
Removed	Inactive models are cleaned up weekly to optimize storage

Python SDK

Node.js SDK

​When to use Custom Router

​Prerequisites

​Installation

​Initialize the client

​Training a Custom Router

​Complete example

​Loading an existing model

​Batch processing

​Supported data formats

​Model lifecycle

When to use Custom Router

Prerequisites

Installation

Initialize the client

Training a Custom Router

Complete example

Loading an existing model

Batch processing

Supported data formats

Model lifecycle