IronLabs Custom Router lets you train a personalized model selection system on your own data. The router learns from your examples to automatically pick the most appropriate AI model for each task — improving accuracy and reducing costs.
Python SDK pip install ironlabs
Node.js SDK npm install ironlabs
When to use Custom Router
Cost optimization — route simple tasks to cheaper models, reserve powerful ones for complex requests
Domain specialization — match prompts to models that excel in your domain (code, legal, creative writing, etc.)
Multi-model pipelines — let the router decide which model handles each stage instead of hardcoding choices
Replace guesswork — use a data-driven system trained on your own examples instead of manual model selection
Prerequisites
Before you start, make sure you have:
An IronLabs API key from the Settings page
A training data file hosted at a publicly accessible URL (GitHub, S3, CDN, etc.)
Installation
Install the SDK for your language:
Initialize the client
Set your API key as an environment variable:
export IRONLABS_API_KEY = "your_api_key_here"
Then initialize the trainer in your code:
from ironlabs import RouterTrainer
trainer = RouterTrainer()
The client automatically picks up IRONLABS_API_KEY from your environment — no need to pass it explicitly.
Training a Custom Router
Prepare training data
Training data maps prompts to the ideal models for each. Supported formats are JSON and CSV — host your file at any publicly accessible URL (GitHub, S3, CDN, etc.). [
{
"problem" : "Write a simple hello world function" ,
"correct_models" : [ "openai/gpt-4o-mini" , "anthropic/claude-3-5-haiku-20241022" ]
},
{
"problem" : "Explain quantum computing in detail" ,
"correct_models" : [ "openai/gpt-4o" , "anthropic/claude-3-5-sonnet-20241022" ]
}
]
You can pass multiple file URLs to combine datasets from different sources.
Start training
Pass one or more data URLs to kick off a training job. import time
from ironlabs import RouterTrainer
trainer = RouterTrainer()
data_urls = [ "https://example.com/path/to/your/training_data.json" ]
training_info = trainer.fit(data_urls)
job_id = training_info.get( "training_job_id" )
print ( f "Training job started. Job ID: { job_id } " )
Response: {
"training_job_id" : "abc123-def456-ghi789" ,
"status" : "queued" ,
"version" : "v1.0"
}
Check training status
Poll until the job reaches completed or failed. Training typically takes a few minutes. while True :
status_info = trainer.get_status()
status = status_info.get( "status" )
print ( f "Current status: { status } " )
if status == "completed" :
model_id = status_info.get( "model_id" )
print ( f "Training completed! Model ID: { model_id } " )
break
elif status == "failed" :
print ( "Training failed." )
break
time.sleep( 10 )
Response: {
"training_job_id" : "abc123-def456-ghi789" ,
"status" : "completed" ,
"model_id" : "model-xyz-top1_0.8542" ,
"started_at" : "2026-02-04T10:30:00Z" ,
"completed_at" : "2026-02-04T10:35:00Z" ,
"training_config" : {
"timing" : {
"training_time_seconds" : 45.2 ,
"embedding_time_seconds" : 120.5 ,
"total_time_seconds" : 300.0
}
}
}
Status Description queuedJob is waiting to start runningTraining is in progress completedTraining finished successfully failedTraining encountered an error
Get model details
Retrieve metadata and performance metrics for your trained model. details = trainer.get_model_details()
print ( f "Model details: { details } " )
Response: {
"model_id" : "model-xyz-top1_0.8542" ,
"version" : "v1.0" ,
"status" : "active" ,
"embedding_model" : "Qwen/Qwen3-Embedding-4B" ,
"num_classes" : 8 ,
"created_at" : "2026-02-04T10:35:00Z" ,
"metrics" : {
"top1_hit_rate" : 0.8542 ,
"top3_hit_rate" : 0.9521 ,
"accuracy" : 0.8542
}
}
Run inference
Use your trained router to select the best model for new prompts. inputs = [
"Write a Python function to calculate fibonacci numbers" ,
"How to implement a binary search tree in JavaScript?" ,
"What is the time complexity of quicksort?"
]
predictions = trainer.predict(inputs)
for i, pred in enumerate (predictions.get( "predictions" , [])):
print ( f "Input: { inputs[i] } " )
print ( f "Recommended Model: { pred.get( 'top_model' ) } " )
print ( f "Confidence: { pred.get( 'top_prob' ) :.0%} " )
print ( "-" * 40 )
Response: {
"predictions" : [
{
"top_model" : "openai/gpt-4o-mini" ,
"top_prob" : 0.92 ,
"models" : [
{ "model" : "openai/gpt-4o-mini" , "confidence" : 0.92 , "rank" : 1 },
{ "model" : "anthropic/claude-3-5-haiku-20241022" , "confidence" : 0.06 , "rank" : 2 }
]
}
],
"version" : "v1.0"
}
Each prediction includes:
top_model — the recommended model for this input
top_prob — confidence score (0–1)
models — full ranked list with individual confidence scores
Complete example
View full end-to-end example
import time
from ironlabs import RouterTrainer
def main ():
trainer = RouterTrainer()
# 1. Start training
data_urls = [ "https://example.com/path/to/your/training_data.json" ]
training_info = trainer.fit(data_urls)
job_id = training_info.get( "training_job_id" )
print ( f "Training job started. Job ID: { job_id } " )
# 2. Poll until complete
while True :
status_info = trainer.get_status()
status = status_info.get( "status" )
print ( f "Current status: { status } " )
if status == "completed" :
model_id = status_info.get( "model_id" )
print ( f "Training completed! Model ID: { model_id } " )
break
elif status == "failed" :
print ( "Training failed." )
return
time.sleep( 10 )
# 3. Inspect model
details = trainer.get_model_details()
print ( f "Model details: { details } " )
# 4. Run inference
test_inputs = [
"Write a Python function to calculate fibonacci numbers" ,
"How to implement a binary search tree in JavaScript?" ,
"What is the time complexity of quicksort?"
]
predictions = trainer.predict(test_inputs)
for i, pred in enumerate (predictions.get( "predictions" , [])):
print ( f "Input: { test_inputs[i] } " )
print ( f "Recommended Model: { pred.get( 'top_model' ) } " )
print ( f "Confidence: { pred.get( 'top_prob' ) :.0%} " )
print ( "-" * 40 )
if __name__ == "__main__" :
main()
Loading an existing model
Reuse a previously trained model without going through training again.
from ironlabs import RouterTrainer
trainer = RouterTrainer()
trainer.model_id = "model-xyz-top1_0.8542"
predictions = trainer.predict([ "What is the time complexity of quicksort?" ])
print ( f "Recommended model: { predictions[ 'predictions' ][ 0 ][ 'top_model' ] } " )
Batch processing
The predict endpoint supports up to 500 inputs per request . For larger datasets, split into chunks.
from ironlabs import RouterTrainer
trainer = RouterTrainer()
trainer.model_id = "model-xyz-top1_0.8542"
large_dataset = [ f "Test query { i } " for i in range ( 1200 )]
BATCH_SIZE = 500
all_predictions = []
for i in range ( 0 , len (large_dataset), BATCH_SIZE ):
batch = large_dataset[i:i + BATCH_SIZE ]
result = trainer.predict(batch)
all_predictions.extend(result.get( "predictions" , []))
print ( f "Processed { min (i + BATCH_SIZE , len (large_dataset)) } / { len (large_dataset) } inputs" )
print ( f "Total predictions: { len (all_predictions) } " )
Format Description JSON Array of objects with problem and correct_models fields CSV Columns for problem and correct_models (comma-separated model names in a single cell) Multiple files Pass multiple URLs in Data_URLs to combine datasets
Model lifecycle
State Description Active Available for inference Archived Automatically archived after 1 year of inactivity Removed Inactive models are cleaned up weekly to optimize storage