The inference platform for the real world
Compile once, deploy everywhere.
Inference infrastructure is broken
Homogeneous assumptions
Most platforms assume identical GPUs in a single cluster. Real deployments have mixed CPUs, GPUs, NPUs across machines.
Fragile pipelines
Multi-model pipelines break silently in production. Type mismatches between models surface only after burning compute.
Scattered tooling
Separate tools for compilation, orchestration, serving, and monitoring. Each seam is a place things break.
From models to production
Models
PyTorch, ONNX, NNEF
Compile
Target hardware natively
Compose
Pipeline builder with validation
Serve
Typed API endpoint
Built for production inference
Heterogeneous Execution
Run on the hardware you actually have. CPUs, GPUs, NPUs, or a mix across machines. Compile once, deploy everywhere.
Typed Pipelines
Compose multi-model pipelines with strong type guarantees on inputs and outputs. Catch integration errors before models load into memory, not after.
Distributed by Default
Computation routes across nodes automatically. Split graphs across a network of heterogeneous accelerators without manual orchestration.
Declarative SDK
Define inference products, not just model calls. Chain models, add transformations, expose typed endpoints. Built in Rust for performance, with bindings for Python and more.
How it works
Define
Bring your models in PyTorch, ONNX, or NNEF. Declare your pipeline, types, and transformations.
Compose
Chain models into pipelines declaratively. Add pre/post-processing and custom logic between stages.
Validate
Pipelines are type-checked before any model loads into memory. Input/output contracts are enforced at definition time.
Compile
Compile ahead of time for distributed multi-machine workloads, at runtime for single-node execution, or use our compilation service for large graphs and complex infrastructure.
Deploy
Expose your pipeline as a typed inference endpoint. Orthos handles distribution across your hardware fleet automatically.
Orthos is in development.
Interested? Reach out to us at [email protected]