Rebellions
Efficient.
Scalable.
Deployment-Ready.

Performance
One Engine.
Mixed Precision.
Energy Efficiency
Smarter Prefetch.
Faster Execution.
Scalability
Modular Architecture.
Monolithic Efficiency.
Synchronization
Always On.
Always Through.
REBEL-Quad vs. H200
Throughput
(TPS)
Efficiency
(TPS/Watt)
Power Consumption
(Watt)
- Benchmark Condition
- Performance measured on Llama 3.3 70B (TP2, FP8) with runtime input/output length 2048/2048.
MoE in Action
Open-Source Frameworks.
Real Deployment.
Architecture for any model. Silicon for any scale.
Built on PyTorch and vLLM. Runs in real time with dynamic expert routing.
HW-SW co-optimized for disaggregated inference.
Develop with familiar tools. Scale with system-level efficiency.
Rebellions Chiplet Ecosystem
Beyond the Die.
Seamless Dataflow and Compute Scalability across Chiplets.
Rebellions Chiplet Design Strategy
Rebellions SDK
Deploy with Confidence from Day One.
Purpose-built for PyTorch.
Tuned for production.
High-QPS vLLM serving.
Ready out of the box.
Full Triton access with dev tools
you’ll actually use.
One-click deployment.
Zero guesswork.