Rebellions at Hot Chips 2025

Efficient at the Core.
Scalable by Architecture.
Usable from Day One.

Rebellions

Rebellions at Hot Chips 2025 in details

Optimized for Today’s Leading AI Models. Without the Energy Tax.

See REBEL-Quad run live and watch MoE inference in action at Rebellions’ booth

REBEL-Quad

Peta-Scale MoE Inference.
Without the Energy Tax.

Performance

One Engine.
Mixed Precision.

Energy Efficiency

Smarter Prefetch.
Faster Execution.

Scalability

Modular Architecture.
Monolithic Efficiency.

Synchronization

Always On.
Always Through.

REBEL-Quad vs. B200 SXM

REBEL-Quad
B200 SXM

Throughput
(TPS)

1.4

Efficiency
(TPS/Watt)

1.6

Power Consumption
(Watt)

0.9

Rebellions Chiplet Ecosystem

Beyond the Die.
Seamless Dataflow and Compute Scalability across Chiplets.

Rebellions Chiplet Design Strategy

UCIe

Rebellions SDK
Deploy with Confidence from Day One.

Purpose-built for PyTorch.
Tuned for production.

High-QPS vLLM serving.
Ready out of the box.

Full Triton access with dev tools
you’ll actually use.

One-click deployment.
Zero guesswork.

MoE in Action
Open-Source Frameworks.
Real Deployment.

Architecture for any model. Silicon for any scale.

Built on PyTorch and vLLM. Runs in real time with dynamic expert routing.

HW-SW co-optimized for disaggregated inference.

Develop with familiar tools. Scale with system-level efficiency.