Rebellions SDK
Built for developers who scale AI
High Performing Server for Large-Scale AI Inference
ATOM™-Max Server is a power-efficient, single-server solution built for large-scale AI inference. It supports up to 8 ATOM™-Max PCIe cards, enabling deployment of hundreds of AI models spanning vision, LLMs, multimodal, and even physical AI workloads. Fully compatible with leading inference frameworks like vLLM, Triton, and Kubernetes, you can seamlessly transition from GPU workflows with familiar tools and guided tutorials.
Peak Performance
Maximum FP16 Performance
GDDR6 Memory
High-Capacity, High-Bandwidth Memory
Power Consumption
Built for Optimal Energy Use
Form Factor
Optimized for Data Centers
Ubuntu, Rhel 9, AlmaLinux, Rocky Linux
Hugging Face, PyTorch, TensorFlow, Triton
VLLM, Triton Inference Server, TorchServe
Docker, OpenStack, Kubernetes, Ray
Even under heavy demand, the ATOM™-Max delivers stable, high-throughput performance—generating thousands of tokens and processing image frames per second, all from a single system.
ATOM™-Max delivers maximum AI inference performance within limited server room power budgets. Its exceptional power efficiency significantly lowers total cost of ownership (TCO) and enables a more sustainable AI infrastructure.
ATOM™-Max is compatible with popular open-source ecosystems, supporting efficient serving, flexible resource management, and monitoring through tools like vLLM, Triton, Kubernetes, and Prometheus—so you can build full end-to-end services with ease.
Run hundreds of AI models out of the box—from LLMs and vision AI to multimodal and physical AI. Build tailored services like chatbots, search, summarization, smart CCTV, and image generation.
No need to abandon your existing development environment. Start right away with familiar workflows (PyTorch, TensorFlow, etc.) and step-by-step tutorials.
Streamline enterprise-wide AI adoption—from development to deployment—with scalable AI infrastructure
Enable proactive safety monitoring on construction sites with AI-powered surveillance
Support the AI healthcare ecosystem, from personalized wellness to precision medicine
Build next-gen financial services with secure, real-time AI processing of financial data
Boost manufacturing productivity with Physical AI-powered smart factories
Power advanced telecom services and elevate customer experience with reliable large-scale AI operations.
Purpose-built for PyTorch.
Tuned for production.
High-QPS vLLM serving.
Ready out of the box.
Full Triton access with dev tools
you’ll actually use.
One-click deployment.
Zero guesswork.
Core system software and essential tools for NPU excution
Developer tools for model and service development
300+ ready-to-use PyTorch and TensorFlow models optimized for Rebellions NPUs