Accelerate AI with Over 200 Supported Models, Effortlessly.

Discover how to quickly deploy your AI models on Rebellions' NPU using RBLN SDK.
You can find detailed information on our compiler, runtime, model zoo, and serving frameworks.

Get Started with Frameworks

HuggingFace
PyTorch
TensorFlow

RBLN SDK supports transformer and diffuser models on HuggingFace, downloadable from the Optimum RBLN library. Deploy newest models like Llama3-8b, SDXL from Huggingface Hub.

💡 Run HuggingFace models on Rebellions hardware.

  • Compilation and inference with HuggingFace models optimized for Rebellions’ hardware.
  • Efficient, developer-friendly API using RBLN Runtime.
  • Support of Llama and SDXL models with multi chips.

RBLN SDK supports PyTorch 2.0. Accelerate your PyTorch-trained NLP, speech, and vision models on Rebellions’ hardware.

💡 RBLN SDK integrates PyTorch models.

  • Compilation of PyTorch models optimized for Rebellions’ hardware.
  • Efficient, developer-friendly API using RBLN Runtime.
  • Run Torch 2.0 models without pretuning and build a powerful serving pipeline.

RBLN SDK supports TensorFlow. Optimize inference for models like LLMs, ImageNet and YOLO.

💡 RBLN SDK integrates TensorFlow models.

  • Inference with a multitude of pre-trained Keras Applications.
  • Efficient, developer-friendly API using RBLN Runtime.
  • Run TensorFlow without pretuning and build a powerful serving pipeline.

Featured Resources

Rebellions specializes in the development of AI accelerators optimized to facilitate efficient AI inference across various advanced applications in fields such as finance and cloud computing.
Explore our latest documentation, tutorials, and webinars.

Rebellions’ Software Stack

Rebellions Software Stack supports our hardware to deliver maximum performance.

Machine Learning Framework

Machine Learning (ML) frameworks are essential tools in the development and deployment of AI models, including NLP, Vision, Speech, and Generative models. Currently, the most popular frameworks are TensorFlow, PyTorch, and Hugging Face, each offering unique features and capabilities that cater to different aspects of machine learning development and deployment.

Compiler

The RBLN Compiler transforms models into executable instructions for ATOM™. It comprises two main components: the Frontend Compiler and the Backend Compiler. The Frontend Compiler abstracts deep learning models into Intermediate Representations (IRs), optimizing them before handing them off to the Backend Compiler. The Backend Compiler further optimizes these IRs and produces the Command Stream, the Program Binary for the hardware to execute the tasks, and serialized weights.

Compute Library

The Compute Library includes a comprehensive suite of highly optimized low-level operations, which are essential for model inference. These low-level operations form the programmable components of the arithmetic logic units within the Neural Engines. The Compute Library prepares the Program Binary at the Compiler’s command. The RBLN SDK supports low-level operations for both traditional Convolutional Neural Networks (CNNs) and state-of-the-art GenAI models. This includes hundreds of General Matrix Multiply (GEMM), normalization, and nonlinear activation functions. Thanks to the flexibility of the Neural Engines, the list of supported low-level operations continues to expand, enabling acceleration across a wide range of AI applications.

Runtime Module

The Runtime Module acts as the intermediary between the compiled model and the hardware, managing the actual execution of programs. It prepares executable instructions generated by the Compiler, manages data transfer between memory and the Neural Engines, and monitors performance to optimize the execution process.

Driver

The Driver, consisting of the Kernel-Mode Driver (KMD) and User-Mode Driver (UMD), provides efficient, safe, and flexible access to the hardware. The KMD allows the operating system to recognize the hardware and exposes APIs to the UMD. It also delivers the Command Stream from the Compiler stack to the device. The UMD, running in user space, intermediates between the application software and the hardware, managing their interactions.

Firmware

The Firmware is the lowest-level software component on ATOM™, serving as the final interface between software and hardware. It controls the tasks of the Command Processor, which orchestrates ATOM™’s operations. Located on the SoC, the Command Processor manages the Command Stream (the actual AI workloads) across multiple layers of the memory architecture and monitors the hardware’s health status.

RBLN Backend Rebellions Hardware

Rebellions’ ATOM™ is an AI accelerator engineered specifically for AI inference tasks with formidable capacity, manufactured on Samsung’s advanced 5nm process. It delivers 32 Tera Floating Point Operations per Second (TFLOPS) for FP16 and 128 Trillion Operations Per Second (TOPS) for INT8, enhanced by eight Neural Engines and 64 MB of on-chip SRAM. With an intricate memory architecture engineered with unparalleled technical mastery, ATOM™ is designed for high performance and peak efficiency.

Rebellions’ Software Stack

Rebellions Software Stack supports our hardware to deliver maximum performance.

TapClick for more details.
Machine Learning Frameworks
Compiler
Compute Library
Runtime Module
Driver
Firmware
RBLN Backend
Rebellions Hardware

Frequently Asked Questions

Need help finding information?

Get
Started
Get started with our user-friendly RBLN SDK.
SDK
Docs
Discover best practices and explore our comprehensive APIs.
Developer
Support
Reach out to us with any inquiries.