ION™

Scalable AI Compute Core

Compute Granule with Maximum Flexibility and Efficiency

Description

Versatile and Powerful AI Compute Engine

Rebellions' AI Compute Core, ION™, provides flexible inference capabilities with low power consumption, a small footprint, and high performance for edge computing systems. Featuring a customized instruction set architecture (ISA) designed for over 1,000 multiply-and-accumulate (MAC) units, ION™ delivers inference acceleration with an exceptionally high utilization rate compared to other AI accelerators. ION™’s versatility, compact size, and low power requirements make it the optimal choice for edge deployments.

Energy Efficiency with TSMC 7nm Technology

The ION™ Compute Core was fabricated using TSMC’s 7nm process and successfully demonstrated its functionality and efficiency on benchmark networks. Supporting mixed-precision (FP16, INT8/4/2) computation with up to 2GHz operating frequency, ION™ delivers over 2.0 TFLOPS/Watt for FP16-based vision tasks and over 10 TOPS/Watt for INT8-based language tasks, respectively.

60x Performance for Finance and Edge Computing

Powered by ION™’s hardware and software full-stack implementation, LightTrader™, the Field Programmable Gate Array (FPGA) integrating multiple chips, has been deployed in high-frequency trading (HFT) solutions at Wall Street-based investment banks. LightTrader™ achieves up to 60x AI-enabled HFT performance, instantly upgrading any HFT solution to a more intelligent trading system.

Performance

Competitive Energy Efficiency for Intelligent Edge

ION™ achieves up to 10x more performance-per-watt figure even in comparison against the state-of-the-art mobile Neural Processing Units (NPUs). This makes ION™ ideal for various applications including mobile companion chips, smart cities, robots and retail.

Finance Trading

Vision

Language

Key features

ION™ Brings Innovative Trading Solution Architecture

* DeepLOB: Deep Convolutional Neural Networks for Limit Order Books, IEEE Transactions on Signal Processing, 2018

ION™ Compute Core

The first-generation Compute Core ION™ supports FP16-based accurate stock prediction, utilizing the single batch inference and predicative execution pipelines. ION™ also supports BFloat16 and low precision integer operations such as INT8/4/2.

  • Number of ION™s
    1
  • TOPS
    16
  • Average Power
    1.6W
  • DL Inference throughput for DeepLOB*
    12.5K Symbols per second

LightTrader™ : World-first AI-enabled HFT Card

This PCIe card integrates the custom AI accelerator, ION™, and the conventional FPGA-based HFT pipeline for low-latency and high-throughput trading solutions with minimized symbol miss rate. It is the one-and-only solution for AI-based HFT trading so far.

  • Number of ION™s
    4
  • TOPS
    64
  • Average Power
    20W
  • DL Inference throughput for DeepLOB*
    50K Symbols per second

Ultra-low Latency, High throughput Finance AI Acceleration Server

Integrating eight LightTrader™ boards into one standard 4U server, the proposed server- level solution extends the capability of the ION™-based finance inference up to 0.5 PetaOPS and 3.2 Tb/s throughput for symbol processing.

  • Number of ION™s
    32
  • TOPS
    512
  • Average Power
    300W
  • DL Inference throughput for DeepLOB*
    400K Symbols per second

System specs

Technology
TSMC 7nm
Package size
8.7mm x 8.7mm
Compute cores
1
Peak FP16 Perf.
4 TFLOPS
Peak INT8 Perf.
16 TOPS
Peak INT4 Perf.
32 TOPS
Max TDP
2 - 6 Watt (Configuarable)
Highlights
Mixed precision
Customized ISA
Various vision and language models (CNN, LSTM, BERT, etc.)

See also...

RBLN-CA22

Cost-efficient, Powerful AI Acceleration for Small-sized Data Centers

Explore RBLN-CA22

RBLN-CA25

Boosted Performance for Hyperscalers

Explore RBLN-CA25

RBLN-CA21

Low-power, Yet Highly Powerful AI Inference at the Edge

Explore RBLN-CA21

System Solutions

Start Lean, Scale Green

Explore System Solutions

ATOM™

Inference AI Accelerator for Data Centers

ATOM™ is a fast and power-efficient System-on-Chip for AI inference with remarkably low latency, conceived for deployment in data centers or cloud service providers.
With international recognition from the industry-standard benchmark MLperf™ v3.0, ATOM™ can be scaled to accelerate state-of-the-art AI models of various sizes.
Explore ATOM™