FuriosaAI

Furiosa is featured in The Wall Street Journal. Read the article.

Frontpage

Redcard cropped
Furiosa RNGD - Gen 2 data center accelerator

Powerfully efficient AI inference for enterprise and cloud

EFFICIENT AI INFERENCE IS HERE

Rev01 front

RNGD (pronounced "Renegade") delivers high-performance LLM and multimodal deployment capabilities while maintaining a radically efficient 180W power profile.

512TFLOPS
64TFLOPS (FP8) x 8 processing elements
48GB
HBM3 memory capacity
2 x HBM3
CoWoS-S, 6.0Gbps
256MB SRAM
384TB/s on-chip bandwidth
1.5TB/s
HBM3 memory bandwidth
180W TDP
Targeting air-cooled data centers
PCIe P2P support for LLM BF16, FP8, INT8, INT4 support
Multiple-instance and virtualization Secure boot & model encryption

INFERENCE WITHOUT CONSTRAINTS

Performance

Deploy the most capable models with low latency and high throughput

Efficiency

Lower total cost of ownership with less energy, fewer racks, and air-cooled data centers of today

Programmability

Stay future-proof for tomorrow’s models and transition with ease

SOFTWARE FOR LLM DEPLOYMENT

Furiosa SW Stack consists of a model compressor, serving framework, runtime, compiler, profiler, debugger, and a suite of APIs for ease of programming and deployment.

Available now.

Fai sw Stack cubes 5 RGB 1

Built for advanced inference deployment

Comprehensive software toolkit for optimizing large language models on RNGD. User-friendly APIs facilitate seamless state-of-the-art LLM deployment.

Maximizing data center utilization

Ensure higher utilization and flexibility for small and large deployments with containerization, SR-IOV, Kubernetes, as well as other cloud native components.

Robust ecosystem support

Effortlessly deploy models from library to end-user with PyTorch 2.x integration. Leverage the vast advancements of open-source AI and seamlessly transition models into production.

Blackbg

BE IN THE KNOW

Sign up to be notified first about RNGD availability and product updates.