loader image

Kalray Data Acceleration Cards
for High-Performance Storage
and Compute

Kalray K200 & TURBOCARD4

Efficient support of data
intensive workloads

AI acceleration: Up to 25 TFLOPs
(16 bits)/50 TOPs (8 bits)

Best performance
per watt and per $

Power consumption as low
as 30W


featuring open and pluggable architectures

Kalray Acceleration Cards

Coolidge is the third generation of Kalray’s MPPA DPUs. They offer an unprecedented level of performance and programmability. It is a game changer for data-centric heterogenous processing tasks as it significantly increases the performance per Watt and per dollar. Kalray DPUs are engineered to strike a balance between performance and power consumption, providing organizations with optimized power usage and energy efficiency for their data processing workloads. The energy-efficient MPPA architecture maximizes energy efficiency without compromising processing capabilities. Kalray DPUs are featured in Kalray’s Storage and Compute Acceleration Cards.

K200 Storage Acceleration Cards

K200 is Kalray’s family of data-centric acceleration cards, based on Coolidge. The K200 storage acceleration cards offer an unprecedented level of performance and programmability, a game changer solution in terms of performance per Watt and per dollar. Kalray K200 cards are easy to configure with the Easy Programming and Open software environment and can be used for networking, storage, security, AI and compute acceleration.

Kalray acceleration cards


K200 Storage Acceleration Cards

K200 datasheet box: By integrating K200-LP DPU PCIe cards into data storage servers and storage enclosures, data centers can accelerate their workloads…

TURBOCARD4 Compute Acceleration Cards

TC4 embodies Kalray’s vision in addressing the intricate demands of modern computing workloads. Housing four of the latest generation of Kalray’s DPUs, Coolidge2TM, in a single PCIe card, Kalray’s TC4 is designed to ensure customers can merge classical and AI-based processing technologies and create superior, efficiency-driven systems for the most processing-intensive AI applications. DPUs offer a very complementary architecture to GPUs, allowing for the processing of a large number of different operations in parallel in an asynchronous way. DPUs are well-suited for pre-processing data that is later used by GPUs or in the context of complex intelligent systems running many different algorithms in parallel.



TURBOCARD4 Compute Acceleration Cards

TC4 embodies Kalray’s vision in addressing the intricate demands of modern computing workloads …



High performance data-centric processor with real-time processing Process more data faster
Parallel execution of heterogenous multi-processing tasks Process more data faster
Fully programmable with open software environment Easy to program and integrate
High speed interfaces: connected to high-speed fabrics Leverage state-of-the-art networking technologies
Secure islands, encrypt/decrypt, secure boot Data is secure during processing
Enables power efficiency Process more data per watt

Kalray Solutions



5G / Edge


Intelligent Data

Kalray MPPA® DPU Manycore

A Massively Parallel Processor Array Architecture

The world is facing an explosion of data which current technologies were not initially designed for and cannot always handle efficiently. The industry needs a new type of processor. Enter the era of Intelligent DPU (DataProcessing Unit) Processors.

The MPPA® DPU’s interconnects are suited to different types of data transfers. The first interconnect is an AXIFabric bus grid, for read/write accesses from cores to memories and peripherals connected by PCIe. The second interconnect is an RDMANoC(Network-on-Chip), that supports data transfers to or from the Ethernet network interfaces and connect all clusters together.

The overall architecture of the Kalray 3rd generation of MPPA® DPU (Data Processing Unit) processor aka Coolidge™ is based on a “Massively Parallel Processor Array” architecture, which is characterized by the association of computing clusters connected to each other, to the external memory and to the I/O interfaces via two independent interconnects.

The robust partitioning necessary for safe operation of the processor is carried out at the granularity of the computing cluster and is based on the configuration of memory management units (MMUs), memory protection units (MPUs), and on the deactivation or not of network on chip links.

evolution from single core to manycore

Start Your Journey


A new generation of processor is needed to support the “explosion of data” we’re experiencing today



Efficiency and
reducing cost



Explore the