d-Matrix aspires to rack scale AI with JetStream I/O cards
Briefly

d-Matrix aspires to rack scale AI with JetStream I/O cards
"AI chip startup d-Matrix is pushing into rack scale with the introduction of its JetStream I/O cards, which are designed to allow larger models to be distributed across multiple servers or even racks while minimizing performance bottlenecks."
"At face value, JetStream presents itself as a fairly standard PCIe 5.0 NIC. It supports two ports at 200 Gb/s or a single port at 400 Gb/s and operates over standard Ethernet."
"Two of these NICs are designed to be paired with up to eight of the chip vendor's 600-watt Corsair AI accelerators in a topology resembling the following. The custom ASICs are no slouch either, with each card capable of churning out 2.4 petaFLOPS when using the MXINT8 data type or 9.6 petaFLOPS when using the lower precision MXINT4 type. d-Matrix's memory hierarchy pairs a rather large amount of blisteringly fast SRAM with much, much slower, but higher capacity LPDDR5 memory, the same kind you'd find in a high-end notebook. Each Corsair card contains 2GB of SRAM good for 150 TB/s along with 256GB of LPDDR5 capable of 400 GB/s. To put those figures in perspective, a single Nvidia B200 offers 180GB of HBM3e and 8TB/s of bandwidth."
JetStream I/O cards provide rack-scale networking for larger AI models by appearing as PCIe 5.0 NICs with two 200 Gb/s ports or a single 400 Gb/s port over standard Ethernet. A custom FPGA-based NIC consuming about 150W reduces network latency to roughly two microseconds. Two NICs can pair with up to eight 600W Corsair AI accelerator cards. Each Corsair card delivers 2.4 petaFLOPS (MXINT8) or 9.6 petaFLOPS (MXINT4) and combines 2GB SRAM at 150 TB/s with 256GB LPDDR5 at 400 GB/s. Two terabytes of LPDDR5 per node can support multi-trillion-parameter models at 4-bit precision, and inference performance remains bandwidth-constrained.
Read at Theregister
Unable to calculate read time
[
|
]