#distributed-inference

[ follow ]
Artificial intelligence
fromInfoWorld
4 days ago

Perplexity's open-source tool to run trillion-parameter models without costly upgrades

TransferEngine enables full-speed GPU-to-GPU communication across AWS and Nvidia hardware, letting trillion-parameter models run on older H100 and H200 systems.
[ Load more ]