#distributed-inference
#distributed-inference

[ follow ]

Perplexity's open-source tool to run trillion-parameter models without costly upgrades

TransferEngine enables full-speed GPU-to-GPU communication across AWS and Nvidia hardware, letting trillion-parameter models run on older H100 and H200 systems.

[ Load more ]

#distributed-inference#distributed-inference

Perplexity's open-source tool to run trillion-parameter models without costly upgrades

#distributed-inference
#distributed-inference