#memory-allocation

[ follow ]
Scala
fromHackernoon
1 month ago

vAttention System Design: Dynamic KV-Cache with Contiguous Virtual Memory | HackerNoon

vAttention improves efficiency in large language models by using dynamic memory allocation and pre-reserving virtual memory.
[ Load more ]