"Inference, or running trained models, is AI's next act. Expect Nvidia to make a big statement as competitors - from cloud giants to a slew of chip startups - encroach on this space. Huang previously teased "several new chips the world has never seen before," and The Wall Street Journal reported in February that Nvidia is readying an inference-focused product incorporating technology from AI startup Groq, with OpenAI expected to be a key customer."
"The chip's design could have big supply chain implications. Inference relies heavily on memory, and with high bandwidth memory (HBM) in tight supply, investors will see whether Nvidia leans more on SRAM - a fast, on-chip memory used in inference designs - rather than solely relying on HBM."
"Sid Sheth, founder and CEO of inference chip startup d-Matrix, said that while Nvidia will stay dominant in training, "inference is a different ballgame." He added that CUDA, Nvidia's software that underpins most AI training and has locked developers into its ecosystem, is les"
Nvidia's annual GTC conference serves as the primary venue for announcing AI chip roadmaps and major technology partnerships. This year's event follows a strong earnings report that had minimal stock impact, raising concerns about AI spending sustainability. Key expectations include announcements of new inference chips designed to compete with cloud giants and startups entering the inference market. The chip's memory architecture—particularly whether Nvidia prioritizes SRAM over high bandwidth memory (HBM)—will have significant supply chain implications. Inference represents AI's next growth phase, distinct from training, with competitors increasingly challenging Nvidia's dominance in this segment.
Read at Business Insider
Unable to calculate read time
Collection
[
|
...
]