#ai-inference

[ follow ]
Artificial intelligence
fromIT Pro
1 week ago

'TPUs just work': Why Google Cloud is betting big on its custom chips

Google's seventh generation TPU, 'Ironwood', aims to lead in AI workload efficiency and cost-effectiveness.
TPUs were developed with a cohesive hardware-software synergy, enhancing their utility for AI applications.
#nvidia
Artificial intelligence
fromZDNET
8 months ago

AI startup Cerebras debuts 'world's fastest inference' service - with a twist

Cerebras Systems aims to capture a share of the rapidly growing AI inference market, challenging Nvidia's dominance with its advanced AI services.
Silicon Valley
fromBusiness Insider
1 month ago

2 reasons why Nvidia's Jensen Huang isn't worried

Nvidia CEO Jensen Huang is confident in sustained demand for Nvidia chips due to new powerful GPUs and an industry shift towards AI inference.
fromBusiness Insider
1 week ago
Artificial intelligence

AMD's CTO says AI inference will move out of data centers and increasingly to phones and laptops

AMD is positioning itself to capitalize on the shift to AI inference, targeting market segments traditionally dominated by Nvidia.
fromTheregister
1 month ago
Tech industry

Nvidia unveils 288 GB Blackwell Ultra GPUs

Nvidia's Blackwell Ultra architecture enhances AI inference with massive performance and increased memory capacity.
Improved throughput enables AI models like DeepSeek-R1 to respond rapidly, boosting efficiency.
Artificial intelligence
fromZDNET
8 months ago

AI startup Cerebras debuts 'world's fastest inference' service - with a twist

Cerebras Systems aims to capture a share of the rapidly growing AI inference market, challenging Nvidia's dominance with its advanced AI services.
Silicon Valley
fromBusiness Insider
1 month ago

2 reasons why Nvidia's Jensen Huang isn't worried

Nvidia CEO Jensen Huang is confident in sustained demand for Nvidia chips due to new powerful GPUs and an industry shift towards AI inference.
fromBusiness Insider
1 week ago
Artificial intelligence

AMD's CTO says AI inference will move out of data centers and increasingly to phones and laptops

AMD is positioning itself to capitalize on the shift to AI inference, targeting market segments traditionally dominated by Nvidia.
fromTheregister
1 month ago
Tech industry

Nvidia unveils 288 GB Blackwell Ultra GPUs

Nvidia's Blackwell Ultra architecture enhances AI inference with massive performance and increased memory capacity.
Improved throughput enables AI models like DeepSeek-R1 to respond rapidly, boosting efficiency.
more#nvidia
fromInfoQ
2 months ago
Agile

Hugging Face Expands Serverless Inference Options with New Provider Integrations

Hugging Face now integrates four serverless inference providers into its platform, enhancing ease of access and speed for AI model inference.
#edge-computing
fromInfoQ
5 months ago
Artificial intelligence

Efficient Resource Management with Small Language Models (SLMs) in Edge Computing

Small Language Models (SLMs) enable AI inference on edge devices without overwhelming resource limitations.
fromTheregister
6 months ago
Miscellaneous

Supermicro crams 18 GPUs into a 3U box

Supermicro's SYS-322GB-NR efficiently accommodates 18 GPUs in a compact design for edge AI and visualization tasks.
fromInfoQ
5 months ago
Artificial intelligence

Efficient Resource Management with Small Language Models (SLMs) in Edge Computing

Small Language Models (SLMs) enable AI inference on edge devices without overwhelming resource limitations.
fromTheregister
6 months ago
Miscellaneous

Supermicro crams 18 GPUs into a 3U box

Supermicro's SYS-322GB-NR efficiently accommodates 18 GPUs in a compact design for edge AI and visualization tasks.
more#edge-computing
#cloud-computing
fromwww.nextplatform.com
7 months ago
Artificial intelligence

The Battle Begins For AI Inference Compute In The Datacenter

Cloud builders rely heavily on Nvidia GPUs for AI training, limiting options for emerging chip startups.
fromTheregister
6 months ago
Miscellaneous

TensorWave bags $43M to add 'thousands' AMD accelerators

TensorWave raised $43 million to scale its cloud platform with AMD accelerators, joining the wave of startups in the generative AI market.
fromwww.nextplatform.com
7 months ago
Artificial intelligence

The Battle Begins For AI Inference Compute In The Datacenter

Cloud builders rely heavily on Nvidia GPUs for AI training, limiting options for emerging chip startups.
fromTheregister
6 months ago
Miscellaneous

TensorWave bags $43M to add 'thousands' AMD accelerators

TensorWave raised $43 million to scale its cloud platform with AMD accelerators, joining the wave of startups in the generative AI market.
more#cloud-computing
fromTechCrunch
6 months ago
Artificial intelligence

Runware uses custom hardware and advanced orchestration for fast AI inference | TechCrunch

Runware offers rapid image generation through optimized servers, seeking to disrupt traditional GPU rental models with an API-based pricing structure.
[ Load more ]