#ai-performance

[ follow ]
fromTechCrunch
1 week ago

A new AI coding challenge just published its first results - and they aren't pretty | TechCrunch

A Brazilian engineer won the K Prize AI coding challenge with only 7.5% correct answers.
fromTechzine Global
1 week ago

Thinking too long makes AI models dumber

Claude models showed a notable sensitivity to irrelevant information during evaluation, leading to declining accuracy as reasoning length increased. OpenAI's models, in contrast, fixated on familiar problems.
Artificial intelligence
fromIT Pro
1 week ago

Dedicated servers are back in vogue as IT leaders scramble to meet AI, compliance requirements

Organizations are increasingly migrating workloads from public cloud to dedicated servers, highlighting a significant revival in their use. AI performance requirements drive this trend.
#openai
fromBusiness Insider
1 week ago
Artificial intelligence

OpenAI just won gold at the world's most prestigious math competition. Here's why that's a big deal.

fromTechCrunch
3 months ago
Artificial intelligence

OpenAI's o3 AI model scores lower on a benchmark than the company initially implied | TechCrunch

fromBusiness Insider
1 week ago
Artificial intelligence

OpenAI just won gold at the world's most prestigious math competition. Here's why that's a big deal.

fromTechCrunch
3 months ago
Artificial intelligence

OpenAI's o3 AI model scores lower on a benchmark than the company initially implied | TechCrunch

fromZDNET
3 weeks ago

5 ways to be great AI agent manager, according to business leaders

Antony Hausdoerfer emphasized that successful AI managers must ensure AI agents deliver trusted value and safe operations, focusing on applications that yield meaningful outcomes.
Business
fromHackernoon
1 year ago

phi-3-mini's Triumph: Redefining Performance on Academic LLM Benchmarks | HackerNoon

The results for phi-3-mini on standard open-source benchmarks measure the model's reasoning ability, comparing it to phi-2 and several other notable models.
Artificial intelligence
fromTheregister
1 month ago

Microsoft Copilot falls Atari 2600 Video Chess

Copilot struggled to beat Atari 2600 Video Chess despite confidence in its chess capabilities.
fromTheregister
1 month ago

Chap claims Atari 2600 beat ChatGPT at chess

AI struggles at chess against a retro console's game engine.
fromGadgets 360
2 months ago

Should You Buy an AI PC? In Conversation With Asus' Arnold Su

"I think it's a positive signal to Asus and to the gaming industry that end users are really eagerly waiting for the new graphics card and new chassis coming to the market," said Arnold Su.
Artificial intelligence
fromIT Pro
2 months ago

Acer's new Swift Edge 14 AI is a Copilot+ MacBook Air killer

The Swift Edge 14 AI Copilot+ PC is one of the lightest devices in its category, weighing only 0.99kg, making it a highly portable laptop choice.
Apple
#qualcomm
fromTechzine Global
2 months ago

Microsoft expands fine-tuning capabilities in Azure AI Foundry

Reinforcement Fine-Tuning (RFT) is a new method that uses chain-of-thought reasoning and task-specific evaluation to improve model performance in specific application domains.
Artificial intelligence
Artificial intelligence
fromInfoWorld
3 months ago

Learning how to measure genAI's impact

AI model improvements are often difficult to quantify accurately.
Smaller language models may outperform larger ones in practical applications.
The debate on AGI misdefines human intelligence benchmarks.
fromZDNET
3 months ago

OpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the time

OpenAI's Deep Research technology surpasses other models and humans in web searches, but still fails nearly half the time.
fromTechCrunch
3 months ago

Meta's vanilla Maverick AI model ranks below rivals on a popular chat benchmark | TechCrunch

The incident prompted the maintainers of LM Arena to apologize, change their policies, and score the unmodified, vanilla Maverick.
Artificial intelligence
#nvidia
fromZDNET
3 months ago
Artificial intelligence

Nvidia dominates in gen AI benchmarks, clobbering 2 rival AI chips

Artificial intelligence
fromHackernoon
4 months ago

Nvidia Promises 40x Hopper Performance in Blackwell Unveil at GTC 2025 | HackerNoon

NVIDIA's Blackwell platform offers groundbreaking AI performance improvements.
Baidu's ERNIE 4.5 outperforms GPT-4o at a fraction of the cost.
fromTheregister
4 months ago
Gadgets

Nvidia wants to put a Grace Blackwell Ultra on your desk

The Nvidia DGX Station has upgraded with a new Superchip, achieving 20 petaFLOPS of AI performance.
The DGX Spark provides impressive AI compute capabilities at an entry price for developers.
fromTechCrunch
4 months ago
Gadgets

Nvidia announces new GPUs at GTC 2025, including Vera Rubin | TechCrunch

Nvidia is launching its new GPU, Vera Rubin, with significant performance improvements over Blackwell.
Artificial intelligence
fromHackernoon
4 months ago

Nvidia Promises 40x Hopper Performance in Blackwell Unveil at GTC 2025 | HackerNoon

NVIDIA's Blackwell platform offers groundbreaking AI performance improvements.
Baidu's ERNIE 4.5 outperforms GPT-4o at a fraction of the cost.
fromComputerworld
4 months ago

Chat with your data: How 4 genAI tools stack up

AI tools vary in effectiveness for retrieving specific information from social media and structured data sources.
Claude and NotebookLM performed better in targeted searches than ChatGPT and Perplexity.
Challenges of navigating extensive datasets highlight real-world applications in demographic research.
Artificial intelligence
fromArs Technica
5 months ago

New AI text diffusion models break speed barriers by pulling words from noise

Diffusion models offer comparable performance to traditional models but with dramatically improved speed, changing dynamics in AI applications.
fromWIRED
6 months ago

The Essential CPU and GPU Field Guide for 2025

AMD's new advanced CPU boasts either 16 or 12 cores, engineered especially for creators and gamers, promising an 8% performance boost in gaming and 10% in other tasks.
Artificial intelligence
[ Load more ]