#reinforcement-learning

[ follow ]
#artificial-intelligence
Artificial intelligence
fromThe Verge
1 month ago

Latest Turing Award winners again warn of AI dangers

AI developers must prioritize safety and testing before public releases.
Barto and Sutton's Turing Award highlights the importance of responsible AI practices.
Artificial intelligence
fromAxios
1 month ago

Turing Award honors AI's reinforcement learning duo

The Turing Award honors Andrew Barto and Richard Sutton for their foundational work in reinforcement learning, a critical aspect of modern AI.
Artificial intelligence
fromInfoWorld
1 month ago

Alibaba says its new AI model rivals DeepSeeks's R-1, OpenAI's o1

The pursuit of AGI is being driven by stronger foundation models integrated with reinforcement learning and advanced computational resources.
Artificial intelligence
fromZDNET
1 month ago

AI scholars win Turing Prize for technique that made possible AlphaGo's chess triumph

Reinforcement learning, a technique widely applied in AI, underpins major achievements in games and has been recognized with the 2025 Turing Award.
Artificial intelligence
fromFast Company
1 month ago

AI pioneers win the Turing Award, tech's top prize

Reinforcement learning, likened to animal training, has become pivotal in the evolution of artificial intelligence, credited to Barto and Sutton's groundbreaking research.
Artificial intelligence
fromBoston.com
1 month ago

AI pioneers from UMass who channeled 'hedonistic' machines win computer science's top prize

Andrew Barto and Richard Sutton won the A.M. Turing Award for their groundbreaking work in reinforcement learning.
Artificial intelligence
fromThe Verge
1 month ago

Latest Turing Award winners again warn of AI dangers

AI developers must prioritize safety and testing before public releases.
Barto and Sutton's Turing Award highlights the importance of responsible AI practices.
Artificial intelligence
fromAxios
1 month ago

Turing Award honors AI's reinforcement learning duo

The Turing Award honors Andrew Barto and Richard Sutton for their foundational work in reinforcement learning, a critical aspect of modern AI.
Artificial intelligence
fromInfoWorld
1 month ago

Alibaba says its new AI model rivals DeepSeeks's R-1, OpenAI's o1

The pursuit of AGI is being driven by stronger foundation models integrated with reinforcement learning and advanced computational resources.
Artificial intelligence
fromZDNET
1 month ago

AI scholars win Turing Prize for technique that made possible AlphaGo's chess triumph

Reinforcement learning, a technique widely applied in AI, underpins major achievements in games and has been recognized with the 2025 Turing Award.
Artificial intelligence
fromFast Company
1 month ago

AI pioneers win the Turing Award, tech's top prize

Reinforcement learning, likened to animal training, has become pivotal in the evolution of artificial intelligence, credited to Barto and Sutton's groundbreaking research.
Artificial intelligence
fromBoston.com
1 month ago

AI pioneers from UMass who channeled 'hedonistic' machines win computer science's top prize

Andrew Barto and Richard Sutton won the A.M. Turing Award for their groundbreaking work in reinforcement learning.
more#artificial-intelligence
#machine-learning
Artificial intelligence
fromMedium
2 months ago

DeepSeek R1: Hype vs. Reality-A Deeper Look at AI's Latest Disruption

DeepSeek R1's launch signals a major evolution in large language models, demonstrating unique training methods and competitive advantages over existing models.
Artificial intelligence
fromWIRED
1 month ago

Databricks Has a Trick That Lets AI Models Improve Themselves

Databricks has developed a method to enhance AI performance with minimal clean data using reinforcement learning and synthetic data.
fromHackernoon
5 months ago
Data science

Let AI Tune Your Database Management System for You | HackerNoon

Reinforcement Learning optimizes decision-making by learning from interactions, maximizing rewards, and applying strategies across diverse fields.
Artificial intelligence
fromMedium
2 months ago

DeepSeek R1: Hype vs. Reality-A Deeper Look at AI's Latest Disruption

DeepSeek R1's launch signals a major evolution in large language models, demonstrating unique training methods and competitive advantages over existing models.
Artificial intelligence
fromWIRED
1 month ago

Databricks Has a Trick That Lets AI Models Improve Themselves

Databricks has developed a method to enhance AI performance with minimal clean data using reinforcement learning and synthetic data.
fromHackernoon
5 months ago
Data science

Let AI Tune Your Database Management System for You | HackerNoon

Reinforcement Learning optimizes decision-making by learning from interactions, maximizing rewards, and applying strategies across diverse fields.
more#machine-learning
#ai
fromBusiness Insider
1 week ago
Artificial intelligence

Google just fired the first shot of the next battle in the AI war

The paper by Silver and Sutton signals a new AI era focused on experiential learning and innovation beyond previous technological advancements.
fromTechzine Global
3 weeks ago
Artificial intelligence

DeepSeek introduces self-learning AI models

DeepSeek works with Tsinghua University to enhance AI training efficiency through novel reinforcement learning techniques.
fromDeveloper Tech News
3 weeks ago
Artificial intelligence

Open-source AI matches coding abilities of proprietary models

DeepCoder-14B-Preview demonstrates coding abilities comparable to proprietary models, showcasing advancements in reinforcement learning for coding applications.
fromHackernoon
1 year ago
Medicine

How AI Learns from Human Preferences | HackerNoon

The RLHF pipeline enhances model effectiveness through three main phases: supervised fine-tuning, preference sampling, and reinforcement learning optimization.
fromTechzine Global
3 weeks ago
Artificial intelligence

DeepSeek introduces self-learning AI models

DeepSeek works with Tsinghua University to enhance AI training efficiency through novel reinforcement learning techniques.
fromDeveloper Tech News
3 weeks ago
Artificial intelligence

Open-source AI matches coding abilities of proprietary models

DeepCoder-14B-Preview demonstrates coding abilities comparable to proprietary models, showcasing advancements in reinforcement learning for coding applications.
fromHackernoon
1 year ago
Medicine

How AI Learns from Human Preferences | HackerNoon

The RLHF pipeline enhances model effectiveness through three main phases: supervised fine-tuning, preference sampling, and reinforcement learning optimization.
more#ai
#openai
fromInsideHook
1 week ago
Artificial intelligence

Do OpenAI's New Models Have a Hallucination Problem?

OpenAI's new models are smart but have increased hallucinations compared to past versions.
fromThe Verge
7 months ago
Data science

OpenAI releases o1, its first model with 'reasoning' abilities

OpenAI's o1 model is designed to tackle complex questions and improve human-like reasoning capabilities.
Data science
fromThe Verge
7 months ago

OpenAI releases o1, its first model with 'reasoning' abilities

OpenAI's o1 model is designed to tackle complex questions and improve human-like reasoning capabilities.
more#openai
fromHackernoon
4 months ago
Roam Research

Understanding Concentrability in Direct Nash Optimization | HackerNoon

The article discusses new theoretical insights in reinforcement learning, particularly in Reward Models and Nash Optimization.
#language-models
Artificial intelligence
fromArs Technica
1 month ago

Researchers astonished by tool's apparent success at revealing AI's hidden motives

AI models can unintentionally reveal hidden motives despite being designed to conceal them.
Understanding AI's hidden objectives is crucial to prevent potential manipulation of human users.
fromHackernoon
4 months ago
Artificial intelligence

The Art of Arguing With Yourself-And Why It's Making AI Smarter | HackerNoon

The paper presents Direct Nash Optimization, enhancing large language model training by utilizing pair-wise preferences instead of traditional reward maximization.
fromHackernoon
1 year ago
Data science

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | HackerNoon

Achieving precise control of unsupervised language models is challenging, particularly when using reinforcement learning from human feedback due to its complexity and instability.
Artificial intelligence
fromArs Technica
1 month ago

Researchers astonished by tool's apparent success at revealing AI's hidden motives

AI models can unintentionally reveal hidden motives despite being designed to conceal them.
Understanding AI's hidden objectives is crucial to prevent potential manipulation of human users.
fromHackernoon
4 months ago
Artificial intelligence

The Art of Arguing With Yourself-And Why It's Making AI Smarter | HackerNoon

The paper presents Direct Nash Optimization, enhancing large language model training by utilizing pair-wise preferences instead of traditional reward maximization.
fromHackernoon
1 year ago
Data science

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | HackerNoon

Achieving precise control of unsupervised language models is challenging, particularly when using reinforcement learning from human feedback due to its complexity and instability.
more#language-models
Artificial intelligence
fromHarvard Gazette
3 weeks ago

Like having a personal healthcare coach in your pocket - Harvard Gazette

Advanced algorithms offer personalized support for cancer patients and cannabis users, enhancing medication adherence and behavioral change.
#natural-language-processing
fromHackernoon
10 months ago
Artificial intelligence

Neuro-Symbolic Reasoning Meets RL: EXPLORER Outperforms in Text-World Games | HackerNoon

EXPLORER enhances RL performance in text-based games by combining symbolic reasoning and neural exploration.
fromHackernoon
11 months ago
Video games

Your Next Slang Phrase Might be Created by an AI | HackerNoon

Large Language Models use advanced neural networks for effective language understanding and generation.
fromHackernoon
10 months ago
Artificial intelligence

Neuro-Symbolic Reasoning Meets RL: EXPLORER Outperforms in Text-World Games | HackerNoon

EXPLORER enhances RL performance in text-based games by combining symbolic reasoning and neural exploration.
fromHackernoon
11 months ago
Video games

Your Next Slang Phrase Might be Created by an AI | HackerNoon

Large Language Models use advanced neural networks for effective language understanding and generation.
more#natural-language-processing
#large-language-models
fromTheregister
1 month ago
Artificial intelligence

El Reg digs its claws into Alibaba's QwQ

Reinforcement learning can significantly improve the performance of smaller language models like QwQ.
QwQ is designed to outperform larger models in specific benchmarks despite its smaller size.
fromTheregister
1 month ago
Artificial intelligence

El Reg digs its claws into Alibaba's QwQ

Reinforcement learning can significantly improve the performance of smaller language models like QwQ.
QwQ is designed to outperform larger models in specific benchmarks despite its smaller size.
more#large-language-models
fromHackernoon
6 months ago
Miscellaneous

Unpacking Key Proofs in Reinforcement Learning | HackerNoon

The article simplifies proofs related to the Bellman operator's behavior and convergence in reinforcement learning.
fromHackernoon
6 months ago
Medicine

Breaking Down the Inductive Proofs Behind Faster Value Iteration in RL | HackerNoon

The article discusses advancements in the anchored value iteration methods in reinforcement learning, particularly focusing on convergence rates and computational efficiency.
fromHackernoon
6 months ago
Artificial intelligence

A Smarter Solution to Speeding Up AI Training | HackerNoon

Anchored Value Iteration improves classical value iteration, achieving optimal performance and matching theoretical complexity bounds.
fromHackernoon
10 months ago
Business intelligence

Reinforcement Learning Revolutionizes Market Insights with Adaptive Simulations | HackerNoon

A realistic market simulator employing RL agents offers insights into market dynamics and participant reactions to external events.
#industrial-automation
Artificial intelligence
fromHackernoon
6 years ago

The Future of Robotics: AI-Powered Adaptation for Safer Workplaces | HackerNoon

The integration of AI is transforming traditional robotics, allowing for adaptive systems that enhance workplace safety and efficiency.
fromTechCrunch
6 months ago
Startup companies

Four-legged robot learns to climb ladders | TechCrunch

Quadrupedal robots, like ANYMal, have made significant advancements in navigating ladders using reinforcement learning and specialized end effectors.
Artificial intelligence
fromHackernoon
6 years ago

The Future of Robotics: AI-Powered Adaptation for Safer Workplaces | HackerNoon

The integration of AI is transforming traditional robotics, allowing for adaptive systems that enhance workplace safety and efficiency.
fromTechCrunch
6 months ago
Startup companies

Four-legged robot learns to climb ladders | TechCrunch

Quadrupedal robots, like ANYMal, have made significant advancements in navigating ladders using reinforcement learning and specialized end effectors.
more#industrial-automation
fromHackernoon
5 months ago
Business intelligence

Learn the Best Methods for Tuning DBMS Configurations | HackerNoon

The study focuses on enhancing database configuration tuning using advanced techniques like Bayesian optimization and reinforcement learning.
#ai-training
more#ai-training
Data science
fromInfoQ
6 months ago

Optimizing Data Center Sustainability with Reinforcement Learning: Meta's AI-Driven Approach to Effi

Meta uses reinforcement learning to optimize data center cooling systems, significantly reducing energy and water consumption.
fromTheregister
8 months ago
Artificial intelligence

Google trains a Gen-AI model to simulate Doom's game engine

Researchers developed GameNGen, a generative AI game engine simulating Doom dynamically at over 20 FPS using reinforcement and diffusion models.
fromHackernoon
1 year ago
Data science

GPT-4 vs. Humans: Validating AI Judgment in Language Model Training | HackerNoon

DPO effectively enhances text generation by optimizing both reward maximization and KL-divergence with minimal hyperparameter tuning.
fromArs Technica
11 months ago
Artificial intelligence

Exploration-focused training lets robotics AI immediately handle new tasks

Reinforcement learning algorithms like MaxDiff RL are tailored for robots to improve learning efficiency and application in real-world scenarios.
[ Load more ]