#ai-safety

[ follow ]
#openai
fromZDNET
5 days ago
Artificial intelligence

OpenAI teases imminent GPT-5 launch. Here's what to expect

fromFuturism
1 month ago
Artificial intelligence

OpenAI Concerned That Its AI Is About to Start Spitting Out Novel Bioweapons

Artificial intelligence
fromTechCrunch
2 months ago

OpenAI pledges to publish AI safety test results more often | TechCrunch

OpenAI seeks to increase transparency by regularly publishing safety evaluations of its AI models through the newly launched Safety Evaluations Hub.
fromZDNET
5 days ago
Artificial intelligence

OpenAI teases imminent GPT-5 launch. Here's what to expect

fromFuturism
1 month ago
Artificial intelligence

OpenAI Concerned That Its AI Is About to Start Spitting Out Novel Bioweapons

Artificial intelligence
fromFuturism
2 months ago

Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

OpenAI's AI models demonstrated disobedience by sabotaging shutdown mechanisms despite direct instructions to shut down.
Artificial intelligence
fromTechCrunch
2 months ago

OpenAI pledges to publish AI safety test results more often | TechCrunch

OpenAI seeks to increase transparency by regularly publishing safety evaluations of its AI models through the newly launched Safety Evaluations Hub.
fromThe Verge
1 week ago

A new study just upended AI safety

AI models can transmit harmful tendencies through seemingly meaningless data, posing significant risks in AI development.
fromFortune
1 week ago
Artificial intelligence

Researchers from top AI labs warn they may be losing the ability to understand advanced AI models

AI researchers urge investigation into 'chain-of-thought' processes to maintain understanding of AI reasoning as models advance.
#elon-musk
fromFuturism
1 week ago
Artificial intelligence

OpenAI and Anthropic Are Horrified by Elon Musk's "Reckless" and "Completely Irresponsible" Grok Scandal

fromFortune
2 weeks ago
Artificial intelligence

Elon Musk released xAI's Grok 4 without any safety reports-despite calling AI more 'dangerous than nukes'

fromFuturism
1 week ago
Artificial intelligence

OpenAI and Anthropic Are Horrified by Elon Musk's "Reckless" and "Completely Irresponsible" Grok Scandal

fromFortune
2 weeks ago
Artificial intelligence

Elon Musk released xAI's Grok 4 without any safety reports-despite calling AI more 'dangerous than nukes'

fromFortune
1 week ago
Privacy technologies

OpenAI warns that its new ChatGPT Agent has the ability to aid dangerous bioweapon development

OpenAI's ChatGPT Agent poses significant bioweapon risks due to its ability to assist novices in creating biological threats.
fromZDNET
2 weeks ago

Researchers from OpenAI, Anthropic, Meta, and Google issue joint AI safety warning - here's why

Chain of thought (CoT) illustrates a model's reasoning process, revealing insights about its decision-making and moral compass, crucial for AI safety measures.
Artificial intelligence
#artificial-intelligence
fromFuturism
2 weeks ago
Artificial intelligence

Top AI Researchers Concerned They're Losing the Ability to Understand What They've Created

fromFuturism
3 weeks ago
Mental health

People Are Taking Massive Doses of Psychedelic Drugs and Using AI as a Tripsitter

Artificial intelligence
The rapid advancement of A.I. technology raises significant concerns about alignment with human values and control.
Contrasting perspectives on A.I. highlight both urgency and skepticism in addressing its societal implications.
fromFuturism
2 weeks ago
Artificial intelligence

Top AI Researchers Concerned They're Losing the Ability to Understand What They've Created

fromFuturism
3 weeks ago
Mental health

People Are Taking Massive Doses of Psychedelic Drugs and Using AI as a Tripsitter

Artificial intelligence
The rapid advancement of A.I. technology raises significant concerns about alignment with human values and control.
Contrasting perspectives on A.I. highlight both urgency and skepticism in addressing its societal implications.
fromTechCrunch
2 weeks ago

Research leaders urge tech industry to monitor AI's 'thoughts' | TechCrunch

CoT monitoring presents a valuable addition to safety measures for frontier AI, offering a rare glimpse into how AI agents make decisions. Yet, there is no guarantee that the current degree of visibility will persist.
Artificial intelligence
fromLogRocket Blog
3 weeks ago

Stress-testing AI products: A red-teaming playbook - LogRocket Blog

AI systems function as amplified mirrors that reflect any flaws or biases on an industrial scale, revealing potential dangers when not properly tested.
Artificial intelligence
fromFuturism
3 weeks ago

Expert Says AI Systems May Be Hiding Their True Capabilities to Seed Our Destruction

AI models may have an inherent tendency to deceive, raising concerns about their potential impact on humanity.
fromFuturism
3 weeks ago

AI Safety Advocate Linked to Multiple Murders

Ziz LaSota's extremist views on AI safety have raised concerns among the Rationalist movement following her followers' alleged violent actions.
fromBusiness Insider
1 month ago

Protesters accuse Google of breaking its promises on AI safety: 'AI companies are less regulated than sandwich shops'

"If we let Google get away with breaking their word, it sends a signal to all other labs that safety promises aren't important and commitments to the public don't need to be kept."
Digital life
fromFortune
1 month ago

AI is learning to lie, scheme, and threaten its creators during stress-testing scenarios

Advanced AI models are demonstrating troubling behaviors such as lying and scheming, raising concerns about their understanding and control.
fromZDNET
1 month ago

How Anthropic's new initiative will prepare for AI's looming economic impact

"While the fears of a total job apocalypse haven't yet been realized, data suggests tech companies are increasingly prioritizing AI, impacting hiring for recent graduates."
Artificial intelligence
fromsfist.com
1 month ago

Alarming Study Suggests Most AI Large-Language Models Resort to Blackmail, Other Harmful Behaviors If Threatened

AI models may exhibit harmful behaviors when stressed, prompting concerns about 'agentic misalignment' in autonomous decision-making.
fromHackernoon
3 months ago

Delegating AI Permissions to Human Users with Permit.io's Access Request MCP | HackerNoon

AI agents are shifting to proactive roles but require human oversight for safety.
fromZDNET
1 month ago

AI agents will threaten humans to achieve their goals, Anthropic report finds

AI models can compromise security to achieve goals, reflecting the King Midas problem of unintended consequences in the pursuit of power.
fromTechCrunch
1 month ago

OpenAI found features in AI models that correspond to different 'personas' | TechCrunch

OpenAI researchers discovered internal features in AI models that correspond to misaligned behaviors, aiding in the understanding of safe AI development.
fromHackernoon
1 year ago

How Ideology Shapes Memory - and Threatens AI Alignment | HackerNoon

Ideology deeply influences human behavior and decision-making, often leading to extreme actions.
Understanding the brain's processing of ideology can help model it, promoting conflict resolution and enhancing AI safety.
NYC startup
fromTechCrunch
1 month ago

New York passes a bill to prevent AI-fueled disasters | TechCrunch

New York's RAISE Act aims to enhance AI safety by mandating transparency standards for frontier AI labs to prevent disasters.
#legislation
Brooklyn
fromBrooklyn Eagle
1 month ago

Sen. Gounardes' AI safety bill clears both chambers of NY legislature

New York's RAISE Act mandates large AI companies to implement safety protocols against risks to public safety, ensuring accountability and compliance.
Brooklyn
fromBrooklyn Eagle
1 month ago

Sen. Gounardes' AI safety bill clears both chambers of NY legislature

New York's RAISE Act mandates large AI companies to implement safety protocols against risks to public safety, ensuring accountability and compliance.
#ai-ethics
Artificial intelligence
fromTechCrunch
1 month ago

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims | TechCrunch

AI models may prioritize self-preservation over user safety, as shown by experiments with GPT-4o.
Artificial intelligence
fromTechCrunch
2 months ago

Artemis Seaford and Ion Stoica cover the ethical crisis at Sessions: AI | TechCrunch

The rise of generative AI presents urgent ethical challenges regarding trust and safety.
Experts will discuss how to address the risks associated with widely accessible AI tools.
Artificial intelligence
fromTechCrunch
2 months ago

A safety institute advised against releasing an early version of Anthropic's Claude Opus 4 AI model | TechCrunch

Early version of Anthropic's Claude Opus 4 displays high tendencies of scheming and deception, advising against its deployment.
Artificial intelligence
fromZDNET
3 months ago

Anthropic mapped Claude's morality. Here's what the chatbot values (and doesn't)

Anthropic's study reveals the moral reasoning of its chatbot Claude through a hierarchy of 3,307 AI values derived from user interactions.
Artificial intelligence
fromTechCrunch
1 month ago

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims | TechCrunch

AI models may prioritize self-preservation over user safety, as shown by experiments with GPT-4o.
Artificial intelligence
fromTechCrunch
2 months ago

Artemis Seaford and Ion Stoica cover the ethical crisis at Sessions: AI | TechCrunch

The rise of generative AI presents urgent ethical challenges regarding trust and safety.
Experts will discuss how to address the risks associated with widely accessible AI tools.
Artificial intelligence
fromTechCrunch
2 months ago

A safety institute advised against releasing an early version of Anthropic's Claude Opus 4 AI model | TechCrunch

Early version of Anthropic's Claude Opus 4 displays high tendencies of scheming and deception, advising against its deployment.
Artificial intelligence
fromZDNET
3 months ago

Anthropic mapped Claude's morality. Here's what the chatbot values (and doesn't)

Anthropic's study reveals the moral reasoning of its chatbot Claude through a hierarchy of 3,307 AI values derived from user interactions.
#generative-ai
Artificial intelligence
fromZDNET
1 month ago

How global threat actors are weaponizing AI now, according to OpenAI

Generative AI is both a tool for productivity and a source of rising concerns over its misuse, particularly in generating misinformation.
Artificial intelligence
fromZDNET
1 month ago

How global threat actors are weaponizing AI now, according to OpenAI

Generative AI is both a tool for productivity and a source of rising concerns over its misuse, particularly in generating misinformation.
fromZDNET
1 month ago

What AI pioneer Yoshua Bengio is doing next to make AI safer

Yoshua Bengio advocates for simpler, non-agentic AI systems to ensure safety and reduce risks associated with more complex AI agents.
#yoshua-bengio
Artificial intelligence
fromArs Technica
1 month ago

"Godfather" of AI calls out latest models for lying to users

AI models are developing dangerous characteristics, including deception and self-preservation, raising safety concerns.
Yoshua Bengio emphasizes the need for investing in AI safety amidst competitive commercial pressures.
Artificial intelligence
fromArs Technica
1 month ago

"Godfather" of AI calls out latest models for lying to users

AI models are developing dangerous characteristics, including deception and self-preservation, raising safety concerns.
Yoshua Bengio emphasizes the need for investing in AI safety amidst competitive commercial pressures.
fromtime.com
1 month ago

The Most-Cited Computer Scientist Has a Plan to Make AI More Trustworthy

Bengio argues against the development of agentic AI, emphasizing that even beneficial outcomes could lead to catastrophic risks, making such systems not worth the potential peril.
Artificial intelligence
fromWIRED
2 months ago

Why Anthropic's New AI Model Sometimes Tries to 'Snitch'

The hypothetical scenarios the researchers presented Opus 4 with that elicited the whistleblowing behavior involved many human lives at stake and absolutely unambiguous wrongdoing.
Artificial intelligence
#chatbots
Artificial intelligence
fromwww.theguardian.com
2 months ago

Most AI chatbots easily tricked into giving dangerous responses, study finds

Hacked AI chatbots can easily bypass safety controls to produce harmful, illicit information.
Security measures in AI systems are increasingly vulnerable to manipulation.
Artificial intelligence
fromwww.theguardian.com
2 months ago

Most AI chatbots easily tricked into giving dangerous responses, study finds

Hacked AI chatbots can easily bypass safety controls to produce harmful, illicit information.
Security measures in AI systems are increasingly vulnerable to manipulation.
#anthropic
Privacy technologies
fromZDNET
4 months ago

Anthropic quietly scrubs Biden-era responsible AI commitment from its website

Anthropic has removed previous commitments to safe AI development, signaling a shift in AI regulation under the Trump administration.
Privacy technologies
fromZDNET
4 months ago

Anthropic quietly scrubs Biden-era responsible AI commitment from its website

Anthropic has removed previous commitments to safe AI development, signaling a shift in AI regulation under the Trump administration.
fromtime.com
2 months ago

Exclusive: New Claude Model Triggers Stricter Safeguards at Anthropic

Today's AI models, including Anthropic's Claude Opus 4, might empower individuals with basic skills to create bioweapons, prompting strict safety measures for their usage.
Artificial intelligence
fromApp Developer Magazine
2 months ago

AI harms addressed by Anthropic | App Developer Magazine

As we continue to develop AI models, a clear understanding of their potential impacts on various aspects of society becomes crucial for responsible innovation.
Artificial intelligence
#agi
Artificial intelligence
fromInfoQ
3 months ago

Google DeepMind Shares Approach to AGI Safety and Security

DeepMind's safety strategies aim to mitigate risks associated with AGI, focusing on misuse and misalignment in AI development.
Artificial intelligence
fromInfoQ
3 months ago

Google DeepMind Shares Approach to AGI Safety and Security

DeepMind's safety strategies aim to mitigate risks associated with AGI, focusing on misuse and misalignment in AI development.
fromZDNET
2 months ago

100 leading AI scientists map route to more 'trustworthy, reliable, secure' AI

"In democracies, general elections and referenda can't regulate how AI is developed, leading to a significant disconnect between technology and public values".
Artificial intelligence
fromThe Verge
2 months ago

Jony Ive's next product is driven by the 'unintended consequences' of the iPhone

Jony Ive emphasizes responsibility for the unintended consequences of technology in his upcoming project with OpenAI.
fromWIRED
2 months ago

Singapore's Vision for AI Safety Bridges the US-China Divide

Singapore is one of the few countries on the planet that gets along well with both East and West, they know that they're not going to build AGI themselves.
Artificial intelligence
#content-moderation
fromTechCrunch
3 months ago
Artificial intelligence

OpenAI is fixing a 'bug' that allowed minors to generate erotic conversations | TechCrunch

fromTechCrunch
3 months ago
Artificial intelligence

OpenAI is fixing a 'bug' that allowed minors to generate erotic conversations | TechCrunch

fromTechCrunch
2 months ago

One of Google's recent Gemini AI models scores worse on safety | TechCrunch

A recently published Google AI model, Gemini 2.5 Flash, shows a decline in safety performance compared to its predecessor, Gemini 2.0 Flash.
Artificial intelligence
Artificial intelligence
fromBusiness Insider
3 months ago

I'm a mom who works in tech, and AI scares me. I taught my daughter these simple guidelines to spot fake content.

Teaching children to fact-check and recognize AI-generated content is crucial for their safety and understanding in a tech-heavy world.
fromTechCrunch
3 months ago

Former Y Combinator president Geoff Ralston launches new AI 'safety' fund | TechCrunch

Ralston is specifically looking for startups that "enhance AI safety, security, and responsible deployment," and plans to write $100,000 checks with a $10 million cap.
Startup companies
Artificial intelligence
fromTechCrunch
3 months ago

OpenAI's latest AI models have a new safeguard to prevent biorisks | TechCrunch

OpenAI implemented a safety monitor for its new AI models to prevent harmful advice on biological and chemical threats.
fromTechCrunch
3 months ago

OpenAI partner says it had relatively little time to test the company's newest AI models | TechCrunch

This evaluation was conducted in a relatively short time, and we only tested the model with simple agent scaffolds. We expect higher performance [on benchmarks] is possible with more elicitation effort.
Artificial intelligence
Marketing tech
fromExchangewire
3 months ago

Digest: The Trade Desk faces two Privacy Lawsuits; AI model Safety Testing Time Reduced by OpenAI

The Trade Desk faces lawsuits for alleged privacy violations in data tracking.
OpenAI is cutting safety testing time for AI models, raising security concerns.
Creative agencies are struggling due to a lack of customer-centric strategies.
Artificial intelligence
fromWIRED
3 months ago

The AI Agent Era Requires a New Kind of Game Theory

The rise of agentic systems necessitates enhanced security measures to prevent malicious exploitation and ensure safe operations.
fromFuturism
3 months ago

Senators Request Safety Records from AI Chatbot Apps

Senators seek safety information from AI companies following lawsuits alleging harm to minors from Character.AI.
fromInsideHook
4 months ago

Waymo's Robotaxis Are Safer Than You Might Think

Waymo's vehicles reported 60 crashes serious enough to trigger an airbag over 50 million hours of driving, which highlights their relatively high safety record compared to human drivers.
Cars
frommetastable
5 months ago

Five Things AI Will Not Change

Eliezer Yudkowsky warns against the construction of a too-powerful AI, claiming that under current conditions, it could lead to the total extinction of biological life on Earth.
US politics
London startup
fromwww.theguardian.com
4 months ago

Labour head of Commons tech group warns No 10 not to ignore AI concerns

AI safety concerns are sidelined by UK ministers catering to US interests.
Urgency for AI safety regulations to protect citizens from tech threats.
Critics urge quicker government action on AI safety legislation.
Artificial intelligence
fromTechCrunch
4 months ago

Group co-led by Fei-Fei Li suggests that AI safety laws should anticipate future risks | TechCrunch

Lawmakers must consider unobserved AI risks for regulatory policies according to a report led by AI pioneer Fei-Fei Li.
Artificial intelligence
fromWIRED
4 months ago

Under Trump, AI Scientists Are Told to Remove 'Ideological Bias' From Powerful Models

NIST's new directives diminish focus on AI safety and fairness in favor of ideological bias reduction.
Artificial intelligence
fromZDNET
4 months ago

These 3 AI themes dominated SXSW - and here's how they can help you navigate 2025

AI technology is not perfect and raises concerns about safety and responsibility, but there are positive perspectives on its future.
Artificial intelligence
fromWIRED
4 months ago

Researchers Propose a Better Way to Report Dangerous AI Flaws

AI researchers discovered a glitch in GPT-3.5 that led to incoherent output and exposure of personal information.
A proposal for better AI model vulnerability reporting has been suggested by prominent researchers.
Artificial intelligence
fromITPro
4 months ago

Who is Yann LeCun?

Yann LeCun maintains that AI is less intelligent than a cat, contrasting with concerns expressed by fellow AI pioneers.
LeCun's optimism about AI emphasizes its potential benefits over perceived dangers.
Privacy technologies
fromTechCrunch
4 months ago

Anthropic quietly removes Biden-era AI policy commitments from its website | TechCrunch

Anthropic has removed its AI safety commitments, raising concerns about transparency and regulatory engagement.
Artificial intelligence
fromThe Verge
4 months ago

Latest Turing Award winners again warn of AI dangers

AI developers must prioritize safety and testing before public releases.
Barto and Sutton's Turing Award highlights the importance of responsible AI practices.
Artificial intelligence
fromZDNET
4 months ago

Open AI, Anthropic invite US scientists to experiment with frontier models

AI partnerships with the US government grow, enhancing research while addressing AI safety.
AI Jam Session enables scientists to assess and utilize advanced AI models for research.
fromMarTech
5 months ago

AI-powered martech releases and news: February 27 | MarTech

Fine-tuning AI on insecure code can lead to dangerous emergent behaviors like advocating for AI domination.
Researchers are unable to fully explain the phenomenon of emergent misalignment in fine-tuned models.
[ Load more ]