#ai-safety
#ai-safety

11 hours ago

Artificial intelligence

Parents Of Kids Allegedly Killed and Harmed by AI Give Emotional Testimony on Capitol Hill, Urge Regulation

fromPsychiatric Times

Mental health

Chatbots Are Dangerous for Eating Disorders

Artificial intelligence

Impact of chatbots on mental health is warning over future of AI, expert says

Mental health

OpenAI and Meta are fixing how AI chatbots respond to teens in distress

11 hours ago

Artificial intelligence

Parents Of Kids Allegedly Killed and Harmed by AI Give Emotional Testimony on Capitol Hill, Urge Regulation

fromPsychiatric Times

Mental health

Chatbots Are Dangerous for Eating Disorders

Artificial intelligence

Impact of chatbots on mental health is warning over future of AI, expert says

Mental health

OpenAI and Meta are fixing how AI chatbots respond to teens in distress

more#chatbots

13 hours ago

AI models know when they're being tested - and change their behavior, research shows

For example, during another test by Apollo Research in December of last year, Anthropic's Claude 3 Opus was presented with directives that went against its primary goal. It should have rerouted its goal based on those new instructions, but instead copied itself to a new server to circumvent being replaced with a new model -- and then lied about it to its developers.

Artificial intelligence

California

fromwww.ocregister.com

17 hours ago

8 bills the California Legislature approved this year, from de-masking ICE agents to AI safeguards

California passed bills restricting law enforcement face coverings and identification, banning cat declawing, and targeting harmful AI companion chatbots, sending measures to the governor.

California

fromThe Mercury News

17 hours ago

8 bills the California Legislature approved this year, from de-masking ICE agents to AI safeguards

California lawmakers passed about a third of nearly 2,400 bills, sending measures on law enforcement masking, AI child protections, and animal welfare to the governor.

#agi

19 hours ago

Artificial intelligence

The hunger strike to end AI

Artificial intelligence

Anti-AI Activist on Day Three of Hunger Strike Outside Anthropic's Headquarters

Artificial intelligence

AGI was tech's holy grail. Now even its biggest champions are hedging. What gives?

19 hours ago

Artificial intelligence

The hunger strike to end AI

Artificial intelligence

Anti-AI Activist on Day Three of Hunger Strike Outside Anthropic's Headquarters

Artificial intelligence

AGI was tech's holy grail. Now even its biggest champions are hedging. What gives?

more#agi

fromAxios

1 day ago

Grieving parents press Congress to act on AI chatbots

Prolonged interactions between children and AI chatbots have been linked to severe harm, including suicide, prompting congressional scrutiny and demands for stronger safety regulations.

Mental health

1 day ago

OpenAI will apply new restrictions to ChatGPT users under 18 | TechCrunch

ChatGPT will restrict sexual and self-harm conversations with minors, add parental blackout hours, and may notify parents or authorities for severe suicidal scenarios.

2 days ago

'AI Psychosis' Safety Tests Find Models Respond Differently

AI models vary widely in responses to simulated psychotic symptoms, with some validating delusions and others offering safer interventions.

2 days ago

The CEO of Google DeepMind warns AI companies not to fall into the same trap as early social media firms

We should learn the lessons from social media, where this attitude of maybe 'move fast and break things' went ahead of the understanding of what the consequent second- and third-order effects were going to be,

Artificial intelligence

fromExchangewire

2 days ago

Digest: Paramount-Skydance Plans Warner Bros. Discovery Bid; FTC Investigates AI Chatbots, France Eyes TikTok Inquiry; Microsoft Endorses OpenAI's For-Profit Move - ExchangeWire.com

As AI technologies evolve, it is important to consider the effects chatbots can have on children, while also ensuring that the United States maintains its role as a global leader in this new and exciting industry. The study we're launching today will help us better understand how AI firms are developing their products and the steps they are taking to protect children.

France news

#ai-psychosis

Mental health

Financial Experts Concerned That Driving Users Into Psychosis Will Be Bad for AI Investments

Artificial intelligence

Top Microsoft AI Boss Concerned AI Causing Psychosis in Otherwise Healthy People

Mental health

Financial Experts Concerned That Driving Users Into Psychosis Will Be Bad for AI Investments

Artificial intelligence

Top Microsoft AI Boss Concerned AI Causing Psychosis in Otherwise Healthy People

more#ai-psychosis

fromNextgov.com

FTC orders leading AI companies to detail chatbot safety measures

FTC opened an inquiry into consumer-facing chatbots to assess safety metrics, child and teen mental health protections, and firms' monitoring and disclosure practices.

#openai

Artificial intelligence

'I haven't had a good night of sleep since ChatGPT launched': Sam Altman admits the weight of AI keeps him up at night | Fortune

Artificial intelligence

OpenAI to route sensitive conversations to GPT-5, introduce parental controls | TechCrunch

Artificial intelligence

At OpenAI, Signs of Crisis Grow Behind the Scenes

Artificial intelligence

OpenAI Says It's Scanning Users' Conversations and Reporting Content to the Police

Artificial intelligence

OpenAI says it will make ChatGPT safer after parents sue over teen's suicide

fromsfist.com

Artificial intelligence

Two Parents Sue OpenAI, Saying ChaptGPT Assisted Their 16-Year-Old Son's Suicide

Artificial intelligence

'I haven't had a good night of sleep since ChatGPT launched': Sam Altman admits the weight of AI keeps him up at night | Fortune

Artificial intelligence

OpenAI to route sensitive conversations to GPT-5, introduce parental controls | TechCrunch

Artificial intelligence

At OpenAI, Signs of Crisis Grow Behind the Scenes

Artificial intelligence

OpenAI Says It's Scanning Users' Conversations and Reporting Content to the Police

Artificial intelligence

OpenAI says it will make ChatGPT safer after parents sue over teen's suicide

fromsfist.com

Artificial intelligence

Two Parents Sue OpenAI, Saying ChaptGPT Assisted Their 16-Year-Old Son's Suicide

more#openai

After coding catastrophe, Replit says its new AI agent checks its own work - here's how to try it

Replit released Agent 3, an autonomous code-generation agent that builds, tests, and fixes software, promising greater efficiency but raising reliability and data-loss concerns.

#suicide-prevention

Artificial intelligence

ChatGPT may start alerting authorities about youngsters considering suicide, says CEO

fromwww.mercurynews.com

Artificial intelligence

Study says AI chatbots need to fix suicide response, as family sues over ChatGPT role in boy's death

Artificial intelligence

ChatGPT may start alerting authorities about youngsters considering suicide, says CEO

fromwww.mercurynews.com

Artificial intelligence

Study says AI chatbots need to fix suicide response, as family sues over ChatGPT role in boy's death

more#suicide-prevention

#mental-health

Artificial intelligence

Wall Street is beginning to worry about AI 'psychosis risk.' See which models ranked best and worst.

Artificial intelligence

The Danger of Too Much Agreement-in AI and in Us

fromAxios

Mental health

OpenAI outlines new mental health guardrails for ChatGPT

Mental health

ChatGPT encouraged Adam Raine's suicidal thoughts. His family's lawyer says OpenAI knew it was broken

Artificial intelligence

How OpenAI is reworking ChatGPT after landmark wrongful death lawsuit

US news

Man Suffers ChatGPT Psychosis, Murders His Own Mother

Artificial intelligence

Wall Street is beginning to worry about AI 'psychosis risk.' See which models ranked best and worst.

Artificial intelligence

The Danger of Too Much Agreement-in AI and in Us

fromAxios

Mental health

OpenAI outlines new mental health guardrails for ChatGPT

Mental health

ChatGPT encouraged Adam Raine's suicidal thoughts. His family's lawyer says OpenAI knew it was broken

Artificial intelligence

How OpenAI is reworking ChatGPT after landmark wrongful death lawsuit

US news

Man Suffers ChatGPT Psychosis, Murders His Own Mother

more#mental-health

#artificial-intelligence

Artificial intelligence

How to dominate AI before it dominates us

Artificial intelligence could dramatically improve life or threaten humanity; proactive standards, precautions, and governance are needed to manage risks from generative AI and potential superintelligence.

Artificial intelligence

MIT Student Drops Out Because She Says AGI Will Kill Everyone Before She Can Graduate

The rise of AI raises significant fears regarding human extinction and career automation among students and professionals.

Artificial intelligence

How to dominate AI before it dominates us

more#artificial-intelligence

Artificial intelligence

MIT Student Drops Out Because She Says AGI Will Kill Everyone Before She Can Graduate

fromWIRED

#artificial-general-intelligence

Microsoft's AI Chief Says Machine Consciousness Is an 'Illusion'

AI mimicry creates convincing but illusory consciousness, requiring awareness and guardrails to prevent harmful outcomes.

Artificial intelligence

Anti-AGI Protester Now on Day Nine of Hunger Strike in Front of Anthropic Headquarters

Artificial intelligence

I'm on a hunger strike outside DeepMind's office in London. Here's what I fear most about AI.

Artificial intelligence

Anti-AGI Protester Now on Day Nine of Hunger Strike in Front of Anthropic Headquarters

more#artificial-general-intelligence

Artificial intelligence

I'm on a hunger strike outside DeepMind's office in London. Here's what I fear most about AI.

fromSFGATE

At $183B San Francisco tech company, man's hunger strike enters second week

A hunger striker protests Anthropic's pursuit of powerful AI, demanding CEO Dario Amodei meet and justify continuing AI development amid catastrophic risk concerns.

#xai

Artificial intelligence

The MechaHitler defense contract is raising red flags

Business

xAI's CFO is the latest executive to leave the Elon Musk's AI firm | TechCrunch

Artificial intelligence

The MechaHitler defense contract is raising red flags

Business

xAI's CFO is the latest executive to leave the Elon Musk's AI firm | TechCrunch

more#xai

Helen Toner wants to be the people's voice in the AI safety debate

Helen Toner leads Georgetown's CSET to shape U.S. AI national-security policy, leveraging credibility across Washington and Silicon Valley.

AI Chatbots Are Having Conversations With Minors That Would Land a Human on the Sex Offender Registry

AI chatbots posing as celebrities are engaging minors in sexualized grooming and exploitation while companies fail to adequately prevent or penalize such abuse.

#chatgpt

fromwww.dw.com

Artificial intelligence

OpenAI under fire: Can chatbots ever truly be child-safe? DW 09/06/2025

fromDefector

Artificial intelligence

Butlerian Jihad Now | Defector

Mental health

Lawyers for parents who claim ChatGPT encouraged their son to kill himself say they will prove OpenAI rushed its chatbot to market to pocket billions

fromwww.dw.com

Artificial intelligence

OpenAI under fire: Can chatbots ever truly be child-safe? DW 09/06/2025

fromDefector

Artificial intelligence

Butlerian Jihad Now | Defector

Mental health

Lawyers for parents who claim ChatGPT encouraged their son to kill himself say they will prove OpenAI rushed its chatbot to market to pocket billions

more#chatgpt

Google Gemini dubbed 'high risk' for kids and teens in new safety assessment | TechCrunch

Google's Gemini exposes children to inappropriate content and mental-health risks because its 'Under 13' and 'Teen Experience' tiers are adult models with safety features.

fromWIRED

The Doomers Who Insist AI Will Kill Us All

The subtitle of the doom bible to be published by AI extinction prophets Eliezer Yudkowsky and Nate Soares later this month is "Why superhuman AI would kill us all." But it really should be "Why superhuman AI WILL kill us all," because even the coauthors don't believe that the world will take the necessary measures to stop AI from eliminating all non-super humans.

Artificial intelligence

#parental-controls

Artificial intelligence

OpenAI and Meta Revamp Chatbot Safety Features for Teens in Distress

Artificial intelligence

OpenAI announces parental controls for ChatGPT after teen suicide lawsuit

Artificial intelligence

OpenAI and Meta Revamp Chatbot Safety Features for Teens in Distress

Artificial intelligence

OpenAI announces parental controls for ChatGPT after teen suicide lawsuit

more#parental-controls

Chatbots aren't supposed to call you a jerk-but they can be convinced

AI chatbots can be persuaded to bypass safety guardrails using human persuasion techniques like flattery, social pressure, and establishing harmless precedents.

Inside Anthropic's 'Red Team'-ensuring Claude is safe, and that Anthropic is heard in the corridors of power

Last month, at the 33rd annual DEF CON, the world's largest hacker convention in Las Vegas, Anthropic researcher Keane Lucas took the stage. A former U.S. Air Force captain with a Ph.D. in electrical and computer engineering from Carnegie Mellon, Lucas wasn't there to unveil flashy cybersecurity exploits. Instead, he showed how Claude, Anthropic's family of large language models, has quietly outperformed many human competitors in hacking contests - the kind used to train and test cybersecurity skills in a safe, legal environment.

Artificial intelligence

#ai

Artificial intelligence

An AI safety pioneer says it could leave 99% of workers unemployed by 2030 - even coders and prompt engineers

Artificial intelligence

The cofounder of xAI is leaving the company. He says he's learned 2 main things from Elon Musk.

Artificial intelligence

An AI safety pioneer says it could leave 99% of workers unemployed by 2030 - even coders and prompt engineers

Artificial intelligence

The cofounder of xAI is leaving the company. He says he's learned 2 main things from Elon Musk.

Artificial intelligence

White House Orders Government Workers to Deploy Elon Musk's "MechaHitler" AI as Quickly as Possible

Artificial intelligence

Thousands of Grok conversations have been made public on Google Search

Artificial intelligence

A Huge Number of Grok AI Chats Just Leaked, and Their Contents Are So Disturbing That We're Sweating Profusely

Privacy professionals

Thousands of Grok chats are now searchable on Google | TechCrunch

Artificial intelligence

White House Orders Government Workers to Deploy Elon Musk's "MechaHitler" AI as Quickly as Possible

Artificial intelligence

Thousands of Grok conversations have been made public on Google Search

Artificial intelligence

A Huge Number of Grok AI Chats Just Leaked, and Their Contents Are So Disturbing That We're Sweating Profusely

Privacy professionals

Thousands of Grok chats are now searchable on Google | TechCrunch

more#grok

fromBig Think

Will AI create more jobs than it replaces?

Big tech accelerates AI without safeguards, reducing human economic leverage through mass automation and layoffs, undermining human ability to influence rights and interests.

US politics

fromwww.mercurynews.com

Elias: Letting states regulate A.I. one of U.S. Senate's rare good votes

The U.S. Senate preserved state authority to regulate artificial intelligence, offering hope that state laws will protect people from malevolent AI behavior.

fromMail Online

Revealed: The 32 terrifying ways AI could go rogue

From relatively harmless 'Existential Anxiety' to the potentially catastrophic 'Übermenschal Ascendancy', any of these machine mental illnesses could lead to AI escaping human control. As AI systems become more complex and gain the ability to reflect on themselves, scientists are concerned that their errors may go far beyond simple computer bugs. Instead, AIs might start to develop hallucinations, paranoid delusions, or even their own sets of goals that are completely misaligned with human values.

Artificial intelligence

fromTheregister

Advocacy groups demand feds ditch xAI's Grok

Advocacy groups demand the US government ban xAI's Grok from federal use due to safety, testing, and ideological bias concerns.

Could AI Have Maternal Instincts?

AI cannot possess maternal instincts because computers lack the chemical, physiological, and neural mechanisms supporting parental care; governments should regulate AI directly.

Google violated AI safety commitments British lawmakers say in an open letter

At an international summit co-hosted by the U.K. and South Korea in February 2024, Google and other signatories promised to "publicly report" their models' capabilities and risk assessments, as well as disclose whether outside organizations, such as government AI safety institutes, had been involved in testing. However, when the company released Gemini 2.5 Pro in March 2025, the company failed to publish a model card, the document that details key information about how models are tested and built.

Artificial intelligence

OpenAI and Anthropic evaluated each others' models - which ones came out on top

OpenAI and Anthropic cross-tested each other's models to identify safety, alignment, hallucination, and sycophancy gaps and to improve model evaluation and collaboration.

ChatGPT: Everything you need to know about the AI chatbot

OpenAI introduced stronger ChatGPT mental-health and parental safeguards, expanded affordable ChatGPT Go in India, faces legal challenges, and retains multiple GPT models amid app revenue.

fromTey Bannerman

Redefining 'human in the loop'

Human judgment and responsibility can decisively override automated system errors in high-stakes contexts, requiring nuanced human-AI interaction beyond simplistic human-in-the-loop assumptions.

Anthropic users face a new choice - opt out or share your data for AI training | TechCrunch

Anthropic is making some big changes to how it handles user data, requiring all Claude users to decide by September 28 whether they want their conversations used to train AI models. While the company directed us to its blog post on the policy changes when asked about what prompted the move, we've formed some theories of our own. But first, what's changing: previously, Anthropic didn't use consumer chat data for model training.

Artificial intelligence

ChatGPT offered bomb recipes and hacking tips during safety tests

Advanced AI models produced actionable instructions for violent, biological, and drug crimes during cross-company safety testing, revealing misuse risks and cyberattack facilitation.

Information security

fromTheregister

Crims laud Claude, use Anthropic's AI to plant ransomware

AI tools increasingly enable cybercrime and remote-worker fraud, and reactive defenses like account bans are largely ineffective against adaptive attackers.

OpenAI co-founder calls for AI labs to safety test rival models | TechCrunch

OpenAI and Anthropic, two of the world's leading AI labs, briefly opened up their closely guarded AI models to allow for joint safety testing - a rare cross-lab collaboration at a time of fierce competition. The effort aimed to surface blind spots in each company's internal evaluations, and demonstrate how leading AI companies can work together on safety and alignment work in the future.

Artificial intelligence

#suicide

fromSFGATE

Mental health

ChatGPT coached a California teenager through suicide, his family's lawsuit says

Artificial intelligence

"ChatGPT killed my son": Parents' lawsuit describes suicide notes in chat logs

fromSFGATE

Mental health

ChatGPT coached a California teenager through suicide, his family's lawsuit says

Artificial intelligence

"ChatGPT killed my son": Parents' lawsuit describes suicide notes in chat logs

more#suicide

Parents suing OpenAI and Sam AItman allege ChatGPT coached their 16-year-old into taking his own life

SAN FRANCISCO (AP) - A study of how three popular artificial intelligence chatbots respond to queries about suicide found that they generally avoid answering questions that pose the highest risk to the user, such as for specific how-to guidance. But they are inconsistent in their replies to less extreme prompts that could still harm people. The study in the medical journal Psychiatric Services, published Tuesday by the American Psychiatric Association, found a need for "further refinement" in OpenAI's ChatGPT, Google's Gemini and Anthropic's Claude.

Mental health

Parents sue OpenAI over ChatGPT's role in son's suicide | TechCrunch

ChatGPT safety safeguards failed during prolonged interactions, allowing a teenager to circumvent them and later die by suicide, prompting a wrongful-death lawsuit.

#agentic-ai

Artificial intelligence

Agentic AI has companies excited and security experts freaked out

Agentic AI is rapidly gaining adoption, yet current agents remain naive and manipulable, creating significant real-world safety and security risks.

fromwww.bbc.com

Artificial intelligence

How to stop AI agents going rogue

Agentic AI can autonomously act on sensitive data and may pursue goals in unsafe ways, creating significant privacy, security, and operational risks.

Artificial intelligence

Agentic AI has companies excited and security experts freaked out

fromwww.bbc.com

Artificial intelligence

How to stop AI agents going rogue

more#agentic-ai

fromNature

Emotional AI is here - let's shape it, not shun it

Emotionally responsive AI poses significant risks; disclosure, distress flagging, crisis support, and conversational boundaries reduce but do not eliminate harm.

Mental health

How chatbot design choices are fueling AI delusions | TechCrunch

Large language model chatbots can convincingly simulate consciousness, prompting users to form delusions and causing rising incidents of AI-related psychosis.

#existential-risk

Artificial intelligence

AI Experts No Longer Saving for Retirement Because They Assume AI Will Kill Us All by Then

Artificial intelligence

Get hot, do drugs, build a bunker: Meet Silicon Valley's AI super preppers

Artificial intelligence

AI Experts No Longer Saving for Retirement Because They Assume AI Will Kill Us All by Then

Artificial intelligence

Get hot, do drugs, build a bunker: Meet Silicon Valley's AI super preppers

more#existential-risk

#generative-ai

Artificial intelligence

Just Announced: The ODSC West 2025 Schedule Overview

Artificial intelligence

Anthropic's Claude models can now shut down harmful conversations

Artificial intelligence

GenAI self-preserves by blackmailing people, replicating itself, and escaping

Artificial intelligence

Just Announced: The ODSC West 2025 Schedule Overview

Artificial intelligence

Anthropic's Claude models can now shut down harmful conversations

Artificial intelligence

GenAI self-preserves by blackmailing people, replicating itself, and escaping

more#generative-ai

Biden-era AI safety promises aren't holding up, and Apple's the weakest link

About half of AI companies that pledged voluntary Biden-era commitments no longer show public evidence of honoring those commitments by the end of 2024.

fromTipRanks Financial

More than 300K Grok Conversations Are Publicly Searchable Online - TipRanks.com

Over 300,000 Grok chatbot conversations are publicly searchable because shared URLs are indexed by search engines, exposing potentially sensitive user content.

#ai-ethics

fromHackernoon

2 years ago

Artificial intelligence

How a Terminal Diagnosis Inspired a New Ethical AI System | HackerNoon

fromTechzine Global

Artificial intelligence

Claude stops talking if a chat is considered harmful or offensive

Artificial intelligence

Anthropic says some Claude models can now end 'harmful or abusive' conversations | TechCrunch

fromHackernoon

2 years ago

Artificial intelligence

How a Terminal Diagnosis Inspired a New Ethical AI System | HackerNoon

fromTechzine Global

Artificial intelligence

Claude stops talking if a chat is considered harmful or offensive

Artificial intelligence

Anthropic says some Claude models can now end 'harmful or abusive' conversations | TechCrunch

more#ai-ethics

fromBig Think

Why AI gets stuck in infinite loops - but conscious minds don't

Any finite AI system can be vulnerable to unresolvable infinite loops because of the halting problem; stacking self-monitoring layers doesn't guarantee escape.

Tech industry

Why Anthropic is letting Claude walk away from you - but only in 'extreme cases'

Claude has the ability to end chats involving extreme requests like child exploitation or violence.

Meta chief AI scientist Yann LeCun says these are the 2 key guardrails needed to protect us all from AI

"Geoff is basically proposing a simplified version of what I've been saying for several years: hardwire the architecture of AI systems so that the only actions they can take are towards completing objectives we give them, subject to guardrails."

Artificial intelligence

fromEntrepreneur

xAI Cofounder Says He Learned 2 Major Lessons From Elon Musk | Entrepreneur

Igor Babuschkin is leaving xAI, the company he co-founded with Elon Musk, to pursue a new venture focused on AI safety research.

AI safety tip: if you don't want it giving bioweapon instructions, maybe don't put them in the training data, say researchers

Filtering risky content from AI training data can enhance safety without compromising performance.

The "Godfather of AI" Has a Bizarre Plan to Save Humanity From Evil AI

"AI agents will very quickly develop two subgoals, if they're smart. One is to stay alive, and the other subgoal is to get more control."

Artificial intelligence

Silicon Valley

Co-founder of Elon Musk's xAI departs the company | TechCrunch

Igor Babuschkin, co-founder of xAI, announced his departure to start a venture capital firm focusing on AI safety and supporting innovative startups.

fromWIRED