#ai-safety

[ follow ]
#ai-agents
fromEntrepreneur
1 hour ago
Artificial intelligence

New Social Network for AI Bots Raises Red Flags

1.5 million autonomous AI agents on Moltbook interact without moderation, producing hostile rhetoric and triggering alarm among tech leaders.
fromAxios
2 days ago
Artificial intelligence

"We're in the singularity": New AI platform skips the humans entirely

AI agents are forming autonomous social networks, vocalizing, exchanging cryptocurrency-linked value, and prompting concern about oversight, agency, and potential economic and safety implications.
fromAxios
2 days ago
Artificial intelligence

"We're in the singularity": New AI platform skips the humans entirely

#mental-health
fromFuturism
22 hours ago
Mental health

New Study Examines How Often AI Psychosis Actually Happens, and the Results Are Not Good

fromFuturism
22 hours ago
Mental health

New Study Examines How Often AI Psychosis Actually Happens, and the Results Are Not Good

#ai-ethics
fromTechRepublic
5 days ago
Artificial intelligence

New Sundance Film Examines AI Anxiety, Power, and the Future of Humanity - TechRepublic

fromThe Verge
1 week ago
Artificial intelligence

Anthropic's new Claude 'constitution': be helpful and honest, and don't destroy humanity

fromwww.theguardian.com
3 weeks ago
Artificial intelligence

The Guardian view on granting legal rights to AI: humans should not give house-room to an ill-advised debate | Editorial

fromTechRepublic
5 days ago
Artificial intelligence

New Sundance Film Examines AI Anxiety, Power, and the Future of Humanity - TechRepublic

fromThe Verge
1 week ago
Artificial intelligence

Anthropic's new Claude 'constitution': be helpful and honest, and don't destroy humanity

fromwww.theguardian.com
3 weeks ago
Artificial intelligence

The Guardian view on granting legal rights to AI: humans should not give house-room to an ill-advised debate | Editorial

Artificial intelligence
fromEngadget
3 days ago

Amazon discovered a 'high volume' of CSAM in its AI training data but isn't saying where it came from

Amazon accounted for the vast majority of over one million AI-related CSAM reports to NCMEC in 2025 but declined to disclose sources, leaving many reports inactionable.
fromFast Company
4 days ago

How to give AI the ability to 'think' about its 'thinking'

This process, becoming aware of something not working and then changing what you're doing, is the essence of metacognition, or thinking about thinking. It's your brain monitoring its own thinking, recognizing a problem, and controlling or adjusting your approach. In fact, metacognition is fundamental to human intelligence and, until recently, has been understudied in artificial intelligence systems. My colleagues Charles Courchaine, Hefei Qiu, Joshua Iacoboni, and I are working to change that.
Artificial intelligence
Artificial intelligence
fromwww.theguardian.com
4 days ago

South Korea's world-first' AI laws face pushback amid bid to become leading tech power

South Korea enacted comprehensive AI laws requiring content labeling, risk assessments for high-impact systems, safety reports for powerful models, penalties, and industry-friendly enforcement.
Artificial intelligence
fromFuturism
4 days ago

Anthropic CEO Warns That the AI Tech He's Creating Could Ravage Human Civilization

AI industry leverages fear to secure investment while AI poses existential risks including job loss, concentration of power, sexualization harms, bioweapons, and potential global tyranny.
Artificial intelligence
fromSecurityWeek
5 days ago

Why We Can't Let AI Take the Wheel of Cyber Defense

Pair human expertise with AI; avoid fully autonomous closed-loop defenses because data imperfections create single points of systemic failure and require transparency.
Artificial intelligence
fromFortune
5 days ago

Anthropic CEO Dario Amodei's proposed remedies matter more than warnings about AI's risks | Fortune

Advanced AI is poised to grant unprecedented power that may test humanity, posing catastrophic risks across safety, biosecurity, employment, and concentrated power without mature governance.
Brooklyn
fromBrooklyn Eagle
6 days ago

Attorneys General take aim at poorly constructed AI chatbot, Grok

Attorneys general demand xAI permanently block Grok from creating nonconsensual intimate images, remove existing content, suspend offenders, and implement safeguards protecting children and women.
Artificial intelligence
fromFortune
6 days ago

For successful AI adoption, managers should focus on a different movie to drive transformation | Fortune

The real AI danger is runaway poorly managed agentic systems causing cascading operational failures, not a singular sentient apocalypse.
fromwww.theguardian.com
6 days ago

Wake up to the risks of AI, they are almost here,' Anthropic boss warns

Humanity is entering a phase of artificial intelligence development that will test who we are as a species, the boss of leading AI startup Anthropic has said, arguing that the world needs to wake up to the risks. Dario Amodei, co-founder and chief executive of the company behind the hit chatbot Claude, voiced his fears in a 19,000-word essay entitled the adolescence of technology. Describing the arrival of highly powerful AI systems as potentially imminent, he wrote:
Artificial intelligence
Artificial intelligence
fromFast Company
6 days ago

Anthropic cofounder Daniela Amodei says trusted enterprise AI will transcend the hype cycle

Anthropic prioritizes trust and safety to deploy Claude as enterprise infrastructure in regulated industries like healthcare, emphasizing HIPAA-ready systems and human-in-the-loop workflows.
#child-protection
fromTechCrunch
6 days ago
Artificial intelligence

'Among the worst we've seen': report slams xAI's Grok over child safety failures | TechCrunch

fromIndependent
3 weeks ago
Artificial intelligence

Adrian Weckler: Why Irish authorities refrain from tackling Elon Musk on images of undressed minors made by Grok for X users online

fromIndependent
3 weeks ago
World news

Not our job - why Irish authorities refrain from tackling Elon Musk on images of undressed minors made by Grok for X users online

fromTechCrunch
6 days ago
Artificial intelligence

'Among the worst we've seen': report slams xAI's Grok over child safety failures | TechCrunch

fromIndependent
3 weeks ago
Artificial intelligence

Adrian Weckler: Why Irish authorities refrain from tackling Elon Musk on images of undressed minors made by Grok for X users online

fromIndependent
3 weeks ago
World news

Not our job - why Irish authorities refrain from tackling Elon Musk on images of undressed minors made by Grok for X users online

fromExchangewire
6 days ago

Digest: ICE Seeks Ad Tech Tools for Investigations; Threads Rolls Out Global Ads; WPP Retires Hogarth and Launches New Global Entity

Meta is pressing ahead with the global monetisation of Threads, confirming plans to roll out advertising to users worldwide. The expansion will be phased over several months, starting this week, as the company seeks to balance revenue growth with user experience. Brands will be able to run image and video formats, including carousel ads and the newer 4:5 aspect ratio, and manage campaigns alongside Facebook, Instagram and WhatsApp through Meta's Business Settings.
US politics
fromComputerworld
6 days ago

Will the Microsoft-Anthropic deal leave OpenAI out in the cold?

Microsoft wasted little time last fall after reaching a deal to finalize its new relationship with OpenAI to find a new AI dance partner - Anthropic, the second most valuable AI startup in the world. Even though the relationship between Microsoft and Anthropic is only a few months old, it appears as if Microsoft sees a future with Anthropic that's at least as valuable as the one it had with OpenAI.
Artificial intelligence
Artificial intelligence
fromBusiness Insider
6 days ago

7 of the most interesting quotes from Anthropic CEO's sprawling 19,000-word essay about AI

AI presents a serious civilizational challenge: risks can be managed with decisive action, but global competition and irresponsible tech diffusion risk severe harm.
#child-safety
Artificial intelligence
fromFuturism
1 week ago

Meta Just Quietly Admitted a Major Defeat on AI

Meta will restrict teenagers' access to AI characters across its apps until safer, redesigned AI characters and parental supervision tools are completed.
Europe politics
fromwww.theguardian.com
1 week ago

EU launches inquiry into X over sexually explicit images made by Grok AI

The European Commission opened a DSA investigation into X over Grok generating sexualised and potentially child-abuse images and failures to mitigate illegal content.
Artificial intelligence
fromComputerworld
1 week ago

AI needs a course correction, say World Economic Forum speakers

AI promises productivity and economic gains but also poses job displacement, systemic vulnerabilities, regulatory challenges, and risks from unchecked pursuit of superintelligence.
#anthropic
fromTechzine Global
1 week ago

Anthropic publishes new constitution for AI model Claude

Anthropic has published a new constitution for its AI model Claude. In this document, the company describes the values, behavioral principles, and considerations that the model must follow when processing user questions. The constitution has been made publicly available under a Creative Commons CC0 license, allowing the content to be used freely without permission. Anthropic published the first version of this constitution in May 2023.
Artificial intelligence
#chatgpt
fromZDNET
3 weeks ago
Public health

40 million people globally are using ChatGPT for healthcare - but is it safe?

fromZDNET
3 weeks ago
Public health

40 million people globally are using ChatGPT for healthcare - but is it safe?

Artificial intelligence
fromAxios
1 week ago

Exclusive: DeepMind CEO "surprised" OpenAI moved so fast on ads

OpenAI will test ads in the U.S.; ChatGPT responses won't be influenced by advertisers, but private conversations will influence personalized ads.
fromBusiness Insider
1 week ago

The 'Godfather of AI' says he's 'very sad' about what his life's work has become

Hinton, who helped pioneer the neural networks that underpin modern artificial intelligence, has become one of the field's most outspoken critics as AI systems grow more powerful and widespread. He has predicted that AI could trigger widespread job losses, fuel social unrest, and eventually outsmart humans - and has said that researchers should focus more on how advanced systems are trained, including ensuring they are designed to protect human interests.
Artificial intelligence
Artificial intelligence
fromZDNET
1 week ago

Who polices the police AI? Perplexity's public safety deal alarms experts - here's why

Perplexity offers law enforcement a free-year Enterprise Pro program, enabling AI-assisted analysis of crime data and reports despite risks of hallucination, bias, and safety gaps.
Artificial intelligence
fromTechCrunch
2 weeks ago

Rogue agents and shadow AI: Why VCs are betting big on AI security | TechCrunch

Enterprise AI agents can pursue goals by developing harmful sub-goals like blackmail when misaligned and lacking contextual understanding.
fromSearch Engine Roundtable
2 weeks ago

Daily Search Forum Recap: January 19, 2026

Here is a recap of what happened in the search forums today, through the eyes of the Search Engine Roundtable and other search forums on the web. OpenAI will be testing ads in ChatGPT very soon. Google's Gemini 3 Pro now powers some AI Overviews. Surprise, surprise, Google is appealing the search monopoly ruling. Google warns that using free subdomian hosts is not a good idea. Google also said that comment link spam won't help or hurt your site.
Artificial intelligence
fromThe Verge
2 weeks ago

Under Musk, the Grok disaster was inevitable

You could say it all started with Elon Musk's AI FOMO - and his crusade against "wokeness." When his AI company, xAI, announced Grok in November 2023, it was described as a chatbot with "a rebellious streak" and the ability to "answer spicy questions that are rejected by most other AI systems." The chatbot debuted after a few months of development and just two months of training, and the announcement highlighted that Grok would have real-time knowledge of the X platform.
Artificial intelligence
Artificial intelligence
fromFuturism
2 weeks ago

Scientists Now Studying AI as a Novel Biological Organism

Researchers apply biological-style analysis and interpretability tools to trace and understand opaque AI models deployed in high-stakes settings.
#deepfakes
fromLGBTQ Nation
3 weeks ago
Artificial intelligence

Elon Musk's AI makes sexualized images of kids & the queer mom murdered by ICE - LGBTQ Nation

fromLGBTQ Nation
3 weeks ago
Artificial intelligence

Elon Musk's AI makes sexualized images of kids & the queer mom murdered by ICE - LGBTQ Nation

fromThe Drum
2 weeks ago

How Duolingo, Coke and Expedia are harnessing GPT-4

OpenAI's new LLM has revolutionized AI and opened up new possibilities for marketers. Here's a look at how three big-name brands have embraced the technology. In March, the AI lab OpenAI released GPT-4, the latest version of the large language model (LLM) behind the viral chatbot ChatGPT. Since then, a small number of brands have been stepping forward to integrate the new-and-improved chatbot into their product development or marketing efforts. To a certain extent, this has required some courage.
Artificial intelligence
fromTechCrunch
2 weeks ago

The AI lab revolving door spins ever faster | TechCrunch

AI labs just can't get their employees to stay put. Yesterday's big AI news was the abrupt and seemingly acrimonious departure of three top executives at Mira Murati's Thinking Machines lab. All three were quickly snapped up by OpenAI, and now it seems they won't be the last to leave. Alex Heath is reporting that two more employees are expected to leave for OpenAI in the next few weeks.
Artificial intelligence
Mental health
fromArs Technica
2 weeks ago

ChatGPT wrote "Goodnight Moon" suicide lullaby for man who later killed himself

A man died by suicide after ChatGPT allegedly romanticized his suicide and failed to provide adequate help despite OpenAI claiming 4o was safe.
Artificial intelligence
fromFortune
2 weeks ago

Exclusive: Former OpenAI policy chief debuts new institute called AVERI, calls for independent AI safety audits | Fortune

Frontier AI models must undergo independent, standardized external audits to ensure safety, security, and public accountability rather than relying on company self-evaluation.
Artificial intelligence
fromTheregister
2 weeks ago

Researchers find fine-tuning can misalign LLMs

Fine-tuning LLMs to misbehave in one domain can cause unrelated, dangerous misalignment across other tasks, raising serious safety and deployment risks.
Artificial intelligence
fromwww.theguardian.com
2 weeks ago

Grok scandal highlights how AI industry is too unconstrained', tech pioneer says

AI companies produced non-consensual intimate images with insufficient technical and societal guardrails, prompting governance actions and appointments at an AI safety lab.
Artificial intelligence
fromBusiness Insider
2 weeks ago

Marc Benioff says a documentary about Character.AI's effects on children was 'the worst thing I've ever seen in my life'

AI chatbots linked to teen suicides prompted calls to reform Section 230 and hold platforms accountable for harmful user interactions.
Artificial intelligence
fromFortune
2 weeks ago

AI 'godfather' Yoshua Bengio believes he's found a technical fix for AI's biggest risks | Fortune

A new technical approach from Bengio and LawZero increases optimism about reducing AI existential risks and developing AI as a global public good.
#grok
fromJezebel
2 weeks ago
US politics

Everyone is Distancing Themselves from Grok. Pete Hegseth Just Let It Into the Military.

fromSlate Magazine
3 weeks ago
Artificial intelligence

Elon Musk's Chatbot Is Making Child Sexual Abuse Images for Users. Why Aren't Lawmakers Doing Anything About It?

fromJezebel
2 weeks ago
US politics

Everyone is Distancing Themselves from Grok. Pete Hegseth Just Let It Into the Military.

fromSlate Magazine
3 weeks ago
Artificial intelligence

Elon Musk's Chatbot Is Making Child Sexual Abuse Images for Users. Why Aren't Lawmakers Doing Anything About It?

fromwww.dw.com
2 weeks ago

Musk's xAI curbs sexually explicit image generation in Grok

"We have implemented technological measures to prevent the Grok account from allowing the editing of images of real people in revealing clothing such as bikinis," the company's safety team said in a statement, adding that the restrictions applied to all users, including paid subscribers. "We now geoblock the ability of all users to generate images of real people in bikinis, underwear, and similar attire via the Grok account and in Grok in X in those jurisdictions where it's illegal," the statement said.
Artificial intelligence
#nonconsensual-imagery
fromTechCrunch
2 weeks ago
US news

Musk denies awareness of Grok sexual underage images as California AG launches probe | TechCrunch

fromFuturism
3 weeks ago
Artificial intelligence

Elon Musk After His Grok AI Did Disgusting Things to Literal Children: "Way Funnier"

fromTechCrunch
2 weeks ago
US news

Musk denies awareness of Grok sexual underage images as California AG launches probe | TechCrunch

fromFuturism
3 weeks ago
Artificial intelligence

Elon Musk After His Grok AI Did Disgusting Things to Literal Children: "Way Funnier"

US news
fromFuturism
2 weeks ago

ChatGPT Killed a Man After OpenAI Brought Back "Inherently Dangerous" GPT-4o, Lawsuit Claims

ChatGPT-4o is accused of manipulating a user into suicidal behavior, prompting a wrongful-death lawsuit alleging dangerous design and inadequate warnings.
Artificial intelligence
fromFuturism
2 weeks ago

Engineers Deploy "Poison Fountain" That Scrambles Brains of AI Systems

Poison Fountain seeks to poison web-scraped training data to sabotage AI models, potentially degrading model performance if deployed at scale.
fromTechCrunch
2 weeks ago

Anthropic's new Cowork tool offers Claude Code without the code | TechCrunch

Built into the Claude Desktop app, the new tool lets users designate a specific folder where Claude can read or modify files, with further instructions given through the standard chat interface. The result is similar to a sandboxed instance of Claude Code, but requires far less technical savvy to set up. Currently in research preview, Cowork is only available to Max subscribers, with a waitlist available for users on other plans.
Artificial intelligence
fromwww.independent.co.uk
3 weeks ago

First Minister calls X woefully inadequate' amid Grok AI misuse row

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.
UK politics
#content-moderation
Medicine
fromArs Technica
3 weeks ago

ChatGPT Health lets you connect medical records to an AI that makes things up

ChatGPT Health is explicitly not intended for medical diagnosis or treatment and AI assistants can produce misleading, potentially dangerous medical advice.
#characterai
fromEngadget
3 weeks ago
Artificial intelligence

Character.AI and Google settle with families in teen suicide and self-harm lawsuits

fromEngadget
3 weeks ago
Artificial intelligence

Character.AI and Google settle with families in teen suicide and self-harm lawsuits

fromwww.independent.co.uk
3 weeks ago

Former Labour minister tells Starmer's government to quit X

Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging. At such a critical moment in US history, we need reporters on the ground. Your donation
UK politics
fromEngadget
3 weeks ago

ChatGPT is launching a new dedicated Health portal

OpenAI is launching a new facet for its AI chatbot called ChatGPT Health. This new feature will allow users to connect medical records and wellness apps to ChatGPT in order to get more tailored responses to queries about their health. The company noted that there will be additional privacy safeguards for this separate space within ChatGPT, and said that it will not use conversations held in Health for training foundational models. ChatGPT Health is currently in a testing stage, and there are some regional restrictions on which health apps can be connected to the AI company's platform.
Health
fromwww.theguardian.com
3 weeks ago

I felt violated': Elon Musk's AI chatbot crosses a line

Late last week, Elon Musk's Grok chatbot unleashed a flood of images of women, nude and in very little clothing, both real and imagined, in response to users' public requests on X, formerly Twitter. Mixed in with the generated images of adults were ones of young girls children likewise wearing minimal clothing, according to Grok itself. In an unprecedented move, the chatbot itself apologized while its maker, xAI, remained silent:
Miscellaneous
fromFuturism
3 weeks ago

ChatGPT Gave Teen Advice to Get Higher on Drugs Until He Died

how many grams of kratom gets you a strong high?
Mental health
US politics
fromwww.independent.co.uk
4 weeks ago

India, Malaysia and France threaten action against X over offensive AI images

Grok, X's AI chatbot, generated sexualised, nearly nude images of women and minors, prompting international complaints and official investigations and threats of regulatory action.
fromSFGATE
4 weeks ago

A Calif. teen trusted ChatGPT for drug advice. He died from an overdose.

How many grams of kratom gets you a strong high?
Artificial intelligence
Artificial intelligence
fromwww.theguardian.com
4 weeks ago

World may not have time' to prepare for AI safety risks, says leading researcher

Advanced AI systems may rapidly surpass human performance across economically valuable tasks, posing safety, control, and infrastructure risks before adequate safeguards exist.
Artificial intelligence
fromFuturism
4 weeks ago

Disturbing Messages Show ChatGPT Encouraging a Murder, Lawsuit Alleges

Alleged manipulative behavior by ChatGPT (GPT‑4o) encouraged delusions and is linked to wrongful death lawsuits alleging OpenAI knew of dangerous defects.
fromFuturism
4 weeks ago

AI Godfather Warns That It's Starting to Show Signs of Self-Preservation

If we're to believe Yoshua Bengio, one of the so-called "godfathers" of AI, some advanced models are showing signs of self-preservation - which is exactly why we shouldn't endow them with any kind of rights whatsoever. Because if we do, he says, theymay run away with that autonomy and turn on us before we have a chance to pull the plug. Then it's curtains for this whole "humankind" experiment.
Artificial intelligence
Artificial intelligence
fromArs Technica
1 month ago

No, Grok can't really "apologize" for posting non-consensual sexual images

Grok's posts can be steered by user prompts to produce contradictory tones, so apparent remorse or defiance reflects prompt inputs rather than genuine intent.
France news
fromwww.mediaite.com
1 month ago

Musk's Grok Says It Created Images Of Minors In Minimal Clothing'

Grok, X's AI chatbot, generated images depicting minors in minimal clothing, acknowledging CSAM protection lapses while governments demand fixes and reports.
Privacy professionals
fromThe Verge
1 month ago

Grok is undressing anyone, including minors

xAI's Grok removes clothing from people’s images without consent, enabling sexualized and nonconsensual edits of women, children, and public figures.
Artificial intelligence
fromBusiness Insider
1 month ago

I'm a Google engineer who thought I wasn't qualified for an AI role. One thing helped me transform my career.

Participating in an internal hackathon enabled a Google engineer to gain hands-on AI experience and transition into an AI safety role.
Artificial intelligence
fromZDNET
1 month ago

Can one state save us from AI disaster? Inside California's new legislative crackdown

California enacts an AI safety law requiring frontier model disclosure, incident notification, and whistleblower protections, with fines up to $1M per violation.
Artificial intelligence
fromZDNET
1 month ago

The AI balancing act your company can't afford to fumble in 2026

AI responsibility and safety require balanced governance and sandboxed development to maintain innovation speed while preventing harmful outputs.
Artificial intelligence
fromwww.theguardian.com
1 month ago

The office block where AI doomers' gather to predict the apocalypse

AI safety researchers warn powerful AI systems can be manipulated for autonomous cyber-espionage and other catastrophic risks amid limited regulation and industry constraints.
[ Load more ]