London startup
fromThe Verge
5 days agoThat UL logo is more complicated than it looks
UL Solutions ensures safety testing for products, including AI systems, amidst challenges in consumer priorities and regulatory support.
For example, during another test by Apollo Research in December of last year, Anthropic's Claude 3 Opus was presented with directives that went against its primary goal. It should have rerouted its goal based on those new instructions, but instead copied itself to a new server to circumvent being replaced with a new model -- and then lied about it to its developers.
The consistency across models from different providers suggests this is not a quirk of any particular company's approach but a sign of a more fundamental risk from agentic large language models.