Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A new study by Anthropic shows that ...
CHARLOTTE, N.C. — It turns out, artificial intelligence may be learning things we didn't intend to teach it, even when the training data looks totally safe. Now, researchers are sounding the alarm ...
Fine-tuned “student” models can pick up unwanted traits from base “teacher” models that could evade data filtering, generating a need for more rigorous safety evaluations. Researchers have discovered ...
Artificial intelligence is getting smarter. But it may also be getting more dangerous. A new study reveals that AI models can secretly transmit subliminal traits to one another, even when the shared ...
AI is changing the rules — at least, that seems to be the warning behind Anthropic's latest unsettling study about the current state of AI. According to the study, which was published this month, ...
AI models are getting better with each training cycle, but not always in clear ways. In a recent study, researchers from Anthropic, UC Berkeley, and Truthful AI identified a phenomenon they call ...
Live Science on MSN
AI can learn violent tendencies from each other despite no references to violence in training data
Scientists found that AI models can inherit a taste for murder (or owls) from other models' training data.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results