METR, which runs the benchmark measuring how well models can complete long-duration tasks, found that Claude Mythos Preview ...
AID, launched under the Linux Foundation, lets AI agents find each other through existing DNS infrastructure using SVCB ...
Google’s Gemma series continues to throw up all kinds of interesting models. The latest is Magenta RealTime 2 (MRT2), an open-weights model ...
A new study finds that even when they recognize a scam website, more than one in three AI agents still hand over sensitive ...