News
AI models can pick up hidden behaviors from seemingly harmless data—even when there are no obvious clues. Researchers warn that this might be a fundamental property of neural networks.
OpenAI says its experimental language model has solved International Mathematical Olympiad (IMO) problems at a gold medal level—a possible breakthrough for AI with general reasoning skills. The ...
Terence Tao, widely regarded as a mathematical prodigy, says that AI still lacks what he calls a mathematical "sense of smell." According to Tao, even when generative AI produces flawed proofs, they ...
Anthropic has published the technical details behind its new Claude Research agent, which uses a multi-agent approach to speed up and improve complex searches.
OpenAI has lowered the price of its o3 language model by 80 percent, CEO Sam Altman said. The new cost is $2 per million input tokens and $8 per million output tokens. The move follows Google’s Gemini ...
Yann LeCun, Meta's chief AI scientist, has taken a direct shot at Anthropic CEO Dario Amodei on Threads, making clear just how sharply the AI community is split over the future of general artificial ...
Google Deepmind CEO Demis Hassabis says world models—AI systems that simulate the structure of the real world—are already making surprising progress toward general intelligence.
Anthropic has released its next generation of AI models, Claude Opus 4 and Claude Sonnet 4, and is introducing new safety measures designed to prevent their use in developing chemical, biological, ...
The French AI startup Mistral has launched Devstral Small 24B, a new open language model built for software development and described as "agentic." ...
Geoffrey Hinton, a leading AI researcher and Turing Award winner, now says he was too quick to declare that artificial intelligence would replace radiologists.
OpenAI says GPT-4.1 is particularly strong when it comes to programming tasks and following instructions precisely. In our tests, the model is noticeably less "chatty" than GPT-4o—not being overly ...
OpenAI has released a new benchmark for testing AI systems in healthcare. Called HealthBench, it's designed to evaluate how well language models handle realistic medical conversations. According to ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results