Understanding the Most Viral Chart in Artificial Intelligence | Odd Lots
www.bloomberg.com
METR, which stands for Model Evaluation and Threat Researc, is focused on understanding the degree to which AI models can engage in autonomous, complex tasks. METR see this is as a particularly important benchmark, given the risk that AI could one day be engaged in recursive self improvement, taking humans out of the loop. But how do you really gauge a model's ability to do complex problems. And what is being measured for exactly? On this episode we speak with METR's President Chris Painter as well as Joel Becker, a member of the technical staff who works on evaluation methods for the organization. We discuss both the mechanics and the philosophy of METR's work, and what it means when we see a a chart showing that Clause Opus 4.6 can do a task that would take a human nearly 12 hours. (Source: Bloomberg)
0 Yorumlar ·0 hisse senetleri ·15 Views
Download the Telestraw App!
Download on the App Store Get it on Google Play
×