BEGIN:VCALENDAR
VERSION:2.0
PRODID:icalendar-ruby
CALSCALE:GREGORIAN
X-WR-CALNAME:Joel Becker | Reconciling Impressive AI Benchmark Performance 
 with Limited Developer Productivity Impacts
X-WR-TIMEZONE:Pacific Time (US & Canada)
BEGIN:VEVENT
DTSTAMP:20260518T010328Z
UID:tag:localist.com\,2008:EventInstance_52250404154565
DTSTART:20260316T190000Z
DTEND:20260316T200000Z
DESCRIPTION:AI coding agents now complete multi-hour coding benchmarks with
  roughly 50% reliability\, yet a randomized trial found experienced open-s
 ource developers took about 19% longer when allowed frontier AI tools than
  when tools were disallowed. This talk presents the evidence on the produc
 tivity paradox in AI coding\, shows the bottlenecks in deployment\, and ou
 tlines the next steps for understanding AI’s productivity impacts.\n\nJo
 el Becker works on AI evaluation methods at METR such as time horizon and 
 developer productivity RCTs. Previously he worked in economics and genomic
 s research\, ran a statistics consultancy advising professional soccer tea
 ms\, and was a very minorly successful play-money prediction markets trade
 r.\n\nLink to time horizon paper\n\nLink to developer productivity paper
GEO:37.429987;-122.17333
LOCATION:Gates Computer Science Building\, 119
SUMMARY:Joel Becker | Reconciling Impressive AI Benchmark Performance with 
 Limited Developer Productivity Impacts
URL;VALUE=URI:https://events.stanford.edu/event/joel-becker-reconciling-imp
 ressive-ai-benchmark-performance-with-limited-developer-productivity-impac
 ts
CATEGORIES:Class/Seminar
END:VEVENT
END:VCALENDAR
