2026-05-01
I deleted this post since I think I was imprecise in the language. It is an interesting pre-registered RCT, but I should have been clearer that .3 SD is a modest effect size, the "big results" I was referring to is that it was cheap and had no apparent downsides. Paper: www.iza.org/publications...
We had SO MUCH FUN!! I'm curious to see what this will sound like as a pod ep.
Not too long ago someone I follow introduced a citation checking tool. I cannot find it anymore (and cannot search posts from only people I follow). Can anyone point me in the right direction? Thanks!
New paper (on an old AI model) tests o1 against doctors on medical benchmarks & real ER cases: “across a variety of scenarios and applications, the large language model outperformed both human physicians and older models” The high potential of AI suggests an “urgent need for prospective trials.”
But wait there's more! Fresh off our live show in Brooklyn, @alexhanna.bsky.social and I will be doing the next MAIHT3K livestream on Monday May 4. We will be witnessing with dismay Bernie's descent into x-risk-ism. Monday, May 4, noon PT twitch.tv/dair_institute