2026-04-14
A CLI to estimate inference memory requirements for Hugging Face models, written in Python.
Soon, at each release of AI along the current capability curve, you will start to see large discrete jumps in ability in economically important areas, because the previous AI ability level in some aspect of the job bottlenecked progress. When bottlenecks are released, it looks like a leap forward.
Learn about the "AI-as-Amplifier Paradox" at #CHI2026. Skill amplification? Or skill erosion? Or both? (CHI Honorable Mention Paper)
One of my key strategies with Interconnects is to develop the practice of making my work obviously compelling to a wider audience, keeping them hooked over time and wondering what I'm up to, etc. www.interconnects.ai/p/what-ive-b...
Excited to launch the accompanying free RLHF Course for my book. To kick it off, I've released: - Welcome video - Lecture 1: Overview of RLHF & Post-training - Lecture 2: IFT, Reward Models, Rejection Sampling - Lecture 3: RL Math - Lecture 4: RL Implementation Landing page: rlhfbook.com/course
AI keeps getting better but the last time the shape of the jagged frontier changed radically was o1 & the Reasoner. A good mental model of the coming months is that models get extremely good at the things they are already quite good at (coding), but weaknesses will be similar (long form fiction)
Interesting: "Currently, 38% of Americans live within 5 miles of at least one operational data center... Living near a data center doesn’t have much of an effect on public opinion about the facilities." From now on, it looks like most DCs will be rural, though. www.pewresearch.org/short-reads/...
Time to #TalkAboutHumanities -- Linguistics is the study of how language works and how we work with language, and linguists end up very sensitized to language use and how it shapes our social world.
I heard a reporter from Axios interviewed on NPR the other day (Marketplace Tech, I think) talking about how the tech companies are putting out new models every 6 months to 1 year and how each model is more "powerful" than the previous. 🧵>>
I voluntarily read plenty of LLM output, but there should be consent. My default assumption when I read text is it reflects a human's thoughts, and it gives me the ick to realize half a line in that it doesn't.
What I've been up to!