Intelligence.Log

2026-04-25

Extracted: 51 items. Sources: GitHub, Bluesky, X.
++ AI OVERVIEW ++
Today’s discourse centers on the operational challenges of multi-agent AI systems, with Ethan Mollick pinpointing organizational design and collaborative benchmarking as the next "critical frontier" for enterprise value. Meanwhile, Emily M. Bender introduces the term "demythifying" from a review of *The AI Con*, signaling a continued pushback against AI hype. On GitHub, repositories focused on agent orchestration frameworks and evaluation toolkits saw a surge in stars, reflecting the community’s pivot from single-model capabilities to managing agent swarms at scale. The tension between scaling agentic systems and maintaining rigorous, myth-busting critique remains the dominant theme of the day.
◆ Signal

Co-Starred · Last 7 days

Repos independently starred by multiple AI leaders in the week ending 2026-04-25. Stronger signal = more overlap.

huggingface/ml-intern
×2 starrers7/10688

🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models

|
[Agent][LLM][Tooling]
grep TOPIC=
grep SOURCE=
sort --by=
GH
ROCm/FlyDSL0.2k4/10

FlyDSL is the Python front‑end of the project: Flexible LaYout DSL.

Starred bytridao|[Tooling]
FlyDSL is a Python front-end for a flexible layout DSL, enabling dynamic and customizable UI layouts. It focuses on simplifying layout design for Python applications.
BSKY
emollick.bsky.socialEthan Mollick

Organizational design for agents is hard, benchmarking agents working in concert is hard. Together, this is the next critical frontier for making AI matter in large-scale valuable tasks, and we really don’t know very much about it. www.strangeloopcanon.com/p/when-align...

❤️ 20 Likes|[Agent][Evaluation]
BSKY
emilymbender.bsky.socialEmily M. Bender

Favorite new to me word, from a review of The AI Con: demythifying.

❤️ 21 Likes|[Safety]
BSKY
simonwillison.netSimon Willison

I think ChatGPT Images 2.0 deciding to add a "WHY ARE YOU LIKE THIS" sign to the background of this image is the first time I've felt a glimpse of AGI simonwillison.net/2026/Apr/25/...

❤️ 214 Likes|[Multi-modal]
BSKY
emollick.bsky.socialEthan Mollick

If you believe that AI is going to have a big impact on work and life, the only real tool for mitigating bad impacts and channeling usage for good will be government policy And that policy will necessarily be very complicated: AI will impact employment & healthcare & education & etc. differently

❤️ 65 Likes|[Safety]
BSKY
emollick.bsky.socialEthan Mollick

I think that academia has not absorbed the fact that AI agents are now good enough to independently reconstruct complex papers without access to code or the papers themselves; just the methods & data. They aren’t perfect but the errors are often in the human paper, not the AI making a mistake.

❤️ 88 Likes|[Agent][Evaluation]
BSKY
angelamczhou.bsky.socialangela zhou

im just a rat that types and writes papers (cough revises) at verve coffee

❤️ 4 Likes|
X
Bought a new Mac mini to properly tinker with claws over the weekend. The apple store person told me they are selling like hotcakes and everyone is confused :) I'm definitely a bit sus'd to run OpenClaw specifically - giving my private data/keys to 400K lines of vibe coded
[Agent][Tooling]
“DeepSeek Summary: Karpathy bought a Mac mini to experiment with 'claws' (likely a typo for 'Claude' or 'Claw' agent), noting that Apple Store staff said they are selling well and customers are confused. He is cautious about running OpenClaw due to security concerns with vibe-coded code.
X
2025 LLM Year in Review
[LLM]
“DeepSeek Summary: Karpathy posted a summary of LLM developments in 2025, likely reflecting on key trends and milestones.
X
I'm beginning to suspect that a key skill in working effectively with coding agents is developing an intuition for when you don't need to
[Agent][Tooling]
“DeepSeek Summary: Simon suggests that a key skill with coding agents is knowing when to step back.
X
Vibe coding is irresponsibly building software through dice rolls, not caring what code is produced
[Agent][Evaluation]
“DeepSeek Summary: Simon defines 'vibe coding' as irresponsible software development.
X
hwchase17Harrison Chase
im excited about agent harnesses because i think are the first stable agent abstractions we can build on top (which is why we're investing so much in deepagents) we always wanted to run llms in a loop and have them call tools (remember autoGPT? that's all that was) but the
[Agent][Infra][Tooling]
“DeepSeek Summary: Agent harnesses provide stable abstractions for building agent loops with tool calling, a key evolution from early attempts like AutoGPT.
X
hwchase17Harrison Chase
This means that operations you would do on code in the software world, you now do on traces in the agent world. Debugging, testing, profiling
[Evaluation][Deployment][Agent]
“DeepSeek Summary: Traces in agent systems replace code as the primary artifact for debugging, testing, and profiling.
X
hwchase17Harrison Chase
TL;DR: More and more agents need a workspace: a computer where they can run code, install packages, and access files. Sandboxes provide this
[Infra][Safety][Agent]
“DeepSeek Summary: Agents require sandboxed workspaces to execute code and access resources safely.
X
hwchase17Harrison Chase
When you ship traditional software to production, you have a good sense of what to expect. Users click buttons, fill out forms,
[Deployment][Agent]
“DeepSeek Summary: Traditional software deployment has predictable user interactions, unlike agent systems.
X
DrJimFanJim Fan
I've been a bit quiet on X recently. The past year has been a transformational experience.
[Agent]
“DeepSeek Summary: Jim Fan acknowledges his recent silence on X and describes the past year as transformational.
X
jeremyphowardJeremy Howard
I replicated this result, that Grok focuses nearly entirely on finding out what Elon thinks in
[Safety][Evaluation]
“DeepSeek Summary: Jeremy Howard replicated a finding that Grok focuses almost entirely on determining Elon Musk's thoughts.
X
jeremyphowardJeremy Howard
Absolutely any time I try to explore something even slightly against commonly accepted beliefs,
[Safety][Evaluation]
“DeepSeek Summary: Jeremy Howard notes that exploring ideas against commonly accepted beliefs is met with resistance.
X
soumithchintalaSoumith Chintala
reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins
[LLM]
“DeepSeek Summary: Soumith recommends reading 'AI News' as a high-leverage activity.
X
I think it's clear that for many smaller companies that invested in deep learning, it turned out
[Evaluation]
“DeepSeek Summary: Deep learning investments may not have paid off for smaller companies.
X
Folks who work in AI or software engineering feel like the world is changing exponential fast.
[Agent]
“DeepSeek Summary: AI and software engineers perceive rapid exponential change in the world.
X
h
David Ha
Don't miss David Ha @hardmaru's keynote at @ALifeConf #ALIFE2021 on "World Models and Attention for Reinforcement Learning"!
[Agent][Multi-modal]
“DeepSeek Summary: David Ha is giving a keynote on world models and attention for reinforcement learning at ALIFE 2021.
X
h
David Ha
It's spectacular to have followed David Ha's (@hardmaru) incredible career arc —MD of Fixed Income at Goldman Sachs —restarted his career
[Agent]
“DeepSeek Summary: David Ha transitioned from a managing director at Goldman Sachs to a career in AI research.
X
y
Yann LeCun
It seems to me that before "urgently figuring out how to control AI systems much smarter than us" we need
[Safety]
“DeepSeek Summary: LeCun questions the urgency of controlling superintelligent AI, implying such systems don't exist yet.
X
y
Yann LeCun
An A.I. Pioneer Warns the Tech 'Herd' Is Marching Into a Dead End. www.nytimes.com.
[LLM]
“DeepSeek Summary: LeCun shares a NYT article warning that the AI field is heading in the wrong direction.
X
y
Yann LeCun
The emergence of superintelligence is not going to be an event. We don't have anything close to a
[Safety]
“DeepSeek Summary: LeCun argues superintelligence will not appear suddenly and we are far from it.
X
d
Fei-Fei Li
Very excited to share @theworldlabs 's latest research work RTFM!! It's a real-time, ...
[Multi-modal]
“DeepSeek Summary: Fei-Fei Li announces World Labs' RTFM research, a real-time 3D world generation model.
X
minimaxirMax Woolf
me irl
[Tooling]
“DeepSeek Summary: Max Woolf posted a self-referential meme 'me irl'.
X
srush_ioSasha Rush
On the infra side, composer 2 uses CP. This is (i think?) the first real detail from using CP on MLA. My understanding is that each rank first computes the compressed KVs, all gather this compressed latents. while the all gather is in flight, compute the Q proj
[Infra][LLM]
“DeepSeek Summary: Sasha discusses infrastructure details of composer 2 using CP (context parallelism) on MLA, describing the process of computing compressed KVs and all-gathering latents.
X
srush_ioSasha Rush
⛏️
[LLM]
“DeepSeek Summary: A single pickaxe emoji, possibly indicating a mining or digging metaphor.
X
srush_ioSasha Rush
“DeepSeek Summary: No text content available from search snippet.
X
If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should
[Infra][Fine-tuning]
“DeepSeek Summary: Stas Bekman notes that DeepSpeed ZeRO++ is now available on master branch, encouraging users to try it.
X
Hear, hear, I'm excited to introduce a new performance metric: Maximum Achievable Matmul
[Evaluation][Infra]
“DeepSeek Summary: Stas Bekman introduces a new performance metric called Maximum Achievable Matmul for evaluating matrix multiplication efficiency.
X
If you're trying out FA4, you're likely to run into not being able to load cutlass.cute
[Tooling][Infra]
“DeepSeek Summary: Stas Bekman warns about a common issue with FA4 (Flash Attention 4) involving cutlass.cute loading.
X
Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can
[Tooling]
“DeepSeek Summary: Stas Bekman thanks a contributor for improving the Machine Learning Engineering Open Book.
X
sayakpaulSayak Paul
Install `diffusers` from source and start using Kontext from @bfl_ml 🧨 Use your favorite optims, too :) Training is also supported (@linoy_tsaban and yours truly) 🤗
[Multi-modal][Fine-tuning][Deployment]
“DeepSeek Summary: Announces support for Kontext from Black Forest Labs in diffusers, with training support.
X
sayakpaulSayak Paul
Release notes: Release Diffusers 0.34.0: New Image and Video Models, Better torch.
[Multi-modal][Deployment][Infra]
“DeepSeek Summary: Announces Diffusers 0.34.0 release with new image and video models and torch improvements.
X
philschmidPhilipp Schmid
Guide: ReAct agent from scratch with Gemini 2.5 and LangGraph | Gemini API | Google AI for Developers. ai.google.dev.
[Agent][LLM]
“DeepSeek Summary: Philipp Schmid published a guide on building a ReAct agent from scratch using Gemini 2.5 and LangGraph.
X
e
Ethan Mollick
AI is actually pretty good at ideas as well.
[LLM]
“DeepSeek Summary: Ethan Mollick notes that AI can generate good ideas, challenging the notion that creativity is exclusively human.
X
e
Ethan Mollick
My most popular AI post was a bunch of made-up "graphs" four years ago.
[Evaluation]
“DeepSeek Summary: Mollick reflects on viral AI content, noting that fabricated graphs gained high engagement.
X
e
Ethan Mollick
So much work is going into faking continual learning and memory for AIs,
[LLM][Fine-tuning]
“DeepSeek Summary: Mollick criticizes efforts to simulate continuous learning and memory in AI models.
X
e
Ethan Mollick
If it helps, I teach at a business school & many of my smartest students are hired by funds because they can reliably turn their only-human
[Deployment]
“DeepSeek Summary: Mollick notes that human judgment remains valuable, as his students are hired for their unique human skills.
X
e
Emily M. Bender
@kohntom A synthetic text extruding machine is not well-matched to any application where the accuracy of the content matters. This is clearly one such application.
[LLM][Safety][Evaluation]
“DeepSeek Summary: Bender criticizes LLMs as 'synthetic text extruding machines' unsuitable for accuracy-critical applications.
X
N
Naomi Saphra
This book starts like it's gonna be a fun microhistory of TB (it gave us the Stetson!
“DeepSeek Summary: Naomi Saphra comments on a book about tuberculosis, noting its engaging start.
X
N
Naomi Saphra
New preprint! Everyone loves causal interp. It's coherently defined! It makes testable predictions
[Evaluation]
“DeepSeek Summary: Announces a new preprint on causal interpretation, emphasizing its coherence and testability.
X
N
Naomi Saphra
New preprint! Phase transitions! We love to see them during LM training.
[LLM]
“DeepSeek Summary: Announces a new preprint about phase transitions in language model training.
X
a
Angela Zhou
#throwback to the beginnings of a beautiful friendship =D @ansonmount @HellOnWheelsAMC #HellonWheels #onlocation.
[Deployment]
“DeepSeek Summary: Angela Zhou shares a throwback post about her friendship with co-stars on the set of Hell on Wheels.
X
b
Ben Recht
I weigh in on the Trump administration’s newfound obsession with Gold Standard Science and reproducibility. Though it’s not all in bad faith, it’s likely to backfire.
[Evaluation]
“DeepSeek Summary: Critique of the Trump administration's focus on reproducibility in science, warning it may backfire despite some good faith.
X
b
Ben Recht
For the first time in almost a decade, I'm teaching a class on learning and control.
[Infra]
“DeepSeek Summary: Announcement of teaching a class on learning and control after a long hiatus.
X
b
Ben Recht
Revisiting Sutton's Bitter Lesson in the wake of GPT-5.
[LLM]
“DeepSeek Summary: Revisiting a classic AI lesson in context of latest GPT advancements.
-- END OF LOG --
[STATS] 51 items · Filter applied
Powered by Horizon + DeepSeek