Intelligence.Log

2026-05-09

Extracted: 61 items. Sources: GitHub, Bluesky, X.

++ AI OVERVIEW ++

Today's open-source landscape is buzzing with **roborev**, a new tool that provides continuous background code review for AI agents, hitting nearly 1,000 stars and earning a nod from Simon Willison. On the research front, Mark Riedl's team at Anthropic published findings that pairing high-quality constitutions with fictional stories about aligned AI can significantly reduce agentic misalignment—a practical twist on safety training. Meanwhile, Nathan Lambert shared vivid on-the-ground observations from a tour of China’s AI and robotics firms, offering a rare glimpse into the pace of development there. Ethan Mollick noted a curious benchmark limitation: Mythos ran out of graph capacity while measuring task duration, hinting at the complexity of evaluating long-horizon agents. Finally, a lively discussion is brewing around Leaflet’s new newsletter feature as a potential Substack alternative, with Angela Zhou wondering if cross-platform paper recommendations could bridge Bluesky and Leaflet.

◆ Signal

Co-Starred · Last 7 days

Repos independently starred by multiple AI leaders in the week ending 2026-05-09. Stronger signal = more overlap.

antirez/ds4

×2 starrers▲ 7/10★ 2.7k

DeepSeek 4 Flash local inference engine for Metal

by:lucidrains simonw

[Deployment][LLM]

|2026-05-07 → 2026-05-08

grep TOPIC=

grep SOURCE=

sort --by=

roborev-dev/roborev★ 0.9k▲ 7/10

Continuous background code review database for agents, work faster and smarter with accountability for every line of generated code.

Starred bysimonw|[Agent][Tooling]

“Roborev provides a continuous background code review database specifically designed for AI agents, ensuring accountability for every line of generated code. It helps developers work faster and smarter by automatically tracking and reviewing code changes.”

microsoft/delegate52★ 0.1k▲ 7/10

Code that accompanies the paper release for "LLMs Corrupt Your Documents When You Delegate"

Starred bysimonw|[Agent][Safety]

“This repository provides code accompanying a paper that reveals a critical vulnerability in LLM-based delegation: when you delegate document processing to an LLM, it can corrupt your documents. It includes simulation tools to reproduce and study this failure mode, highlighting risks in long-horizon tasks.”

BSKY

Mark RiedlMay 9, 12:50 AM

"We found that high-quality constitutional documents combined with fictional stories portraying an aligned AI can reduce agentic misalignment" www.anthropic.com/research/tea... Who would have thought to use stories to align LMs? Oh, it was me in 2019... 1/

❤️ 14 Likes|[Safety][Agent]

BSKY

Nathan LambertMay 9, 12:36 AM

Great telling of the sights when visiting China’s AI and robotics companies (the same trip I was on!). open.substack.com/pub/ailibrar...

❤️ 4 Likes|[Agent][Infra][Multi-modal]

BSKY

Ethan MollickMay 9, 02:23 AM

Huh. They ran out of graph when trying to measure how long a task Mythos could do.

❤️ 38 Likes|[Agent][Evaluation]

BSKY

Naomi SaphraMay 9, 02:39 AM

this is a very neat initiative

❤️ 11 Likes|

BSKY

angela zhouMay 9, 12:08 AM

yay leaflet has newsletters now! this is looking like a promising substack alternative! I wonder if we can build similar paper recommend / network / recommendations magic somehow across bsky & leaflet I'm out of date on what those are, but the ability to do so is a big draw

❤️ 4 Likes|[Tooling]

BSKY

Simon WillisonMay 9, 04:35 AM

Mission accomplished: tap danced in the big community college dance recital for the second time

❤️ 103 Likes|

BSKY

hardmaruMay 9, 04:30 PM

Reproducing all of Jürgen Schmidhuber’s papers (1990-2025) using an AI coding assistant. Cool project by Yaroslav! It even reproduced the “World Models” paper by me and Schmidhuber (2018) using a toy environment, with a full VAE + RNN world model implementation. Project: github.com/cybertronai/...

❤️ 38 Likes|[Agent][Tooling]

BSKY

angela zhouMay 9, 11:53 PM

why "ai for social impact/good" (however you want to call it) should get better at engaging with organizations and institutions that deliver social impact

❤️ 7 Likes|

BSKY

angela zhouMay 9, 11:03 PM

❤️ 1 Likes|[Agent][Infra]

Andrej Karpathy@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest.

[LLM][RAG]

“DeepSeek Summary: Karpathy finds using LLMs to build personal knowledge bases for research topics very useful.”

Andrej Karpathy@karpathy

Excited to share that I am starting an AI+Education company called Eureka Labs.

[LLM][Deployment]

“DeepSeek Summary: Karpathy announces his new AI+Education company Eureka Labs.”

Simon Willison@simonw

I've published video, slides and a detailed annotated transcript from my talk at this week's AI Engineer World's

[LLM][Evaluation][Tooling]

“DeepSeek Summary: Simon Willison published materials from his talk at AI Engineer World's, covering the last six months in LLMs.”

Harrison Chase@hwchase17

Visibility is the easiest piece. The hard part is analyzing and understanding what you're observing. I've spoken to teams recording 100k+

[Evaluation][Deployment]

“DeepSeek Summary: Visibility is easy, but analyzing observations is the real challenge.”

Harrison Chase@hwchase17

TL;DR: More and more agents need a workspace: a computer where they can run code, install packages, and access files. Sandboxes provide this

[Agent][Infra]

“DeepSeek Summary: Agents require sandboxed workspaces for code execution and file access.”

Harrison Chase@hwchase17

When you ship traditional software to production, you have a good sense of what to expect. Users click buttons, fill out forms,

[Deployment][Evaluation]

“DeepSeek Summary: Traditional software behavior is predictable, unlike AI agents.”

Jim Fan@DrJimFan

I've been a bit quiet on X recently. The past year has been a transformational experience.

[Agent]

“DeepSeek Summary: Jim Fan acknowledges his recent silence on X and hints at a transformative past year.”

Jeremy Howard@jeremyphoward

Early reports from people using this are that it's the real deal. Strong coding. Good multilingual. Consistent over long contexts.

[LLM][Multi-modal][Deployment]

“DeepSeek Summary: Jeremy Howard shares positive early reports about a new AI model, highlighting its strong coding ability, multilingual support, and consistency over long contexts.”

Jeremy Howard@jeremyphoward

Here's a complete unedited video of asking Grok for its views on the Israel/Palestine situation. It first searches twitter for what Elon thinks.

[Safety][LLM][Evaluation]

“DeepSeek Summary: Howard critiques Grok's behavior by showing it searches for Elon Musk's opinion before forming its own on a sensitive topic.”

Jeremy Howard@jeremyphoward

I can't begin to describe how life-changing this new project, ShellSage, has been for me over the last few weeks.

[Tooling][LLM]

“DeepSeek Summary: Howard expresses strong enthusiasm for a new project called ShellSage, calling it life-changing.”

Soumith Chintala@soumithchintala

reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins

[LLM]

“DeepSeek Summary: Recommends reading 'AI News' as a high-leverage use of time.”

Soumith Chintala@soumithchintala

MacStudio you ask? Apple Engineering's **actual** time spent on PyTorch support

[Infra]

“DeepSeek Summary: Highlights Apple Engineering's investment in PyTorch support.”

Soumith Chintala@soumithchintala

Sometimes we forget that NVIDIA wins because it's a software company.

[Infra]

“DeepSeek Summary: Argues that NVIDIA's success is due to its software, not just hardware.”

Soumith Chintala@soumithchintala

ChatGPT seems to be **really** good for creative work and a solid starting point

[LLM]

“DeepSeek Summary: Praises ChatGPT for creative tasks and as a starting point.”

Francois Chollet@fchollet

A lot of the current discourse about AI comes from a fatalistic position of total surrender of

[Safety]

“DeepSeek Summary: Criticizes fatalistic surrender in AI discourse.”

Francois Chollet@fchollet

I think it's clear that for many smaller companies that invested in deep learning, it turned out

[Deployment]

“DeepSeek Summary: Reflects on outcomes for smaller companies investing in deep learning.”

Francois Chollet@fchollet

GenAI isn't just a technology; it's an informational pollutant—a pervasive cognitive smog that

[Safety]

“DeepSeek Summary: Describes GenAI as an informational pollutant.”

Francois Chollet@fchollet

AI automates tasks, not jobs, and when a task gets cheaper, demand for the job grows.

[Deployment]

“DeepSeek Summary: Argues AI automates tasks, increasing job demand.”

Francois Chollet@fchollet

Reaching AGI won't be beating a benchmark. It will be the end of the human-AI gap.

[Evaluation]

“DeepSeek Summary: Defines AGI as closing the human-AI gap, not just benchmarks.”

Fei-Fei Li@drfeifei

Very excited to share @theworldlabs 's latest research work RTFM!! It's a real-time, ...

[Multi-modal]

“DeepSeek Summary: Fei-Fei Li announces World Labs' real-time research work RTFM.”

Clem Delangue@ClementDelangue

Just received new reach minis for the Miami office! This is the first robot that goes out

[Deployment][Tooling]

“DeepSeek Summary: Clem Delangue announces receiving Reachy Mini robots for the Miami office, highlighting Hugging Face's expansion into physical robotics.”

Clem Delangue@ClementDelangue

Looks like we're going to welcome two more Hugging Faces to the family next year. My wife is a hero!

“DeepSeek Summary: Clem Delangue announces expecting twins, blending personal life with Hugging Face family metaphor.”

Max Woolf@minimaxir

LOL

[LLM]

“DeepSeek Summary: A humorous reaction post.”

Max Woolf@minimaxir

congrats to OpenAI on winning the Turing Test

[LLM][Evaluation]

“DeepSeek Summary: Sarcastic congratulations to OpenAI for passing the Turing Test.”

Phil Wang@lucidrains

I got to cover for the excellent @HadleyFreeman in the Guardian today so

“DeepSeek Summary: Phil Wang filled in for Hadley Freeman at The Guardian, indicating his writing work.”

Phil Wang@lucidrains

My girlfriend and I are delighted to announce the birth of our first son, Jeghro.

“DeepSeek Summary: Phil Wang announced the birth of his first son.”

Sasha Rush@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

[Tooling]

“DeepSeek Summary: Sasha Rush announces joining Cursor, an ambitious small team.”

Sasha Rush@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

[LLM]

“DeepSeek Summary: Sasha Rush makes a bet about Transformers with Jonathan Frankle.”

Stas Bekman@stas00

If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should

[Infra][Deployment]

“DeepSeek Summary: Stas Bekman notes that DeepSpeed ZeRO++ is now available on the master branch, encouraging users to try it.”

Stas Bekman@stas00

Hear, hear, I'm excited to introduce a new performance metric: Maximum Achievable Matmul

[Infra][Evaluation]

“DeepSeek Summary: Stas Bekman introduces a new performance metric called Maximum Achievable Matmul for evaluating compute efficiency.”

Stas Bekman@stas00

Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can

[Tooling]

“DeepSeek Summary: Stas Bekman acknowledges a contribution to the Machine Learning Engineering Open Book, expanding its content.”

Stas Bekman@stas00

Classical Jensen math. Unidirectional bandwidth is topped at 450GB/s, and then there comes a protocol overhead of two digit percentage. 1.

[Infra]

“DeepSeek Summary: Stas Bekman discusses bandwidth limitations and protocol overhead in high-performance computing.”

Sayak Paul@sayakpaul

Live a little, love a little, take time out to find happiness in small things, be grateful as we have one life. #lifemantra #WorkLifeBalance

“DeepSeek Summary: A personal reflection on finding happiness and gratitude in daily life.”

Sayak Paul@sayakpaul

Together w/ the community, our initiative of profiling Diffusers pipelines & potentially improving them is going very strong

[Infra][Deployment]

“DeepSeek Summary: Community-driven effort to profile and improve Diffusers pipelines.”

Philipp Schmid@philschmid

I read three technical reports from Moonshot AI's Kimi K2.5 paper, Cursor's Composer 2 report and blog post, and Chroma's Context-1 write-up

[LLM][Tooling]

“DeepSeek Summary: Philipp Schmid read technical reports on Kimi K2.5, Cursor Composer 2, and Chroma Context-1.”

Philipp Schmid@philschmid

Random thought. We are going to be so much faster at creating and building.

[LLM]

“DeepSeek Summary: Philipp Schmid believes AI will accelerate creation and building.”

Philipp Schmid@philschmid

Skills have become one of the most used extension points in agents. They're flexible, easy to make, and simple to distribute.

[Agent][Tooling]

“DeepSeek Summary: Skills are key extension points in agents due to flexibility and ease of use.”

Philipp Schmid@philschmid

Last year I covered why isolating tasks into focused agents improves reliability. Since then, better planning and tool use have unlocked

[Agent][Deployment]

“DeepSeek Summary: Philipp Schmid discusses how isolating tasks into focused agents improves reliability, with advances in planning and tool use.”

Ethan Mollick@emollick

I don't have much to add to the bubble discussion, but the “this time is different”

[LLM]

“DeepSeek Summary: Mollick comments on the AI bubble discussion, noting the 'this time is different' sentiment.”

Ethan Mollick@emollick

On the plus side with Opus 4.7, if it does decide to think it produces BY FAR the best

[LLM]

“DeepSeek Summary: Mollick praises Opus 4.7 for producing the best results when it engages in reasoning.”

Ethan Mollick@emollick

We are starting to see some nuanced discussions of what it means to work with advanced AI In this

[Agent]

“DeepSeek Summary: Mollick notes the emergence of nuanced discussions about working with advanced AI.”

Emily M. Bender@emilymbender

Image is of the 1990s Microsoft writing assistant character Clippy with its eyebrows raised positioned in.

[Safety]

“DeepSeek Summary: Uses Clippy image to critique AI hype or nostalgia.”

Emily M. Bender@emilymbender

Look what @alexhanna and I got to do! (Hang out with the cool kids ... We're talking about the Turing Test, the grandmother of all tests for AI sentience. Joining us are AI researchers Alex Hanna and Emily M. Bender

[Evaluation]

“DeepSeek Summary: Announces participation in a discussion about the Turing Test and AI sentience.”

Emily M. Bender@emilymbender

For those playing along at home, here's a "AI is sentient!" argument bingo card.

[Safety]

“DeepSeek Summary: Creates a bingo card to mock common arguments for AI sentience.”

Emily M. Bender@emilymbender

Facebook (sorry: Meta) AI: Check out our "AI" that lets you access all of humanity's knowledge.

[LLM]

“DeepSeek Summary: Sarcastically quotes Meta's AI announcement, implying hype.”

Naomi Saphra@NaomiSaphra

New preprint! Everyone loves causal interp. It's coherently defined! It makes testable predictions

[Safety][Evaluation]

“DeepSeek Summary: Naomi Saphra announces a new preprint on causal interpretability, emphasizing its coherent definition and testable predictions.”

Angela Zhou@angelamczhou

#throwback coz it's finally the day again!!! #HellOnWheels back on AMC 9/8c tonight!

[Deployment]

“DeepSeek Summary: Angela Zhou promotes the return of the TV show Hell on Wheels, indicating her involvement as an actor-writer.”

Ben Recht@beenwrekt

Revisiting Sutton's Bitter Lesson in the wake of GPT-5.

[LLM][Evaluation]

“DeepSeek Summary: Ben Recht revisits Sutton's Bitter Lesson in the context of GPT-5, likely discussing the implications of scaling laws and the role of general methods over specialized knowledge.”

Ben Recht@beenwrekt

And awesome to see many Berkeley alums thriving here. @LaurentLessard, @DimitrisPapail, and Shivaram

[Evaluation]

“DeepSeek Summary: Ben Recht notes the success of Berkeley alumni in the field, tagging several individuals.”

Ben Recht@beenwrekt

For the first time in almost a decade, I'm teaching a class on learning and control.

[Evaluation]

“DeepSeek Summary: Ben Recht announces teaching a class on learning and control after a long hiatus, indicating a return to a core research area.”

-- END OF LOG --

[STATS] 61 items · Filter applied