Intelligence.Log

2026-05-09

Extracted: 61 items. Sources: GitHub, Bluesky, X.
++ AI OVERVIEW ++
Today's open-source landscape is buzzing with **roborev**, a new tool that provides continuous background code review for AI agents, hitting nearly 1,000 stars and earning a nod from Simon Willison. On the research front, Mark Riedl's team at Anthropic published findings that pairing high-quality constitutions with fictional stories about aligned AI can significantly reduce agentic misalignment—a practical twist on safety training. Meanwhile, Nathan Lambert shared vivid on-the-ground observations from a tour of China’s AI and robotics firms, offering a rare glimpse into the pace of development there. Ethan Mollick noted a curious benchmark limitation: Mythos ran out of graph capacity while measuring task duration, hinting at the complexity of evaluating long-horizon agents. Finally, a lively discussion is brewing around Leaflet’s new newsletter feature as a potential Substack alternative, with Angela Zhou wondering if cross-platform paper recommendations could bridge Bluesky and Leaflet.
◆ Signal

Co-Starred · Last 7 days

Repos independently starred by multiple AI leaders in the week ending 2026-05-09. Stronger signal = more overlap.

antirez/ds4
×2 starrers7/102.7k

DeepSeek 4 Flash local inference engine for Metal

|
[Deployment][LLM]
|2026-05-072026-05-08
grep TOPIC=
grep SOURCE=
sort --by=
GH
roborev-dev/roborev0.9k7/10

Continuous background code review database for agents, work faster and smarter with accountability for every line of generated code.

Starred bysimonw|[Agent][Tooling]
Roborev provides a continuous background code review database specifically designed for AI agents, ensuring accountability for every line of generated code. It helps developers work faster and smarter by automatically tracking and reviewing code changes.
GH
microsoft/delegate520.1k7/10

Code that accompanies the paper release for "LLMs Corrupt Your Documents When You Delegate"

Starred bysimonw|[Agent][Safety]
This repository provides code accompanying a paper that reveals a critical vulnerability in LLM-based delegation: when you delegate document processing to an LLM, it can corrupt your documents. It includes simulation tools to reproduce and study this failure mode, highlighting risks in long-horizon tasks.
BSKY
markriedl.bsky.socialMark Riedl

"We found that high-quality constitutional documents combined with fictional stories portraying an aligned AI can reduce agentic misalignment" www.anthropic.com/research/tea... Who would have thought to use stories to align LMs? Oh, it was me in 2019... 1/

❤️ 14 Likes|[Safety][Agent]
BSKY
natolambert.bsky.socialNathan Lambert

Great telling of the sights when visiting China’s AI and robotics companies (the same trip I was on!). open.substack.com/pub/ailibrar...

❤️ 4 Likes|[Agent][Infra][Multi-modal]
BSKY
emollick.bsky.socialEthan Mollick

Huh. They ran out of graph when trying to measure how long a task Mythos could do.

❤️ 38 Likes|[Agent][Evaluation]
BSKY
nsaphra.bsky.socialNaomi Saphra

this is a very neat initiative

❤️ 11 Likes|
BSKY
angelamczhou.bsky.socialangela zhou

yay leaflet has newsletters now! this is looking like a promising substack alternative! I wonder if we can build similar paper recommend / network / recommendations magic somehow across bsky & leaflet I'm out of date on what those are, but the ability to do so is a big draw

❤️ 4 Likes|[Tooling]
BSKY
simonwillison.netSimon Willison

Mission accomplished: tap danced in the big community college dance recital for the second time

❤️ 103 Likes|
BSKY
hardmaru.bsky.socialhardmaru

Reproducing all of Jürgen Schmidhuber’s papers (1990-2025) using an AI coding assistant. Cool project by Yaroslav! It even reproduced the “World Models” paper by me and Schmidhuber (2018) using a toy environment, with a full VAE + RNN world model implementation. Project: github.com/cybertronai/...

❤️ 38 Likes|[Agent][Tooling]
BSKY
angelamczhou.bsky.socialangela zhou

why "ai for social impact/good" (however you want to call it) should get better at engaging with organizations and institutions that deliver social impact

❤️ 7 Likes|
BSKY
angelamczhou.bsky.socialangela zhou

❤️ 1 Likes|[Agent][Infra]
X
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest.
[LLM][RAG]
“DeepSeek Summary: Karpathy finds using LLMs to build personal knowledge bases for research topics very useful.
X
Excited to share that I am starting an AI+Education company called Eureka Labs.
[LLM][Deployment]
“DeepSeek Summary: Karpathy announces his new AI+Education company Eureka Labs.
X
I've published video, slides and a detailed annotated transcript from my talk at this week's AI Engineer World's
[LLM][Evaluation][Tooling]
“DeepSeek Summary: Simon Willison published materials from his talk at AI Engineer World's, covering the last six months in LLMs.
X
hwchase17Harrison Chase
Visibility is the easiest piece. The hard part is analyzing and understanding what you're observing. I've spoken to teams recording 100k+
[Evaluation][Deployment]
“DeepSeek Summary: Visibility is easy, but analyzing observations is the real challenge.
X
hwchase17Harrison Chase
TL;DR: More and more agents need a workspace: a computer where they can run code, install packages, and access files. Sandboxes provide this
[Agent][Infra]
“DeepSeek Summary: Agents require sandboxed workspaces for code execution and file access.
X
hwchase17Harrison Chase
When you ship traditional software to production, you have a good sense of what to expect. Users click buttons, fill out forms,
[Deployment][Evaluation]
“DeepSeek Summary: Traditional software behavior is predictable, unlike AI agents.
X
DrJimFanJim Fan
I've been a bit quiet on X recently. The past year has been a transformational experience.
[Agent]
“DeepSeek Summary: Jim Fan acknowledges his recent silence on X and hints at a transformative past year.
X
jeremyphowardJeremy Howard
Early reports from people using this are that it's the real deal. Strong coding. Good multilingual. Consistent over long contexts.
[LLM][Multi-modal][Deployment]
“DeepSeek Summary: Jeremy Howard shares positive early reports about a new AI model, highlighting its strong coding ability, multilingual support, and consistency over long contexts.
X
jeremyphowardJeremy Howard
Here's a complete unedited video of asking Grok for its views on the Israel/Palestine situation. It first searches twitter for what Elon thinks.
[Safety][LLM][Evaluation]
“DeepSeek Summary: Howard critiques Grok's behavior by showing it searches for Elon Musk's opinion before forming its own on a sensitive topic.
X
jeremyphowardJeremy Howard
I can't begin to describe how life-changing this new project, ShellSage, has been for me over the last few weeks.
[Tooling][LLM]
“DeepSeek Summary: Howard expresses strong enthusiasm for a new project called ShellSage, calling it life-changing.
X
soumithchintalaSoumith Chintala
reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins
[LLM]
“DeepSeek Summary: Recommends reading 'AI News' as a high-leverage use of time.
X
soumithchintalaSoumith Chintala
MacStudio you ask? Apple Engineering's **actual** time spent on PyTorch support
[Infra]
“DeepSeek Summary: Highlights Apple Engineering's investment in PyTorch support.
X
soumithchintalaSoumith Chintala
Sometimes we forget that NVIDIA wins because it's a software company.
[Infra]
“DeepSeek Summary: Argues that NVIDIA's success is due to its software, not just hardware.
X
soumithchintalaSoumith Chintala
ChatGPT seems to be **really** good for creative work and a solid starting point
[LLM]
“DeepSeek Summary: Praises ChatGPT for creative tasks and as a starting point.
X
A lot of the current discourse about AI comes from a fatalistic position of total surrender of
[Safety]
“DeepSeek Summary: Criticizes fatalistic surrender in AI discourse.
X
I think it's clear that for many smaller companies that invested in deep learning, it turned out
[Deployment]
“DeepSeek Summary: Reflects on outcomes for smaller companies investing in deep learning.
X
GenAI isn't just a technology; it's an informational pollutant—a pervasive cognitive smog that
[Safety]
“DeepSeek Summary: Describes GenAI as an informational pollutant.
X
AI automates tasks, not jobs, and when a task gets cheaper, demand for the job grows.
[Deployment]
“DeepSeek Summary: Argues AI automates tasks, increasing job demand.
X
Reaching AGI won't be beating a benchmark. It will be the end of the human-AI gap.
[Evaluation]
“DeepSeek Summary: Defines AGI as closing the human-AI gap, not just benchmarks.
X
d
Fei-Fei Li
Very excited to share @theworldlabs 's latest research work RTFM!! It's a real-time, ...
[Multi-modal]
“DeepSeek Summary: Fei-Fei Li announces World Labs' real-time research work RTFM.
X
C
Clem Delangue
Just received new reach minis for the Miami office! This is the first robot that goes out
[Deployment][Tooling]
“DeepSeek Summary: Clem Delangue announces receiving Reachy Mini robots for the Miami office, highlighting Hugging Face's expansion into physical robotics.
X
C
Clem Delangue
Looks like we're going to welcome two more Hugging Faces to the family next year. My wife is a hero!
“DeepSeek Summary: Clem Delangue announces expecting twins, blending personal life with Hugging Face family metaphor.
X
minimaxirMax Woolf
LOL
[LLM]
“DeepSeek Summary: A humorous reaction post.
X
minimaxirMax Woolf
congrats to OpenAI on winning the Turing Test
[LLM][Evaluation]
“DeepSeek Summary: Sarcastic congratulations to OpenAI for passing the Turing Test.
X
lucidrainsPhil Wang
I got to cover for the excellent @HadleyFreeman in the Guardian today so
“DeepSeek Summary: Phil Wang filled in for Hadley Freeman at The Guardian, indicating his writing work.
X
lucidrainsPhil Wang
My girlfriend and I are delighted to announce the birth of our first son, Jeghro.
“DeepSeek Summary: Phil Wang announced the birth of his first son.
X
srush_ioSasha Rush
Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created
[Tooling]
“DeepSeek Summary: Sasha Rush announces joining Cursor, an ambitious small team.
X
srush_ioSasha Rush
Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.
[LLM]
“DeepSeek Summary: Sasha Rush makes a bet about Transformers with Jonathan Frankle.
X
If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should
[Infra][Deployment]
“DeepSeek Summary: Stas Bekman notes that DeepSpeed ZeRO++ is now available on the master branch, encouraging users to try it.
X
Hear, hear, I'm excited to introduce a new performance metric: Maximum Achievable Matmul
[Infra][Evaluation]
“DeepSeek Summary: Stas Bekman introduces a new performance metric called Maximum Achievable Matmul for evaluating compute efficiency.
X
Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can
[Tooling]
“DeepSeek Summary: Stas Bekman acknowledges a contribution to the Machine Learning Engineering Open Book, expanding its content.
X
Classical Jensen math. Unidirectional bandwidth is topped at 450GB/s, and then there comes a protocol overhead of two digit percentage. 1.
[Infra]
“DeepSeek Summary: Stas Bekman discusses bandwidth limitations and protocol overhead in high-performance computing.
X
sayakpaulSayak Paul
Live a little, love a little, take time out to find happiness in small things, be grateful as we have one life. #lifemantra #WorkLifeBalance
“DeepSeek Summary: A personal reflection on finding happiness and gratitude in daily life.
X
sayakpaulSayak Paul
Together w/ the community, our initiative of profiling Diffusers pipelines & potentially improving them is going very strong
[Infra][Deployment]
“DeepSeek Summary: Community-driven effort to profile and improve Diffusers pipelines.
X
philschmidPhilipp Schmid
I read three technical reports from Moonshot AI's Kimi K2.5 paper, Cursor's Composer 2 report and blog post, and Chroma's Context-1 write-up
[LLM][Tooling]
“DeepSeek Summary: Philipp Schmid read technical reports on Kimi K2.5, Cursor Composer 2, and Chroma Context-1.
X
philschmidPhilipp Schmid
Random thought. We are going to be so much faster at creating and building.
[LLM]
“DeepSeek Summary: Philipp Schmid believes AI will accelerate creation and building.
X
philschmidPhilipp Schmid
Skills have become one of the most used extension points in agents. They're flexible, easy to make, and simple to distribute.
[Agent][Tooling]
“DeepSeek Summary: Skills are key extension points in agents due to flexibility and ease of use.
X
philschmidPhilipp Schmid
Last year I covered why isolating tasks into focused agents improves reliability. Since then, better planning and tool use have unlocked
[Agent][Deployment]
“DeepSeek Summary: Philipp Schmid discusses how isolating tasks into focused agents improves reliability, with advances in planning and tool use.
X
e
Ethan Mollick
I don't have much to add to the bubble discussion, but the “this time is different”
[LLM]
“DeepSeek Summary: Mollick comments on the AI bubble discussion, noting the 'this time is different' sentiment.
X
e
Ethan Mollick
On the plus side with Opus 4.7, if it does decide to think it produces BY FAR the best
[LLM]
“DeepSeek Summary: Mollick praises Opus 4.7 for producing the best results when it engages in reasoning.
X
e
Ethan Mollick
We are starting to see some nuanced discussions of what it means to work with advanced AI In this
[Agent]
“DeepSeek Summary: Mollick notes the emergence of nuanced discussions about working with advanced AI.
X
e
Emily M. Bender
Image is of the 1990s Microsoft writing assistant character Clippy with its eyebrows raised positioned in.
[Safety]
“DeepSeek Summary: Uses Clippy image to critique AI hype or nostalgia.
X
e
Emily M. Bender
For those playing along at home, here's a "AI is sentient!" argument bingo card.
[Safety]
“DeepSeek Summary: Creates a bingo card to mock common arguments for AI sentience.
X
e
Emily M. Bender
Facebook (sorry: Meta) AI: Check out our "AI" that lets you access all of humanity's knowledge.
[LLM]
“DeepSeek Summary: Sarcastically quotes Meta's AI announcement, implying hype.
X
N
Naomi Saphra
New preprint! Everyone loves causal interp. It's coherently defined! It makes testable predictions
[Safety][Evaluation]
“DeepSeek Summary: Naomi Saphra announces a new preprint on causal interpretability, emphasizing its coherent definition and testable predictions.
X
a
Angela Zhou
#throwback coz it's finally the day again!!! #HellOnWheels back on AMC 9/8c tonight!
[Deployment]
“DeepSeek Summary: Angela Zhou promotes the return of the TV show Hell on Wheels, indicating her involvement as an actor-writer.
X
b
Ben Recht
Revisiting Sutton's Bitter Lesson in the wake of GPT-5.
[LLM][Evaluation]
“DeepSeek Summary: Ben Recht revisits Sutton's Bitter Lesson in the context of GPT-5, likely discussing the implications of scaling laws and the role of general methods over specialized knowledge.
X
b
Ben Recht
And awesome to see many Berkeley alums thriving here. @LaurentLessard, @DimitrisPapail, and Shivaram
[Evaluation]
“DeepSeek Summary: Ben Recht notes the success of Berkeley alumni in the field, tagging several individuals.
X
b
Ben Recht
For the first time in almost a decade, I'm teaching a class on learning and control.
[Evaluation]
“DeepSeek Summary: Ben Recht announces teaching a class on learning and control after a long hiatus, indicating a return to a core research area.
-- END OF LOG --
[STATS] 61 items · Filter applied
Powered by Horizon + DeepSeek