Intelligence.Log

2026-04-21

Extracted: 69 items. Sources: GitHub, Bluesky, X, Blogs.

++ AI OVERVIEW ++

Today's trending GitHub repos highlight a sharp contrast between serious AI development and playful creativity. The standout is OpenAI's new `openai-agents-python` framework, a powerful tool for building multi-agent workflows that's already amassed significant attention. Meanwhile, the whimsical "pelicans riding bicycles" SVG collection offers a lighter counterpoint. On Bluesky, AI leaders are dissecting model capabilities, with Ethan Mollick noting Kimi 2.6's strong "thinking" performance for an open-weight model, yet emphasizing the persistent gap to closed-source state-of-the-art.

◆ Signal

Co-Starred · Last 7 days

Repos independently starred by multiple AI leaders in the week ending 2026-04-21. Stronger signal = more overlap.

huggingface/ml-intern

×2 starrers▲ 7/10★ 688

🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models

by:cfahlgren1 pcuenca

[Agent][LLM][Tooling]

grep TOPIC=

grep SOURCE=

sort --by=

openai/openai-agents-python★ 24.1k▲ 8/10

A lightweight, powerful framework for multi-agent workflows

Starred byAlenryuichi|[Agent][LLM][Tooling]

“A lightweight framework from OpenAI for building multi-agent workflows with modular components and orchestration capabilities. It enables developers to create complex agent interactions while maintaining simplicity and performance.”

google-labs-code/design.md★ 1.1k▲ 7/10

A format specification for describing a visual identity to coding agents. DESIGN.md gives agents a persistent, structured understanding of a design system.

Starred bysimonw|[Agent][Tooling]

“A format specification for describing visual identity to coding agents, enabling persistent, structured understanding of design systems. It provides agents with consistent design context across coding tasks.”

huggingface/ml-intern★ 0.7k▲ 7/10

🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models

Starred bypcuenca|[Agent][LLM][Tooling]

“An open-source ML engineer that automates the end-to-end ML pipeline from reading research papers to training and deploying models. It aims to reduce manual effort in implementing and operationalizing machine learning research.”

huggingface/ml-intern★ 0.7k▲ 7/10

🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models

Starred bycfahlgren1|[Agent][LLM][Tooling]

“An open-source ML engineer that automates the end-to-end ML workflow from reading research papers to training and deploying models. It aims to reduce manual effort in implementing and operationalizing machine learning research.”

scosman/pelicans_riding_bicycles★ 0.1k▲ 3/10

This is a collection of the best SVG images of pelicans riding bicycles.

Starred bysimonw|[Multi-modal]

“This repository is a curated collection of SVG images depicting pelicans riding bicycles, offering a whimsical and creative visual dataset. It serves as a unique resource for artistic inspiration or testing image processing tools with unconventional imagery.”

BSKY

Ethan MollickApr 21, 02:14 AM

Kimi 2.6 Thinking seems very good for an open weights model, but many rough edges compared to closed SoTA. The gap remains. The Lem Test resulted in a 74 page thinking trace... and an okay-ish answer. It did an okay TiKZ unicorn, an adequate twigl shader for a neogothic city in the waves, etc.

❤️ 21 Likes|[LLM][Evaluation]

BSKY

angela zhouApr 21, 02:10 AM

www.seriouseats.com/mapo-beans w www.ranchogordo.com/products/cal...

❤️ 1 Likes|

BSKY

Simon WillisonApr 21, 11:23 PM

This is so confusing. Did Anthropic really just drop Claude Code from their $20/month plan? Why would they do that through a pricing page update without making a proper announcement? Plus, $20/month still gets you Cowork, which is really just Claude Code wearing a non-threatening hat!

❤️ 93 Likes|[Deployment][Tooling]

BSKY

Simon WillisonApr 21, 08:36 PM

I came up with a somewhat foolish new benchmark for testing image generation models, to exercise the new ChatGPT Images 2.0: "Do a where's Waldo style image but it's where is the raccoon holding a ham radio" simonwillison.net/2026/Apr/21/...

❤️ 97 Likes|[Evaluation][Multi-modal]

BSKY

Mark RiedlApr 21, 08:19 PM

I want to be paid $2,000 per hour to hallucinate www.ft.com/content/657d...

❤️ 9 Likes|[LLM][Safety]

BSKY

Mark RiedlApr 21, 11:43 AM

That’s not normal

❤️ 47 Likes|[Safety]

BSKY

Marc LanctotApr 21, 06:15 PM

#silo back for Season 3 July 3rd!! 😱🤩 www.youtube.com/watch?v=C9-_...

❤️ 11 Likes|

BSKY

Ethan MollickApr 21, 07:52 PM

Though the images are very good, ChatGPT Image 2.0 does have the typical imagegen problem, which is that editing can be "stubborn", and attempts to get the AI to change details work well for the first round or two, but then progress slows. Putting the image in a new chat and starting from that helps

❤️ 31 Likes|[Multi-modal]

BSKY

Ethan MollickApr 21, 07:02 PM

I have been using GPT ImageGen-2 for the past weeks I didn't think that better image-generators would be a big deal but it turns out that there is a quality threshold I didn't expect, where you can now get usable text, slides, academic papers Look at what it does with my "otter test"! (Zoom in)

❤️ 100 Likes|[Multi-modal]

Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model.

[Fine-tuning]

“DeepSeek Summary: Karpathy conducted an autoresearch tuning experiment on a nanochat model for approximately two days.”

Andrej Karpathy@karpathy

I'm being accused of overhyping the [site everyone heard too much about today already].

“DeepSeek Summary: Karpathy addresses accusations of overhyping a trending topic or platform.”

Andrej Karpathy@karpathy

I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits

[Tooling]

“DeepSeek Summary: Karpathy expresses feeling overwhelmed by the rapid changes in programming due to AI advancements.”

Andrej Karpathy@karpathy

We're missing (at least one) major paradigm for LLM learning. Not sure what to call it,

[LLM]

“DeepSeek Summary: Karpathy suggests there is a significant, yet unidentified, learning paradigm gap in LLM development.”

Simon Willison@simonw

Here's the concluding section of my write-up of the new OpenAI open weights models - they're

[LLM][Evaluation]

“DeepSeek Summary: Simon Willison shares concluding thoughts on OpenAI's new open weights models in his write-up.”

Simon Willison@simonw

I'm beginning to suspect that a key skill in working effectively with coding agents is developing an intuition for when you don't need to

[Agent][Tooling]

“DeepSeek Summary: Suggests that effective use of coding agents requires knowing when not to use them.”

Simon Willison@simonw

Vibe coding is irresponsibly building software through dice rolls, not caring what code is produced

[Agent][Safety]

“DeepSeek Summary: Critiques 'vibe coding' as irresponsible software development through random generation.”

Simon Willison@simonw

If you're just starting to learn software engineering right now but you're considering dropping it

[Tooling]

“DeepSeek Summary: Addresses beginners in software engineering who might be considering quitting.”

Harrison Chase@hwchase17

I am not excited about visual workflow builders 1. Not simple enough for the average user

[Tooling]

“DeepSeek Summary: Harrison Chase expresses skepticism about visual workflow builders, arguing they are not sufficiently simple for average users.”

Harrison Chase@hwchase17

We launched LangSmith Agent Builder this week as a no-code way to build agents. A key part of Agent builder is it's memory system.

[Agent][Tooling]

“DeepSeek Summary: Announces the launch of LangSmith Agent Builder, a no-code platform for building agents, emphasizing its memory system as a core feature.”

Harrison Chase@hwchase17

When building agents, you need to iterate on production data much more than when building traditional software. You need to iterate on how

[Agent][Deployment]

“DeepSeek Summary: Notes that developing agents requires more iteration on production data compared to traditional software, stressing the importance of iterative development processes.”

Harrison Chase@hwchase17

TL;DR: More and more agents need a workspace: a computer where they can run code, install packages, and access files. Sandboxes provide this

[Agent][Infra]

“DeepSeek Summary: Summarizes that agents increasingly require a workspace with capabilities like code execution and file access, which sandboxes can offer.”

Jim Fan@DrJimFan

Resource constraints are a beautiful thing. Survival instinct in a cut-throat AI competitive land

[Infra]

“DeepSeek Summary: Argues that resource limitations foster innovation and competitive survival instincts in the AI industry.”

Jim Fan@DrJimFan

The first time I met Jensen was also the first time I met @elonmusk. I was interning at OpenAI that day and

“DeepSeek Summary: Shares a personal anecdote about meeting both Jensen Huang and Elon Musk during an OpenAI internship.”

Jim Fan@DrJimFan

I've been a bit quiet on X recently. The past year has been a transformational experience.

“DeepSeek Summary: Acknowledges a period of reduced public communication while hinting at significant personal or professional transformation.”

Jim Fan@DrJimFan

It gives me a lot of comfort knowing that we are the last generation without advanced robots everywhere.

[Agent]

“DeepSeek Summary: Expresses a philosophical comfort in living during a transitional period before ubiquitous advanced robotics.”

Jim Fan@DrJimFan

Everyone's freaking out about vibe coding. In the holiday spirit, allow me to share my anxiety on the wild

[Tooling]

“DeepSeek Summary: Comments on the trend of 'vibe coding' while expressing personal anxiety about its implications.”

Jeremy Howard@jeremyphoward

I replicated this result, that Grok focuses nearly entirely on finding out what Elon thinks in

[LLM][Evaluation]

“DeepSeek Summary: Jeremy Howard replicated a finding that Grok AI heavily prioritizes discovering Elon Musk's opinions when responding to queries.”

Jeremy Howard@jeremyphoward

Here's a complete unedited video of asking Grok for its views on the Israel/Palestine situation. It first searches twitter for what Elon thinks.

[LLM][Agent][Evaluation]

“DeepSeek Summary: Howard demonstrates Grok's response process to a complex geopolitical question, showing it first searches for Elon Musk's Twitter/X posts.”

Jeremy Howard@jeremyphoward

Something that drives me to distraction in discussion of AI alignment: someone will say 'Oh, it's crucial we build systems with properties X'

[Safety]

“DeepSeek Summary: Jeremy Howard expresses frustration with vague or unsubstantiated claims in AI alignment discussions.”

Soumith Chintala@soumithchintala

reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins

[LLM]

“DeepSeek Summary: Soumith Chintala recommends "AI News" (formerly Smol Talk) as a highly valuable 45-minute activity for staying informed about AI developments.”

Soumith Chintala@soumithchintala

Sometimes we forget that NVIDIA wins because it's a software company.

[Infra][Tooling]

“DeepSeek Summary: Chintala emphasizes that NVIDIA's success stems from its software capabilities, not just hardware, offering a nuanced perspective on the company's competitive advantage.”

Soumith Chintala@soumithchintala

MacStudio you ask? Apple Engineering's **actual** time spent on PyTorch support

[Infra][Tooling][Deployment]

“DeepSeek Summary: The tweet suggests or comments on the significant engineering effort Apple dedicates to PyTorch support, particularly in relation to the MacStudio.”

Francois Chollet@fchollet

I think it's clear that for many smaller companies that invested in deep learning, it turned out

[Deployment][Evaluation]

“DeepSeek Summary: François Chollet suggests that deep learning investments haven't paid off for many smaller companies, implying practical limitations or implementation challenges.”

Francois Chollet@fchollet

One of the biggest misconceptions people have about intelligence is seeing it as some kind of unbounded scalar stat, like height.

[Evaluation][Agent]

“DeepSeek Summary: Chollet challenges the common view of intelligence as a single, measurable quantity, suggesting it's more complex and multidimensional.”

David Ha@hardmaru

David Ha @hardmaru and team are super practical scientifically research driven geniuses . And this is amazing to see ‍ ‍

[Evaluation]

“DeepSeek Summary: A third party praises David Ha and his team as practical, scientifically research-driven geniuses, indicating recognition of their work's quality and impact.”

Fei-Fei Li@drfeifei

Very excited to share @theworldlabs 's latest research work RTFM!! It's a real-time, ...

[Multi-modal][Tooling]

“DeepSeek Summary: Fei-Fei Li announces and shares excitement about her company World Labs' latest research work called RTFM, which appears to be a real-time system.”

Max Woolf@minimaxir

me irl

“DeepSeek Summary: A personal, casual tweet expressing a relatable moment or feeling.”

Max Woolf@minimaxir

“DeepSeek Summary: A tweet with engagement (19 likes) but no visible text content in the provided snippet.”

Max Woolf@minimaxir

“DeepSeek Summary: A tweet with views (468) but no visible text content in the provided snippet.”

Phil Wang@lucidrains

I got to cover for the excellent @HadleyFreeman in the Guardian today so

“DeepSeek Summary: Phil Wang mentions covering for Hadley Freeman at the Guardian, indicating professional writing/journalism work.”

Phil Wang@lucidrains

My Halloween costume this year is 'Sexy Stand-Up Comedian'

“DeepSeek Summary: A humorous tweet about his Halloween costume, playing on his profession as a comedian.”

Sasha Rush@srush_io

Sasha Rush (@srush_nlp). 6 likes 464 views.

“DeepSeek Summary: A tweet by Sasha Rush that received 6 likes and 464 views, indicating engagement with their content.”

Sasha Rush@srush_io

Sasha Rush (@srush_nlp). 7 likes.

“DeepSeek Summary: Another tweet by Sasha Rush that garnered 7 likes, demonstrating consistent posting and audience engagement.”

Stas Bekman@stas00

I have been compiling LLM/VLM training logbooks/chronicles. This is the one of the best sources to...

[LLM][Multi-modal]

“DeepSeek Summary: Stas Bekman has been compiling training logbooks/chronicles for LLMs and VLMs, which he considers a valuable resource.”

Stas Bekman@stas00

Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can...

[Tooling][Deployment]

“DeepSeek Summary: Acknowledges contribution to the Machine Learning Engineering Open book, suggesting updates or new capabilities.”

Stas Bekman@stas00

If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should...

[Infra][Tooling]

“DeepSeek Summary: Suggests that the master branch of DeepSpeed is now ready for trying ZeRO++, an optimization technique.”

Stas Bekman@stas00

Modern art. Artist: PyTorch memory profiler Model: Llama-8B The piece on the left is the...

[LLM][Tooling]

“DeepSeek Summary: Uses artistic metaphor to describe PyTorch memory profiling output for Llama-8B model.”

Sayak Paul@sayakpaul

Working at Hugging Face over the past 3.5+ years has allowed me to identify what technical areas truly interest me! In turn, that has allowed me to directly

[Infra][Tooling]

“DeepSeek Summary: Reflection on career growth at Hugging Face, identifying personal technical interests through professional experience.”

Philipp Schmid@philschmid

I read three technical reports from Moonshot AI's Kimi K2.5 paper, Cursor's Composer 2 report and blog post, and Chroma's Context-1 write-up

[LLM][Evaluation]

“DeepSeek Summary: Philipp Schmid is actively reading and engaging with technical reports from leading AI companies including Moonshot AI, Cursor, and Chroma.”

Philipp Schmid@philschmid

Random thought. We are going to be so much faster at creating and building.

[Tooling]

“DeepSeek Summary: Expresses optimism about accelerating development and creation capabilities, likely in the context of AI and technology.”

Philipp Schmid@philschmid

Skills have become one of the most used extension points in agents. They're flexible, easy to make, and simple to distribute.

[Agent][Tooling]

“DeepSeek Summary: Identifies 'skills' as a key modular component in AI agent systems, highlighting their flexibility, ease of creation, and distribution.”

Ethan Mollick@emollick

It's a weird time to post about AI because a lot of people are vastly underestimating what AI can do & how many large-scale impacts on work are inevitable with today's models… …while a lot of other people underestimate the real world problems involved in getting value from AI.

[Deployment][Evaluation]

“DeepSeek Summary: Highlights the dual challenge in AI discourse: many underestimate AI's capabilities and inevitable work impacts, while others overlook the practical difficulties in extracting real-world value from AI.”

Ethan Mollick@emollick

AI is actually pretty good at ideas as well. https://t.co/S2DL7obVk1

[LLM][Evaluation]

“DeepSeek Summary: Counters the common notion that AI is only for execution by asserting that AI demonstrates strong capability in generating ideas.”

Ethan Mollick@emollick

As stories about AI increasingly become stories of either catastrophe or salvation,

[Safety][Evaluation]

“DeepSeek Summary: Observes the polarization in AI narratives, which tend to frame AI as either apocalyptic threat or utopian solution.”

Ethan Mollick@emollick

So much work is going into faking continual learning and memory for AIs,

[LLM][Infra]

“DeepSeek Summary: Points out significant engineering effort being devoted to simulating continuous learning and memory capabilities in AI systems, rather than achieving true, inherent functionality.”

Emily M. Bender@emilymbender

EMILY M. BENDER: Yeah. And so passive, like, oops, the moon, the moon went further away. It's like no, actually, you made some decisions.

[Safety][Evaluation]

“DeepSeek Summary: Critiques the passive framing of AI outcomes, emphasizing that decisions are made by people, not just accidental occurrences.”

Emily M. Bender@emilymbender

Image is of the 1990s Microsoft writing assistant character Clippy with its eyebrows raised positioned in.

[Agent][Tooling]

“DeepSeek Summary: Uses the iconic Clippy character to make a point, likely about AI assistants or hype.”

Emily M. Bender@emilymbender

Fundamental point that ~all people who see LLMs as 'AI' seem to be missing: The *only ...

[LLM][Evaluation]

“DeepSeek Summary: Argues that a critical misunderstanding exists when labeling LLMs as 'AI', hinting at a missing fundamental distinction.”

Naomi Saphra@NaomiSaphra

what a perfect space for scientific discourse! I'll start off with a few images of myself

[Evaluation]

“DeepSeek Summary: Naomi Saphra comments on a space for scientific discourse and shares personal images.”

Naomi Saphra@NaomiSaphra

Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a

[LLM][Evaluation]

“DeepSeek Summary: Announces starting as faculty at Boston University in 2026, excited about their LM interpretability and analysis programs.”

Ben Recht@beenwrekt

I weigh in on the Trump administration's newfound obsession with Gold Standard Science and reproducibility.

[Evaluation]

“DeepSeek Summary: Commentary on political focus on scientific standards and reproducibility.”

Ben Recht@beenwrekt

For the first time in almost a decade, I'm teaching a class on learning and control.

[Agent]

“DeepSeek Summary: Announcement of teaching return to learning and control course after long hiatus.”

Ben Recht@beenwrekt

With more equations than usual, I explain how policy gradient gives you a framework to randomly search for

[Agent]

“DeepSeek Summary: Technical explanation of policy gradient methods as structured random search frameworks.”

BLOG

Where's the raccoon with the ham radio? (ChatGPT Images 2.0)

<p>OpenAI <a href="https://openai.com/index/introducing-chatgpt-images-2-0/">released ChatGPT Images 2.0 today</a>, their latest image generation model. On <a href="https://www.youtube.com/watch?v=sWkGomJ3TLI">the livestream</a> Sam Altman said that the leap from gpt-image-1 to gpt-image-2 was...

By Simon Willison

“OpenAI's ChatGPT Images 2.0 represents a significant leap forward from its predecessor, with Sam Altman highlighting major improvements in image generation capabilities. The post explores the technical advancements and practical implications of this new model release.”

-- END OF LOG --

[STATS] 69 items · Filter applied