System.Timeline

Global Timeline

2244 items from all sources, sorted by time
grep SOURCE=

What I expect to come next and why, focused on the open-closed gap.

Highlights: The author predicts that by mid-2026, the gap between open and closed AI models will significantly narrow, with open models achieving performance parity in key areas. This shift is expected to be driven by advancements in training efficiency, data curation, and collaborative development within the open-source community.

Worth reading: It offers a forward-looking perspective on the evolving AI landscape, grounded in technical trends, making it valuable for developers, researchers, and anyone interested in the future of accessible AI technology.

Blog

What I've been up to!

Highlights: The post offers a personal update on Nathan Lambert's multifaceted contributions to AI/ML, including the ATOM Report for technical insights, a post-training course for practical education, and his book for broader dissemination of knowledge. It highlights the importance of bridging research, education, and community engagement in advancing the field.

Worth reading: It provides a concise overview of current projects from an active researcher, useful for those interested in AI/ML trends, educational resources, or community contributions.

Blog

Further reflections on China's high-participation, open-first AI ecosystem.

Highlights: The post explores how open model ecosystems, particularly in China, create compounding benefits through high participation and open-first strategies. It argues that openness accelerates innovation and leads to more robust AI development compared to closed systems.

Worth reading: It provides a unique perspective on the dynamics of open AI ecosystems, especially in the context of China's approach, which is often underrepresented in Western discussions.

Blog

Another dance around fears of open-source.

Highlights: The post critiques the 'Claude Mythos' narrative that overstates risks of open-weight AI models, arguing it's a form of fearmongering that distracts from more substantive discussions. It suggests this pattern reflects recurring anxieties in open-source debates rather than new, evidence-based concerns.

Worth reading: It offers a critical perspective on current AI discourse, challenging common assumptions about open-source risks and encouraging more nuanced evaluation of model accessibility.

Blog

Lessons from my trip to talk to most of the leading AI labs in China.

Highlights: China's AI labs are highly focused on practical applications and large-scale engineering, often prioritizing rapid iteration over theoretical novelty. The ecosystem is characterized by intense competition, strong government support, and a unique blend of open-source contributions and proprietary development.

Worth reading: Offers rare firsthand insights into China's AI landscape, revealing how cultural and policy differences shape research priorities and innovation cycles.

Blog
How to Work and Compound with AI

Eugene Yan·May 3, 2026

Context as infra, taste as config, verification for autonomy, scale via delegation, closing the loop.

Highlights: The post frames working with AI as a compound process, where context serves as infrastructure, taste as configuration, and verification enables autonomy. It emphasizes scaling through delegation and closing the loop for continuous improvement. The core insight is that effective AI collaboration requires intentional design of these elements.

Worth reading: It offers a practical framework for integrating AI into workflows, moving beyond hype to actionable strategies. The author's experience with applied AI provides credible, nuanced advice for practitioners.

Blog

A learning-oriented workflow for understanding new open-weight model releases

Highlights: The post presents a systematic, learning-focused approach for analyzing new open-weight LLM architectures, emphasizing practical understanding over theoretical abstraction. It likely details a repeatable workflow that helps practitioners efficiently grasp architectural innovations and their implications.

Worth reading: It offers actionable guidance for staying current with rapidly evolving LLM releases, making it valuable for developers, researchers, and enthusiasts seeking to deepen their practical understanding of model architectures.

Blog

An eventful month with one flagship release after another

Highlights: The post reviews a wave of major open model releases including Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5, and GLM-5.1, highlighting the rapid pace of innovation in open AI. It focuses on CAISI's V4 assessment, providing a comparative analysis of performance and capabilities across these models.

Worth reading: If you follow open-source AI, this post offers a concise yet comprehensive snapshot of the latest model landscape, saving you hours of individual paper reviews.

Blog

From Gemma 4 to DeepSeek V4, How New Open-Weight LLMs Are Reducing Long-Context Costs

Highlights: This post covers recent advances in LLM architectures aimed at reducing memory and compute costs for long-context processing, including KV sharing, multi-head caching (mHC), and compressed attention mechanisms. Key examples include Gemma 4's and DeepSeek V4's approaches to efficient attention, which enable handling longer sequences without proportional resource increases.

Worth reading: For practitioners and researchers working with LLMs, this article provides a concise overview of cutting-edge techniques that address the scalability bottleneck of long-context models, offering practical insights into how open-weight models are evolving.

Blog

And yes, I hate consortia too.

Highlights: The article argues that despite general skepticism toward consortia, the AI field urgently requires an open model consortium to ensure transparency, collaboration, and ethical standards. This collective approach is framed as essential for addressing the rapid, often opaque advancements in AI development.

Worth reading: It offers a pragmatic perspective on overcoming industry fragmentation and highlights the critical role of open collaboration in shaping responsible AI innovation.

Blog

The complex factors that determine the single evaluation number so many focus on. Plus, how this changes in the future.

Highlights: The post critiques the oversimplification of AI performance metrics, particularly the 'open-closed performance gap' often reduced to a single number. It argues this gap is shaped by complex, interdependent factors beyond simple comparisons, and explores how these dynamics might evolve with future AI advancements.

Worth reading: It offers a nuanced perspective on evaluating AI systems, moving beyond surface-level metrics to consider underlying complexities and future implications, which is valuable for practitioners and enthusiasts seeking deeper understanding.

Blog

‘Distillation attacks’ is a horrible term for what is happening right now.

Highlights: The post criticizes the term 'distillation attacks' as misleading and argues that the current trend of smaller models learning from larger ones is a natural and beneficial progression in AI development.

Worth reading: It offers a clear, critical perspective on a hot topic in AI, helping readers understand the nuances of model distillation beyond the hype.

Blog
If you're trying out FA4, you're likely to run into not being able to load cutlass.cute

Highlights: Stas Bekman warns about a common issue with FA4 (Flash Attention 4) involving cutlass.cute loading.

Worth reading: Useful for developers experimenting with Flash Attention 4.

InfraTooling
Ben Recht

@beenwrekt

For the first time in almost a decade, I'm teaching a class on learning and control.

Highlights: Ben Recht announces teaching a class on learning and control after a long hiatus.

Worth reading: Shows his renewed engagement with teaching at the intersection of learning and control theory.

Evaluation
Ben Recht

@beenwrekt

Building a theory of the architecture of organizing machines and people.

Highlights: He is working on a theory for organizing both machines and people.

Worth reading: Reflects his interest in the broader implications of machine learning on organizational structures.

Agent
Ben Recht

@beenwrekt

On unquantifiable costs and inherent tradeoffs in decision theory.

Highlights: He discusses the challenges of unquantifiable costs and tradeoffs in decision theory.

Worth reading: Highlights his critical perspective on the limitations of optimization in decision-making.

Safety
Naomi Saphra

@NaomiSaphra

New preprint! Phase transitions! We love to see them during LM training.

Highlights: Announces a new preprint about phase transitions in language model training.

Worth reading: Relevant for researchers interested in training dynamics and phase transitions in LLMs.

LLMFine-tuning
Naomi Saphra

@NaomiSaphra

Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a

Highlights: Announces new faculty position at Boston University focusing on LM interpretability.

Worth reading: Highlights career move and BU's research initiatives in interpretability.

EvaluationSafety
Ethan Mollick

@emollick

I broke my own rule to never post about AI detection as it is fraught in many ways. The problem is that if you use AI a lot, you know AI writing on sight, which makes the difficulty of objectively proving that AI use to others very frustrating

Highlights: Mollick argues that heavy AI users can recognize AI writing intuitively, but struggle to prove it objectively, highlighting the limitations of AI detection.

Worth reading: It exposes the gap between subjective AI recognition and objective proof, a key issue in AI evaluation.

Evaluation
Ethan Mollick

@emollick

In 1980, the philosopher John Searle proposed a thought experiment: a person locked in a room, manipulating Chinese characters according to a

Highlights: Mollick references Searle's Chinese Room argument, likely to discuss implications for AI understanding and consciousness.

Worth reading: Connects classic philosophy to modern AI debates about whether LLMs truly understand language.

LLM
sayakpaul
Sayak Paul

@sayakpaul

1. Read the post. 2. Contemplate. 3. Repeat 1.

Highlights: Sayak Paul shares a simple three-step process for engaging with content: read, contemplate, repeat.

Worth reading: It emphasizes thoughtful engagement over passive consumption.

LLM
sayakpaul
Sayak Paul

@sayakpaul

While I was in SF, I had a chance to present all the things in the diffusion community enabled by PyTorch at the

Highlights: Sayak Paul presented diffusion community advancements enabled by PyTorch during a visit to San Francisco.

Worth reading: Showcases the intersection of PyTorch and diffusion models in AI research.

Multi-modalInfra
I have been compiling LLM/VLM training logbooks/chronicles. This is the one of the best sources to ...

Highlights: Stas Bekman compiles LLM/VLM training logbooks, providing a valuable resource for training insights.

Worth reading: Essential for anyone involved in training large language or vision models.

LLMFine-tuningInfra
Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can ...

Highlights: Acknowledges a contribution to the Machine Learning Engineering Open Book, expanding its capabilities.

Worth reading: Highlights collaborative improvements to open-source ML engineering resources.

ToolingInfra
This is a long overdue section of the ML Engineering Understanding Training Loss Patterns ...

Highlights: Introduces a new section on understanding training loss patterns in ML engineering.

Worth reading: Provides crucial knowledge for diagnosing and improving model training.

LLMFine-tuningEvaluation
Modern art. Artist: PyTorch memory profiler Model: Llama-8B The piece on the left is the ...

Highlights: Uses PyTorch memory profiler output as a form of modern art, showing memory patterns of Llama-8B.

Worth reading: Creative visualization of memory profiling, useful for understanding model memory usage.

InfraTooling
minimaxir
Max Woolf

@minimaxir

me irl

Highlights: A short, relatable post with an image (content not fully captured).

Worth reading: Demonstrates Woolf's casual, personal style on social media.

Yann LeCun

@ylecun

Dario is wrong. He knows absolutely nothing about the effects of technological revolutions on the labor market.

Highlights: LeCun criticizes Dario's understanding of technological revolutions and labor market effects.

Worth reading: Shows LeCun's stance on AI's impact on jobs and his disagreement with other AI leaders.

Safety
Yann LeCun

@ylecun

I love Geoff. But he understands even less than Dario about the effects of technological revolutions on

Highlights: LeCun critiques Geoff Hinton's understanding of technological revolutions on labor, while expressing personal affection.

Worth reading: Highlights LeCun's differing views from other prominent AI researchers on economic impacts.

Safety
Yann LeCun

@ylecun

It seems to me that before "urgently figuring out how to control AI systems much smarter than us" we need

Highlights: LeCun questions the urgency of controlling superintelligent AI, suggesting a different priority.

Worth reading: Reflects his skepticism about AI risk narratives and his focus on more immediate challenges.

Safety