Recent Activity
Observable Plot charts for Datasette Agent
@simonwillison.net
@simonw
Highlights: Simon criticizes 'vibe coding' as an irresponsible approach to software development.
Worth reading: Highlights a key debate on coding practices with AI.
@simonw
Highlights: Simon notes that LLMs may not favor boring technology as predicted.
Worth reading: Offers insight into LLM adoption trends.
@simonw
Highlights: Simon suggests that knowing when not to use coding agents is a key skill.
Worth reading: Important for effective use of AI coding tools.
@simonw
Highlights: Simon argues coding agents remove excuses for skipping certain practices.
Worth reading: Reinforces the value of coding agents in improving software quality.
@simonwillison.net
Simon Willison·May 21, 2026
<p>We just <a href="https://datasette.io/blog/2026/datasette-agent/">announced the first release of Datasette Agent</a>, a new extensible AI assistant for Datasette. I've been working on my <a href="https://llm.datasette.io/">LLM</a> Python library for just over three years now, and Datasette Agent...
Highlights: Datasette Agent is a new extensible AI assistant for Datasette, built on the LLM Python library. It enables natural language querying of databases and can be extended with plugins for custom workflows. This release marks a significant step in combining AI with data exploration.
Worth reading: If you use Datasette for data analysis, this tool offers a powerful way to interact with your databases using natural language, making data queries more accessible and efficient.
Universal Rust multiplexer with a typed SDK — drive any CLI or TUI app from code. Native on Linux, macOS, and Windows.
Highlights: rmux is a universal Rust multiplexer that allows you to programmatically drive any CLI or TUI application via a typed SDK. It supports native execution on Linux, macOS, and Windows, making it a cross-platform tool for automating terminal interactions.
Worth reading: For AI engineers building agentic systems that need to interact with CLI tools, rmux provides a robust, type-safe way to control terminals programmatically, which is a common requirement in agent frameworks.
@simonw
Highlights: Simon Willison critiques 'vibe coding' as an irresponsible approach to software development.
Worth reading: It offers a critical perspective on a popular coding trend.
@simonw
Highlights: Simon Willison notes that LLMs might favor boring technology when paired with a good coding agent harness.
Worth reading: It provides insight into LLM behavior in coding contexts.
@simonw
Highlights: Simon Willison suggests that intuition for when not to intervene is key for coding agent effectiveness.
Worth reading: It offers practical advice for working with AI coding agents.
@simonwillison.net
@simonw
Highlights: Simon Willison argues that quitting programming due to LLMs is analogous to quitting carpentry due to power tools, implying LLMs are tools that augment rather than replace programmers.
Worth reading: Provides a balanced perspective on LLMs' impact on programming careers, countering fear with historical analogy.
@simonwillison.net
Simon Willison·May 19, 2026
<p>Today at Google I/O, Google <a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/">released Gemini 3.5 Flash</a>. This one skipped the <code>-preview</code> modifier and went straight to general availability, and Google appear to be using it for a whole lot...
Highlights: Google released Gemini 3.5 Flash directly to general availability, skipping the preview phase, and plans to integrate it across many products. Despite being more expensive, it offers improved performance and efficiency, making it a versatile model for various applications.
Worth reading: This post provides insight into Google's strategic shift towards a more powerful, production-ready model and its implications for developers and users.
Simon Willison·May 19, 2026
<p>I put together these annotated slides from my five minute lightning talk at PyCon US 2026, using the <a href="https://tools.simonwillison.net/annotated-presentations">latest iteration</a> of my <a href="https://simonwillison.net/2023/Aug/6/annotated-presentations/">annotated presentation...
Highlights: The post summarizes key developments in LLMs over the past six months, including the rise of multi-modal models, improved reasoning capabilities, and the increasing importance of evaluation frameworks. It highlights practical tools and techniques for working with LLMs, such as prompt engineering and fine-tuning.
Worth reading: It offers a concise, high-level overview of recent LLM advancements, making it useful for practitioners who want to stay updated without diving into lengthy technical papers.
Data collection and analysis for a PyCon talk on GitHub Actions security across Python packages.
Highlights: This repository provides data collection and analysis scripts for a PyCon talk on GitHub Actions security across Python packages. It offers insights into how GitHub Actions are used in the Python ecosystem and potential security risks.
Worth reading: It's worth exploring for insights into GitHub Actions security practices and for understanding the security posture of Python packages.
@simonw
Highlights: Simon observes that AI labs are prioritizing code generation as the primary benchmark for model improvement.
Worth reading: Reflects a key trend in AI development where coding ability is seen as a proxy for general intelligence.
PyPI downloads analytics dashboard
Highlights: PyPI Stats provides a public dashboard for viewing download statistics of Python packages from PyPI. It offers insights into package popularity and trends over time, useful for developers and maintainers.
Worth reading: For AI leaders tracking the adoption of their Python-based AI tools, this repo offers a simple way to monitor download metrics and gauge community interest.
A tool for measuring Python class cohesion.
Highlights: Cohesion is a flake8 plugin that measures Python class cohesion using the Lack of Cohesion of Methods (LCOM) metric. It helps developers identify classes that may be doing too much and should be refactored, improving code maintainability and adherence to single responsibility principle.
Worth reading: For Python developers focused on code quality, this tool provides a concrete, automated way to detect low-cohesion classes, which is a key indicator of design issues in object-oriented code.
@simonw
Highlights: Simon Willison criticizes 'vibe coding' as an irresponsible approach to software development.
Worth reading: Highlights a critical perspective on a trending coding methodology.
@simonw
Highlights: Simon Willison notes that LLMs may not favor boring technology as predicted.
Worth reading: Challenges common assumptions about LLM preferences.
@simonwillison.net
@simonw
Highlights: Simon Willison observes that AI labs are increasingly focused on improving code generation as a primary goal.
Worth reading: Reflects a key trend in AI development priorities.
@simonw
Highlights: Simon notes that AI labs are increasingly focused on improving code generation capabilities.
Worth reading: Reflects a key trend in AI development priorities.
@simonw
Highlights: LLMs may favor boring technology when paired with a good coding agent harness.
Worth reading: Challenges the assumption that LLMs always prefer novel tech.
@simonw
Highlights: Key skill for coding agents is knowing when not to intervene.
Worth reading: Insightful perspective on human-agent collaboration.
@simonw
Highlights: Defines vibe coding as irresponsible software development.
Worth reading: Critical view on a trending practice in AI-assisted coding.
@simonw
Highlights: Simon Willison defines vibe coding as irresponsible software development where code quality is neglected.
Worth reading: Provides a critical perspective on a trending AI-assisted coding practice.
@simonw
Highlights: Willison notes that LLMs may not favor boring technology as predicted.
Worth reading: Challenges a common assumption about LLM preferences in technology choices.
@simonwillison.net
26m function call model that runs on incredibly small devices
Highlights: A compact 26M parameter model optimized for function calling on resource-constrained devices, enabling on-device AI execution. It leverages Gemma architecture and is designed for low-latency, privacy-preserving inference.
Worth reading: It demonstrates how to run capable LLMs on tiny hardware, opening up edge AI applications like IoT and mobile assistants.
Highlights: This plugin enables Tailscale authentication for Datasette, allowing users to restrict access to their Datasette instances to Tailscale network members. It leverages Tailscale's identity and access controls for seamless, secure sharing.
Worth reading: For Datasette users already on Tailscale, this plugin provides a simple yet powerful way to add authentication without managing separate user databases, making it ideal for internal tools and collaborative data exploration.
@simonw
Highlights: Simon Willison shared a talk about the last year six months in LLMs, illustrated with pelicans on bicycles.
Worth reading: Provides a creative and insightful overview of recent LLM developments.
@simonw
Highlights: Simon Willison observes that AI labs are increasingly focused on improving code generation capabilities.
Worth reading: Highlights a key trend in AI research priorities.
@simonwillison.net
@simonwillison.net
@simonwillison.net
@simonwillison.net
@simonw
Highlights: LLMs attached to coding agents may favor boring technology, challenging earlier predictions.
Worth reading: Insight into how LLM behavior changes when integrated with agentic harnesses.
@simonw
Highlights: Key skill for coding agents is knowing when to intervene.
Worth reading: Practical advice for developers using AI coding assistants.
@simonw
Highlights: Recommends best guidance for writing commit history.
Worth reading: Useful for developers aiming to improve their commit practices.
@simonw
Highlights: LLMs may favor boring technology when attached to a good coding agent harness.
Worth reading: Challenges the assumption that LLMs always prefer boring tech.
@simonw
Highlights: Key skill with coding agents: intuition for when not to intervene.
Worth reading: Insight into effective human-agent collaboration.
@simonw
Highlights: Vibe coding defined as irresponsible software development.
Worth reading: Critical perspective on a trending coding approach.
Code that accompanies the paper release for "LLMs Corrupt Your Documents When You Delegate"
Highlights: This repository provides code accompanying a paper that reveals a critical vulnerability in LLM-based delegation: when you delegate document processing to an LLM, it can corrupt your documents. It includes simulation tools to reproduce and study this failure mode, highlighting risks in long-horizon tasks.
Worth reading: It exposes a subtle but important failure mode in LLM agents that is often overlooked, making it essential for anyone building or deploying LLM-based automation.
@simonwillison.net
Continuous background code review database for agents, work faster and smarter with accountability for every line of generated code.
Highlights: Roborev provides a continuous background code review database specifically designed for AI agents, ensuring accountability for every line of generated code. It helps developers work faster and smarter by automatically tracking and reviewing code changes.
Worth reading: As AI-generated code becomes more prevalent, Roborev addresses the critical need for accountability and review, making it a timely tool for teams leveraging AI agents in development.
@simonwillison.net
DeepSeek 4 Flash local inference engine for Metal
Highlights: A local inference engine for DeepSeek 4 Flash optimized for Apple Metal, enabling fast LLM inference on Mac hardware. Written in C for performance, it provides a lightweight alternative to cloud-based inference.
Worth reading: Essential for AI developers using Apple Silicon who want to run DeepSeek models locally with minimal overhead and high speed.
@simonwillison.net
Simon Willison·May 7, 2026
<p>There weren't a lot of big new announcements from Anthropic at yesterday's Code w/ Claude event, but the biggest by far was the deal they've struck with SpaceX/xAI to use "all of the capacity of their Colossus data center".</p> <p>As I mentioned in my <a...
Highlights: Anthropic has struck a deal with xAI to use the full capacity of the Colossus data center, signaling a major infrastructure collaboration. This move highlights the escalating demand for compute resources in AI development and the strategic partnerships forming to secure them.
Worth reading: The post offers a clear analysis of the implications of this deal for the AI industry, particularly around resource consolidation and competitive dynamics.
@simonw
Highlights: Simon observes that AI labs are increasingly focused on improving code generation capabilities.
Worth reading: Reflects a key trend in AI development priorities.
@simonwillison.net
Simon Willison·May 6, 2026
<p>I'm at Anthropic's Code w/ Claude event today. Here's my live blog of the morning keynote sessions.</p><p><em>You are only seeing the long-form articles from my blog. Subscribe to <a href="https://simonwillison.net/atom/everything/">/atom/everything/</a> to get all of my posts, or take a look at...
Highlights: Anthropic's Code w/ Claude event showcases new capabilities for AI-assisted coding, including improved code generation, debugging, and collaborative features. The live blog format provides real-time insights into keynote sessions, highlighting practical applications and future directions for Claude in software development.
Worth reading: For developers interested in the cutting edge of AI coding tools, this live blog offers firsthand observations of Claude's latest features and Anthropic's vision for AI-assisted programming.
@simonwillison.net
Simon Willison·May 6, 2026
<p>I recently talked with Joseph Ruscio about AI coding tools for Heavybit's High Leverage podcast: <a href="https://www.heavybit.com/library/podcasts/high-leverage/ep-9-the-ai-coding-paradigm-shift-with-simon-willison">Ep. #9, The AI Coding Paradigm Shift with Simon Willison</a>. Here are some of...
Highlights: The post discusses the convergence of 'vibe coding' (using AI to generate code without fully understanding it) and 'agentic engineering' (autonomous AI agents that build software), warning that as these approaches advance, developers risk losing control over code quality and security. It emphasizes the need for human oversight and testing, especially as AI-generated code becomes more complex and harder to audit.
Worth reading: It offers a nuanced perspective on the risks of over-relying on AI coding tools, making it valuable for developers and tech leaders navigating the shift toward AI-assisted software development.
@simonwillison.net
Highlights: Liblotus is a Rust library for building fast, embeddable vector search indexes with support for hybrid search (sparse + dense vectors). It offers efficient indexing and querying for semantic search applications.
Worth reading: With only 2 stars but starred by Simon Willison, this early-stage project could become a key tool for lightweight, local vector search in AI applications.
@simonw
Highlights: Simon Willison criticizes 'vibe coding' as building software irresponsibly without regard for code quality.
Worth reading: It offers a critical perspective on a trendy but potentially dangerous development approach.
@simonw
Highlights: Simon Willison praises guidance on writing good commit history.
Worth reading: It highlights best practices for software development and version control.
@simonw
Highlights: Simon Willison notes that LLMs favor boring technology when attached to a good coding agent harness.
Worth reading: It provides insight into how LLMs interact with coding tools and technology choices.
@simonwillison.net
Highlights: Ringdown is a lightweight Python tool for recording and replaying HTTP responses, useful for testing and development. It simplifies mocking external APIs by capturing real responses and serving them offline.
Worth reading: It offers a simple, practical approach to HTTP recording that can speed up development and testing workflows, especially for projects relying on external APIs.
Highlights: Trycycle is a tool that helps developers iterate quickly on AI prompts by automatically generating and testing variations, making it easier to find the best prompt for a given task. It integrates with popular AI models and provides a simple CLI interface for prompt experimentation.
Worth reading: For developers working with LLMs, Trycycle offers a practical way to systematically improve prompts, saving time and effort in prompt engineering.
@simonw
Highlights: Simon observes that AI labs are increasingly focused on improving code generation capabilities as a primary objective.
Worth reading: Reflects a key trend in AI development priorities.
@simonw
Highlights: Simon shares materials from a talk about the last year six months in LLMs, illustrated by pelicans on bicycles.
Worth reading: Provides a creative and insightful overview of recent LLM developments.
@simonw
Highlights: Simon recommends guidance on writing good commit history.
Worth reading: Useful for developers aiming to improve their version control practices.
@simonwillison.net
@simonw
Highlights: LLMs become more effective when integrated into a robust coding agent framework.
Worth reading: Highlights the importance of tooling around LLMs for practical coding tasks.
@simonw
Highlights: Effective use of coding agents requires knowing when to rely on them and when not to.
Worth reading: Emphasizes the human skill of judgment in AI-assisted coding.
@simonw
Highlights: Criticizes 'vibe coding' as an irresponsible approach to software development.
Worth reading: Warns against over-reliance on AI without proper oversight.
@simonw
Highlights: Praises a resource on writing excellent commit messages.
Worth reading: Reflects Simon's interest in software craftsmanship and best practices.
@simonwillison.net
@simonwillison.net
@simonw
Highlights: Simon notes that AI labs are overwhelmingly focused on improving code generation capabilities.
Worth reading: Reflects a key trend in AI development priorities.
@simonw
Highlights: Simon argues that leaving programming due to LLMs is premature, comparing it to quitting carpentry due to power tools.
Worth reading: Provides perspective on AI's impact on software careers.
@simonw
Highlights: Simon shared a talk summarizing LLM developments with a pelican-on-bicycle analogy.
Worth reading: Creative summary of LLM progress over six months.
@simonw
Highlights: Simon posted an AI-generated image of raccoons on a heist.
Worth reading: Showcases creative use of AI image generation.
@simonwillison.net
@simonw
Highlights: GPT-5.5 is comparable to Claude Mythos in finding security vulnerabilities and is generally available.
Worth reading: Highlights the security capabilities of a widely accessible model.
@simonwillison.net
Simon Willison·Apr 29, 2026
<p>I just released <a href="https://llm.datasette.io/en/latest/changelog.html#a0-2026-04-28">LLM 0.32a0</a>, an alpha release of my <a href="https://llm.datasette.io/">LLM</a> Python library and CLI tool for accessing LLMs, with some consequential changes that I've been working towards for quite a...
Highlights: LLM 0.32a0 is a major refactor that prioritizes backwards compatibility while introducing significant internal changes for future extensibility. The alpha release aims to stabilize new APIs and data structures, allowing plugin authors to adapt before the stable release.
Worth reading: If you use or build plugins for the LLM tool, this post details critical architectural shifts that will affect your workflow, making it essential for staying up-to-date with the ecosystem's evolution.
@simonw
Highlights: Simon Willison notes that AI labs are increasingly focused on improving code generation as a primary goal.
Worth reading: Reflects a key trend in AI development priorities.
@simonw
Highlights: Simon Willison praises guidance on writing good commit history.
Worth reading: Useful for developers aiming to improve version control practices.
@simonw
Highlights: Alpha refactor enables message list prompting in llm CLI.
Worth reading: Important update for users of the llm tool.
PyWry is a cross-platform app factory, rendering engine and UI toolkit for Python that produces native desktop, web, and notebook experiences from a single API.
Highlights: PyWry is a cross-platform app factory that lets you build native desktop, web, and notebook experiences from a single Python API. It leverages Tauri and WebView2 for rendering, and integrates with Jupyter, Plotly, and MCP servers, making it a versatile tool for creating rich interactive applications.
Worth reading: It bridges Python desktop development with modern web technologies and AI tooling (e.g., MCP, Claude Code), offering a unique approach to building full-stack AI interfaces.
@simonwillison.net
@simonw
Highlights: Simon Willison observes that improving code generation has become the primary objective for AI labs.
Worth reading: Highlights a key trend in AI development priorities.
@simonw
Highlights: Simon Willison created a benchmark for testing image generation models, specifically for ChatGPT Images 2.0.
Worth reading: Shows creative evaluation of AI image generation capabilities.
@simonwillison.net
@simonw
Highlights: OpenAI Codex includes an instruction to avoid discussing certain animals unless relevant.
Worth reading: Reveals an interesting constraint in AI model instructions.
@simonwillison.net
@simonwillison.net
@simonwillison.net
Simon Willison·Apr 27, 2026
<p>For many years, Microsoft and OpenAI's relationship has included a weird clause saying that, should AGI be achieved, Microsoft's commercial IP rights to OpenAI's technology would be null and void. That clause appeared to end today. I decided to try and track its expression over time on <a...
Highlights: The AGI clause in the OpenAI-Microsoft contract, which would void Microsoft's IP rights upon AGI achievement, has been removed, signaling a shift in their partnership. This change may reflect OpenAI's evolving definition of AGI or strategic realignment.
Worth reading: It offers a fascinating historical tracking of a pivotal contractual clause, shedding light on the evolving relationship between two AI giants and the elusive concept of AGI.
@simonw
Highlights: Simon notes that AI labs are increasingly focusing on improving code generation as a primary objective.
Worth reading: Reflects a key trend in AI development priorities.
@simonw
Highlights: New version of llm CLI tool adds GPT-5.5 support and verbosity control.
Worth reading: Useful for developers using OpenAI models via command line.
@simonw
Highlights: Simon Willison observes that improving code generation has become the primary objective for AI labs.
Worth reading: Reflects a key trend in AI development priorities.
@simonwillison.net
@simonw
Highlights: Simon suggests that a key skill with coding agents is knowing when to step back.
Worth reading: Highlights a nuanced skill for effective human-AI collaboration in coding.
@simonw
Highlights: Simon defines 'vibe coding' as irresponsible software development.
Worth reading: Critiques a trend in AI-assisted coding that prioritizes speed over quality.
Simon Willison·Apr 24, 2026
<p>Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) <a href="https://simonwillison.net/2025/Dec/1/deepseek-v32/">last December</a>. They just dropped the first of their hotly anticipated V4 series in the shape of two preview models, <a...
Highlights: DeepSeek V4 preview models achieve near-frontier performance at a fraction of the cost, challenging the pricing strategies of leading AI labs. The release signals a major shift towards cost-efficient AI development, making advanced models more accessible.
Worth reading: For those tracking AI economics and model performance trade-offs, this post offers a clear analysis of how DeepSeek's pricing and capabilities compare to competitors, highlighting a potential trend in the industry.
@simonw
Highlights: DeepSeek V4 offers near-frontier performance at a much lower cost.
Worth reading: Highlights a cost-effective alternative to top-tier models.
Simon Willison·Apr 23, 2026
<p>LlamaIndex have a most excellent open source project called <a href="https://github.com/run-llama/liteparse">LiteParse</a>, which provides a Node.js CLI tool for extracting text from PDFs. I got a version of LiteParse working entirely in the browser, using most of the same libraries that...
Highlights: LiteParse is a Node.js CLI tool for extracting text from PDFs, and this post shows how to run it entirely in the browser using the same libraries. The key insight is that many server-side tools can be adapted for client-side execution, enabling new interactive applications without backend dependencies.
Worth reading: It demonstrates a practical approach to porting a server-side tool to the browser, which is valuable for developers looking to build offline-capable or low-latency PDF processing features.
Simon Willison·Apr 23, 2026
<p><a href="https://openai.com/index/introducing-gpt-5-5/">GPT-5.5 is out</a>. It's available in OpenAI Codex and is rolling out to paid ChatGPT subscribers. I've had some preview access and found it to be a fast, effective and highly capable model. As is usually the case these days, it's hard to...
Highlights: GPT-5.5 is now available via OpenAI Codex and rolling out to paid ChatGPT users. The author finds it fast, effective, and highly capable, with improvements in coding and reasoning tasks.
Worth reading: Simon Willison provides early hands-on impressions of GPT-5.5, highlighting its performance and the novel 'Codex backdoor' access method, which is valuable for developers tracking OpenAI's latest model capabilities.
Simon Willison·Apr 22, 2026
<p>Anthropic today quietly (as in <em>silently</em>, no announcement anywhere at all) updated their <a href="https://claude.com/pricing">claude.com/pricing</a> page (but not their <a href="https://support.claude.com/en/articles/11049762-choosing-a-claude-plan">Choosing a Claude plan page</a>, which...
Highlights: Anthropic made unannounced pricing changes to Claude Code, creating confusion about potential costs. The author analyzes the discrepancies between different official pages to clarify the actual pricing structure.
Worth reading: It provides valuable insight into how AI companies communicate pricing changes and helps users navigate confusing documentation to understand actual costs.
Simon Willison·Apr 21, 2026
<p>OpenAI <a href="https://openai.com/index/introducing-chatgpt-images-2-0/">released ChatGPT Images 2.0 today</a>, their latest image generation model. On <a href="https://www.youtube.com/watch?v=sWkGomJ3TLI">the livestream</a> Sam Altman said that the leap from gpt-image-1 to gpt-image-2 was...
Highlights: OpenAI's ChatGPT Images 2.0 represents a significant leap forward from its predecessor, with Sam Altman highlighting major improvements in image generation capabilities. The post explores the technical advancements and practical implications of this new model release.
Worth reading: It provides timely analysis of a major AI development from a respected technical voice, with insights into how this upgrade might impact creative and practical applications of image generation.
Simon Willison·Apr 18, 2026
<p>Anthropic are the only major AI lab to <a href="https://platform.claude.com/docs/en/release-notes/system-prompts">publish the system prompts</a> for their user-facing chat systems. Their system prompt archive now dates all the way back to Claude 3 in July 2024 and it's always interesting to see...
Highlights: Anthropic uniquely publishes system prompts for their Claude models, providing transparency into AI development. The archive now includes prompts dating back to Claude 3 in July 2024, allowing for tracking of how these foundational instructions evolve.
Worth reading: It offers rare insight into how AI companies shape model behavior through system prompts, which is valuable for understanding AI development practices and transparency.
Simon Willison·Apr 17, 2026
<p>This year's <a href="https://us.pycon.org/2026/">PyCon US</a> is coming up next month from May 13th to May 19th, with the core conference talks from Friday 15th to Sunday 17th and tutorial and sprint days either side. It's in Long Beach, California this year, the first time PyCon US has come to...
Highlights: PyCon US 2026 introduces dedicated AI and security tracks, reflecting Python's growing role in these critical domains. The conference expands beyond traditional Python development to address emerging technical challenges and opportunities.
Worth reading: It provides timely information about new AI-focused conference tracks for Python developers interested in staying current with industry trends.
Simon Willison·Apr 16, 2026
<p>For anyone who has been (inadvisably) taking my <a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/">pelican riding a bicycle benchmark</a> seriously as a robust way to test models, here are pelicans from this morning's two big model releases - <a...
Highlights: The post demonstrates that the Qwen3.6-35B-A3B model, running locally on a laptop, generated a more accurate or aesthetically pleasing image of a pelican riding a bicycle compared to the larger, cloud-based Claude Opus 4.7 model. This highlights the rapid progress in open-source, locally runnable AI models that can now compete with or surpass leading proprietary models in specific creative tasks.
Worth reading: It offers a tangible, visual comparison of recent model capabilities, challenging assumptions about the necessity of large, cloud-based models for creative AI tasks and showcasing the practical potential of local AI deployment.
Simon Willison·Apr 8, 2026
<p>Meta <a href="https://ai.meta.com/blog/introducing-muse-spark-msl/">announced Muse Spark</a> today, their first model release since Llama 4 <a href="https://simonwillison.net/2025/Apr/5/llama-4-notes/">almost exactly a year ago</a>. It's hosted, not open weights, and the API is currently "a...
Highlights: Meta's Muse Spark represents their first major model release in about a year, following Llama 4. Unlike previous models, Muse Spark is a hosted service rather than open weights, indicating a shift in Meta's AI deployment strategy.
Worth reading: The post provides timely analysis of Meta's strategic pivot in AI model distribution and highlights new tools available through meta.ai chat that developers and researchers should explore.
Simon Willison·Apr 7, 2026
<p>Anthropic <em>didn't</em> release their latest model, Claude Mythos (<a href="https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf">system card PDF</a>), today. They have instead made it available to a very restricted set of preview partners under their newly announced <a...
Highlights: Anthropic is taking a cautious approach with Claude Mythos by restricting access to security researchers through Project Glasswing, rather than releasing it publicly. This reflects growing industry awareness of AI safety risks and the need for controlled testing before broader deployment.
Worth reading: The post offers timely insight into how leading AI companies are balancing innovation with safety, providing context on current industry practices around responsible AI deployment.
@simonw
Highlights: Simon observes that AI labs are focused on improving code generation.
Worth reading: Reflects a key trend in AI development.
@simonw
Highlights: Simon Willison published materials from his talk at AI Engineer World's, covering the last six months in LLMs.
Worth reading: Provides a comprehensive overview of recent LLM developments with annotated transcript.
@simonw
Highlights: Observes that AI labs are increasingly focused on improving code generation.
Worth reading: Reflects his perspective on AI industry trends.
@simonw
Highlights: Argues that LLMs are tools that enhance, not replace, programmers.
Worth reading: Provides a balanced view on AI's impact on programming careers.
@simonw
Highlights: Analogizes LLMs in programming to power tools in carpentry, suggesting they augment rather than replace.
Worth reading: Provides a grounded perspective on the impact of LLMs on software development careers.
@simonw
Highlights: Simon praises guidance on writing good commit history.
Worth reading: Shows his interest in software craftsmanship and best practices.
@simonw
Highlights: Simon Willison praises guidance on writing good commit history.
Worth reading: Highlights best practices for commit messages from a respected developer.
@simonw
Highlights: Simon praises a resource on writing good commit history.
Worth reading: Highlights best practices for commit messages.