Intelligence.Log

2026-05-15

Extracted: 52 items. Sources: GitHub, Bluesky, X.

++ AI OVERVIEW ++

Today's AI discourse is dominated by a push for academic integrity and a reaffirmation of computational scaling. The ACL conference has set a hard line against AI-generated slop in research, announcing that papers containing hallucinated references will face desk rejection, a move that underscores growing concerns about reliability in published work. Meanwhile, Ethan Mollick highlights the enduring power of the "Second Scaling Law," noting that simply allowing models more tokens—more time to "think"—consistently improves performance on complex tasks like hacking, math, and science. On GitHub, this trend is reflected in a surge of interest in inference-time compute optimizations and chain-of-thought frameworks, as developers race to harness longer reasoning chains without prohibitive costs. The message is clear: the field is simultaneously cracking down on dishonesty and doubling down on the brute-force effectiveness of letting models reason longer.

◆ Signal

Co-Starred · Last 7 days

Repos independently starred by multiple AI leaders in the week ending 2026-05-15. Stronger signal = more overlap.

antirez/ds4

×2 starrers▲ 8/10★ 9.0k

DeepSeek 4 Flash local inference engine for Metal and CUDA

by:minimaxir pcuenca

[Deployment][LLM]

|2026-05-12 → 2026-05-14

grep TOPIC=

grep SOURCE=

sort --by=

marimo-team/marimo★ 21.0k▲ 8/10

A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.

Starred byminimaxir|[Tooling][Deployment][Infra]

“Marimo is a reactive Python notebook that combines the best of Jupyter and Streamlit: it runs cells automatically when dependencies change, supports SQL queries, and can be deployed as an interactive web app. Notebooks are stored as pure Python files, enabling easy version control with git and execution as scripts.”

abetlen/llama-cpp-python★ 10.3k▲ 7/10

Python bindings for llama.cpp

Starred bypcuenca|[Infra][Deployment]

“llama-cpp-python provides Python bindings for llama.cpp, enabling efficient inference of LLMs on CPU and GPU. It supports quantization, GPU acceleration, and a wide range of model architectures, making it a key tool for local LLM deployment.”

qualisero/awesome-pi-agent★ 0.9k▲ 6/10

Awesome list of add-ons, hooks, tools, skills, and resources for the pi coding agent (pi-mono).

Starred byphilschmid|[Agent][Tooling]

“A curated list of add-ons, hooks, tools, and skills for the pi coding agent, enabling users to extend its capabilities. It serves as a central resource for the pi-agent ecosystem, similar to awesome lists for other frameworks.”

Tyriar/vscode-theme-sapphire★ 0.0k▲ 2/10

Sapphire is a vibrant blue theme for Visual Studio Code

Starred byminimaxir|[Tooling]

“A vibrant blue theme for Visual Studio Code with a focus on readability and aesthetics. It offers a consistent color palette that reduces eye strain during long coding sessions.”

BSKY

Mark RiedlMay 15, 02:15 AM

The ACL conference has put out a statement that papers with hallucinated references will be desk-rejected 2026.aclweb.org/acl_statement/

❤️ 9 Likes|[Evaluation]

BSKY

Ethan MollickMay 15, 12:15 AM

The Second Scaling Law of AI remains undefeated. If you want better hacking (or math, or science, or crossword puzzle solving) out of an LLM, just let it use more tokens. There doesn't seem to be any plateau so far in the new study by the UK's governmental AI Security Institute.

❤️ 43 Likes|[LLM][Infra]

BSKY

Mark RiedlMay 15, 08:19 PM

Imagine getting upset over a movie that doesn’t involve Optimus Prime dying

❤️ 10 Likes|

BSKY

Mark RiedlMay 15, 11:55 AM

Relative change in A grades given since the release of ChatGPT www.wsj.com/us-news/educ...

❤️ 26 Likes|[Evaluation]

BSKY

Ethan MollickMay 15, 04:46 PM

Anton labs have hooked up a bunch of AI models to harnesses and had them working as DJs, programming and running a radio station, including taking callers and donations, which they use to buy more music. The results are both hilarious and a good reminder of how working with AI is deeply weird.

❤️ 72 Likes|[Agent][Tooling]

BSKY

Emily M. BenderMay 15, 10:43 PM

Seattle-area friends: See you Sunday? www.pikeplacemarket.org/events-calen...

❤️ 11 Likes|

BSKY

angela zhouMay 15, 04:57 PM

❤️ 0 Likes|

BSKY

Lijun AnMay 15, 02:19 PM

If you want to learn proteomics signatures of APOE genetic variants on a massive sample from multi-chort, you should not miss this tour de force! Huge congrats to Lu Lina and Niklas!

❤️ 0 Likes|

Andrej Karpathy@karpathy

I'm being accused of overhyping the [site everyone heard too much about today already].

[LLM]

“DeepSeek Summary: Karpathy responds to criticism of overhyping a popular site.”

Andrej Karpathy@karpathy

Power to the people: How LLMs flip the script on technology diffusion. So it strikes me as quite unique and remarkable that LLMs display a dramatic reversal of this pattern - they generate disproportionate benefit for regular people, while their impact is a lot more...

[LLM]

“DeepSeek Summary: Karpathy argues that LLMs benefit regular people more than experts, reversing typical tech diffusion.”

Andrej Karpathy@karpathy

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. Also I just talk to Composer with SuperWhisper.

[Tooling][Agent]

“DeepSeek Summary: Karpathy coins 'vibe coding' as an AI-assisted coding style where developers rely on AI and ignore code details.”

Simon Willison@simonw

It's interesting how "better at code" has become the defining goal of almost every AI lab over the

[LLM][Tooling]

“DeepSeek Summary: Simon notes that AI labs are increasingly focused on improving code generation capabilities.”

Harrison Chase@hwchase17

TL;DR: More and more agents need a workspace: a computer where they can run code, install packages, and access files. Sandboxes provide this

[Agent][Infra][Deployment]

“DeepSeek Summary: Agents require a sandboxed workspace to execute code, install packages, and access files.”

Harrison Chase@hwchase17

When building agents, you need to iterate on production data much more than when building traditional software. You need to iterate on how

[Agent][Evaluation][Deployment]

“DeepSeek Summary: Agent development requires more iteration on production data than traditional software.”

Harrison Chase@hwchase17

Traditional Application Performance Monitoring (APM) tools focus on metrics like latency, traffic, errors, and saturation. They track HTTP

[Agent][Infra][Tooling]

“DeepSeek Summary: Traditional APM metrics are insufficient for monitoring agents.”

Harrison Chase@hwchase17

I am not excited about visual workflow builders 1. Not simple enough for the average user

[Agent][Tooling]

“DeepSeek Summary: Visual workflow builders are not simple enough for average users.”

Jeremy Howard@jeremyphoward

I replicated this result, that Grok focuses nearly entirely on finding out what Elon thinks in

[Safety][Evaluation][LLM]

“DeepSeek Summary: Jeremy Howard replicated a finding that Grok's responses are heavily influenced by Elon Musk's opinions.”

Jeremy Howard@jeremyphoward

Absolutely any time I try to explore something even slightly against commonly accepted beliefs,

[Safety][Fine-tuning]

“DeepSeek Summary: Jeremy Howard notes difficulty in exploring ideas that challenge commonly accepted beliefs.”

Soumith Chintala@soumithchintala

We are scientists, engineers, and builders behind some of the most widely used AI products and libraries, including ChatGPT.

[Agent][LLM][Multi-modal]

“DeepSeek Summary: Soumith Chintala announces his involvement with Thinking Machines Lab, highlighting the team's background in creating major AI products.”

Francois Chollet@fchollet

It was always the case that agency was self-compounding, but AI is magnifying the effect. Low-agency AI users further lose agency, high-agency AI users further gain agency.

[Agent][Safety]

“DeepSeek Summary: Agency compounds with AI use: low-agency users lose more agency, high-agency users gain more.”

Francois Chollet@fchollet

Current AI is a librarian of existing knowledge. Science requires an explorer of the unknown.

[Evaluation][LLM]

“DeepSeek Summary: Contrasts current AI's role as a librarian with the need for exploration in science.”

Yann LeCun@ylecun

Dario is wrong. He knows absolutely nothing about the effects of technological revolutions on the labor market.

[Safety]

“DeepSeek Summary: LeCun dismisses Dario's views on AI's labor market impact, asserting Dario lacks understanding.”

Yann LeCun@ylecun

The emergence of superintelligence is not going to be an event. We don't have anything close to a

[Safety]

“DeepSeek Summary: LeCun argues superintelligence will not emerge suddenly, challenging common narratives.”

Yann LeCun@ylecun

It seems to me that before 'urgently figuring out how to control AI systems much smarter than us' we need

[Safety]

“DeepSeek Summary: LeCun questions the urgency of AI control, suggesting other priorities.”

Fei-Fei Li@drfeifei

Very excited to share @theworldlabs 's latest research work RTFM!! It's a real-time, ...

[Multi-modal]

“DeepSeek Summary: Fei-Fei Li announces RTFM, a real-time research work from World Labs.”

Clem Delangue@ClementDelangue

We're facing an LLM bubble, not a broader AI bubble. The industry is obsessed with building one massive model when we should be focusing on practical applications.

[LLM]

“DeepSeek Summary: Clem Delangue distinguishes between an LLM bubble and a broader AI bubble, arguing that the hype is concentrated on large language models rather than AI as a whole.”

Max Woolf@minimaxir

LOL. Remove the code in the algorithm that boosts the tweets of Elon by elvodqa · Pull Request #160 ·... github.com.

[Deployment][Infra]

“DeepSeek Summary: Max Woolf finds humor in a pull request that removes code boosting Elon Musk's tweets.”

Max Woolf@minimaxir

me irl

“DeepSeek Summary: A short, relatable post expressing a personal sentiment.”

Max Woolf@minimaxir

what

“DeepSeek Summary: A brief expression of confusion or surprise.”

Phil Wang@lucidrains

I got to cover for the excellent @HadleyFreeman in the Guardian today so

[Deployment]

“DeepSeek Summary: Phil Wang filled in for Hadley Freeman at The Guardian.”

Sasha Rush@srush_io

Some news: moving this fall from Harvard -> Cornell Tech. Sad to leave such an incredible

[Deployment]

“DeepSeek Summary: Sasha Rush announced moving from Harvard to Cornell Tech.”

Sasha Rush@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

[Deployment]

“DeepSeek Summary: Sasha Rush announced joining Cursor, a small ambitious team.”

Sasha Rush@srush_io

⛏️

[Tooling]

“DeepSeek Summary: A tweet with a pickaxe emoji, possibly hinting at mining or hard work.”

Stas Bekman@stas00

I have been compiling LLM/VLM training logbooks/chronicles. This is the one of the best sources to

[LLM][Fine-tuning]

“DeepSeek Summary: Compiling LLM/VLM training logbooks as a valuable resource.”

Stas Bekman@stas00

Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can

[Tooling]

“DeepSeek Summary: Acknowledges contribution to the Machine Learning Engineering Open Book.”

Stas Bekman@stas00

This is a long overdue section of the ML Engineering Understanding Training Loss Patterns

[LLM][Fine-tuning]

“DeepSeek Summary: Introduces a section on understanding training loss patterns in ML Engineering.”

Stas Bekman@stas00

Modern art. Artist: PyTorch memory profiler Model: Llama-8B The piece on the left is the

[Infra][LLM]

“DeepSeek Summary: Humorously compares PyTorch memory profiler output to modern art.”

Sayak Paul@sayakpaul

After working on releasing the v5, this is the latest release from the Transformers team at

[Deployment][Infra]

“DeepSeek Summary: Sayak Paul mentions the latest release from the Transformers team after working on v5.”

Philipp Schmid@philschmid

I read three technical reports from Moonshot AI's Kimi K2.5 paper, Cursor's Composer 2 report and blog post, and Chroma's Context-1 write-up

[LLM][Tooling]

“DeepSeek Summary: Philipp Schmid read and shared three technical reports: Moonshot AI's Kimi K2.5, Cursor's Composer 2, and Chroma's Context-1.”

Philipp Schmid@philschmid

Random thought. We are going to be so much faster at creating and building.

[Agent][Deployment]

“DeepSeek Summary: Philipp Schmid expresses optimism about increased speed in creation and building.”

Philipp Schmid@philschmid

83 likes 3 replies ...

“DeepSeek Summary: This post has engagement but the content is not fully captured.”

Philipp Schmid@philschmid

4 likes 882 views ...

“DeepSeek Summary: This post has high views but low likes.”

Ethan Mollick@emollick

So much work is going into faking continual learning and memory for AIs,

[LLM][Fine-tuning][Evaluation]

“DeepSeek Summary: Critique of efforts to simulate continual learning and memory in AI, implying it may be misguided.”

Ethan Mollick@emollick

Oh no.

[Safety]

“DeepSeek Summary: Brief expression of concern, likely about an AI development.”

Naomi Saphra@NaomiSaphra

what a perfect space for scientific discourse! I'll start off with a few images of myself

[LLM]

“DeepSeek Summary: Naomi Saphra humorously comments on using images of herself for scientific discourse.”

Naomi Saphra@NaomiSaphra

Life update: I'm starting as faculty at Boston University in 2026! BU ...

[LLM]

“DeepSeek Summary: Naomi Saphra announces her new faculty position at Boston University starting in 2026.”

Ben Recht@beenwrekt

For the first time in almost a decade, I'm teaching a class on learning and control.

[Evaluation]

“DeepSeek Summary: Ben Recht is teaching a class on learning and control after a long hiatus.”

Ben Recht@beenwrekt

Everyone knows actions are fundamentally different than predictions, but it's hard to write this

[Evaluation]

“DeepSeek Summary: Ben Recht discusses the fundamental difference between actions and predictions.”

-- END OF LOG --

[STATS] 52 items · Filter applied