Intelligence.Log

2026-05-08

Extracted: 59 items. Sources: GitHub, Bluesky, X.

++ AI OVERVIEW ++

Today’s discourse is dominated by a landmark shift in academic peer review, as Mark Riedl highlights AAAI’s controversial experiment with a hybrid AI-human system for all 22,000 submitted papers. In a bold transparency move, authors received one clearly labeled AI-generated review alongside a human one, sparking intense debate about quality, fairness, and the future of conference integrity. Meanwhile, the community is buzzing over the practical implications of this system, with many questioning whether AI can truly match human nuance in evaluating novel research. On the project front, repositories focused on automating scientific workflows and LLM-based evaluation tools continue to trend, reflecting a broader push to integrate AI into the very fabric of knowledge creation. The tension between efficiency and authenticity remains the day’s central theme, as researchers grapple with AI’s growing role in gatekeeping science.

◆ Signal

Co-Starred · Last 7 days

Repos independently starred by multiple AI leaders in the week ending 2026-05-08. Stronger signal = more overlap.

antirez/ds4

×2 starrers▲ 7/10★ 2.7k

DeepSeek 4 Flash local inference engine for Metal

by:lucidrains simonw

[Deployment][LLM]

|2026-05-07 → 2026-05-08

grep TOPIC=

grep SOURCE=

sort --by=

antirez/ds4★ 2.7k▲ 7/10

DeepSeek 4 Flash local inference engine for Metal

Starred bysimonw|[LLM][Deployment]

“A local inference engine for DeepSeek 4 Flash optimized for Apple Metal, enabling fast LLM inference on Mac hardware. Written in C for performance, it provides a lightweight alternative to cloud-based inference.”

BSKY

Mark RiedlMay 8, 01:50 AM

AAAI used a novel AI paper reviewing system on all 22k papers submitted. In phase 1, authors received 1 clearly marked AI generated review and 1 human review. arxiv.org/abs/2604.13940

❤️ 9 Likes|[Evaluation][LLM]

BSKY

Naomi SaphraMay 8, 02:12 AM

❤️ 3 Likes|[Evaluation][Safety]

BSKY

Simon WillisonMay 8, 05:57 PM

Just realized that the reason I like TikTok so much is that it's lightning talks! I've always loved lightning talks

❤️ 40 Likes|

BSKY

Mark RiedlMay 8, 10:45 PM

Goals

❤️ 5 Likes|[Safety]

BSKY

Mark RiedlMay 8, 03:52 PM

This

❤️ 2 Likes|

BSKY

hardmaruMay 8, 04:15 PM

Excited to share Sakana AI’s new #ICML2026 paper in collaboration with NVIDIA: "Sparser, Faster, Lighter Transformer Language Models" arxiv.org/abs/2603.23198 This work introduces new open-source GPU kernels and data formats for faster inference and training of sparse transformer LLMs: 🧵 Thread 👇

❤️ 38 Likes|[LLM][Infra][Deployment]

BSKY

Yoshua BengioMay 8, 01:07 PM

Thank you to Rob Wiblin for inviting me on the @80000hours.bsky.social podcast to discuss the research progress we’re making at @law-zero.bsky.social to create safe-by-design AI systems.

❤️ 6 Likes|[Safety]

BSKY

Ethan MollickMay 8, 04:26 AM

I have always found it charming that the fourth, fifth and sixth derivatives of position are snap, crackle, and pop. Because I could, I asked Codex to throw together a little simulation so you can play with them (as well as velocity, acceleration & jerk). motion-derivatives-exhibit.netlify.app

❤️ 63 Likes|[Tooling]

BSKY

Ethan MollickMay 8, 04:10 AM

Professions with guilds or membership associations are going to get different AI policy reactions than those without The Bar & the AMA will ensure that human doctors or lawyers are legally required for key activities. There is no equivalent organization for consultants or coders

❤️ 83 Likes|

BSKY

Emily M. BenderMay 8, 11:07 PM

Seattle friends -- two showings of @ghostdoc2026.bsky.social at SIFF on Sunday and Monday! And I'll be part of the post-screening Q&A :) www.thestranger.com/arts/the-sif...

❤️ 10 Likes|

BSKY

Emily M. BenderMay 8, 08:41 PM

Tip for dealing with busy people: If you're asking someone to speak (esp. for free) at some event, AND you expect them to spend some time on a meeting beforehand to plan how it will go, put this in the initial invitation. Demanding extra time after someone has already agreed is rude, honestly.

❤️ 74 Likes|

BSKY

Emily M. BenderMay 8, 01:07 PM

“‘AI’ might not be good for xyz, but you can’t deny that it’s helpful for programming” -- sound familiar? On the next Mystery AI Hype Theater 3000 @alexhanna.bsky.social and I will be digging into that bullshit. Join us for the livestream: Monday, May 11, noon PT twitch.tv/dair_institute

❤️ 57 Likes|[LLM][Tooling]

BSKY

Naomi SaphraMay 8, 01:11 PM

Goodfire released a megapost of all the random feature geometry stuff they're finding, and it's worth a read

❤️ 112 Likes|[Safety][Evaluation]

Andrej Karpathy@karpathy

Drafted a blog post. Used an LLM to meticulously improve the argument over 4 hours. Wow, feeling great, it’s so convincing! Fun idea let’s ask it to argue the opposite. LLM demolishes the entire argument and convinces me that the opposite is in fact true. lol

[LLM][Safety]

“DeepSeek Summary: LLMs can be used to improve arguments, but they can also convincingly argue the opposite, revealing their persuasive power and potential pitfalls.”

Andrej Karpathy@karpathy

The hottest new programming language is English

[LLM][Tooling]

“DeepSeek Summary: Natural language is becoming a dominant interface for programming, especially with LLMs.”

Andrej Karpathy@karpathy

By training LLMs against automatically verifiable rewards across a number of environments (e.g. think math/code puzzles), the LLMs spontaneously develop strategies that look like 'reasoning' to humans - they learn to break down problem solving into intermediate calculations and they learn a number of problem-solving techniques.

[LLM][Fine-tuning][Evaluation]

“DeepSeek Summary: LLMs can develop emergent reasoning-like behaviors through reinforcement learning with verifiable rewards.”

Simon Willison@simonw

This may be the best guidance I've seen anywhere on writing a really good commit history.

[Tooling]

“DeepSeek Summary: Simon praises a resource on writing good commit history.”

Simon Willison@simonw

It's interesting how "better at code" has become the defining goal of almost every AI lab over the

[LLM][Agent]

“DeepSeek Summary: Simon observes that AI labs are focused on improving code generation.”

Harrison Chase@hwchase17

We launched LangSmith Agent Builder this week as a no-code way to build agents. A key part of Agent builder is it's memory system.

[Agent][Tooling]

“DeepSeek Summary: LangSmith Agent Builder is a no-code agent builder with a memory system.”

Harrison Chase@hwchase17

Your harness, your memory ... The “best” way to build agentic systems has changed dramatically over the past three years. When ChatGPT came out,

[Agent][LLM]

“DeepSeek Summary: The best way to build agentic systems has evolved significantly since ChatGPT.”

Harrison Chase@hwchase17

TL;DR: More and more agents need a workspace: a computer where they can run code, install packages, and access files. Sandboxes provide this

[Agent][Infra]

“DeepSeek Summary: Agents increasingly require a sandboxed workspace to execute code and access files.”

Jim Fan@DrJimFan

It gives me a lot of comfort knowing that we are the last generation without advanced robots everywhere.

[Safety][Deployment]

“DeepSeek Summary: Reflects on the imminent proliferation of advanced robots, suggesting a future where robots are ubiquitous.”

Jim Fan@DrJimFan

I've been a bit quiet on X recently. The past year has been a transformational experience.

[Agent]

“DeepSeek Summary: Acknowledges a period of personal and professional transformation, hinting at significant developments.”

Jim Fan@DrJimFan

Resource constraints are a beautiful thing. Survival instinct in a cut-throat AI competitive land.

[Fine-tuning][Deployment]

“DeepSeek Summary: Emphasizes the positive role of constraints in driving innovation and survival in AI competition.”

Jim Fan@DrJimFan

In this context, I define world modeling as predicting the next plausible world state (or a longer duration of states) conditioned on an action.

[Multi-modal][Agent]

“DeepSeek Summary: Defines world modeling in AI as predicting future states based on actions.”

Jeremy Howard@jeremyphoward

I replicated this result, that Grok focuses nearly entirely on finding out what Elon thinks in

[Safety][Evaluation]

“DeepSeek Summary: Jeremy replicated a finding that Grok's responses are heavily influenced by Elon Musk's views.”

Jeremy Howard@jeremyphoward

Wow I can already say after just 5 hours using @AnthropicAI Opus 4.7 that this is the first

[LLM][Evaluation]

“DeepSeek Summary: After 5 hours with Anthropic's Opus 4.7, Jeremy claims it's the first model to achieve a notable milestone.”

Soumith Chintala@soumithchintala

we've been working on democratizing fast kernel writing on the @PyTorch team. try

[Infra]

“DeepSeek Summary: Soumith highlights efforts to democratize fast kernel writing within the PyTorch team.”

Soumith Chintala@soumithchintala

reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins

[LLM]

“DeepSeek Summary: Soumith recommends reading 'AI News' as a high-leverage activity.”

Francois Chollet@fchollet

I think it's clear that for many smaller companies that invested in deep learning, it turned out

[Evaluation]

“DeepSeek Summary: Smaller companies investing in deep learning faced challenges.”

Francois Chollet@fchollet

A lot of the current discourse about AI comes from a fatalistic position of total surrender of

[Safety]

“DeepSeek Summary: Criticizes fatalistic views in AI discourse.”

Francois Chollet@fchollet

GenAI isn't just a technology; it's an informational pollutant—a pervasive cognitive smog that

[Safety][LLM]

“DeepSeek Summary: Compares generative AI to informational pollution.”

Fei-Fei Li@drfeifei

Very excited to share @theworldlabs 's latest research work RTFM!! It's a real-time,

[Multi-modal]

“DeepSeek Summary: Fei-Fei Li announces World Labs' real-time research work RTFM.”

Fei-Fei Li@drfeifei

I can now confess that I participated in the new #TronAres movie, playing myself I had a great time working with everyone especially Greta

“DeepSeek Summary: Fei-Fei Li reveals her cameo in the movie Tron: Ares.”

Max Woolf@minimaxir

LOL

[Tooling]

“DeepSeek Summary: Brief humorous reaction.”

Max Woolf@minimaxir

@simonw

[Tooling]

“DeepSeek Summary: Mentions Simon Willison.”

Max Woolf@minimaxir

congrats to OpenAI on winning the Turing Test

[LLM][Evaluation]

“DeepSeek Summary: Sarcastic or ironic congratulations to OpenAI.”

Sasha Rush@srush_io

Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created

[Tooling]

“DeepSeek Summary: Sasha Rush announced joining Cursor, a small ambitious team.”

Sasha Rush@srush_io

Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.

[LLM]

“DeepSeek Summary: Sasha Rush made a bet about Transformers with Jonathan Frankle.”

Sasha Rush@srush_io

today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote

[Evaluation]

“DeepSeek Summary: Sasha Rush received a reproduction paper of his own work, calling it a PhD student's nightmare.”

Stas Bekman@stas00

If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should

[Infra][Fine-tuning]

“DeepSeek Summary: Stas Bekman suggests that DeepSpeed ZeRO++ is now ready to try on the master branch.”

Stas Bekman@stas00

Hear, hear, I'm excited to introduce a new performance metric: Maximum Achievable Matmul

[Infra][Evaluation]

“DeepSeek Summary: Stas Bekman introduces a new performance metric called Maximum Achievable Matmul.”

Stas Bekman@stas00

Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can

[Tooling][Deployment]

“DeepSeek Summary: Stas Bekman thanks a contributor for enhancing the Machine Learning Engineering Open Book.”

Stas Bekman@stas00

Classical Jensen math. Unidirectional bandwidth is topped at 450GB/s, and then there comes a protocol overhead of two digit percentage. 1.

[Infra]

“DeepSeek Summary: Stas Bekman discusses bandwidth limitations and protocol overhead in computing.”

Sayak Paul@sayakpaul

Working at Hugging Face over the past 3.5+ years has allowed me to identify what technical areas truly interest me! In turn, that has allowed me to directly...

[Infra][Fine-tuning]

“DeepSeek Summary: Reflects on 3.5+ years at Hugging Face, identifying key technical interests.”

Sayak Paul@sayakpaul

Every day you learn something new. Today I learned that diffusion ... Good folks at @photoroom_app decided to change that by releasing PRX under Apache 2.0 with solid reporting.

[Multi-modal][Deployment]

“DeepSeek Summary: Learned about diffusion models and praises Photoroom for releasing PRX under Apache 2.0.”

Sayak Paul@sayakpaul

Based on ...

[LLM]

“DeepSeek Summary: Tweet starting with 'Based on' (content truncated in search).”

Philipp Schmid@philschmid

Guide: ReAct agent from scratch with Gemini 2.5 and LangGraph | Gemini API | Google AI for Developers. ai.google.dev.

[Agent][LLM][Tooling]

“DeepSeek Summary: Philipp shared a guide on building a ReAct agent from scratch using Gemini 2.5 and LangGraph.”

Philipp Schmid@philschmid

How to use Deep Research with the Gemini API. www.philschmid.de.

[LLM][Tooling][RAG]

“DeepSeek Summary: Philipp posted about using Deep Research capabilities with the Gemini API, linking to his blog.”

Philipp Schmid@philschmid

Told an AI agent to read the autoresearch repo and build a version for QMD. Get training data from tobi/qmd github. Went to sleep. Woke up to a 0.8B model

[Agent][Fine-tuning][Deployment]

“DeepSeek Summary: Philipp describes an experiment where an AI agent autonomously built a 0.8B parameter model overnight.”

Ethan Mollick@emollick

So much work is going into faking continual learning and memory for AIs,

[LLM][Safety]

“DeepSeek Summary: Critiques efforts to simulate continual learning and memory in AI, implying it may be superficial.”

Ethan Mollick@emollick

Had early access to GPT-5.4 and Pro. They are very good. One fun illustration of progress,

[LLM][Evaluation]

“DeepSeek Summary: Reports early access to advanced AI models, noting significant progress.”

Ethan Mollick@emollick

If it helps, I teach at a business school & many of my smartest students are hired by funds because they can reliably turn their only-human

[Deployment]

“DeepSeek Summary: Notes that human skills remain valuable in finance despite AI advances.”

Naomi Saphra@NaomiSaphra

what a perfect space for scientific discourse! I'll start off with a few images of myself

[Safety]

“DeepSeek Summary: Naomi Saphra humorously comments on using self-images to initiate scientific discourse.”

Naomi Saphra@NaomiSaphra

Life update: I'm starting as faculty at Boston University in 2026! BU ...

[LLM]

“DeepSeek Summary: Naomi Saphra announces her new faculty position at Boston University starting in 2026.”

Ben Recht@beenwrekt

For the first time in almost a decade, I'm teaching a class on learning and control.

[Evaluation]

“DeepSeek Summary: Ben Recht is teaching a class on learning and control after a long hiatus.”

Ben Recht@beenwrekt

Revisiting Sutton's Bitter Lesson in the wake of GPT-5.

[LLM]

“DeepSeek Summary: Recht reflects on Sutton's Bitter Lesson in the context of GPT-5.”

Ben Recht@beenwrekt

Fully open machine learning requires not only GPU access but a community commitment to openness.

[Infra]

“DeepSeek Summary: Recht argues that open ML needs both hardware access and community dedication.”

-- END OF LOG --

[STATS] 59 items · Filter applied