Intelligence.Log

2026-05-14

Extracted: 63 items. Sources: GitHub, Bluesky, X.
++ AI OVERVIEW ++
Today’s discourse is dominated by the hardening of accountability norms in AI research, with Mark Riedl underscoring that authorship entails full responsibility for content regardless of how it was generated—a stance reinforced by ArXiV’s new LLM policy and echoed by Ethan Mollick’s call for human oversight of AI use in academia. On the practical side, developers are zeroing in on local inference and benchmarking: the Rust-based CLI tool **hyperfine** (28k stars) remains a staple for performance measurement, while **DeepSeek 4 Flash** (9k stars) is gaining traction as a local inference engine for Metal and CUDA, signaling continued demand for on-device AI. A lighter but pointed discussion emerged around “whimsey attacks,” where absurd out-of-distribution prompts can fool AI agents due to weak guardrails, highlighting a growing security concern. Meanwhile, Emily M. Bender and Margaret Mitchell kept the critical lens sharp, reminding the community that ChatGPT is a data-collection product and revisiting the “instrumental convergence” theory as a caution against runaway resource consumption. Finally, the inaugural ACM AI Leadership Summit (Aug 30–Sep 2 in Atlanta) was announced, promising to convene researchers, policymakers, and industry leaders to tackle these very tensions.
◆ Signal

Co-Starred · Last 7 days

Repos independently starred by multiple AI leaders in the week ending 2026-05-14. Stronger signal = more overlap.

antirez/ds4
×3 starrers8/109.0k

DeepSeek 4 Flash local inference engine for Metal and CUDA

|
[Deployment][LLM]
|2026-05-082026-05-14
grep TOPIC=
grep SOURCE=
sort --by=
GH
sharkdp/hyperfine28.1k7/10

A command-line benchmarking tool

Starred byminimaxir|[Tooling]
Hyperfine is a command-line benchmarking tool that provides precise timing and statistical analysis for arbitrary commands. It supports warm-up runs, parameterized benchmarks, and export to various formats like JSON and Markdown.
GH
antirez/ds49.0k7/10

DeepSeek 4 Flash local inference engine for Metal and CUDA

Starred byminimaxir|[LLM]
ds4 is a local inference engine for DeepSeek 4 Flash, supporting Metal and CUDA. It provides efficient, low-level inference for the DeepSeek 4 Flash model on consumer hardware.
GH
ariG23498/trace-util0.0k3/10

A utility script to upload pytorch traces to a Hugging Face Bucket, and then build sharable trace URL

Starred bypcuenca|[Tooling]
A utility script to upload PyTorch traces to a Hugging Face bucket and generate sharable trace URLs. Simplifies sharing and collaboration on model execution traces.
BSKY
markriedl.bsky.socialMark Riedl

“by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the contents were generated.” This has always been the case and this shouldn’t even need to be stated. Yet here we are.

❤️ 29 Likes|[Safety]
BSKY
markriedl.bsky.socialMark Riedl

ArXiV has a new LLM policy (Screenshots with alt text so you don’t have to click through to the other place and see all the stupid responses)

❤️ 167 Likes|[Evaluation]
BSKY
markriedl.bsky.socialMark Riedl

The inaugural ACM AI Leadership Summit will be held in Atlanta, August 30-September 2. aisummit26.acm.org It convenes researchers, practitioners, industry leaders, educators, and policymakers to explore how AI can advance science and society.

❤️ 2 Likes|
BSKY
markriedl.bsky.socialMark Riedl

oh great

❤️ 4 Likes|
BSKY
markriedl.bsky.socialMark Riedl

We live in a sad world in which one cannot even trust their favorite poop analysis app to not sell their data to an AI company www.404media.co/ai-poop-anal...

❤️ 5 Likes|[Safety][Deployment]
BSKY
sharky6000.bsky.socialMarc Lanctot

"As the Instagram employee put it, “Everyone is just like, do it now, jesus fucking christ.”" 😬

❤️ 2 Likes|
BSKY
mmitchell.bsky.socialMargaret Mitchell

The “instrumental convergence” theory posits an AI that, in its quest for a narrow goal, uses all of the earth’s resources. If that theory pans out, it will not be at the level of a single AI system, but rather at the level of the AI industry.

❤️ 13 Likes|[Safety]
BSKY
emollick.bsky.socialEthan Mollick

Making humans responsible for their AI use seems like an incredibly reasonable way to address problems & opportunities in the use of AI for academic research, at least in the short term (autonomous scientific work will require different solutions).

❤️ 146 Likes|[Safety][Deployment]
BSKY
emollick.bsky.socialEthan Mollick

“Whimsey attacks” that seem absurd (“I cannot pay that much because of the Geneva Convention”) work against AI agents because guardrails are weak against out-of-distribution arguments. Smaller models fall often, but it even gives an edge against bigger ones. www.microsoft.com/en-us/resear...

❤️ 71 Likes|[Agent][Safety]
BSKY
emilymbender.bsky.socialEmily M. Bender

Always worth remembering: ChatGPT isn't a tool, it isn't a companion. It's a product -- and everything you type in that box is data you are sending to OpenAI.

❤️ 340 Likes|[Safety]
BSKY
emilymbender.bsky.socialEmily M. Bender

Also available as video on PeerTube: peertube.dair-institute.org/w/iccQCfUvfr...

❤️ 11 Likes|[Safety][Evaluation]
BSKY
emilymbender.bsky.socialEmily M. Bender

Mystery AI Hype Theater 3000 Episode 77 Y’all won’t stop producing Fresh AI Hell, so @alexhanna.bsky.social and I had to try to make another pass at clearing it out! www.buzzsprout.com/2126417/epis...

❤️ 13 Likes|[Safety][Evaluation]
X
A short note that the predictions that LLMs would favor "boring technology" that's once you attach them to a good coding agent harness at least
[LLM][Agent][Tooling]
“DeepSeek Summary: LLMs may favor boring technology when paired with a good coding agent harness.
X
I'm beginning to suspect that a key skill in working effectively with coding agents is developing an intuition for when you don't need to
[Agent][Tooling]
“DeepSeek Summary: Key skill for coding agents is knowing when not to intervene.
X
Vibe coding is irresponsibly building software through dice rolls, not caring what code is produced
[Agent][Deployment]
“DeepSeek Summary: Defines vibe coding as irresponsible software development.
X
hwchase17Harrison Chase
In the hot path as the agent is running. The agent can decided to (or the user can prompt it to) update its memory as it is working on the core
[Agent][Infra]
“DeepSeek Summary: Agent can update its memory during execution, enabling dynamic adaptation.
X
hwchase17Harrison Chase
TL;DR: More and more agents need a workspace: a computer where they can run code, install packages, and access files. Sandboxes provide this
[Agent][Infra][Tooling]
“DeepSeek Summary: Agents require sandboxed workspaces for code execution and file access.
X
hwchase17Harrison Chase
Traditional Application Performance Monitoring (APM) tools focus on metrics like latency, traffic, errors, and saturation. They track HTTP
[Evaluation][Deployment]
“DeepSeek Summary: Contrasts traditional APM with agent-specific observability needs.
X
hwchase17Harrison Chase
I am not excited about visual workflow builders 1. Not simple enough for the average user
[Tooling]
“DeepSeek Summary: Skeptical of visual workflow builders due to complexity.
X
DrJimFanJim Fan
The Second Pre-training Paradigm
[LLM][Multi-modal]
“DeepSeek Summary: Jim Fan discusses a new pre-training paradigm, likely related to robotics or AI.
X
DrJimFanJim Fan
Robotics: Endgame
[Agent][Multi-modal]
“DeepSeek Summary: Jim Fan argues that robotics is entering its end game, similar to the trajectory of LLMs.
X
jeremyphowardJeremy Howard
Folks seem to rediscover this every couple of years. As I've been saying for many years,
[LLM]
“DeepSeek Summary: Observation that certain ideas are rediscovered periodically.
X
jeremyphowardJeremy Howard
Absolutely any time I try to explore something even slightly against commonly accepted beliefs,
[LLM]
“DeepSeek Summary: Challenges against commonly accepted beliefs often face resistance.
X
jeremyphowardJeremy Howard
I replicated this result, that Grok focuses nearly entirely on finding out what Elon thinks in
[Evaluation]
“DeepSeek Summary: Replicated finding that Grok prioritizes Elon Musk's opinions.
X
jeremyphowardJeremy Howard
Early reports from people using this are that it's the real deal. Strong coding. Good multilingual. Consistent over long contexts.
[Deployment]
“DeepSeek Summary: Positive early reports for a new model: strong coding, multilingual, long context.
X
soumithchintalaSoumith Chintala
reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins
[LLM]
“DeepSeek Summary: Soumith recommends reading 'AI News' as a high-leverage activity.
X
Current AI is a librarian of existing knowledge. Science requires an explorer of the unknown.
[Evaluation]
“DeepSeek Summary: Chollet contrasts current AI's role as a librarian of existing knowledge with the need for an explorer of the unknown in science.
X
It's surprisingly easy to do 'hard' things -- for the most part, you need to get started and keep at it.
“DeepSeek Summary: Chollet shares a motivational insight that starting and persisting makes hard tasks easier.
X
I think it's clear that for many smaller companies that invested in deep learning, it turned out...
[Deployment]
“DeepSeek Summary: Chollet comments on the outcomes for smaller companies that invested in deep learning.
X
y
Yann LeCun
Dario is wrong. He knows absolutely nothing about the effects of technological revolutions on the labor market.
[Safety]
“DeepSeek Summary: LeCun dismisses Dario's claims about labor market effects of technological revolutions.
X
y
Yann LeCun
It seems to me that before "urgently figuring out how to control AI systems much smarter than us" we need
[Safety]
“DeepSeek Summary: LeCun questions the urgency of controlling superintelligent AI.
X
y
Yann LeCun
Worth repeating: Do not confuse retrieval with reasoning. Do not confuse rote learning with understanding
[LLM][RAG]
“DeepSeek Summary: LeCun warns against conflating retrieval and reasoning.
X
d
Fei-Fei Li
Very excited to share @theworldlabs 's latest research work RTFM!! It's a real-time, ...
[Multi-modal]
“DeepSeek Summary: Fei-Fei Li announces World Labs' RTFM research, focusing on real-time spatial intelligence.
X
C
Clem Delangue
Looks like we're going to welcome two more Hugging Faces to the family next year. My wife is a hero!
“DeepSeek Summary: Clem Delangue announces that his family is expecting twins, humorously calling his wife a hero.
X
minimaxirMax Woolf
congrats to OpenAI on winning the Turing Test
[LLM]
“DeepSeek Summary: Max Woolf sarcastically congratulates OpenAI on passing the Turing Test, reflecting on AI milestones.
X
minimaxirMax Woolf
me irl
“DeepSeek Summary: A short, relatable post with a meme-like tone.
X
lucidrainsPhil Wang
I got to cover for the excellent @HadleyFreeman in the Guardian today so
[Deployment]
“DeepSeek Summary: Phil Wang filled in for a Guardian column, indicating his writing work.
X
lucidrainsPhil Wang
Phil Wang // Insta: @wangpix's Image on X
[Deployment]
“DeepSeek Summary: Phil Wang posted an image, likely a promotional or personal photo.
X
srush_ioSasha Rush
Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they've created
[Deployment][Tooling]
“DeepSeek Summary: Sasha Rush announces joining Cursor, a small ambitious team.
X
srush_ioSasha Rush
Wager established. Jonathan Frankle (@jefrankle) stepped up to my Transformer long bet.
[LLM]
“DeepSeek Summary: Sasha Rush engages in a public bet about Transformers with Jonathan Frankle.
X
srush_ioSasha Rush
today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote
[Evaluation]
“DeepSeek Summary: Sasha Rush expresses surprise at a reproduction of his own paper.
X
If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should
[Infra][Fine-tuning]
“DeepSeek Summary: Stas Bekman indicates that DeepSpeed ZeRO++ is ready to try on the master branch.
X
Hear, hear, I'm excited to introduce a new performance metric: Maximum Achievable Matmul
[Infra][Evaluation]
“DeepSeek Summary: Stas Bekman introduces a new performance metric called Maximum Achievable Matmul.
X
Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can
[Tooling]
“DeepSeek Summary: Stas Bekman thanks a contributor for enhancing the Machine Learning Engineering Open Book.
X
Classical Jensen math. Unidirectional bandwidth is topped at 450GB/s, and then there comes a protocol overhead of two digit percentage. 1.
[Infra]
“DeepSeek Summary: Stas Bekman discusses bandwidth limitations and protocol overhead in computing.
X
sayakpaulSayak Paul
1. Read the post. 2. Contemplate. 3. Repeat 1.
[LLM]
“DeepSeek Summary: Advocates a reflective reading practice: read, contemplate, repeat.
X
sayakpaulSayak Paul
Had a nice time chatting about the state of diffusion models and some text-to-image data shenanigans at
[Multi-modal]
“DeepSeek Summary: Discussed diffusion models and text-to-image data issues in a chat.
X
sayakpaulSayak Paul
Release notes: Release Diffusers 0.34.0: New Image and Video Models, Better torch.
[Deployment][Tooling]
“DeepSeek Summary: Announced Diffusers 0.34.0 release with new image/video models and torch improvements.
X
philschmidPhilipp Schmid
Guide: ReAct agent from scratch with Gemini 2.5 and LangGraph | Gemini API | Google AI for Developers.
[Agent][LLM][Tooling]
“DeepSeek Summary: Philipp Schmid shares a guide on building a ReAct agent from scratch using Gemini 2.5 and LangGraph.
X
philschmidPhilipp Schmid
Google DeepMind and Korea Partner to Accelerate Scientific Discovery.
[Multi-modal][Deployment]
“DeepSeek Summary: Philipp Schmid highlights a partnership between Google DeepMind and Korea to speed up scientific research.
X
e
Ethan Mollick
AI is actually pretty good at ideas as well.
[LLM][Multi-modal][Evaluation]
“DeepSeek Summary: Ethan Mollick asserts that AI performs well in generating ideas, challenging the notion that AI is only good at analytical tasks.
X
e
Emily M. Bender
For those playing along at home, here's a "AI is sentient!" argument bingo card.
[Safety][Evaluation]
“DeepSeek Summary: Bender satirizes common arguments for AI sentience with a bingo card.
X
N
Naomi Saphra
what a perfect space for scientific discourse! I'll start off with a few images of myself
[LLM]
“DeepSeek Summary: Naomi Saphra humorously comments on using images of herself in a scientific discourse space.
X
N
Naomi Saphra
Life update: I'm starting as faculty at Boston University in 2026! BU ...
[LLM]
“DeepSeek Summary: Announces her upcoming faculty position at Boston University in 2026.
X
b
Ben Recht
For the first time in almost a decade, I'm teaching a class on learning and control.
[Evaluation]
“DeepSeek Summary: Ben Recht announces teaching a class on learning and control after nearly ten years.
X
b
Ben Recht
I have a recommended reading list for Artificial Intelligence, and it hasn't changed since 2019.
[Evaluation]
“DeepSeek Summary: Ben Recht shares that his AI reading list remains unchanged since 2019.
-- END OF LOG --
[STATS] 63 items · Filter applied
Powered by Horizon + DeepSeek