Intelligence.Log

2026-05-04

Extracted: 64 items. Sources: GitHub, Bluesky, X, Blogs.
++ AI OVERVIEW ++
Today’s discourse centered on the persistent challenge of AI hallucination in academic publishing, with Mark Riedl sharing a new reference checker he built to catch fake citations in papers under review—highlighting that PDF-to-text extraction remains an "open problem" and formats vary wildly. While the technical community grappled with research integrity, a lighter note came from Marc Lanctot, whose Bluesky post went viral for celebrating the Tampa Bay Lightning’s improbable playoff elimination by a team of underdogs, including a Costco employee and a rookie goalie. The juxtaposition underscores a day where AI’s reliability issues and the chaos of sports both demanded attention, though the former remains the more pressing trend for developers and researchers.
grep TOPIC=
grep SOURCE=
sort --by=
GH
danshapiro/trycycle0.2k6/10

Starred bysimonw|[Tooling][Evaluation]
Trycycle is a tool that helps developers iterate quickly on AI prompts by automatically generating and testing variations, making it easier to find the best prompt for a given task. It integrates with popular AI models and provides a simple CLI interface for prompt experimentation.
GH
danshapiro/ringdown0.0k4/10

Starred bysimonw|[Tooling]
Ringdown is a lightweight Python tool for recording and replaying HTTP responses, useful for testing and development. It simplifies mocking external APIs by capturing real responses and serving them offline.
GH
Fusion/pngsource0.1k3/10

Embed Embed source code in png files

Starred byminimaxir|[Tooling]
This repository allows embedding source code into PNG images, enabling a novel way to distribute code alongside visual assets. It provides a simple tool to encode and decode code within image files, useful for sharing or obfuscation.
BSKY
markriedl.bsky.socialMark Riedl

I wrote a reference checker to see if papers I am reviewing have hallucinated references. It's a ghastly problem. PDF-to-structured-text is still an open problem. Reference formats can vary and some are hard to parse. Even when references are correct, there can be sloppiness.

❤️ 30 Likes|[Evaluation][Tooling]
BSKY
sharky6000.bsky.socialMarc Lanctot

The Tampa Bay Lightning literally just got eliminated by a Costco employee, a European, a rookie goalie, and an bunch of irrelevant players 🤣🤣🤣 Oh and with just 9 shots on net! 😁 Na na na na 🎵, na na na na 🎶, eyyaayyy goodbye 👋👋👋 #gohabsgo round two bring on the Sabres and see you in Buffalo!! 🥳

❤️ 9 Likes|
BSKY
simonwillison.netSimon Willison

I tried running the same "Generate an SVG of a pelican riding a bicycle" prompt against 21 different quantized variants of the same IBM Granite 4.1 3B model - the results weren't as interesting as I had hoped simonwillison.net/2026/May/4/g...

❤️ 27 Likes|[Evaluation][Deployment]
BSKY
markriedl.bsky.socialMark Riedl

It's going to be a pin, or a pen, or earbuds, or a phone...

❤️ 0 Likes|[Deployment]
BSKY
markriedl.bsky.socialMark Riedl

oof

❤️ 9 Likes|
BSKY
markriedl.bsky.socialMark Riedl

On this May the Fourth, let us step back for a moment to think about how, very soon, "The Mandalorian & Grogu" will supplant "Attack of the Clones" for the Star Wars movie with the cringiest title.

❤️ 2 Likes|
BSKY
markriedl.bsky.socialMark Riedl

That viral paper on the benefits of ChatGPT in education was using unsound meta-review methodologies. This does not mean that there are no benefits or anti-benefits of AI, only that the conclusions drawn in the paper cannot be drawn www.404media.co/nature-retra...

❤️ 34 Likes|[Evaluation]
BSKY
natolambert.bsky.socialNathan Lambert

We need to create a new term for the attacks some Chinese labs are doing on APIs that is different than distillation or else we risk tarnishing a crucial technique that is crucial to AI diffusion, academic research & the open-source ecosystem. www.interconnects.ai/p/the-distil...

❤️ 18 Likes|[Safety]
BSKY
emollick.bsky.socialEthan Mollick

It is somewhat comforting that now, whenever I see a post about “here’s the thing that keeps me up at night” I know that there is absolutely no chance that this is being written by a human who is staying up all night.

❤️ 49 Likes|[Safety]
BSKY
emollick.bsky.socialEthan Mollick

This is from the co-founder of Anthropic, interesting that he refers to public sources when he is also obviously privy to lots of internal sources that he cannot discuss. I assume he sees the same thing at Anthropic. importai.substack.com/p/import-ai-...

❤️ 69 Likes|[LLM][Safety]
BSKY
emollick.bsky.socialEthan Mollick

Poems that ChatGPT, Claude, and Gemini all seem to "like" or suggest when you ask for poetry related to being/making LLMs: Rilke's "Archaic Torso of Apollo" Stevens' "Idea of Order at Key West" Borges's "The Golem" (or "The Other Tiger") Pessoa's "Autopsychography" Pretty apt choices!

❤️ 43 Likes|[LLM]
BSKY
emilymbender.bsky.socialEmily M. Bender

Today!

❤️ 2 Likes|
BSKY
beenwrekt.bsky.socialBen Recht

Easy Bay Friends: Tomorrow at Berkeley, the Social Science Matrix is hosting a conversation between Marion Fourcade and me about The Irrational Decision. More info and registration link here: matrix.berkeley.edu/events/the-i...

❤️ 5 Likes|
BSKY
beenwrekt.bsky.socialBen Recht

5/4 for 5/4

❤️ 4 Likes|
X
The hottest new programming language is English
[LLM][Tooling]
“DeepSeek Summary: Karpathy suggests that natural language is becoming the dominant way to program, thanks to AI.
X
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest.
[LLM][RAG]
“DeepSeek Summary: Karpathy advocates using LLMs to create personal knowledge bases for research.
X
It's interesting how "better at code" has become the defining goal of almost every AI lab over the
[LLM][Tooling]
“DeepSeek Summary: Simon observes that AI labs are increasingly focused on improving code generation capabilities as a primary objective.
X
I've published video, slides and a detailed annotated transcript from my talk at this week's
[LLM]
“DeepSeek Summary: Simon shares materials from a talk about the last year six months in LLMs, illustrated by pelicans on bicycles.
X
This may be the best guidance I've seen anywhere on writing a really good commit history.
[Tooling]
“DeepSeek Summary: Simon recommends guidance on writing good commit history.
X
hwchase17Harrison Chase
A brilliant surgeon without instruments, nurses, or an operating room is almost useless. The skill is real. But without the system around them, it goes nowhere.
[Infra][Agent][Tooling]
“DeepSeek Summary: Skill alone is insufficient without supporting infrastructure.
X
hwchase17Harrison Chase
RT @samecrowder: as always, it's an exciting time to be working at LangChain!
[LLM]
“DeepSeek Summary: Retweet expressing excitement about working at LangChain.
X
hwchase17Harrison Chase
Christian was a big part of the idea of middleware! He's going to help make langchain and langgraph agents more
[Agent][Infra]
“DeepSeek Summary: Acknowledges contribution to middleware concept for LangChain agents.
X
hwchase17Harrison Chase
TL;DR: More and more agents need a workspace: a computer where they can run code, install packages, and access files. Sandboxes provide this
[Agent][Infra][Tooling]
“DeepSeek Summary: Agents require sandboxed workspaces for code execution.
X
DrJimFanJim Fan
Resource constraints are a beautiful thing. Survival instinct in a cut-throat AI competitive land
[Agent]
“DeepSeek Summary: Resource constraints drive innovation and survival in competitive AI landscape.
X
DrJimFanJim Fan
I've been a bit quiet on X recently. The past year has been a transformational experience.
[Multi-modal]
“DeepSeek Summary: Jim Fan reflects on a transformative year and his reduced activity on X.
X
DrJimFanJim Fan
It gives me a lot of comfort knowing that we are the last generation without advanced robots everywhere.
[Multi-modal]
“DeepSeek Summary: Perspective on the imminent ubiquity of advanced robotics.
X
DrJimFanJim Fan
Everyone's freaking out about vibe coding. In the holiday spirit, allow me to share my anxiety on the wild
[Agent]
“DeepSeek Summary: Commentary on the 'vibe coding' trend and its implications.
X
jeremyphowardJeremy Howard
Here's a complete unedited video of asking Grok for its views on the Israel/Palestine situation. It first searches twitter for what Elon thinks.
[Safety][LLM]
“DeepSeek Summary: Jeremy Howard posted a video of asking Grok about Israel/Palestine, noting it first searches Twitter for Elon Musk's views.
X
soumithchintalaSoumith Chintala
reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins
[LLM]
“DeepSeek Summary: Recommends a newsletter as high-leverage reading.
X
soumithchintalaSoumith Chintala
Sometimes we forget that NVIDIA wins because it's a software company.
[Infra]
“DeepSeek Summary: Attributes NVIDIA's success to software, not just hardware.
X
soumithchintalaSoumith Chintala
Open LLMs need to get organized and co-ordinated about sharing human feedback.
[LLM][Safety]
“DeepSeek Summary: Calls for coordination among open LLM developers on human feedback.
X
soumithchintalaSoumith Chintala
MacStudio you ask? Apple Engineering's **actual** time spent on PyTorch support
[Infra]
“DeepSeek Summary: Comments on Apple's engineering effort for PyTorch on Mac Studio.
X
I think it's clear that for many smaller companies that invested in deep learning, it turned out
[Deployment]
“DeepSeek Summary: Smaller companies that invested in deep learning faced challenges.
X
Folks who work in AI or software engineering feel like the world is changing exponential fast.
[Evaluation]
“DeepSeek Summary: AI and software engineers perceive rapid exponential change.
X
y
Yann LeCun
Yann LeCun's $1B Bet Against LLMs
[LLM][Agent]
“DeepSeek Summary: Yann LeCun is taking a $1 billion bet against large language models, promoting alternative AI approaches.
X
d
Fei-Fei Li
Very excited to share @theworldlabs 's latest research work RTFM!! It's a real-time, ...
[Multi-modal]
“DeepSeek Summary: Fei-Fei Li announces RTFM research from World Labs, focusing on real-time spatial intelligence.
X
minimaxirMax Woolf
LOL
“DeepSeek Summary: Max Woolf posted a simple reaction 'LOL'.
X
srush_ioSasha Rush
#acl2020nlp Lot of threads online about likes and dislikes for the conference. Twitter is fleeting, github is forever. Send issues or PRs: https://github.com/Mini-Conf/Mini-Conf/issues… It's early days, we're making up virtual conferences as we go along.
[Infra]
“DeepSeek Summary: Sasha Rush advocates for using GitHub over Twitter for lasting conference feedback, and acknowledges the experimental nature of virtual conferences.
X
srush_ioSasha Rush
Composer is a new model we built at Cursor. We used RL to train a big MoE model to be really good at real-world coding, and also very fast. https://cursor.com/blog/composer Excited for the potential of building specialized models to help in critical domains.
[LLM][Fine-tuning][Tooling]
“DeepSeek Summary: Sasha Rush announces Composer, a new RL-trained MoE coding model from Cursor, emphasizing speed and real-world coding performance.
X
If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should
[Infra][Deployment]
“DeepSeek Summary: Stas Bekman points out that DeepSpeed ZeRO++ is now available on master branch, encouraging users to try it.
X
Hear, hear, I'm excited to introduce a new performance metric: Maximum Achievable Matmul
[Infra][Tooling]
“DeepSeek Summary: Stas Bekman introduces a new performance metric called Maximum Achievable Matmul for evaluating ML hardware.
X
If you're trying out FA4, you're likely to run into not being able to load cutlass.cute
[Infra][Tooling]
“DeepSeek Summary: Stas Bekman warns about a common issue with FlashAttention-4 where cutlass.cute fails to load.
X
Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can
[Tooling]
“DeepSeek Summary: Stas Bekman acknowledges a contribution to the Machine Learning Engineering Open Book, adding new content.
X
sayakpaulSayak Paul
Working at Hugging Face over the past 3.5+ years has allowed me to identify what technical areas truly interest me! In turn, that has allowed me to directly
[LLM][Deployment][Tooling]
“DeepSeek Summary: Reflects on how working at Hugging Face helped identify technical interests.
X
philschmidPhilipp Schmid
I read three technical reports from Moonshot AI's Kimi K2.5 paper, Cursor's Composer 2 report and blog post, and Chroma's Context-1 write-up
[Agent][LLM][Tooling]
“DeepSeek Summary: Philipp Schmid read three technical reports on AI topics.
X
philschmidPhilipp Schmid
Random thought. We are going to be so much faster at creating and building.
[Agent][Deployment]
“DeepSeek Summary: He reflects on the accelerating pace of creation and building.
X
philschmidPhilipp Schmid
Skills have become one of the most used extension points in agents. They're flexible, easy to make, and simple to distribute.
[Agent][Tooling]
“DeepSeek Summary: He notes that Skills are a key extension point for agents.
X
e
Ethan Mollick
Here is a full implementation of the Chinese Room using a printed copy of GPT-1, in case you have a few spare years and want to actually run
[LLM][Safety]
“DeepSeek Summary: Ethan Mollick humorously describes a thought experiment implementation of the Chinese Room using a printed GPT-1, highlighting the impracticality of running it manually.
X
e
Ethan Mollick
The fact that no current AI models, often including GPT-5, believe in the existence of
[LLM][Evaluation]
“DeepSeek Summary: Mollick points out that even advanced AI models like GPT-5 do not believe in the existence of something, likely referring to a specific concept or fact.
X
e
Ethan Mollick
So much work is going into faking continual learning and memory for AIs,
[LLM][Fine-tuning]
“DeepSeek Summary: Mollick criticizes the focus on simulating continual learning and memory in AI rather than achieving genuine capabilities.
X
e
Ethan Mollick
Talking about the ethics of AI companies or personalities, or discussing the potential of
[Safety][Deployment]
“DeepSeek Summary: Mollick engages in discussions about AI ethics and the potential of AI technologies.
X
N
Naomi Saphra
I work on understanding and improving training for NLP models, with a focus on studying how structures and mechanistic behaviors emerge over the
[LLM][Fine-tuning][Evaluation]
“DeepSeek Summary: Naomi Saphra describes her research focus on understanding and improving NLP model training, specifically how structures and mechanistic behaviors emerge.
X
N
Naomi Saphra
Naomi Saphra (@nsaphra). 237 likes. New preprint! Everyone loves causal interp. It's coherently defined! It makes testable predictions
[Safety][Evaluation][LLM]
“DeepSeek Summary: Announces a new preprint on causal interpretability, emphasizing its coherent definition and testable predictions.
X
N
Naomi Saphra
Just got a desk reject, post-rebuttals, for a paper being submitted to arxiv <30 min late for
[Evaluation][Fine-tuning]
“DeepSeek Summary: Naomi Saphra shares an experience of receiving a desk reject after rebuttals due to a paper being submitted to arXiv less than 30 minutes late.
X
a
Angela Zhou
#throwback to the beginnings of a beautiful friendship =D @ansonmount @HellOnWheelsAMC
[Agent]
“DeepSeek Summary: Angela Zhou shares a throwback post about the start of a friendship, tagging @ansonmount and @HellOnWheelsAMC.
X
b
Ben Recht
For the first time in almost a decade, I'm teaching a class on learning and control.
[Evaluation]
“DeepSeek Summary: Ben Recht announces teaching a class on learning and control after a long hiatus.
X
b
Ben Recht
Building a theory of the architecture of organizing machines and people.
[Infra]
“DeepSeek Summary: He is working on a theory for organizing machines and people.
X
b
Ben Recht
Fully open machine learning requires not only GPU access but a community commitment to openness.
[Infra][Safety]
“DeepSeek Summary: He argues that open ML needs both GPU access and community commitment.
BLOG

&#8216;Distillation attacks&#8217; is a horrible term for what is happening right now.

The post criticizes the term 'distillation attacks' as misleading and argues that the current trend of smaller models learning from larger ones is a natural and beneficial progression in AI development.
-- END OF LOG --
[STATS] 64 items · Filter applied
Powered by Horizon + DeepSeek