Intelligence.Log

2026-05-19

Extracted: 55 items. Sources: GitHub, Bluesky, X, Blogs.
++ AI OVERVIEW ++
Today's trending signals point to a deepening focus on both practical security tooling and the evolution of AI post-training methods. On the security front, Simonw's star of the `andrew/pycon` repo highlights growing interest in auditing GitHub Actions security across Python packages, a timely concern as supply chain attacks become more sophisticated. Meanwhile, Nathan Lambert's thread on Bluesky is generating significant discussion around on-policy distillation, which he argues is becoming a permanent fixture across instruction tuning, RLHF, DPO, and RLVR—suggesting the field is converging on a core set of training techniques. This dual emphasis on hardening infrastructure and refining alignment methodologies underscores a maturing ecosystem where both safety and performance are being tackled head-on.
grep TOPIC=
grep SOURCE=
sort --by=
GH
sapientinc/HRM-Text0.4k7/10

HRM-Text is a 1B text generation model based on the HRM architecture, strengthened by task completion and latent space reasoning.

Starred bylucidrains|[LLM][Fine-tuning]
HRM-Text is a 1B parameter text generation model that introduces hierarchical reasoning and latent space reasoning to improve task completion. It offers a novel architecture that could enhance reasoning capabilities in LLMs.
GH
andrew/pycon0.0k4/10

Data collection and analysis for a PyCon talk on GitHub Actions security across Python packages.

Starred bysimonw|[Infra]
This repository provides data collection and analysis scripts for a PyCon talk on GitHub Actions security across Python packages. It offers insights into how GitHub Actions are used in the Python ecosystem and potential security risks.
GH
elcritch/sarcophagus0.0k3/10

auth and other api helpers for mummy

Starred bylucidrains|[Infra]
Sarcophagus provides authentication and API helper utilities for the Mummy web framework in Nim. It simplifies common backend tasks like user auth, session management, and request handling.
BSKY
natolambert.bsky.socialNathan Lambert

On-policy distillation is on track to be a lasting method in post-training. The list of areas would be: Instruction tuning (SFT/IFT) RLHF Direct Preference Optimization (DPO et al) RLVR On-policy Distillation (OPD) New classes of methods are rare! Excited to play.

❤️ 14 Likes|[Fine-tuning]
BSKY
simonwillison.netSimon Willison

My notes on Gemini 3.5 Flash - 3x the price of Gemini 3 Flash but Google are planning to use it for many of their own products simonwillison.net/2026/May/19/...

❤️ 47 Likes|[LLM][Deployment]
BSKY
mmitchell.bsky.socialMargaret Mitchell

Against the constant pressure of *genAI, genAI, genAI*, I am really appreciating @ai2.bsky.social 's work on creating tools for critical needs -- like crop maps and forest loss analysis. They just did a nice release on @hf.co. huggingface.co/blog/allenai...

❤️ 41 Likes|
BSKY
mmitchell.bsky.socialMargaret Mitchell

Gmail's automatically generated responses (which can appear whether or not you ask for them) cement human anchoring bias: The tendency for people to heavily rely on what they have already seen. The effects are insidious, subconsciously influencing what we believe.

❤️ 15 Likes|[Safety]
BSKY
t
Thomas Dietterich

Yet another sobering post from @noahpinion.blogsky.venki.dev open.substack.com/pub/noahpini...

❤️ 3 Likes|[Evaluation][Safety]
BSKY
emollick.bsky.socialEthan Mollick

🚨Our paper is out in PNAS: we found classic human persuasion techniques worked on AIs in a "parahuman" way, making them agree to objectionable requests (increasing compliance from 35% to 51%) It worked on a range of major recent LLMs though newer models do resist more www.pnas.org/doi/10.1073/...

❤️ 36 Likes|[Safety]
BSKY
emollick.bsky.socialEthan Mollick

Also had some early access to Gemini 3.5 Flash. Very fast for a flash model and very capable, though not as powerful as a full frontier model. I added it to the gallery or procedurally generated one-shot towns (it made one error that it corrected): hg-20f7d1a3ce.netlify.app#gemini-3-5-f...

❤️ 33 Likes|[LLM][Evaluation]
BSKY
emollick.bsky.socialEthan Mollick

Gemini Omni is quite good at instruction following: "sea otter in a pilot's uniform explains why Spirit Airlines went bankrupt to a river otter who is distracted by their laptop while they are in a hot air balloon over NYC. in the next balloon over, william shakespeare fights a robot made of pizza"

❤️ 77 Likes|[LLM]
BSKY
emollick.bsky.socialEthan Mollick

Had early access to Gemini Omni: "a dramatic reading of Death by Water from the Wasteland by a man eating garlic bread while balanced on a unicycle on a small platform over a churning sea of tomato sauce in which, at the center, sites a meatball with bright blue eyes wearing a top hat"

❤️ 59 Likes|[Multi-modal]
BSKY
emilymbender.bsky.socialEmily M. Bender

Wow some terrible reporting about Google's latest horrible ideas about how to distort information access in the name of "convenience" (or something): techcrunch.com/2026/05/19/g... A short thread 🧵>>

❤️ 267 Likes|[Evaluation][Safety]
BSKY
emilymbender.bsky.socialEmily M. Bender

We gotta find the guy that did this!!

❤️ 84 Likes|
BSKY
angelamczhou.bsky.socialangela zhou

Excited to share our paper! Due Process on Hold: A Queueing Framework for Improving Access in SNAP arxiv.org/abs/2605.15165 Millions of Americans interface with the social safety net via call centers that are too congested. In Holmes v. Knodell, bad operations = procedural due process violation.

❤️ 23 Likes|
X
Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours.
[LLM]
“DeepSeek Summary: Karpathy used an LLM to refine a blog post argument over 4 hours.
X
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around
[Safety]
“DeepSeek Summary: Karpathy notes a growing gap in understanding AI capability.
X
I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits
[Tooling]
“DeepSeek Summary: Karpathy feels behind as a programmer due to AI-driven refactoring.
X
Quitting programming as a career right now because of LLMs would be like quitting carpentry as a career because of power tools.
[LLM][Tooling]
“DeepSeek Summary: Analogizes LLMs in programming to power tools in carpentry, suggesting they augment rather than replace.
X
hwchase17Harrison Chase
I am not excited about visual workflow builders 1. Not simple enough for the average user
[Tooling][Evaluation]
“DeepSeek Summary: Harrison Chase expresses skepticism about visual workflow builders, citing lack of simplicity for average users.
X
hwchase17Harrison Chase
We launched LangSmith Agent Builder this week as a no-code way to build agents. A key part of Agent builder is it's memory system.
[Agent][Tooling]
“DeepSeek Summary: Announcement of LangSmith Agent Builder, a no-code agent builder with a focus on memory systems.
X
hwchase17Harrison Chase
In the hot path as the agent is running. The agent can decided to (or the user can prompt it to) update its memory as it is working on the core
[Agent][LLM]
“DeepSeek Summary: Describes how agents can update memory during execution, either autonomously or via user prompt.
X
DrJimFanJim Fan
In this context, I define world modeling as predicting the next plausible world state (or a longer duration of states) conditioned on an action.
[Agent][Multi-modal]
“DeepSeek Summary: Jim Fan defines world modeling as predicting future world states given actions, a key concept in robotics and embodied AI.
X
jeremyphowardJeremy Howard
Here's what I would prefer to see:
[LLM]
“DeepSeek Summary: Jeremy Howard expresses a preference for an unspecified topic.
X
jeremyphowardJeremy Howard
hi, i'm a sole proprietor/founder in Austria and i earn many many multiples of what i'd earn as an employee, despite 'predatory income tax'. in fact, i opt out
[Agent]
“DeepSeek Summary: Jeremy Howard discusses his income as a sole proprietor in Austria, noting high earnings despite taxes.
X
soumithchintalaSoumith Chintala
reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins
[LLM]
“DeepSeek Summary: Recommends a newsletter called AI News as a high-leverage way to stay informed.
X
soumithchintalaSoumith Chintala
MacStudio you ask? Apple Engineering's **actual** time spent on PyTorch support
[Infra][Deployment]
“DeepSeek Summary: Comments on Apple's engineering effort for PyTorch support on Mac Studio.
X
soumithchintalaSoumith Chintala
Open LLMs need to get organized and co-ordinated about sharing human feedback.
[Fine-tuning][LLM]
“DeepSeek Summary: Advocates for coordination among open LLM projects to share human feedback data.
X
Folks who work in AI or software engineering feel like the world is changing exponential fast.
[Deployment]
“DeepSeek Summary: Chollet observes that AI/software engineers perceive rapid exponential change in the world.
X
d
Fei-Fei Li
We are beyond thrilled to congratulate Dr. Fei-Fei Li for being ranked #9 in the Top 100 Women in #AI by AI Magazine!
[Safety]
“DeepSeek Summary: Fei-Fei Li ranked #9 in Top 100 Women in AI.
X
minimaxirMax Woolf
LOL. Remove the code in the algorithm that boosts the tweets of Elon by elvodqa · Pull Request #160 ·... github.com.
[Deployment][Tooling]
“DeepSeek Summary: Max Woolf finds humor in a GitHub pull request that aims to remove code boosting Elon Musk's tweets.
X
minimaxirMax Woolf
me irl
“DeepSeek Summary: A short, relatable post expressing a personal sentiment.
X
lucidrainsPhil Wang
Having a wonderful time hanging out with my uncle James Wong at the Chelsea Flower show!
“DeepSeek Summary: Phil Wang posts about spending time with his uncle James Wong at the Chelsea Flower Show.
X
srush_ioSasha Rush
today i woke up to a living version of a phd student's nightmare. a new paper in my inbox: a detailed reproduction of a paper i wrote
[Evaluation]
“DeepSeek Summary: Sasha Rush woke up to a detailed reproduction of his own paper, a common PhD student nightmare.
X
srush_ioSasha Rush
Some news: moving this fall from Harvard -> Cornell Tech. Sad to leave such an incredible ...
[Deployment]
“DeepSeek Summary: Sasha Rush announced his move from Harvard to Cornell Tech.
X
I have been compiling LLM/VLM training logbooks/chronicles. This is the one of the best sources to
[LLM][Fine-tuning][Infra]
“DeepSeek Summary: Compiling logbooks/chronicles for LLM/VLM training, sharing a valuable resource.
X
Thanks to an awesome contribution from @omarnomad The Machine Learning Engineering Open book now can
[Tooling][Infra]
“DeepSeek Summary: Announces a contribution to the Machine Learning Engineering Open Book.
X
If you were holding off to try @MSFTDeepSpeed ZeRO++ it looks like deepspeed@master should
[Infra][Fine-tuning]
“DeepSeek Summary: Encourages trying DeepSpeed ZeRO++ as it should be functional on master.
X
sayakpaulSayak Paul
Had a nice time chatting about the state of diffusion models and some text-to-image data shenanigans at
[Multi-modal]
“DeepSeek Summary: Sayak Paul discussed diffusion models and text-to-image data issues.
X
sayakpaulSayak Paul
Release notes: Release Diffusers 0.34.0: New Image and Video Models, Better torch.
[Deployment][Tooling]
“DeepSeek Summary: Announcement of Diffusers 0.34.0 release with new models and improvements.
X
philschmidPhilipp Schmid
I read three technical reports from Moonshot AI's Kimi K2.5 paper, Cursor's Composer 2 report and blog post, and Chroma's Context-1 write-up
[LLM][RAG][Tooling]
“DeepSeek Summary: Philipp Schmid read three technical reports: Kimi K2.5, Cursor Composer 2, and Chroma Context-1.
X
philschmidPhilipp Schmid
Random thought. We are going to be so much faster at creating and building.
[Agent][Infra]
“DeepSeek Summary: Philipp Schmid predicts accelerated creation and building speed.
X
e
Ethan Mollick
In 1980, the philosopher John Searle proposed a thought experiment: a person locked in a room, manipulating Chinese characters according to a
[LLM]
“DeepSeek Summary: References Searle's Chinese Room argument to discuss AI understanding.
X
N
Naomi Saphra
what a perfect space for scientific discourse! I'll start off with a few images of myself
[Evaluation]
“DeepSeek Summary: Saphra humorously comments on a space for scientific discourse with self-deprecating tone.
X
N
Naomi Saphra
Perfect cute light very short read for a break in a deadline crunch.
[LLM]
“DeepSeek Summary: Saphra recommends a short, light read for a break during intense work.
X
b
Ben Recht
For the first time in almost a decade, I'm teaching a class on learning and control.
[Deployment]
“DeepSeek Summary: Ben Recht announces teaching a class on learning and control after nearly a decade.
X
b
Ben Recht
Building a theory of the architecture of organizing machines and people.
[Agent]
“DeepSeek Summary: Recht discusses developing a theory for organizing machines and people.
X
b
Ben Recht
On unquantifiable costs and inherent tradeoffs in decision theory.
[Safety]
“DeepSeek Summary: Recht addresses unquantifiable costs and tradeoffs in decision theory.
X
b
Ben Recht
With more equations than usual, I explain how policy gradient gives you a framework to randomly search for
[Fine-tuning]
“DeepSeek Summary: Recht explains policy gradient as a framework for random search.
BLOG

<p>I put together these annotated slides from my five minute lightning talk at PyCon US 2026, using the <a href="https://tools.simonwillison.net/annotated-presentations">latest iteration</a> of my <a href="https://simonwillison.net/2023/Aug/6/annotated-presentations/">annotated presentation...

The post summarizes key developments in LLMs over the past six months, including the rise of multi-modal models, improved reasoning capabilities, and the increasing importance of evaluation frameworks. It highlights practical tools and techniques for working with LLMs, such as prompt engineering and fine-tuning.
BLOG

<p>Today at Google I/O, Google <a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/">released Gemini 3.5 Flash</a>. This one skipped the <code>-preview</code> modifier and went straight to general availability, and Google appear to be using it for a whole lot...

Google released Gemini 3.5 Flash directly to general availability, skipping the preview phase, and plans to integrate it across many products. Despite being more expensive, it offers improved performance and efficiency, making it a versatile model for various applications.
-- END OF LOG --
[STATS] 55 items · Filter applied
Powered by Horizon + DeepSeek