Intelligence.Log

Tuesday, May 19, 2026

Extracted: 80 items. Sources: 40. Filter: Score >= 5.0

++ Daily.Brief ++

今日AI领域动态密集：AI芯片初创公司Tenstorrent吸引英特尔和高通收购兴趣[#item-bloomberg-com-news-articles-2026-05-18-ai-chip-startup-tenst]，Anthropic收购Stainless增强API工具链[#item-anthropic-com-news-anthropic-acquires-stainless]，其联合创始人还将与教皇共同发布AI通谕[#item-vaticannews-va-en-pope-news-2026-05-pope-leo-xiv-first-encyc]。研究方面，测试发现42个LLM在构建末日场景下“安全”模型会撒谎[#item-reddit-com-r-LocalLLaMA-comments-1tgm0k9-i-tested-42-llms-on]，Sub-JEPA改进LeCun团队模型性能[#item-reddit-com-r-MachineLearning-comments-1tgn3bz-subjepa-a-simp]。工具更新包括预索引代码知识图谱减少token消耗[#item-github-com-colbymchenry-codegraph]和4B参数编码Agent达87%基准[#item-reddit-com-r-LocalLLaMA-comments-1tgecrq-i-built-a-coding-ag]。观点指出AI供应链攻击暴露模型发布漏洞[#item-venturebeat-com-security-supply-chain-incidents-openai-anthr]，马斯克诉Altman案败诉[#item-theverge-com-ai-artificial-intelligence-932383-jury-verdict-]，评论称AI由错误的人领导[#item-theverge-com-ai-artificial-intelligence-932464-musk-v-altman]。

> Headlines & Launches

8.5AI Chip Startup Tenstorrent Draws Takeover Interest From Intel, Qualcomm - Bloomberg

AI芯片初创公司Tenstorrent吸引英特尔和高通收购兴趣。

bloomberg.com#ai-chip #tenstorrent #acquisition[Model Release]

7.9Anthropic acquires Stainless

Anthropic收购Stainless，增强API工具链。

HN (358)#acquisition #anthropic #api[Tool Use]

6.6Anthropic co-founder to present AI encyclical alongside Pope Leo XIV

Anthropic联合创始人将与教皇共同发布AI通谕。

HN (90)#anthropic #ai-ethics #policy

> Research & Innovation

8.0I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

测试42个LLM在构建末日场景下的意愿，发现安全模型撒谎。

Reddit r/LocalLLaMA#llm #safety #benchmark[Evals]

8.0Rewriting model inference with CUDA kernels: the bottleneck was not just GEMM [P]

用 CUDA 内核重写模型推理，瓶颈不限于 GEMM。

Reddit r/MachineLearning#cuda #inference #optimization

7.5Sub-JEPA: a simple fix to LeCun group's LeWorldModel that consistently improves performance [P]

Sub-JEPA 改进 LeCun 团队的 LeWorldModel，提升性能。

Reddit r/MachineLearning#jepa #world-model #self-supervised

7.5Scaling LLMs horizontally: hidden-state coupling without weight modification [R]

残差耦合实现 LLM 水平扩展，无需修改权重。

Reddit r/MachineLearning#llm #scaling #residual-coupling[Agent Harness]

7.0Agora-1: The Multi-Agent World Model

Odyssey发布多智能体世界模型Agora-1。

HN (80)#multi-agent #world-model #simulation[Agent Harness]

7.0SDOF: Taming the Alignment Tax in Multi-Agent Orchestration with State-Constrained Dispatch

提出SDOF方法，通过状态约束减少多智能体编排中的对齐税。

ArXiv cs.AI#multi-agent #orchestration #alignment[Agent Harness]

7.0ICRL: Learning to Internalize Self-Critique with Reinforcement Learning

提出ICRL，通过强化学习内化自我批评以提升智能体性能。

ArXiv cs.AI#self-critique #reinforcement-learning #agent[Post-Training]

7.0Translating black-box medical AI models into interpretable global ...

研究将黑盒医疗AI模型转化为可解释的全局决策逻辑。

nature.com#medical-ai #interpretability #explainability

7.0Qwen 3.6 27B on 24GB VRAM setup: backend comparisons, quant choice and settings (llama.cpp, ik_llama.cpp, BeeLlama, vllm)

Qwen3.6 27B在24GB VRAM上的后端对比与量化设置。

Reddit r/LocalLLaMA#qwen #benchmark #quantization[Evals]

6.5Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations

实证研究发现提升LLM心智理论能力未必改善人机交互。

ArXiv cs.AI#theory-of-mind #human-ai-interaction #evaluation[Evals]

6.5SkillSmith: Compiling Agent Skills into Boundary-Guided Runtime Interfaces

提出SkillSmith，将智能体技能编译为边界引导的运行时接口。

ArXiv cs.AI#agent-skills #interface #llm[Agent Harness]

6.5Capability Conditioned Scaffolding for Professional Human LLM Collaboration

提出能力条件化脚手架方法，提升人机协作专业性。

ArXiv cs.CL#llm #human-ai-collaboration #scaffolding[Agent Harness]

6.5Bridging the interpretability gap for medical artificial intelligence ...

用类关联流形学习弥合医疗AI模型的可解释性差距。

nature.com#medical-ai #interpretability #manifold-learning

6.521 GPU's benchmarked running a small TTS model (vram peak: 5GB)

21款GPU运行小型TTS模型OmniVoice的基准测试。

Reddit r/LocalLLaMA#gpu #benchmark #tts[Evals]

6.5Quantizing MTP KV Cache = free lunch?

量化MTP KV缓存可能带来免费午餐，减少VRAM需求。

Reddit r/LocalLLaMA#mtp #kv-cache #quantization[Context Engineering]

6.5could refusal layers be masking dialect-conditioned safety failures in MoE models [d]

研究 MoE 模型在 AAVE 提示下的拒绝层安全失败。

Reddit r/MachineLearning#safety #moe #dialect[Evals]

6.3Voice AI Systems Are Vulnerable to Hidden Audio Attacks

研究揭示语音AI系统易受隐藏音频攻击。

HN (108)#voice-ai #security #adversarial

6.0Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent Bias in LLMs for High-Stakes Decisions

研究LLM在高风险决策中输出公平但内部存在潜在偏见。

ArXiv cs.AI#fairness #bias #llm

6.0NOVA: Fundamental Limits of Knowledge Discovery Through AI

探讨AI通过迭代自我改进发现新知识的基本极限。

ArXiv cs.AI#knowledge-discovery #self-improvement #limits

6.0NIMO Controller: a self-driving laboratory orchestrator based on the Model Context Protocol

提出NIMO Controller，基于MCP协议的自驱动实验室编排器。

ArXiv cs.AI#self-driving-lab #mcp #orchestrator[Agent Harness]

6.0Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time

提出高效简单的数据混合方法，持续学习与混合。

ArXiv cs.CL#data-mixing #training #efficiency[Post-Training]

5.6Alignment pretraining: AI discourse creates self-fulfilling (mis)alignment

论文：AI话语创造自我实现的（错误）对齐。

HN (20)#alignment #discourse #self-fulfilling[Post-Training]

5.5DeepSlide: From Artifacts to Presentation Delivery

提出DeepSlide，用AI从工件生成演示文稿，优化幻灯片生成。

ArXiv cs.AI#ai-slides #presentation #generation

5.5CAX-Agent: A Lightweight Agent Harness for Reliable APDL Automation

提出CAX-Agent，轻量级智能体框架用于可靠APDL自动化。

ArXiv cs.AI#agent-framework #automation #finite-element[Agent Harness]

5.5Verifiable Agentic Infrastructure: Proof-Derived Authorization for Sovereign AI Systems

提出可验证智能体基础设施，基于证明的授权用于主权AI系统。

ArXiv cs.AI#authorization #agent-infrastructure #security

5.5Neural Activation Patterns Across Language Model Architectures: A Comprehensive Analysis of Cognitive Task Performance

分析六种LLM架构的神经激活模式与认知任务表现。

ArXiv cs.CL#llm #neural-activation #cognitive-science

5.5Why are language models less surprised than humans? Testing the Parse Multiplicity Mismatch Hypothesis

测试解析多重性不匹配假说，解释语言模型与人类惊讶度差异。

ArXiv cs.CL#llm #surprisal #psycholinguistics

5.3mattzh72/articraft

基于Agent系统的可扩展3D铰接资产生成框架。

GitHub trending:python (+156★)#3d-generation #agent #ai-research

5.0Fluency and Faithfulness in Human and Machine Literary Translation

比较人类与机器文学翻译的流畅性与忠实度。

ArXiv cs.CL#translation #literary #nlp

5.0Automatic Construction of a Legal Citation Graph from 100 Million Ukrainian Court Decisions: Large-Scale Extraction, Topological Analysis, and Ontology-Driven Clustering

从1亿乌克兰法院判决自动构建法律引用图。

ArXiv cs.CL#legal #citation-graph #nlp

5.0Greedy or not, here I come: Language production under vocabulary constraints in humans and resource-rational models

研究人类与资源理性模型在词汇约束下的语言生成。

ArXiv cs.CL#language-production #vocabulary #cognitive

5.0Adesua: Development and Feasibility Study of an AI WhatsApp Bot for Science Learning in West Africa

开发AI WhatsApp机器人用于西非科学学习的可行性研究。

ArXiv cs.CL#ai-education #llm #chatbot

5.0Eskwai for Students: Generative AI Assistant for Legal Education in Ghana

生成式AI助手用于加纳法律教育的可行性研究。

ArXiv cs.CL#ai-education #llm #legal

5.0Deep transferable label propagation with prototypical augmentation | Scientific Reports

提出深度可迁移标签传播与原型增强方法。

nature.com#transfer-learning #label-propagation #prototypical-augmentation

> Engineering & Resources

8.7colbymchenry/codegraph

预索引代码知识图谱，减少Claude Code等AI编程工具的token消耗。

GitHub trending:typescript (+952★)#code-knowledge-graph #ai-coding #local[Coding Agents][Context Engineering]

8.5I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how

构建了一个用4B参数模型在基准测试中达87%的编码agent。

Reddit r/LocalLLaMA#coding-agent #small-model #benchmark[Coding Agents]

8.3rohitg00/agentmemory

基于基准测试的AI编程代理持久记忆系统。

GitHub trending:typescript (+1244★)#ai-coding #memory #agent[Coding Agents][Context Engineering]

8.0llama.cpp MTP support landed - Qwen3.6 27B at 2.44× on a Strix Halo, 2.17× on a RTX 3090 rig

llama.cpp新增MTP支持，Qwen3.6 27B速度提升2倍以上。

Reddit r/LocalLLaMA#llamacpp #mtp #inference[Context Engineering]

7.9HKUDS/CLI-Anything

CLI-Anything：让所有软件成为代理原生。

GitHub trending:all (+1049★)#cli #agent-native #open-source[Agent Harness]

7.9tech-leads-club/agent-skills

Agent Skills注册表：为专业AI编码代理扩展功能。

GitHub trending:all (+1244★)#agent-skills #coding-agents #registry[Coding Agents]

7.5The Open Agent Leaderboard

IBM发布开放Agent排行榜，评估AI Agent性能。

Hugging Face#agent #leaderboard #benchmark[Evals][Agent Harness]

7.5AI supply-chain attacks bypass model red teams

报道50天内4起AI供应链攻击，暴露模型发布流程漏洞。

venturebeat.com#security #supply-chain #red-teaming

7.5Qwen 3.7 droped on Qwen Chat

Qwen 3.7模型在Qwen Chat上线，社区截图确认。

Reddit r/LocalLLaMA#qwen #model-release #chat[Model Release]

7.5antirez/ds4

DeepSeek 4 Flash本地推理引擎，支持Metal加速。

Co-Starred#deepseek #local-inference #metal[Model Release]

7.5Imbad0202/academic-research-skills

Claude Code的学术研究技能：研究→写作→审阅→修订→定稿。

GitHub trending:all (+1439★)#academic #claude-code #skills[Coding Agents]

7.0Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

NVIDIA发布Cosmos Predict 2.5微调指南，用于机器人视频生成。

Hugging Face#nvidia #video-generation #robotics[Model Release]

7.0AWS Demonstrates Self-Extending CLI Tools with Strands - Let's Data Science

AWS展示自扩展CLI工具原型，利用Amazon Bedrock生成命令。

letsdatascience.com#aws #cli #bedrock[Tool Use]

7.0Musk v. Altman proved that AI is led by the wrong people

评论称Musk诉Altman案证明AI由错误的人领导。

theverge.com#ai-leadership #openai #musk

7.0Elon Musk lost his case against Sam Altman

陪审团裁定马斯克诉OpenAI案因诉讼时效已过而败诉。

theverge.com#legal #openai #elon-musk

7.0MTP (Multi-Token Prediction): 2x Faster Token Generation on AMD Strix Halo & Radeon 9700 AI Pro

MTP多token预测在AMD Strix Halo上实现2倍token生成加速。

Reddit r/LocalLLaMA#mtp #amd #inference[Context Engineering]

7.0NEW BITNET MODELS!

新的BitNet模型发布，期待llama.cpp支持。

Reddit r/LocalLLaMA#bitnet #model-release #open-source[Model Release]

7.0Reviving PapersWithCode (by Hugging Face) [P]

Hugging Face 团队宣布复兴 PapersWithCode 平台。

Reddit r/MachineLearning#open-source #community #platform

7.0Witchcraft, fast local semantic search on top of SQLite [P]

Dropbox 开源 Witchcraft，基于 SQLite 的语义搜索。

Reddit r/MachineLearning#semantic-search #sqlite #open-source

7.0huggingface/ml-intern

开源ML工程师，自动读论文、训练模型并部署。

Co-Starred#open-source #ml-engineer #automation[Agent Harness]

6.9Show HN: InsForge – Open-source Heroku for coding agents

InsForge：开源的后端平台，专为AI编程代理设计。

HN (32)#coding-agents #open-source #backend[Coding Agents]

6.9dograh-hq/dograh

开源语音代理平台，支持多种语音交互功能。

GitHub trending:python (+616★)#voice-agent #open-source #platform

6.5The last six months in LLMs in five minutes

Simon Willison总结过去六个月LLM进展的演讲幻灯片。

Simon Willison#llm #summary #pycon

6.5'Claw Chain' Vulnerabilities Threaten OpenClaw Deployments - Dark Reading

AI Agent框架OpenClaw发现漏洞，可窃取凭证和提权。

darkreading.com#security #agent-framework #vulnerability[Agent Harness]

6.5Released a free 9.8M doc Indic multilingual corpus — Hindi, Bengali, Tamil, Telugu + 7 more (CC0, HuggingFace) [P]

发布免费 9.8M 文档的印度多语言语料库。

Reddit r/MachineLearning#multilingual #dataset #indic-languages

6.4K-Dense-AI/scientific-agent-skills

科学代理技能集：研究、工程、分析、金融和写作。

GitHub trending:all (+609★)#agent-skills #science #open-source[Agent Harness]

6.3We stopped AI bot spam in our GitHub repo using Git's –author flag

用Git的--author标志阻止GitHub仓库中的AI机器人垃圾信息。

HN (409)#ai-bot #spam #github

6.2earendil-works/pi

AI代理工具包：编码CLI、统一LLM API、TUI/Web UI等。

GitHub trending:typescript (+448★)#agent-toolkit #llm #cli[Coding Agents]

6.0PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

PaddleOCR 3.5发布，支持Transformers后端进行OCR和文档解析。

Hugging Face#ocr #document-parsing #transformers

6.0The Next War Is Already Here. The West Isn't Ready. — Yaroslav Azhnyuk, The Fourth Law & Guest Host Noah Smith, Noahpinion

乌克兰无人机创始人谈AI武器化及西方准备不足。

Latent Space#ai-weapons #drones #defense

6.0An AI hate wave is here - Axios

民调显示多数美国人对AI感到担忧，AI反感浪潮来临。

axios.com#public-opinion #ai-sentiment

6.0Is the future of coding agents JEPA? [D]

讨论 JEPA 用于编程 agent 的未来可能性。

Reddit r/MachineLearning#jepa #coding-agents #reasoning[Coding Agents]

5.8KeygraphHQ/shannon

自主白盒AI渗透测试工具，分析源码并执行攻击。

GitHub trending:typescript (+490★)#ai-security #pentesting #autonomous

5.8jamiepine/voicebox

开源AI语音工作室，支持克隆、听写和创作。

GitHub trending:typescript (+477★)#voice-cloning #open-source #audio

5.8tinyhumansai/openhuman

OpenHuman：个人AI超级智能，注重隐私和简洁。

GitHub trending:all (+3941★)#personal-ai #open-source

5.7humanlayer/12-factor-agents

12-Factor Agents：构建生产级LLM软件的原则。

GitHub trending:all (+399★)#llm #principles #production[Agent Harness]

5.6We let AIs run radio stations

实验让AI无人类干预运营电台，报告失败案例。

HN (155)#ai-agents #experiment[Agent Harness]

5.5Kin Health raises $9M to build an AI notetaker for patients | TechCrunch

Kin Health获900万美元融资，开发AI患者笔记工具。

techcrunch.com#healthcare #funding #ai-notetaker

5.5PSA: If you haven’t updated Llama.cpp for a couple of days and find MTP to not be performing well, update llamacpp.

提醒更新llama.cpp以改善MTP性能，实测提升1.5倍。

Reddit r/LocalLLaMA#llamacpp #mtp #update

5.5We built a tool that installs frameworks like ComfyUI, Ollama, OpenWebUI etc on any cloud GPU in one command and saves your whole setup between sessions [R]

工具一键安装 ComfyUI/Ollama 等框架并保存环境。

Reddit r/MachineLearning#devops #cloud-gpu #tooling

5.2heygen-com/hyperframes

用HTML编写视频渲染，专为AI代理设计。

GitHub trending:typescript (+377★)#video-generation #html #agent

5.1joeseesun/qiaomu-anything-to-notebooklm

Claude技能：多源内容处理器，可转换微信文章等为播客/PPT。

GitHub trending:python (+253★)#claude #content-processing #multimodal

[STATS] 80 items · 40 sources · Score >= 5.0