Intelligence.Log

Saturday, May 23, 2026

Extracted: 79 items. Sources: 43. Filter: Score >= 5.0

++ Daily.Brief ++

今日AI领域迎来重大融资动态，Anthropic即将完成超300亿美元融资，同时DeepSeek推进102.9亿美元融资并承诺继续开发开源AI，其创始人明确宣布AGI目标。研究方面，Anthropic发布Project Glasswing初步更新，聚焦AI可解释性，另有论文提出SOLAR自主智能体实现终身学习。工具层面，Anthropic官方Claude Code插件目录和NousResearch的Hermes Agent框架正式发布。观点方面，Google I/O展示AI驱动科学路径的转变，同时内存短缺因AI需求导致消费电子重新定价，前DeepMind研究员则警告仅靠基准测试无法确保AI安全。

> Headlines & Launches

9.5Anthropic to Close Over $30 Billion Round as Soon as Next Week - Bloomberg

Anthropic即将完成超300亿美元融资，创AI领域纪录。

bloomberg.com#funding #anthropic #investment[Model Release]

9.5DeepSeek is pushing forward with $10.29 billion financing round, with Liang Wenfeng committing to continue developing open-source AI models rather than pursuing short-term commercialization goals

DeepSeek推进102.9亿美元融资，创始人承诺继续开发开源AI。

Reddit r/LocalLLaMA#deepseek #funding #open-source[Model Release]

9.0DeepSeek Founder Declares AGI Goal as $10 Billion Round Advances - Bloomberg

DeepSeek创始人宣布AGI目标，同时推进100亿美元融资。

bloomberg.com#deepseek #agi #funding[Model Release]

8.0Elon Musk, Mark Zuckerberg derail Trump AI order | Semafor

马斯克和扎克伯格联手破坏特朗普AI行政令，影响AI政策。

semafor.com#policy #regulation #us

7.5Read the AI executive order thwarted by Trump tech allies

特朗普科技盟友阻挠AI行政令，白宫计划被披露。

axios.com#ai-policy #executive-order #politics

7.5EU-Anthropic Talks Stall Over Mythos AI Cybersecurity Concerns, Spain Says - Bloomberg

欧盟与Anthropic谈判因Mythos AI网络安全问题陷入僵局。

bloomberg.com#eu #anthropic #cybersecurity

7.0[AINews] New AI Infra unicorns: Exa, Modal, TurboPuffer

AI基础设施初创Exa、Modal、TurboPuffer成为新独角兽。

Latent Space#funding #ai-infrastructure #unicorn

7.0US scrambles to stop Internet users re-creating dead pilots’ voices - Ars Technica

美国阻止互联网用户利用AI重现已故飞行员声音。

arstechnica.com#ai-ethics #voice-cloning #regulation

6.0McKinsey & Company partners with AppliedAI to drive agentic AI in regulated sectors - Consultancy-me.com

麦肯锡与AppliedAI合作，推动受监管行业的智能体AI。

consultancy-me.com#partnership #agentic-ai #regulated-industry[Agent Harness]

6.0China Scrutinizes Companies, Funds After AI-Fueled Stock Moves - Bloomberg

中国审查AI驱动股票波动背后的公司和基金。

bloomberg.com#china #regulation #stock-market

5.0FTC to Require Cox Media Group, Two Other Firms to Pay Nearly $1 Million to Settle Charges They Deceived Customers About “Active Listening” AI-Powered Marketing Service

FTC对三家公司的AI监听营销服务处以近百万美元罚款。

Simon Willison#ftc #regulation #ai-marketing

> Research & Innovation

7.7Project Glasswing: An Initial Update

Anthropic发布Project Glasswing初步更新，涉及AI可解释性研究。

HN (301)#anthropic #interpretability #research

7.5SOLAR: A Self-Optimizing Open-Ended Autonomous Agent for Lifelong Learning and Continual Adaptation

提出SOLAR自主智能体，实现终身学习与持续适应。

ArXiv cs.AI#llm #autonomous-agent #continual-learning[Agent Harness]

7.5AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

检索合成可互操作多智能体工作流。

ArXiv cs.AI#multi-agent #workflow #retrieval[Agent Harness]

7.5Open-World Evaluations for Measuring Frontier AI Capabilities

开放世界评估衡量前沿AI能力。

ArXiv cs.AI#benchmark #frontier-ai #evaluation[Evals]

7.5AgentAtlas: Beyond Outcome Leaderboards for LLM Agents

超越结果排行榜的LLM智能体评估框架。

ArXiv cs.AI#llm-agent #benchmark #evaluation[Evals]

7.5Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models

NVIDIA发布Nemotron扩散语言模型，实现近光速文本生成。

Hugging Face#diffusion #text-generation #nvidia[Model Release]

7.0Tool-Augmented Agent for Closed-loop Optimization,Simulation,and Modeling Orchestration

工具增强智能体用于闭环工业设计与仿真优化。

ArXiv cs.AI#tool-use #industrial-design #cad-cae[Tool Use]

7.0OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mind

RL引导对抗生成用于高阶心智理论评估。

ArXiv cs.AI#theory-of-mind #reinforcement-learning #adversarial[Evals]

7.0CR4T: Rewrite-Based Guardrails for Adolescent LLM Safety

基于重写的护栏确保青少年LLM安全。

ArXiv cs.CL#llm-safety #guardrails #adolescent

7.0RankJudge: A Multi-Turn LLM-as-a-Judge Synthetic Benchmark Generator

多轮LLM评判合成基准生成器。

ArXiv cs.CL#llm-as-judge #benchmark #multi-turn[Evals]

7.0A framework for longitudinal health AI agents - Nature

Nature发表纵向健康AI智能体框架，用于持续健康管理。

nature.com#health-ai #agent #framework[Agent Harness]

7.0Antigravity 2.0 Tops the OpenSCAD Architectural 3D LLM Benchmark

Antigravity 2.0在OpenSCAD 3D LLM基准测试中登顶。

HN (346)#llm #benchmark #3d-modeling[Evals]

6.5High Quality Embeddings for Horn Logic Reasoning

神经网络学习逻辑推理排序，提升推理效率。

ArXiv cs.AI#logical-reasoning #embeddings #neural-symbolic

6.5$ECUAS_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

提出不确定性增强系统的评估指标族。

ArXiv cs.AI#uncertainty #evaluation #metrics[Evals]

6.5Sem-Detect: Semantic Level Detection of AI Generated Peer-Reviews

语义级检测AI生成的同行评审。

ArXiv cs.CL#ai-detection #peer-review #semantic

6.5Probabilistic Attribution For Large Language Models

大语言模型的概率归因方法。

ArXiv cs.CL#attribution #llm #probabilistic

6.5PromptNCE: Pointwise Mutual Information Predictions Using Only LLMs and Contrastive Estimation Prompts

提出PromptNCE方法，用LLM和对比估计提示预测互信息。

ArXiv cs.CL#llm #mutual-information #prompting

6.5Reflective Prompt Tuning through Language Model Function-Calling

通过语言模型函数调用实现反思性提示调优。

ArXiv cs.CL#llm #prompt-tuning #function-calling[Tool Use]

6.5Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems

研究揭示多智能体LLM系统中域伪装注入攻击可逃避检测。

HN (33)#multi-agent #security #llm[Agent Harness]

6.3HKUDS/ViMax

ViMax：一体化智能体视频生成框架，含导演、编剧等角色。

GitHub trending:python (+266★)#video-generation #agent #multimodal[Agent Harness]

6.0Personality Engineering with AI Agents: A New Methodology for Negotiation Research

AI智能体人格工程用于谈判研究。

ArXiv cs.AI#personality #negotiation #agent

6.0Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX

GPU加速麻将模拟器用于强化学习。

ArXiv cs.AI#reinforcement-learning #game #simulator

6.0Residual Skill Optimization for Text-to-SQL Ensembles

提出残差技能优化方法提升Text-to-SQL集成效果。

ArXiv cs.CL#text-to-sql #ensemble #llm

6.0When Cases Get Rare: A Retrieval Benchmark for Off-Guideline Clinical Question Answering

构建罕见病例检索基准，评估临床问答系统。

ArXiv cs.CL#retrieval #clinical-qa #benchmark[Evals]

6.0Does Slightly Mean Somewhat? Measuring Vague Intensity Words in LLM Numeric Actions

测量LLM在数值动作中模糊强度词的语义保留。

ArXiv cs.CL#llm #semantics #evaluation

6.0OpenBMB presents the model BitCPM-CANN 1.58 bit

OpenBMB发布1.58位模型BitCPM-CANN，适配华为Ascend 910B。

Reddit r/LocalLLaMA#quantization #openbmb #huawei

5.5Broadening Access to Transportation Safety Data with Generative AI: A Schema-Grounded Framework for Spatial Natural Language Queries

生成式AI扩展交通安全数据访问。

ArXiv cs.CL#generative-ai #transportation #nlp

5.0Live Human Detector on Outbound Phone Calls [R]

实时检测电话中人类语音的工具。

Reddit r/MachineLearning#audio #detection

5.0Looking for arXiv endorsement + sharing a preprint on homeostatic cognitive architecture for AI companions [R]

AI伴侣的稳态认知架构预印本。

Reddit r/MachineLearning#cognitive-architecture #ai-companion

> Engineering & Resources

8.7colbymchenry/codegraph

预索引代码知识图谱，减少AI编码agent的token和工具调用。

GitHub trending:all (+3684★)#knowledge-graph #ai-coding #code-index[Coding Agents]

8.3anthropics/claude-plugins-official

Anthropic官方Claude Code插件目录发布。

GitHub trending:all (+2549★)#claude #plugins #ai-coding[Coding Agents]

8.3NousResearch/hermes-agent

NousResearch发布Hermes Agent，一个可成长的AI代理框架。

GitHub trending:python (+1743★)#agent-framework #open-source[Agent Harness]

7.5Google I/O showed how the path for AI-driven science is shifting | MIT Technology Review

Google I/O展示AI驱动科学路径的转变，强调AI在科研中的应用。

technologyreview.com#ai-science #google #research

7.5BeeLlama v0.2.0 – major DFlash update. Single RTX 3090: Qwen 3.6 27B up to 164 tps (4.40x), Gemma 4 31B up to 177.8 tps (4.93x). Prompt processing speed near baseline.

BeeLlama v0.2.0发布，单卡RTX 3090推理速度提升4倍以上。

Reddit r/LocalLLaMA#inference #optimization #llama-cpp

7.5NuExtract3 released: open-weight 4B VLM for Markdown, OCR and structured extraction (self-hostable) [P]

NuExtract3发布：4B开源VLM，支持Markdown/OCR/结构化提取。

Reddit r/MachineLearning#vlm #ocr #open-source

7.5can1357/oh-my-pi

终端AI编码agent，支持哈希锚定编辑、LSP、Python等。

GitHub trending:all (+457★)#ai-coding #terminal #agent[Coding Agents]

7.5Lum1104/Understand-Anything

将代码转为交互式知识图谱，支持探索和问答。

GitHub trending:all (+1393★)#knowledge-graph #code-analysis #interactive

7.3DeepSeek makes the V4 Pro price discount permanent

DeepSeek将V4 Pro模型API价格永久降至原价的1/4。

HN (306)#deepseek #pricing #api[Model Release]

7.1ChromeDevTools/chrome-devtools-mcp

Chrome DevTools MCP，为编码agent提供浏览器调试能力。

GitHub trending:all (+501★)#mcp #devtools #ai-coding[Coding Agents][Tool Use]

7.0antirez/ds4

DeepSeek 4 Flash本地推理引擎，支持Metal。

Co-Starred#deepseek #local-inference #metal[Model Release]

6.5dotnet/skills

dotnet/skills仓库，帮助AI编码agent使用.NET和C#。

GitHub trending:all (+389★)#dotnet #ai-coding #skills[Coding Agents]

6.5The memory shortage is causing a repricing of consumer electronics

内存短缺导致消费电子产品重新定价，AI需求是主因。

Simon Willison#memory-shortage #consumer-electronics #ai-impact

6.5Ex-Google DeepMind Researcher Warns Benchmarks Won’t Save Us - Gizmodo

前DeepMind研究员警告：仅靠基准测试无法确保AI安全。

gizmodo.com#benchmark #ai-safety #opinion[Evals]

6.5We tried Google’s AI glasses and they’re almost there | TechCrunch

体验Google AI眼镜，评价接近成熟但仍有不足。

techcrunch.com#wearable #google #ai-glasses

6.5I fine-tuned Cohere Transcribe to support diarization and timestamps

微调Cohere Transcribe以支持说话人分离和时间戳。

Reddit r/LocalLLaMA#speech-recognition #fine-tuning #open-source

6.5Blackwell and PDL performance increase

llama.cpp新增对NVIDIA Blackwell PDL的支持，提升性能。

Reddit r/LocalLLaMA#llama-cpp #nvidia #inference

6.5facebookresearch/sam3

Meta发布SAM 3，最新分割一切模型，支持推理和微调。

GitHub trending:python (+63★)#segmentation #vision #meta[Model Release]

6.2Launch HN: Superset (YC P26) – IDE for the agents era

开源代理时代IDE Superset发布。

HN (79)#ide #agents #open-source[Coding Agents]

6.1microsoft/agent-governance-toolkit

微软AI代理治理工具包，含策略执行、沙箱等。

GitHub trending:python (+86★)#governance #security #agent[Agent Harness]

6.0MemTensor/MemOS

MemOS：LLM和AI代理的自我进化记忆系统，节省35% token。

GitHub trending:typescript (+59★)#memory #llm #token-efficiency[Context Engineering]

6.0Specialization Beats Scale: A Strategic Variable Most AI Procurement Decisions Overlook

观点：专业化优于规模，AI采购决策常忽略此变量。

Hugging Face#ai-procurement #specialization #opinion

6.0Qwen3.6-35B-A3B Q4 262k context on 8GB 3070 Ti = +30tps

Qwen3.6-35B-A3B在8GB显卡上实现262K上下文和30+ tps。

Reddit r/LocalLLaMA#qwen #quantization #long-context[Context Engineering]

6.0ByteShape Qwen3.6-35B-A3B: 30% faster than Unsloth IQ on 6GB VRAM laptop

ByteShape推出Qwen3.6-35B-A3B量化，比Unsloth IQ快30%。

Reddit r/LocalLLaMA#quantization #qwen #inference

6.0Experts first llama.cpp

llama.cpp实验性分支实现专家优先调度，针对12GB显存。

Reddit r/LocalLLaMA#llama-cpp #mixture-of-experts #optimization

6.0LQS v3.1 — an open methodology for rating AI training data (multi-oracle consensus + signed certificates) [P]

LQS v3.1：开放AI训练数据评级方法论。

Reddit r/MachineLearning#data-quality #methodology

6.0heygen-com/hyperframes

HeyGen Hyperframes：用HTML写视频，专为代理构建。

GitHub trending:typescript (+294★)#video-generation #html #agent

5.9abhigyanpatwari/GitNexus

GitNexus：浏览器内代码知识图谱引擎，零服务器。

GitHub trending:typescript (+239★)#knowledge-graph #code-analysis

5.7plastic-labs/honcho

Honcho：用于构建有状态AI代理的记忆库。

GitHub trending:python (+133★)#memory #agent #library[Context Engineering]

5.6google-labs-code/stitch-skills

Google Stitch Skills库，与MCP服务器配合的代理技能。

GitHub trending:typescript (+41★)#mcp #agent-skills #google[Agent Harness]

5.5Even If You Hate AI, You Will Use Google AI Search | WIRED

即使讨厌AI，用户仍将使用Google AI搜索，分析其不可避免性。

wired.com#google-search #ai-adoption #opinion

5.5Qwen3.6 27B Pure Quant: 40 tok/s on 16 GB VRAM

Qwen3.6 27B量化版在16GB显存上达到40 tok/s。

Reddit r/LocalLLaMA#qwen #quantization #inference

5.5G4-MeroMero-26B-A4B-it-uncensored-heretic Is Out Now, a Finetune of gemma-4-26B-A4B-it, With KLD of 0.0152 and 12/100 Refusals!

发布Gemma-4-26B-A4B的无审查微调版本，拒绝率低。

Reddit r/LocalLLaMA#fine-tuning #uncensored #gemma

5.3code-yeongyu/oh-my-openagent

Oh My OpenAgent：最佳代理框架，前身为oh-my-opencode。

GitHub trending:typescript (+159★)#agent-framework #open-source[Agent Harness]

5.2aiming-lab/AutoResearchClaw

AutoResearchClaw：全自动科研代理，从想法到论文。

GitHub trending:python (+73★)#research-agent #automation[Agent Harness]

5.2Open source Kanban desktop app that runs parallel agents on every card

开源看板桌面应用，每张卡片可运行并行AI代理。

HN (163)#kanban #agents #open-source[Agent Harness]

5.1awslabs/aidlc-workflows

AWS AI-DLC工作流：AI编码代理的自适应工作流规则。

GitHub trending:python (+25★)#coding-agent #workflow #aws[Coding Agents]

5.0Catch up on the Dialogues stage at Google I/O 2026.

Google I/O 2026 Dialogues阶段回顾，展示AI进展。

Google AI Blog#google #io-2026 #ai-showcase

5.0Qwen-27B-IQ4_KS for ik_llama.cpp, especially for NVIDIA with 16GB VRAM

为16GB NVIDIA显卡推出Qwen-27B的IQ4_KS量化版本。

Reddit r/LocalLLaMA#quantization #qwen #llama-cpp

[STATS] 79 items · 43 sources · Score >= 5.0