Pedro Cuenca

Hugging Face

Recent Activity13 stars

Recent Activity

TypeScript⭐ 4·starred by pcuenca

Pi coding agent extension: llama.cpp provider with dynamic model + context window discovery

Highlights: Pi-llama is a coding agent extension that integrates llama.cpp as a provider, enabling dynamic model and context window discovery. It allows users to leverage local LLMs for coding tasks with flexible model selection and automatic context size adjustment.

Worth reading: This repo is worth exploring for developers interested in running local coding agents with customizable LLM backends, especially those using llama.cpp for on-device inference.

AgentLLMTooling

TeichAI/teich

Python⭐ 10·starred by pcuenca

Highlights: Teich is a Python library for building and managing AI agents with a focus on modularity and extensibility. It provides tools for agent orchestration, memory management, and tool integration, aiming to simplify the development of complex AI workflows.

Worth reading: As a new entrant in the agent-building space, Teich offers a fresh perspective on modular agent design, which could be valuable for developers looking to experiment with agent architectures.

AgentTooling

IBM/AssetOpsBench

Python⭐ 1,572·starred by pcuenca

AssetOpsBench - Industry 4.0

Highlights: AssetOpsBench is a benchmark for evaluating AI agents on Industry 4.0 asset operations tasks, such as predictive maintenance and anomaly detection. It provides realistic scenarios and metrics to assess agent performance in industrial settings.

Worth reading: It bridges the gap between AI agent research and real-world industrial applications, offering a standardized evaluation framework that is currently lacking.

AgentEvaluation

dacorvo/hf-mount-cache-examples

Shell⭐ 1·starred by pcuenca

Highlights: This repository provides examples for mounting Hugging Face model caches, enabling efficient reuse of downloaded models across environments. It focuses on Shell scripts for setup and configuration.

Worth reading: Worth exploring if you manage multiple HF model deployments and want to optimize storage and bandwidth by sharing cache directories.

InfraDeployment

ggml-org/llama.cpp

C++⭐ 110,678·starred by pcuenca

LLM inference in C/C++

Highlights: llama.cpp enables efficient LLM inference in C/C++ with minimal dependencies, supporting a wide range of models including LLaMA, Mistral, and GPT-2. It features quantization, GPU acceleration, and a lightweight server for local deployment.

Worth reading: As the de facto standard for local LLM inference, llama.cpp is essential for developers building on-device AI applications or exploring model optimization techniques.

ggml

InfraDeployment

abetlen/llama-cpp-python

Python⭐ 10,303·starred by pcuenca

Python bindings for llama.cpp

Highlights: llama-cpp-python provides Python bindings for llama.cpp, enabling efficient inference of LLMs on CPU and GPU. It supports quantization, GPU acceleration, and a wide range of model architectures, making it a key tool for local LLM deployment.

Worth reading: Essential for AI engineers deploying LLMs locally or in resource-constrained environments, offering a seamless Python interface to the high-performance llama.cpp backend.

InfraDeployment

ariG23498/trace-util

Python⭐ 2·starred by pcuenca

A utility script to upload pytorch traces to a Hugging Face Bucket, and then build sharable trace URL

Highlights: A utility script to upload PyTorch traces to a Hugging Face bucket and generate sharable trace URLs. Simplifies sharing and collaboration on model execution traces.

Worth reading: Useful for AI engineers who need to share PyTorch traces for debugging or collaboration, leveraging Hugging Face infrastructure.

Tooling

apocryphx/ObjCTokenizer

Objective-C⭐ 2·starred by pcuenca

Objective-C port of the tokenizer in HuggingFace's swift-transformers

Highlights: This repository provides an Objective-C port of HuggingFace's swift-transformers tokenizer, enabling tokenization for LLMs in iOS/macOS apps. It bridges the gap between Swift-based tokenizer implementations and Objective-C codebases, making it easier to integrate transformer models into legacy or mixed-language projects.

Worth reading: For developers working with LLMs in Apple ecosystems who need to tokenize text in Objective-C, this is a niche but practical tool that saves rewriting tokenization logic.

LLMTooling

merveenoyan/space-doctor

Python⭐ 1·starred by pcuenca

Highlights: Space Doctor is a tool that helps manage and optimize disk space on Hugging Face Hub repositories. It provides insights into storage usage and assists in cleaning up unnecessary files.

Worth reading: For AI practitioners using Hugging Face Hub, this tool can save time and prevent storage issues, making it a practical utility for managing model and dataset repositories.

Tooling

julien-c/hf-speedtest

Python⭐ 24·starred by pcuenca

How FastFast can you pull from Hugging Face?

Highlights: A simple Python script to benchmark download speeds from Hugging Face Hub, measuring how fast you can pull models and datasets. Useful for diagnosing network performance and optimizing CI/CD pipelines.

Worth reading: If you frequently download from Hugging Face, this tool helps identify speed bottlenecks and compare providers or regions.

hf-extension

InfraTooling

antirez/ds4

C⭐ 8,134·starred by pcuenca

DeepSeek 4 Flash local inference engine for Metal and CUDA

Highlights: ds4 is a lightweight, high-performance local inference engine for DeepSeek 4 Flash, supporting both Metal (Apple Silicon) and CUDA (NVIDIA GPUs). It offers efficient model execution with minimal dependencies, making it ideal for on-device AI applications.

Worth reading: This repo provides a practical, optimized solution for running DeepSeek 4 Flash locally, which is valuable for developers seeking to deploy LLMs on edge devices without cloud dependencies.

Deployment

huggingface/context-course

Python⭐ 12·starred by pcuenca

A course on context engineering with code agents.

Highlights: This repository offers a course on context engineering specifically for code agents, covering how to design prompts and manage context to improve agent performance. It includes hands-on code examples and practical guidance for building more effective AI agents.

Worth reading: It's a niche but practical resource for developers working on agent-based systems, providing actionable techniques to optimize context handling—a critical but often overlooked aspect of agent design.

AgentLLM

alvarobartt/dotfiles

Shell⭐ 5·starred by pcuenca

Opinionated Configuration Files

Highlights: This repository contains opinionated configuration files (dotfiles) for shell and development environments, likely including aliases, functions, and tool settings. It is a personal collection that may offer insights into an AI leader's workflow preferences.

Worth reading: While not directly AI-related, it provides a glimpse into the development environment setup of a notable AI figure, which can be useful for optimizing your own workflow.

Tooling

13 repos · All time