Run LLMs locally with Psyllama

Psyllama makes it easy to get up and running with large language models locally. With support for GGUF models and seamless llama.cpp integration.

🚀

Get started in minutes with our simple installation process and intuitive CLI interface.

🧠

Native support for GGUF models with llama.cpp integration for optimal performance.

🔧

OpenAI-compatible API makes it easy to integrate with your existing tools and applications.

🔒

Run models locally on your hardware. Your data stays private and secure.

⚡

Full CUDA support for maximum performance on NVIDIA GPUs.

🌐

Works on Windows, macOS, and Linux with consistent experience across platforms.

Quick start with Psyllama:

# Install Psyllama

curl -fsSL https://psyllama.com/install.sh | sh

# Pull and run a model

psyllama run kimi-k2.5:cloud

# Start chatting

psyllama chat kimi-k2.5:cloud

Download Psyllama Documentation GitHub