Python Bindings for Llama.cpp Library

Python bindings for llama.cpp provide a simple interface to integrate the llama.cpp library into Python projects. This package includes documentation, requirements, and installation instructions. It supports various hardware acceleration backends, allowing for faster inference. The high-level API offers a managed interface through the Llama class, enabling basic text completion and chat completions with pre-registered chat formats or custom chat handlers. The package also supports OpenAI compatible function and tool calling, multi-modal models for text and image processing, and speculative decoding for faster completions. Additionally, it offers a web server as a drop-in replacement for the OpenAI API, complete with Docker support for easy deployment. The low-level API provides direct bindings to the C API for advanced users. The package is actively developed, open to contributions, and licensed under the MIT license.

Andrei
5,001 to 10,000 stars
April 14, 2024
Python Bindings for Llama.cpp Library GitHub Page