News

Prebuilt .whl for llama-cpp-python 0.3.8 — CUDA 12.8 acceleration with full Gemma 3 model support (Windows x64). This repository provides a prebuilt Python wheel (.whl) file for llama-cpp-python, ...
With apologies, development of this library has ceased except for bug fixes or behavioral issues. Pull requests for other changes are unlikely to be merged.