andersch.dev
dandersch
contact [at} andersch {dot) dev
article
project
other
wiki
<2024-11-26 Tue>
[
ai
]
LLM Inference
LLM Inference Software
llama.cpp
- LLM inference in C/C++
vLLM
- Inference and serving engine for LLMs
ExLlamaV2
- Library for local inference on modern consumer-class GPUs
Resources
Everything I've learned so far about running local LLMs