andersch.dev

contact [at} andersch {dot) dev

<2024-11-26 Tue>

[ ai ]

LLM Inference

LLM Inference Software

llama.cpp - LLM inference in C/C++
vLLM - Inference and serving engine for LLMs
ExLlamaV2 - Library for local inference on modern consumer-class GPUs

Resources

Everything I've learned so far about running local LLMs