andersch.dev

<2024-11-26 Tue>
[ ai ]

LLM Inference

LLM Inference Software

  • llama.cpp - LLM inference in C/C++
  • vLLM - Inference and serving engine for LLMs
  • ExLlamaV2 - Library for local inference on modern consumer-class GPUs

Resources