4. Skip to content

4. Running inference

How do we run inference on an open source LLM? The following table contains a (non-exhaustive) list of methods to interact with an open source LLM.

Projects

M1 MBP Tested

Build and run quantised LLMs via the command line.

GitHub

M1 MBP Tested

Serve and run any GGUF format LLM via the Ollama CLI.

Ollama

Projects

HF Transformers provides APIs and tools to easily run inference on LLMs available from the HF Hub.

Hugging Face 🤗

M1 MBP Tested

A framework for developing applications powered by LLMs.

LangChain

Projects

A toolkit for deploying and serving LLMs.

GitHub

M1 MBP Tested

A Gradio web UI for LLMs.

GitHub

An application to discover, download, and run local LLMs.

LM Studio

The Awesome-LLM repository also contains a useful list of tools for deploying LLMs.

Deploying tools