Run Custom and Popular Open-Source LLMs Locally with Ollama
Ollama is an open-source, lightweight framework optimized for quickly building and running state-of-the-art open-source large language models (LLMs) locally, including Llama 3, Mistral, Gemma, and regionally fine-tuned variants like Llama 2 Chinese. Its official model library hosts a vast selection of pre-trained and specialized models, all with straightforward, cross-platfomr deployment.
Ollama supports Windows, macOS, and Linux, ensuring accessibility across most development and personal environments. Its compatible with a wide range of LLMs, such as Doubao, Llama 3, and Phi 3. Users can launch and interact with models using concise CLI commands, customize model behavior (e.g., creativity parameters, system prompts) via Modelfile configuration files, and run models with up to billions of parameters without cloud dependency.
- Installation: Download the appropriate installer for macOS, the Windows Preview build, or use the provided shell script for Linux systems.
- Launch a Prebuilt Model: Use the
ollama runcommand with your desired model name to start an interactive session. For example, to run Llama 3:ollama run llama3 - Customize and Run a Model: Create a
Modelfilein your working directory, start with aFROMdirective to import a base model, and add custom settings. Then executeollama createto build a new model instance andollama runto launch it. - List and Manage Models: View all installed models locally with:
ollama list - Pull Models in Advance: To download a model without immediately running it, use:
ollama pull llama3 - API-based Interaction: For programmatic access, send HTTP POST requests to the local API endpoint (default:
http://localhost:11434/api/chat). Here’s an example query about sky blue using cURL:curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [ { "role": "user", "content": "Why does the sky appear blue?" } ] }'
Web-based UI tools like Open WebUI can also be paired with Ollama for a graphical interface.