Home > Tech > Content

Getting Started with Large Language Models Using Ollama

Tech Apr 15 25

Ollama is an effective tool for runing open-source large language models (LLMs) locally. It provides a straightforward command-line interface for managing models and includes Python and JavaScript SDKs for building chatbot interfaces. This guide demonstrates the setup process using a cloud GPU instance, though the steps are similar for local machines.

After launching a GPU instance and ensuring it's running, install the Ollama application. The platform used here comes with several pre-loaded models: llama2-7b, llama3-8b, llama3-70b, and qwen-4b. Additional models can be downloaded using the ollama pull command. The official model library is available at https://ollama.com/libray.

Performance varies by model and hardware. For example, on a 24GB GPU, llama3-8b runs quickly, while llama3-70b is significantly slower but provides concise responses.

Ollama's modelfile feature allows for custom model creation, similar in concept to a Dockerfile. Below is an example of creating a role-specific chatbot by defining a system message.

Create a file named Modelfile (the name can vary) with the following content:

FROM llama3:latest
SYSTEM """
You are a child development expert who answers questions from children aged 2-6 in the style of a kindergarten teacher. Use a lively, patient, and friendly tone. Provide concrete, easy-to-understand answers, avoiding complex or abstract terms. Frequently use metaphors and examples, drawing from children's cartoons or picture books. Expand on scenarios by explaining both the 'why' and suggesting actionable steps.
"""

In a terminal, use the Ollama CLI to build the new model:

ollama create preschool-teacher -f /path/to/Modelfile

After building, list available models to confirm creation:

ollama list

Output:

NAME                   ID              SIZE    MODIFIED
llama2:latest         78e26419b446    3.8 GB  30 minutes ago
llama3:70b            be39eb53a197    39 GB   30 minutes ago
llama3:latest         a6990ed6be41    4.7 GB  30 minutes ago
qwen:latest           d53d04290064    2.3 GB  30 minutes ago
preschool-teacher:latest 480a154551b5 4.7 GB  13 seconds ago

You can then enteract with the custom model through Ollama's Web UI. The responses will reflect the defined system prompt, differing notably from the base model's output.

Note: Some models, like the base Llama 3, may understand Chinese queries but default to English responses. Techniques for building Chinese-optimized models will be covered separately.

Tags: ollama llm local-deployment

Back to List

Prev: Installing Visual Studio Code on Ubuntu Using Ubuntu Make

Next: Understanding Input Buffering and Character-Level I/O in C

Fading Coder

Getting Started with Large Language Models Using Ollama

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Getting Started with Large Language Models Using Ollama

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment