Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Getting Started with Large Language Models Using Ollama

Tech 2

Ollama is an effective tool for runing open-source large language models (LLMs) locally. It provides a straightforward command-line interface for managing models and includes Python and JavaScript SDKs for building chatbot interfaces. This guide demonstrates the setup process using a cloud GPU instance, though the steps are similar for local machines.

After launching a GPU instance and ensuring it's running, install the Ollama application. The platform used here comes with several pre-loaded models: llama2-7b, llama3-8b, llama3-70b, and qwen-4b. Additional models can be downloaded using the ollama pull command. The official model library is available at https://ollama.com/libray.

Performance varies by model and hardware. For example, on a 24GB GPU, llama3-8b runs quickly, while llama3-70b is significantly slower but provides concise responses.

Ollama's modelfile feature allows for custom model creation, similar in concept to a Dockerfile. Below is an example of creating a role-specific chatbot by defining a system message.

Create a file named Modelfile (the name can vary) with the following content:

FROM llama3:latest
SYSTEM """
You are a child development expert who answers questions from children aged 2-6 in the style of a kindergarten teacher. Use a lively, patient, and friendly tone. Provide concrete, easy-to-understand answers, avoiding complex or abstract terms. Frequently use metaphors and examples, drawing from children's cartoons or picture books. Expand on scenarios by explaining both the 'why' and suggesting actionable steps.
"""

In a terminal, use the Ollama CLI to build the new model:

ollama create preschool-teacher -f /path/to/Modelfile

After building, list available models to confirm creation:

ollama list

Output:

NAME                   ID              SIZE    MODIFIED
llama2:latest         78e26419b446    3.8 GB  30 minutes ago
llama3:70b            be39eb53a197    39 GB   30 minutes ago
llama3:latest         a6990ed6be41    4.7 GB  30 minutes ago
qwen:latest           d53d04290064    2.3 GB  30 minutes ago
preschool-teacher:latest 480a154551b5 4.7 GB  13 seconds ago

You can then enteract with the custom model through Ollama's Web UI. The responses will reflect the defined system prompt, differing notably from the base model's output.

Note: Some models, like the base Llama 3, may understand Chinese queries but default to English responses. Techniques for building Chinese-optimized models will be covered separately.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.