Running LLMs Locally with Ollama

Set up and run open-source LLMs on your local machine using Ollama. Supports Llama, Mistral, Phi, and more with simple CLI commands.

December 1, 2025 · 1 min · 207 words · BlogIA Team 1 min read

Overview

Ollama makes it easy to run large language models locally. No cloud API needed, full privacy, and works on Mac, Linux, and Windows.

Installation

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

# Or download from https://ollama.com/download

Running Your First Model

# Pull and run Llama 3.2
ollama run llama3.2

# Pull and run Mistral
ollama run mistral

# Pull and run a coding model
ollama run codellama

Available Models

Model	Size	Use Case
llama3.2	3B/8B	General purpose
mistral	7B	Fast, high quality
codellama	7B/13B	Code generation
phi3	3.8B	Efficient, Microsoft
gemma2	9B	Google’s open model
qwen2.5	7B	Multilingual

API Usage

import requests

response = requests.post('http://localhost:11434/api/generate', json={
    'model': 'mistral',
    'prompt': 'Explain machine learning in one paragraph',
    'stream': False
})
print(response.json()['response'])

Using with LangChain

from langchain_community.llms import Ollama

llm = Ollama(model="mistral")
response = llm.invoke("What is the capital of France?")
print(response)

Custom Models (Modelfile)

# Modelfile
FROM mistral
SYSTEM You are a helpful coding assistant specialized in Python.
PARAMETER temperature 0.7
PARAMETER num_ctx 4096

ollama create my-coder -f Modelfile
ollama run my-coder

Hardware Requirements

Model Size	RAM Required	GPU VRAM
3B	4 GB	4 GB
7B	8 GB	8 GB
13B	16 GB	16 GB
70B	64 GB	48 GB

Running LLMs Locally with Ollama

Overview

Installation

Running Your First Model

Available Models

API Usage

Using with LangChain

Custom Models (Modelfile)

Hardware Requirements

Key Resources

Why It Matters

BlogIA Team

💬 Comments