⚠️

This page is not yet translated, showing English version

Ollama Local Model Setup Guide

Ollama is a powerful local large language model runner that lets you run open-source AI models on your own computer, completely free and privacy-preserving.

What is Ollama?

Ollama is an open-source project that simplifies running large language models locally:

One-click deployment: Simple commands to download and run models
Completely free: No API fees, just need a computer
Privacy protection: Data processed entirely locally, never uploaded to the cloud
Rich model library: Supports Llama, Mistral, Qwen, and many other open-source models

Installing Ollama

macOS

Option 1: Homebrew (Recommended)

brew install ollama

Option 2: Official Installer

Visit the Ollama download page
Download the macOS installer
Open the .dmg file and drag Ollama to your Applications folder

Windows

Visit the Ollama download page
Download the Windows installer (OllamaSetup.exe)
Run the installer and follow the prompts

Linux

One-line install script:

curl -fsSL https://ollama.com/install.sh | sh

Or manual installation:

See the Ollama GitHub for installation instructions.

Starting the Ollama Service

After installation, you need to start the Ollama service:

macOS/Linux:

ollama serve

Windows: After installation, Ollama typically runs automatically in the background. You can see its icon in the system tray.

Downloading Models

Use the ollama pull command to download models:

Recommended Models

# Llama 3.2 - Meta's latest open-source model
ollama pull llama3.2

# Llama 3.2 3B - Smaller version, faster
ollama pull llama3.2:3b

# Qwen 2.5 - Alibaba's open-source model, strong Chinese capability
ollama pull qwen2.5

# Mistral - European open-source model, excellent performance
ollama pull mistral

# DeepSeek Coder - Specialized for code generation
ollama pull deepseek-coder

Model Selection Guide

Model	Parameters	Memory Required	Features
llama3.2:3b	3B	4GB+	Fast, good for beginners
llama3.2	8B	8GB+	Balanced choice
qwen2.5:7b	7B	8GB+	Excellent Chinese capability
mistral	7B	8GB+	Strong reasoning
llama3.1:70b	70B	64GB+	Best performance

View Downloaded Models

ollama list

Configure in Chatbox

Step 1: Ensure Ollama is Running

Run this command in terminal to check:

curl http://localhost:11434

If it returns Ollama is running, the service is working.

Step 2: Open Chatbox Settings

Open the Chatbox app
Click the "Settings" entry in the bottom left
Select "AI Provider" or "Model Settings"

💡

On mobile, tap the menu icon (☰) in the top left to open the sidebar first, then tap "Settings".

Step 3: Add Ollama

Click "Add Provider"
Select "Ollama"
Set API Host: http://localhost:11434
Save settings

Step 4: Select a Model

Choose a downloaded model from the list
Or manually enter the model name (e.g., llama3.2)
Start chatting

Remote Connection Setup (Optional)

If you want to access Ollama on your computer from other devices (like phone, tablet):

Security Warning: Setting OLLAMA_HOST=0.0.0.0 exposes Ollama on all network interfaces. Ollama has no built-in authentication. Only use this on trusted local networks, and do not expose port 11434 to the public internet. Consider using a VPN or SSH tunnel for remote access.

Step 1: Configure Ollama Listen Address

macOS/Linux:

# Set environment variable
export OLLAMA_HOST=0.0.0.0

# Restart Ollama
ollama serve

Make it permanent: Add export OLLAMA_HOST=0.0.0.0 to ~/.bashrc or ~/.zshrc

Windows:

Add OLLAMA_HOST to system environment variables with value 0.0.0.0
Restart the Ollama service

Step 2: Open Firewall Port

macOS: Usually no extra configuration needed

Windows:

Open "Windows Firewall"
Select "Advanced Settings" → "Inbound Rules"
Create new rule, open TCP port 11434

Linux:

# Ubuntu/Debian
sudo ufw allow 11434

# CentOS/RHEL
sudo firewall-cmd --add-port=11434/tcp --permanent
sudo firewall-cmd --reload

Step 3: Connect from Chatbox

In Chatbox on other devices:

Add Ollama Provider
Set API Host to: http://your-computer-IP:11434
For example: http://192.168.1.100:11434

Common Issues

Model Download is Slow

Solutions:

Check network connection
Try downloading during off-peak hours
Use proxy or mirror

Out of Memory When Running Models

Solutions:

Use smaller model versions (e.g., :3b)
Close other memory-intensive programs
Add more RAM to your computer

Chatbox Cannot Connect to Ollama

Troubleshooting:

Confirm Ollama service is running
Check if API Host address is correct
Try accessing http://localhost:11434 in browser

Slow Model Response

Optimization tips:

Use smaller models (e.g., 3B parameters)
If you have a GPU, ensure Ollama is using GPU acceleration
Reduce conversation context length

How to Delete Models

ollama rm model-name
# For example
ollama rm llama3.2

Usage Tips

Choose models based on hardware:
- 8GB RAM: Use 3B-7B parameter models
- 16GB RAM: Can try 13B parameter models
- 32GB+ RAM: Can run larger models
Leverage GPU acceleration:
- NVIDIA GPU: Ollama uses it automatically
- Apple Silicon Mac: Uses Metal acceleration
Update models regularly:
```
ollama pull model-name
```
Explore the model library: Visit the Ollama model library to discover more models