⚠️
This page is not yet translated, showing English version

Ollama Local Model Setup Guide

Ollama is a powerful local large language model runner that lets you run open-source AI models on your own computer, completely free and privacy-preserving.

What is Ollama?

Ollama is an open-source project that simplifies running large language models locally:

  • One-click deployment: Simple commands to download and run models
  • Completely free: No API fees, just need a computer
  • Privacy protection: Data processed entirely locally, never uploaded to the cloud
  • Rich model library: Supports Llama, Mistral, Qwen, and many other open-source models

Installing Ollama

macOS

Option 1: Homebrew (Recommended)

brew install ollama

Option 2: Official Installer

  1. Visit the Ollama download page
  2. Download the macOS installer
  3. Open the .dmg file and drag Ollama to your Applications folder

Windows

  1. Visit the Ollama download page
  2. Download the Windows installer (OllamaSetup.exe)
  3. Run the installer and follow the prompts

Linux

One-line install script:

curl -fsSL https://ollama.com/install.sh | sh

Or manual installation:

See the Ollama GitHub for installation instructions.

Starting the Ollama Service

After installation, you need to start the Ollama service:

macOS/Linux:

ollama serve

Windows: After installation, Ollama typically runs automatically in the background. You can see its icon in the system tray.

Downloading Models

Use the ollama pull command to download models:

Recommended Models

# Llama 3.2 - Meta's latest open-source model
ollama pull llama3.2

# Llama 3.2 3B - Smaller version, faster
ollama pull llama3.2:3b

# Qwen 2.5 - Alibaba's open-source model, strong Chinese capability
ollama pull qwen2.5

# Mistral - European open-source model, excellent performance
ollama pull mistral

# DeepSeek Coder - Specialized for code generation
ollama pull deepseek-coder

Model Selection Guide

ModelParametersMemory RequiredFeatures
llama3.2:3b3B4GB+Fast, good for beginners
llama3.28B8GB+Balanced choice
qwen2.5:7b7B8GB+Excellent Chinese capability
mistral7B8GB+Strong reasoning
llama3.1:70b70B64GB+Best performance

View Downloaded Models

ollama list

Configure in Chatbox

Step 1: Ensure Ollama is Running

Run this command in terminal to check:

curl http://localhost:11434

If it returns Ollama is running, the service is working.

Step 2: Open Chatbox Settings

  1. Open the Chatbox app
  2. Click the "Settings" entry in the bottom left
  3. Select "AI Provider" or "Model Settings"
💡
On mobile, tap the menu icon (☰) in the top left to open the sidebar first, then tap "Settings".

Step 3: Add Ollama

  1. Click "Add Provider"
  2. Select "Ollama"
  3. Set API Host: http://localhost:11434
  4. Save settings

Step 4: Select a Model

  1. Choose a downloaded model from the list
  2. Or manually enter the model name (e.g., llama3.2)
  3. Start chatting

Remote Connection Setup (Optional)

If you want to access Ollama on your computer from other devices (like phone, tablet):

Security Warning: Setting OLLAMA_HOST=0.0.0.0 exposes Ollama on all network interfaces. Ollama has no built-in authentication. Only use this on trusted local networks, and do not expose port 11434 to the public internet. Consider using a VPN or SSH tunnel for remote access.

Step 1: Configure Ollama Listen Address

macOS/Linux:

# Set environment variable
export OLLAMA_HOST=0.0.0.0

# Restart Ollama
ollama serve

Make it permanent: Add export OLLAMA_HOST=0.0.0.0 to ~/.bashrc or ~/.zshrc

Windows:

  1. Add OLLAMA_HOST to system environment variables with value 0.0.0.0
  2. Restart the Ollama service

Step 2: Open Firewall Port

macOS: Usually no extra configuration needed

Windows:

  1. Open "Windows Firewall"
  2. Select "Advanced Settings" → "Inbound Rules"
  3. Create new rule, open TCP port 11434

Linux:

# Ubuntu/Debian
sudo ufw allow 11434

# CentOS/RHEL
sudo firewall-cmd --add-port=11434/tcp --permanent
sudo firewall-cmd --reload

Step 3: Connect from Chatbox

In Chatbox on other devices:

  1. Add Ollama Provider
  2. Set API Host to: http://your-computer-IP:11434
  3. For example: http://192.168.1.100:11434

Common Issues

Model Download is Slow

Solutions:

  • Check network connection
  • Try downloading during off-peak hours
  • Use proxy or mirror

Out of Memory When Running Models

Solutions:

  • Use smaller model versions (e.g., :3b)
  • Close other memory-intensive programs
  • Add more RAM to your computer

Chatbox Cannot Connect to Ollama

Troubleshooting:

  1. Confirm Ollama service is running
  2. Check if API Host address is correct
  3. Try accessing http://localhost:11434 in browser

Slow Model Response

Optimization tips:

  • Use smaller models (e.g., 3B parameters)
  • If you have a GPU, ensure Ollama is using GPU acceleration
  • Reduce conversation context length

How to Delete Models

ollama rm model-name
# For example
ollama rm llama3.2

Usage Tips

  1. Choose models based on hardware:

    • 8GB RAM: Use 3B-7B parameter models
    • 16GB RAM: Can try 13B parameter models
    • 32GB+ RAM: Can run larger models
  2. Leverage GPU acceleration:

    • NVIDIA GPU: Ollama uses it automatically
    • Apple Silicon Mac: Uses Metal acceleration
  3. Update models regularly:

    ollama pull model-name
    
  4. Explore the model library: Visit the Ollama model library to discover more models