Local AI Coding Setup: Continue, Ollama & Devstral / DeepSeek Coder

Auteur: Siu-Ho Fung

May 12, 2025

Want a better coding experience with local models? Try Continue - a code assistant extension for Visual Studio Code. Combined with Ollama and advanced models like Devstral / DeepSeek Coder, it gives you code suggestions, edits, and chat - all without sending your source code online.

Personal note:
For those looking for a robust free alternative to Cursor, I personally use a combination of Continue, Ollama, and Devstral 24B / DeepSeek Coder 33B. I find that DeepSeek Coder is more capable and gives better results. This setup works very well for me in my daily development workflow.

Everything runs locally, so your code stays private and secure - no cloud dependency.

It brings Cursor-like features to your existing VS Code setup.

Whether you're coding in Python, C++, or JavaScript, PHP, this setup provides smart completions, code edits, and chat-based AI assistance, all running locally and offline.

📊 Roo Code vs. Continue – Feature Comparison

Feature	Roo Code	Continue
VS Code Extension	✅ Yes	✅ Yes
Local Model Support	✅ Yes (via Ollama or local APIs)	✅ Yes (via Ollama)
Cloud Model Support	✅ Yes (e.g. OpenAI, Anthropic, DeepSeek)	✅ Yes (e.g. OpenAI, Anthropic)
Chat Interface	✅ Yes	✅ Yes
Autocomplete Integration	✅ Yes	✅ Yes (tight integration)
Code Edit Suggestions	✅ Yes	✅ Yes
Multi-model Configuration	✅ Yes (custom prompt modes)	✅ Yes (`continue.config.yaml`)
File System Access	✅ Yes	✅ Yes
Terminal Command Execution	✅ Yes	✅ Yes
Browser Automation	✅ Yes	❌ No
Offline Use	✅ Fully supported	✅ Fully supported
Open Source	✅ Yes	✅ Yes
Ease of Setup	✅ Easy (VS Code Marketplace)	✅ Easy (VS Code Marketplace)
Customization	✅ High (Modes, commands, routing)	✅ High (YAML config for models and roles)
Community Size	🟡 Medium (growing)	🟢 Large (mature VS Code ecosystem)

✅ Roo Code – Pros and Cons

Pros

Works with both local and cloud-based LLMs.
Supports browser automation and custom assistant commands.
Highly customizable prompt modes (e.g., Debug, Code, Architect).
Terminal and file access for AI agents.
Clean UI within VS Code with multi-tab chat.

Cons

Slightly heavier footprint compared to Continue.
Some features (e.g., agent commands) may be overkill for basic dev tasks.
Smaller community and less documentation than Continue.

✅ Continue – Pros and Cons

Pros

Lightweight and tightly integrated into VS Code interface.
Seamless autocomplete, chat, and code edit experience.
Optimized for local model usage with continue.config.yaml.
Larger community and robust documentation.
Fast setup via Marketplace + Ollama.

Cons

Lacks browser automation and advanced multi-modal tools.
No native agent system for autonomous tasks.
Requires manual config for advanced multi-model workflows.

💡 Recommendation: Use Continue if you want a focused, efficient AI coding assistant tightly integrated into VS Code with excellent local model support.
Choose Roo Code if you prefer an assistant-style developer tool that can automate browser tasks, run terminal commands, and adapt to diverse workflows.

🛠️ How to Install Continue in VS Code

1. Install the Continue Extension

Open VS Code.
Go to the Extensions view (Ctrl+Shift+X or Cmd+Shift+X).
Search for Continue and install the one by Continue Dev.

Or use the terminal:

code --install-extension Continue.continue

2. Install Ollama

Ollama is a local model runner that works seamlessly with Continue.

🪟 On Windows

Download the installer from https://ollama.com
Follow the setup instructions.
Open a terminal and run:
```
ollama run llama3
```

🍎 On macOS

brew install ollama
ollama run llama3

3. Use "Better" Models (e.g., Devstral)

Ollama supports many high-performance open-source models. To pull and run Devstral 24B:

ollama pull devstral:24b-small-2505-q4_K_M
ollama pull deepseek-coder:6.7b

Then, test it:

ollama run devstral:24b-small-2505-q4_K_M

⚠️ System Requirements: To run this model, a GPU with at least 24 GB of VRAM is recommended, such as an NVIDIA RTX 3090 or higher. If you use a model variant with fewer parameters or quantization, 16 GB of VRAM may be sufficient in some cases.

This model is well-suited for chat, code editing, and longer prompts due to its extended context window.

⚙️ Optional: Advanced Configuration with `continue.config.yaml`

For deeper integration and multi-model control, you can create a custom config file. This allows Continue to:

Assign specific models to tasks like chat, edit, or autocomplete
Increase token limits for models with long context windows
Add semantic search and embeddings with models like nomic-embed-text

📄 Example `continue.config.yaml`

Save this in the root of your workspace or inside a .continue/ folder:

name: Local Assistant
version: 1.0.0
schema: v1models:
- name: Devstral Small 24B (Q4_K_M)
provider: ollama
model: devstral:24b-small-2505-q4_K_M
maxTokens: 131072 # explicitly set
contextLength: 128000 # if supported by Continue
roles:
- chat
- edit
- apply- name: DeepSeek-Coder 6.7B
provider: ollama
model: deepseek-coder:6.7b
maxTokens: 16384
contextLength: 16384
roles:
- autocomplete- name: Nomic Embed
provider: ollama
model: nomic-embed-text:latest
roles:
- embedcontext:
- provider: code
- provider: docs
- provider: diff
- provider: terminal
- provider: problems
- provider: folder
- provider: codebase

📌 Tip: After editing the config, restart the Continue extension or reload VS Code.

🚀 Benefits of Continue + Ollama

✅ Local-first: No cloud dependency
✅ Multi-model support: Use specialized models for specific roles
✅ Fast: Optimized for real-time dev workflows
✅ Customizable: YAML-based configuration
✅ Offline: Ideal for secure environments

By combining Continue with Ollama and powerful open-source models like Devstral en Qwen3, you unlock a next-level AI coding experience - all within VS Code and on your own machine.

🤝 Combining Roo Code and Continue

While Roo Code and Continue each have their own strengths, you can also use them together to create a powerful and flexible AI coding environment.

Use Continue for tightly integrated autocomplete, inline code edits, and multi-model roles within VS Code.
Use Roo Code as a complement for tasks like browser automation, terminal commands, and structured prompts via custom modes.

By installing both tools side by side in Visual Studio Code, you benefit from:

The best of fast, contextual coding assistance (Continue)
Advanced AI agent functionality and automation (Roo Code)

💡 Note: Make sure your configuration files (such as continue.config.yaml and Roo’s settings) don’t conflict. Both tools run independently and can function in parallel. Sometimes it is also necessary to adjust the key combinations due to conflicts.

10 Coding Models for Local Use with Ollama (May 2025)

Model Name	Precise Model Identifier	Notes
DevStral 7B	`devstral:7b`
DeepSeek Coder 6.7B	`deepseek-coder:6.7b`
DeepSeek Coder 33B Q4	`deepseek-coder:33b-instruct-q4_K_M`	Personal favorite
CodeLlama 13B Instruct	`codellama:13b-instruct`
CodeLlama 34B Instruct	`codellama:34b-instruct`	VRAM-intensive; feasible with quantization
Phind-CodeLlama-34B-v2	`phind-codellama:34b-v2`
WizardCoder 34B	`wizardcoder:34b`	With Q4 or Q5 quantization
Mixtral 8x7B Instruct	`mixtral:8x7b-instruct`
StarCoder 15B	`starcoder:15b`	Manual setup required
aiXcoder 7B	`aixcoder:7b`	Manual setup required

Start building smarter, faster, and more securely, right from your editor.

Schrijf in voor onze Nieuwsbrief

Hebt u vragen of hulp nodig? Wij helpen u graag.

15+ jaar ervaring • Preferred partner van Dell, HPE & Supermicro en meer • Advies op maat binnen 1 werkdag • Snelle levering & installatie • Wereldwijde 24/7 onsite support • Laagste prijsgarantie

Leverancier van betrouwbare serveroplossingen en opslag. Systeemintegratie van servers en opslag van fabrikanten zoals Supermicro, ASUS, NetApp, HPE, Dell, GIGABYTE, ASRock, Western Digital, Seagate , Micron, Chenbro, Toshiba. Wij bieden wereldwijde levering, waaronder Amsterdam, Brussel, Parijs, Madrid, Rome, Amerika, Dubai en meer.