Running Claude Code Locally Just Got Easier with ollama-code
- 4 min read

Back in May, I walked through how to run Claude Code locally by emulating Anthropic’s API and using a local LLM with the same prompt structure. It worked, but it was a little clunky. You had to manually map Claude’s behavior onto a local model, spin up your own fake endpoint, and tweak system prompts to get decent results. Now, thanks to ollama-code
, you don’t have to.
This repo makes it dead simple to replicate Claude Code’s experience using open models like codellama
, deepseek-coder
, or codestral
, locally via Ollama.
Here’s what changed.
What is ollama-code
?
ollama-code
is a lightweight wrapper around the Ollama ecosystem that mimics Claude Code’s linear, REPL-style development experience. It’s opinionated, and that’s a good thing.
- Handles prompt engineering and system message injection
- Loads your preferred code-capable model (like
codellama:34b
) - Maintains an in-memory “project context” for incremental development
- Optionally integrates with shell commands or interpreters
It’s built by @tcsenpai, who’s been shipping fast, useful dev tools. This one nails it.
Why It’s Better Than My Last Setup
The original local Claude Code workaround involved:
- Emulating Anthropic’s API with a local Flask server
- Crafting custom system prompts to replicate the Claude Code persona
- Manually managing conversation and code state
ollama-code
eliminates that overhead.
Instead of faking endpoints or building your own wrapper, you just:
git clone https://github.com/tcsenpai/ollama-code
cd ollama-code
ollama run codellama:34b
Then start coding with natural language.
It auto-handles:
- Role separation (user/system/assistant)
- Updating a virtual in-memory codebase
- Partial file edits based on your request
- Refactors vs. rewrites intelligently
This gives you a more Claude-like dev experience, the LLM behaves more like a smart pair programmer, not a chatbot.
Works With Your Local Setup
If you’re already using Ollama and have a model like codellama:34b
, deepseek-coder
, or codestral
, you’re good to go. No API keys. No Anthropic paywalls.
It supports multi-turn planning and context updates. You can say:
Add error handling to the
fetchData
function and log failures to a file.
…and it’ll adjust just the lines that matter. No full-file overwrites.
Use Cases
If you’re building a local dev agent or testing workflows, ollama-code
is a solid base:
- Replace Claude Code in your CLI workflow
- Prototype projects incrementally via REPL
- Integrate into automations (e.g. Raycast, n8n, Bash scripts)
- Fine-tune Claude-style prompts
And since it’s open source, you can tweak whatever you need, context handling, file routing, etc.
What’s Still Missing
It’s early, and there’s room to grow:
- No GUI or editor plugin (yet)
- Limited to one-file or basic multi-file memory
- Performance depends on your local Ollama model
Still, for a tool that removes the glue code between natural language and working software, it’s a leap forward.
Final Thoughts
Running Claude Code locally used to be a hack. Now it’s just ollama-code
.
If you’re building AI-assisted dev workflows or exploring local coding agents, this is one of the fastest ways to get started.