[Suggestion] Ability to send num_ctx parameter to Ollama to change Context Window size

By default, ollama will use the `num_ctx` set in a modelfile parameters, or fall back to a low value between 1k and 8k. I think the default depends on how ollama is used (cli vs api).

In a chat, I can change the context window with `/set parameter num_ctx 131072` for much higher memory usage and full context of `llama3.2`.
In the API, the `options` object can take a `num_ctx` https://ollama.readthedocs.io/en/api/#request_7 .

For some tasks, we want a much higher context window.

### Workaround

The currently available option is to create a new model with `num_ctx`, like installing a new Modelfile or running `/set parameter num_ctx 20000` followed by `/save llama3.2-20k_ctx` .
Or set the global default when starting ollama, with environment variable `OLLAMA_CONTEXT_LENGTH=20000`.

ollama logs:
```text
llama_context: constructing llama_context
llama_context: n_seq_max     = 1
llama_context: n_ctx         = 131072
llama_context: n_ctx_per_seq = 131072
llama_context: n_batch       = 512
llama_context: n_ubatch      = 512
llama_context: causal_attn   = 1
llama_context: flash_attn    = 1
llama_context: freq_base     = 500000.0
llama_context: freq_scale    = 1
```


---

## Details and notes

Enable debug logging
```sh
OLLAMA_DEBUG=1 ollama serve
```


Ollama will log during model loading, pay attention to `runner.num_ctx=8192`:

```text
time=2025-06-17T15:46:27.259+02:00 level=DEBUG source=sched.go:495 msg="finished setting up" runner.name=registry.ollama.ai/library/llama3.2:latest runner.inference=metal runner.devices=1 runner.size="3.3 GiB" runner.vram="3.3 GiB" runner.parallel=2 runner.pid=36317 runner.model=/Users/kristian/.ollama/models/blobs/sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff runner.num_ctx=8192
```


```text
llama_context: constructing llama_context
llama_context: n_seq_max     = 2
llama_context: n_ctx         = 8192
llama_context: n_ctx_per_seq = 4096
llama_context: n_batch       = 1024
llama_context: n_ubatch      = 512
llama_context: causal_attn   = 1
llama_context: flash_attn    = 1
llama_context: freq_base     = 500000.0
llama_context: freq_scale    = 1
llama_context: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
```

If I do `/set parameter num_ctx 5` in a `ollama run llama3.2:latest` chat, I get a stupid assistant and this log:

```text
time=2025-06-17T15:50:28.050+02:00 level=DEBUG source=sched.go:495 msg="finished setting up" runner.name=registry.ollama.ai/library/llama3.2:latest runner.inference=metal runner.devices=1 runner.size="2.8 GiB" runner.vram="2.8 GiB" runner.parallel=2 runner.pid=37551 runner.model=/Users/kristian/.ollama/models/blobs/sha256-dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff runner.num_ctx=10
```

```text
llama_context: constructing llama_context
llama_context: n_batch is less than GGML_KQ_MASK_PAD - increasing to 64
llama_context: n_seq_max     = 2
llama_context: n_ctx         = 10
llama_context: n_ctx_per_seq = 5
llama_context: n_batch       = 64
llama_context: n_ubatch      = 64
llama_context: causal_attn   = 1
llama_context: flash_attn    = 1
llama_context: freq_base     = 500000.0
llama_context: freq_scale    = 1
llama_context: n_ctx_per_seq (5) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
```

Also lots of warnings:
```text
time=2025-06-17T15:50:29.721+02:00 level=DEBUG source=cache.go:240 msg="context limit hit - shifting" id=0 limit=5 input=5 keep=4 discard=1
```

See also https://github.com/ollama/ollama/issues/2714

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Suggestion] Ability to send num_ctx parameter to Ollama to change Context Window size #295

Workaround

Details and notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Suggestion] Ability to send num_ctx parameter to Ollama to change Context Window size #295

Description

Workaround

Details and notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions