Closes #ISSUE
Problem:
- The status bar’s pending keystroke indicator (shown next to --NORMAL--
in Vim mode) didn’t clear when focus moved to another context, e.g.
hitting g in the editor then clicking the Git panel. The keymap state
correctly canceled the prefix, but observers that render the indicator
never received a “pending input changed” notification, so the UI kept
showing stale prefixes until a new keystroke occurred.
Fix:
- The change introduces a `pending_input_changed_queued` flag and a new
helper `notify_pending_input_if_needed` which will flushes the queued
notification as soon as we have an App context. The
`pending_input_changed` now resets the flag after notifying subscribers.
Before:
https://github.com/user-attachments/assets/7bec4c34-acbf-42bd-b0d1-88df5ff099aa
After:
https://github.com/user-attachments/assets/2264dc93-3405-4d63-ad8f-50ada6733ae7
Release Notes:
- Fixed: pending keybinding prefixes on the status bar now clear
immediately when focus moves to another panel or UI context.
---------
Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Conrad Irwin <conrad.irwin@gmail.com>
We recently added this `InlineCode` component but I'd forgotten that
many months ago I also introduced an `inline_code` method to the Label
component which does the same thing. That means we don't need a
standalone component at all!
Release Notes:
- N/A
First up: I'm sorry if this is a low quality PR, or if this feature
isn't wanted. I implemented this because I'd like to have this
behaviour. If you don't think that this is useful, feel free to close
the PR without comment. :)
My idea is this: I love to pull random models with Ollama to try them.
At the same time, not all of them are useful for coding, or some won't
work out of the box with the context_length set. So, I'd like to change
Zed's behaviour to not show me all models Ollama has, but to limit it to
the ones that I configure manually.
What I did is add an `auto_discover` field to the settings. The idea is
that you can write a config like this:
```json
"language_models": {
"ollama": {
"api_url": "http://localhost:11434",
"auto_discover": false,
"available_models": [
{
"name": "qwen3:4b",
"display_name": "Qwen3 4B 32K",
"max_tokens": 32768,
"supports_tools": true,
"supports_thinking": true,
"supports_images": true
}
]
}
}
```
The `auto_discover: false` means that Zed won't pick up or show the
language models that Ollama knows about, and will only show me the one I
manually configured in `available_models`. That way, I can pull random
models with Ollama, but in Zed I can only see the ones that I know work
(because I've configured them).
The default for `auto_discover` (when it is not explicitly set) is
`true`, meaning that the existing behaviour is preserved, and this is
not a breaking change for configurations.
Release Notes:
- ollama: Added `auto_discover` setting to optionally limit visible
models to only those manually configured in `available_models`
- Edit prediction providers can now be configured through the settings
UI
- Cleaned up the status bar menu to only show _configured_ providers
- Added to the status bar icon button tooltip the name of the active
provider
- Only display the data collection functionality under "Privacy" for the
Zed models
- Moved the Codestral edit prediction provider out of the Mistral
section in the agent panel into the settings UI
- Refined and improved UI and states for configuring GitHub Copilot as
both an agent and edit prediction provider
#### Todos before merge:
- [x] UI: Unify with settings UI style and tidy it all up
- [x] Unify Copilot modal `impl`s to use separate window
- [x] Remove stop light icons from GitHub modal
- [x] Make dismiss events work on GitHub modal
- [ ] Investigate workarounds to tell if Copilot authenticated even when
LSP not running
Release Notes:
- settings_ui: Added a section for configuring edit prediction providers
under AI > Edit Predictions, including Codestral and GitHub Copilot.
Once you've updated you can use the following link to open it:
zed://settings/edit_predictions.providers
---------
Co-authored-by: Ben Kunkle <ben@zed.dev>
## Closes#43887
## Release Notes:
### Problem
DeepSeek's reasoning mode API requires `reasoning_content` to be
included in assistant messages that precede tool calls. Without it, the
API returns a 400 error:
```
Missing `reasoning_content` field in the assistant message at message index 2
```
### Added/Fixed/Improved
- Add `reasoning_content` field to `RequestMessage::Assistant` in
`crates/deepseek/src/deepseek.rs`
- Accumulate thinking content from `MessageContent::Thinking` and attach
it to the next assistant/tool-call message
- Wire reasoning content through the language model provider in
`crates/language_models/src/provider/deepseek.rs`
### Testing
- Verified with DeepSeek Reasoner model using tool calls
- Confirmed reasoning content is properly included in API requests
Fixes tool-call errors when using DeepSeek's reasoning mode.
---------
Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
Closes#43598
Release Notes:
- bedrock: Added opt-in `allow_global` which enables global endpoints
- bedrock: Updated cross-region-inference endpoint and model list
- bedrock: Fixed Opus 4.5 access on Bedrock, now only accessible through the `allow_global` setting
This PR simplifies error and event handling by removing the
`Ok(LanguageModelCompletionEvent::Status(CompletionRequestStatus::Failed)))`
state from the stream returned by `LanguageModel::stream_completion()`,
by changing it into an `Err(LanguageModelCompletionError)`. This was
done by collapsing the valid `CompletionRequestStatus` values into
`LanguageModelCompletionEvent`.
Release Notes:
- N/A
---------
Co-authored-by: Michael Benfield <mbenfield@zed.dev>
This fixes various issues where rustfmt failed to format code due to too
long strings, most of which I stumbled across over the last week and
some additonal ones I searched for whilst fixing the others.
Release Notes:
- N/A
Automatically retry the agent's LLM completion requests when the
provider returns 429 Too Many Requests. Uses the Retry-After header to
determine the retry delay if it is available.
Many providers are frequently overloaded or have low rate limits. These
providers are essentially unusable without automatic retries.
Tested with Cerebras configured via openai_compatible.
Related: #31531
Release Notes:
- Added automatic retries for OpenAI-compatible LLM providers
---------
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Closes#42303
Ollama added tool call identifiers
(https://github.com/ollama/ollama/pull/12956) in its latest version
[v0.12.10](https://github.com/ollama/ollama/releases/tag/v0.12.10). This
broke our json schema and made all tool calls fail.
This PR fixes the schema and uses the Ollama provided tool call
identifier when available. We remain backwards compatible and still use
our own identifier with older versions of Ollama. I added a `TODO` to
remove the `Option` around the new field when most users have updated
their installations to v0.12.10 or above.
Note to reviewer: The fix to this issue should likely get cherry-picked
into the next release, since Ollama becomes unusable as an agent without
it.
Release Notes:
- Fixed tool calling when using the latest version of Ollama
This PR adds a new component to the `language_models` crate called
`ConfiguredApiCard`:
<img width="500" height="420" alt="Screenshot 2025-11-09 at 2 07@2x"
src="https://github.com/user-attachments/assets/655ea941-2df8-4489-a4da-bba34acf33a9"
/>
We were previously recreating this component from scratch with regular
divs in all LLM providers render function, which was redundant as they
all essentially looked the same and didn't have any major variations
aside from labels. We can clean up a bunch of similar code with this
change, which is cool!
Release Notes:
- N/A
Improve the layout and text display of API key configuration in multiple
language model providers to ensure proper text wrapping and ellipsis
handling when API URLs are long.
Before:
<img width="320" alt="image"
src="https://github.com/user-attachments/assets/2f89182c-34a0-4f95-a43a-c2be98d34873"
/>
After:
<img width="320" alt="image"
src="https://github.com/user-attachments/assets/09bf5cc3-07f0-47bc-b21a-d84b8b1caa67"
/>
Changes include:
- Add proper flex layout with overflow handling
- Replace truncate_and_trailoff with CSS text ellipsis
- Ensure consistent UI behavior across all providers
Release Notes:
- Improved API key configuration display in language model settings
Closes#40097
When multiple files are added sequentially to the agent panel, the
request JSON incorrectly includes "text" elements containing only
spaces. These empty elements cause the Zhipu AI API to return a "text
cannot be empty" error.
The fix filters out any "text" elements that are empty or contain only
whitespaces.
UI state when the error occurs:
<img width="300" alt="Image"
src="https://github.com/user-attachments/assets/c55e5272-3f03-42c0-b412-fa24be2b0043"
/>
Request JSON (causing the error):
```
{
"model": "glm-4.6",
"messages": [
{
"role": "system",
"content": "<<CUT>>"
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "[@1.txt](zed:///agent/file?path=C%3A%5CTemp%5CTest%5C1.txt)"
},
{ "type": "text", "text": " " },
{
"type": "text",
"text": "[@2.txt](zed:///agent/file?path=C%3A%5CTemp%5CTest%5C2.txt)"
},
{ "type": "text", "text": " describe" },
```
Release Notes:
- Fixed an issue when an OpenAI request contained whitespace-only text content
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
I am using an Azure OpenAI instance since that is what is provided at
work and with how they have it setup not all responses contain a delta,
which lead to errors and truncated responses. This is related to how
they are filtering potentially offensive requests and responses. I don't
believe this filter was made in-house, instead I believe it is provided
by Microsoft/Azure, so I suspect this fix may help other users.
Release Notes:
- N/A
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Closes#17524
This PR adds a button to the bottom right corner of the ollama settings
ui. It resets the available ollama models, also resets the "Connected"
state in the process. This means it can be used to check if the
connection is still valid as well. It's a question whether we should
clear the available models on ALL `fetch_models` calls, since these only
happen during auth anyway.
Ollama is a local model provider which means clicking the refresh button
often only flashes the "not connected" state because the latency of the
request is so low. This accentuates changes in the UI, however I don't
think there's a way around this without adding some rather cumbersome
deferred ui updates.
I've attached the refresh button to the "Connected" `ButtonLike`, since
I don't think automatic UI spacing should separate these elements. I
think this is okay because the "Connected" isn't actually something that
the user can interact with.
Before:
<img width="211" height="245" alt="image"
src="https://github.com/user-attachments/assets/ea90e24a-b603-4ee2-9212-2917e1695774"
/>
After:
<img width="211" height="250" alt="image"
src="https://github.com/user-attachments/assets/be9af950-86a2-4067-87a0-52034a80a823"
/>
Alternative approach: There was also a suggestion to simply add a entry
to the command palette, however none of the other providers have this
ability currently either so I went with this approach. The current
approach also makes it more discoverable to the user.
Release Notes:
- Added a button for refreshing available ollama models
---------
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Add support for GithubCopilot /responses endpoint. This gives the
copilot chat provider the ability to use the new GPT-5 codex model and
any other model that lacks support for /chat/copmletions endpoint.
Closes#38858
Release Notes:
- Add support for GithubCopilot /responses endpoint.
# Added
1. copilot_response.rs that has the /response endpoint types
2. uses response endpoint if model does not support /chat/completions.
3. new into_copilot_response() to map LanguageCompletionEvents to
Request.
4. new map_stream() to map response stream event to
LanguageCompletionEvents and tests.
5. Fixed a bug where trying to parse response for non streaming for
/chat/completion was failing
# Notes
There is a pr open - https://github.com/zed-industries/zed/pull/39989
for adding /response support for OpenAi and OpenAi compatible API.
Altough they share some similarities (copilot api seems to mirror openAi
directly) ive simplified some stuff and tried to keep it the same with
the vscode-chat implementation where possible. There might be a case for
code reuse but i think keeping them separate for now should be ok.
# Tool Calls
<img width="716" height="670" alt="Screenshot from 2025-10-15 17-12-30"
src="https://github.com/user-attachments/assets/14e88a52-ba8b-4209-8f78-73d15034b1e0"
/>
# Image
<img width="923" height="494" alt="Screenshot from 2025-10-21 02-02-26"
src="https://github.com/user-attachments/assets/b96ce97c-331e-45cb-b5b1-7aa10ed387b4"
/>
In the process of adding pickers for the theme and icon themes fields in
the settings UI, I felt like there was an improvement opportunity in
regards to where some of these components are stored. The `ui_input`
crate originally was meant only for the text field-like component, which
couldn't be in the regular `ui` crate due to the dependency with
`editor`. Given we had also added the number field there—which is
similar in also having the same dependency—it made sense to think of
this crate more like a home for form-like components rather than for
only one component.
However, we were also storing some settings UI-specific stuff in that
crate, which didn't feel right. So I ended up creating a new directory
within the `settings_ui` for components and moved all the pickers and
the custom input field there. I think this makes it for a cleaner
structure.
Release Notes:
- settings_ui: Added the ability to search for theme and icon themes in
their respective fields.
We've been considering removing workspace-hack for a couple reasons:
- Lukas ran into a situation where its build script seemed to be causing
spurious rebuilds. This seems more likely to be a cargo bug than an
issue with workspace-hack itself (given that it has an empty build
script), but we don't necessarily want to take the time to hunt that
down right now.
- Marshall mentioned hakari interacts poorly with automated crate
updates (in our case provided by rennovate) because you'd need to have
`cargo hakari generate && cargo hakari manage-deps` after their changes
and we prefer to not have actions that make commits.
Currently removing workspace-hack causes our workspace to grow from
~1700 to ~2000 crates being built (depending on platform), which is
mainly a problem when you're building the whole workspace or running
tests across the the normal and remote binaries (which is where
feature-unification nets us the most sharing). It doesn't impact
incremental times noticeably when you're just iterating on `-p zed`, and
we'll hopefully get these savings back in the future when
rust-lang/cargo#14774 (which re-implements the functionality of hakari)
is finished.
Release Notes:
- N/A
Previously we were guessing the context window size here:
8c3f09e31e/crates/ollama/src/ollama.rs (L22)
This is inaccurate and must be updated manually. This PR ensures that we
extract the context window size from the request in the same way that
the Ollama CLI does when running `ollama show <model-name>` (Relevant
code is
[here](3d32249c74/cmd/cmd.go (L860)))
The format looks like this:
```json
{
"model_info": {
"general.architecture": "llama",
"llama.context_length": 132000
}
}
```
Once this PR is merged we could technically remove the old code
8c3f09e31e/crates/ollama/src/ollama.rs (L22)
I decided to keep it for now, as it is unclear if the necessary fields
are available via the API on older Ollama versions.
Release Notes:
- Fixed an issue where Ollama models would use the wrong context window
size
Release Notes:
- Added Codestral edit predictions provider which can be enabled by adding an API key in the Mistral section of agent settings.

## Config
Get API key from https://console.mistral.ai/codestral and add it in the Mistral section of the agent settings.
```
"features": {
"edit_prediction_provider": "codestral"
},
"edit_predictions": {
"codestral": {
"model": "codestral-latest",
"max_tokens": 150
}
},
```
---------
Co-authored-by: Michael Sloan <michael@zed.dev>
Before this change the active theme and icon theme were retrofitted onto
the ThemeSettings.
Now they're in their own new global (GlobalTheme::theme(cx) and
GlobalTheme::icon_theme(cx))
This lets us remove cx from the settings traits, and tidy up a few other
things along the way.
Release Notes:
- N/A
After the change, we can add "supports_images", "supports_tools" and
"parallel_tool_calls" properties to set up new models. Our
`settings.json` will be as follows:
```json
"language_models": {
"x_ai": {
"api_url": "https://api.x.ai/v1",
"available_models": [
{
"name": "grok-4-fast-reasoning",
"display_name": "Grok 4 Fast Reasoning",
"max_tokens": 2000000,
"max_output_tokens": 64000,
"supports_tools": true,
"parallel_tool_calls": true,
},
{
"name": "grok-4-fast-non-reasoning",
"display_name": "Grok 4 Fast Non-Reasoning",
"max_tokens": 2000000,
"max_output_tokens": 64000,
"supports_images": true,
}
]
}
}
```
Closes https://github.com/zed-industries/zed/issues/38752
Release Notes:
- xAI: Added support for for configuring tool and image support for
custom model configurations
This PR adds an `x-zed-client-supports-x-ai` header to the `GET /models`
request sent to Cloud to indicate that the client supports xAI models.
Release Notes:
- N/A
The `Duration` argument in `get_models` has been unused for over a year.
The `complete` function is also unused and it has fallen behind in new
feature additions such as Authorization support. This used to exist
because ollama didn't support tools in streaming mode, `with_tools` also
existed because of that. Now however there is no reason to keep this
around.
`ChatResponseDelta ` had unnecessary `#[allow(unused)]` macros since the
fields are marked `pub`. Using `#[expect(unused)]` would've caught this.
Release Notes:
- N/A