Commit Graph

57 Commits

Author SHA1 Message Date
Xiaobo Liu
225a2a8a20 google_ai: Refactor token count methods in Google AI (#45184)
The change simplifies the `max_token_count` and `max_output_tokens`
methods by grouping Gemini models with identical token limits.

Release Notes:

- N/A
2025-12-17 20:12:40 -06:00
Richard Feldman
27c5d39d28 Add Gemini 3 Flash (#45139)
Add support for the new Gemini 3 Flash model

Release Notes:

- Added support for Gemini 3 Flash model
2025-12-17 18:56:15 +00:00
Junseong Park
6404939427 google_ai: Update Gemini models (#43117)
Closes #43040

Release Notes:

- Remove the end-of-support Gemini 1.5 model from the options.
- Remove the older Gemini 2.0 model from the options.
- Please let me know if you think it's better to keep it, as it is still
a usable model.
- Update the incorrect amounts for some input/output tokens.
- Update the default model to Gemini 2.5 Flash-Lite.
- Rename variant `Gemini3ProPreview` to `Gemini3Pro`

When this PR is merged, users will be able to select the following
Gemini models.

- 2.5 Flash
- 2.5 Flash-Lite
- 2.5 Pro
- 3 Pro
2025-11-28 15:48:33 -05:00
Mikayla Maki
19d2532cf8 Update google_ai.rs (#43034)
Release Notes:

- N/A
2025-11-19 05:41:24 +00:00
Martin Bergo
7c0663b825 google_ai: Add gemini-3-pro-preview model (#43015)
Release Notes:

- Added the newly released Gemini 3 Pro Preview Model


https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/3-pro
2025-11-18 23:51:32 +00:00
Richard Feldman
c0fadae881 Thought signatures (#42915)
Implement Gemini API's [thought
signatures](https://ai.google.dev/gemini-api/docs/thinking#signatures)

Release Notes:

- Added thought signatures for Gemini tool calls
2025-11-18 10:41:19 -05:00
Julia Ryan
ef5b8c6fed Remove workspace-hack (#40216)
We've been considering removing workspace-hack for a couple reasons:
- Lukas ran into a situation where its build script seemed to be causing
spurious rebuilds. This seems more likely to be a cargo bug than an
issue with workspace-hack itself (given that it has an empty build
script), but we don't necessarily want to take the time to hunt that
down right now.
- Marshall mentioned hakari interacts poorly with automated crate
updates (in our case provided by rennovate) because you'd need to have
`cargo hakari generate && cargo hakari manage-deps` after their changes
and we prefer to not have actions that make commits.

Currently removing workspace-hack causes our workspace to grow from
~1700 to ~2000 crates being built (depending on platform), which is
mainly a problem when you're building the whole workspace or running
tests across the the normal and remote binaries (which is where
feature-unification nets us the most sharing). It doesn't impact
incremental times noticeably when you're just iterating on `-p zed`, and
we'll hopefully get these savings back in the future when
rust-lang/cargo#14774 (which re-implements the functionality of hakari)
is finished.

Release Notes:

- N/A
2025-10-17 18:58:14 +00:00
Conrad Irwin
fcdab160f9 Settings refactor (#38367)
Co-Authored-By: Ben K <ben@zed.dev>
Co-Authored-By: Anthony <anthony@zed.dev>
Co-Authored-By: Mikayla <mikayla@zed.dev>

Release Notes:

- settings: Major internal changes to settings. The primary user-facing
effect is that some settings which did not make sense in project
settings files are no-longer read from there. (For example the inline
blame settings)

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
Co-authored-by: Anthony <anthony@zed.dev>
2025-09-18 16:47:23 +00:00
Antonio Scandurra
39d86eeb7f Trim API key when submitting requests to LLM providers (#37082)
This prevents the common footgun of copy/pasting an API key
starting/ending with extra newlines, which would lead to a "bad request"
error.

Closes #37038 

Release Notes:

- agent: Support pasting language model API keys that contain newlines.
2025-08-28 12:00:44 +00:00
Piotr Osiewicz
05fc0c432c Fix a bunch of other low-hanging style lints (#36498)
- **Fix a bunch of low hanging style lints like unnecessary-return**
- **Fix single worktree violation**
- **And the rest**

Release Notes:

- N/A
2025-08-19 21:26:17 +02:00
Bennet Bo Fenner
6b6eb11643 agent2: Fix tool schemas for Gemini (#36507)
Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <agus@zed.dev>
2025-08-19 18:06:09 +00:00
Piotr Osiewicz
8f567383e4 Auto-fix clippy::collapsible_if violations (#36428)
Release Notes:

- N/A
2025-08-19 13:27:24 +00:00
Ben Brandt
0191f16ebc Update Gemini Models (#32902)
Updates google_ai to use latest model information from the respective
model cards: https://ai.google.dev/gemini-api/docs/models

Release Notes:

- google: Update to latest Gemini 2.5 models
2025-06-17 20:26:27 +00:00
Richard Feldman
5405c2c2d3 Standardize on u64 for token counts (#32869)
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.

Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.

I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.

Release Notes:

- N/A
2025-06-17 10:43:07 -04:00
Oleksiy Syvokon
04cd3fcd23 google: Add latest versions of Gemini 2.5 Pro and Flash Preview (#32183)
Release Notes:

- Added the latest versions of Gemini 2.5 Pro and Flash Preview
2025-06-05 19:30:34 +00:00
90aca
cf931247d0 Add thinking budget for Gemini custom models (#31251)
Closes #31243

As described in my issue, the [thinking
budget](https://ai.google.dev/gemini-api/docs/thinking) gets
automatically chosen by Gemini unless it is specifically set to
something. In order to have fast responses (inline assistant) I prefer
to set it to 0.

Release Notes:

- ai: Added `thinking` mode for custom Google models with configurable
token budget

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-06-03 13:40:20 +02:00
Fernando Freire
3077abf9cf google_ai: Parse thought parts in Gemini responses (#31925)
Fixes thinking Gemini models.

Closes #31902

Release Notes:

- Updated Google Gemini client to match the latest API
2025-06-03 10:37:06 +00:00
Ben Brandt
119beb210a Update default models to newer versions (#31415)
Follow up to: https://github.com/zed-industries/zed/pull/31209
Changes default models across multiple providers:
- Zed.dev Default Models in settings: claude-3-7-sonnet-latest →
claude-4-sonnet-latest
- Bedrock Default Model: Claude 3.5 Sonnet v2 → Claude Sonnet 4
- Google AI Default Fast Model: Gemini 1.5 Flash → Gemini 2.0 Flash

Release Notes:

- N/A
2025-05-27 10:54:42 +02:00
Kirill Bulatov
16366cf9f2 Use anyhow more idiomatically (#31052)
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.

The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.

The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.

* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`

Release Notes:

- N/A
2025-05-20 23:06:07 +00:00
Michael Sloan
76ad1a29a5 Add support for getting the token count for all parts of Gemini generation requests (#29630)
* `CountTokensRequest` now takes a full `GenerateContentRequest` instead
of just content.

* Fixes use of `models/` prefix in `model` field of
`GenerateContentRequest`, since that's required for use in
`CountTokensRequest`. This didn't cause issues before because it was
always cleared and used in the path.

Release Notes:

- N/A
2025-05-04 21:32:45 +00:00
Michael Sloan
edf78e770d Fix token counting requests in Gemini (#29643)
Release Notes:

- N/A
2025-04-30 04:55:07 +00:00
Michael Sloan
b4732235e3 Skip serializing None fields in Gemini API (#29632)
Release Notes:

- N/A
2025-04-29 19:03:01 -06:00
Michael Sloan
2beefc8158 Fix gemini model token limits (#29584)
Release Notes:

- N/A
2025-04-29 03:12:59 +00:00
Antonio Scandurra
3fdbc3090d Fix error when deserializing Gemini streams (#29470)
Sometimes Gemini would report `Content` without a `parts` field.

Release Notes:

- Fixed a bug that would sometimes cause Gemini models to fail streaming
their response.
2025-04-26 11:51:04 +00:00
Bennet Bo Fenner
cd365b0cf5 gemini: Fix issue when deserializing tool call (#29363)
Fixes a regression introduced in #29322

Release Notes:

- N/A

Co-authored-by: Agus Zubiaga <hi@aguz.me>
2025-04-24 18:19:05 +00:00
Marshall Bowers
f527df6fa1 google_ai: Remove list of supported countries (#29348)
This PR removes the list of supported countries from the `google_ai`
crate, as it is no longer referenced in this repo.

Release Notes:

- N/A
2025-04-24 15:04:45 +00:00
Nathan Sobo
8836c6fb42 Introduce LanguageModelToolUse::raw_input (#29322)
This is to enable alternative streaming solutions at the application
layer. I'm not sure we really should have performed parsing of the input
at this layer. Either way I want to experiment with streaming approaches
in a separate crate on a branch, and this will help.

/cc @maxdeviant @bennetbo @rtfeldman

Closes #ISSUE

Release Notes:

- N/A
2025-04-24 02:30:48 +00:00
Stephan Seidt
10ded0ab75 agent: Add support for google gemini 2.5 flash preview (#29205)
Adds support for the new gemini-2.5-flash-preview-04-17

Release Notes:

- agent: Added support for gemini-2.5-flash-preview
2025-04-22 09:37:12 +00:00
Michael Sloan
fbf7caf93e Default to fast model for thread summaries and titles + don't include system prompt / context / thinking segments (#29102)
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.

* Skips system prompt, context, and thinking segments.

* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.

Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.

Release Notes:

- N/A
2025-04-19 23:26:29 +00:00
Bennet Bo Fenner
ae47829fa8 agent: Fix system instructions typo (#28949)
See #28793, the name of the field is actually `systemInstruction` not
`systemInstructions`.

Release Notes:

- Fixed an issue where Gemini requests would fail
2025-04-17 08:51:05 +00:00
Bennet Bo Fenner
c7e80c80c6 gemini: Pass system prompt as system instructions (#28793)
https://ai.google.dev/gemini-api/docs/text-generation#system-instructions

Release Notes:

- agent: Improve performance of Gemini models
2025-04-15 19:45:47 +02:00
Marshall Bowers
a8b1ef3531 google_ai: Remove unused extract_text_from_events function (#28723)
This PR removes the `extract_text_from_events` function from
`google_ai`, as it was not used anywhere.

Release Notes:

- N/A
2025-04-14 22:01:21 +00:00
Bennet Bo Fenner
97abf21a28 agent: Add support for Google Gemini 2.5 preview (#28326)
Adds support for the new `gemini-2.5-pro-preview-03-25`

Release Notes:

- Added support for `gemini-2.5-pro-preview-03-25` in the assistant
2025-04-08 15:00:23 +00:00
Julia Ryan
01ec6e0f77 Add workspace-hack (#27277)
This adds a "workspace-hack" crate, see
[mozilla's](https://hg.mozilla.org/mozilla-central/file/3a265fdc9f33e5946f0ca0a04af73acd7e6d1a39/build/workspace-hack/Cargo.toml#l7)
for a concise explanation of why this is useful. For us in practice this
means that if I were to run all the tests (`cargo nextest r
--workspace`) and then `cargo r`, all the deps from the previous cargo
command will be reused. Before this PR it would rebuild many deps due to
resolving different sets of features for them. For me this frequently
caused long rebuilds when things "should" already be cached.

To avoid manually maintaining our workspace-hack crate, we will use
[cargo hakari](https://docs.rs/cargo-hakari) to update the build files
when there's a necessary change. I've added a step to CI that checks
whether the workspace-hack crate is up to date, and instructs you to
re-run `script/update-workspace-hack` when it fails.

Finally, to make sure that people can still depend on crates in our
workspace without pulling in all the workspace deps, we use a `[patch]`
section following [hakari's
instructions](https://docs.rs/cargo-hakari/0.9.36/cargo_hakari/patch_directive/index.html)

One possible followup task would be making guppy use our
`rust-toolchain.toml` instead of having to duplicate that list in its
config, I opened an issue for that upstream: guppy-rs/guppy#481.

TODO:
- [x] Fix the extension test failure
- [x] Ensure the dev dependencies aren't being unified by Hakari into
the main dependencies
- [x] Ensure that the remote-server binary continues to not depend on
LibSSL

Release Notes:

- N/A

---------

Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-04-02 13:26:34 -07:00
Piotr Osiewicz
dc64ec9cc8 chore: Bump Rust edition to 2024 (#27800)
Follow-up to https://github.com/zed-industries/zed/pull/27791

Release Notes:

- N/A
2025-03-31 20:55:27 +02:00
Bennet Bo Fenner
c8a9a74e6a Add tool calling support for Gemini models (#27772)
Release Notes:

- N/A
2025-03-31 17:46:42 +02:00
Michael Sloan
7376c6f377 Add support for Gemini 2.5 Pro Experimental model (#27468)
Release Notes:

- Added support for Gemini 2.5 Pro Experimental model to Zed AI.

Co-authored-by: Wilhelm Klopp <wil.klopp@gmail.com>
2025-03-26 00:12:10 +00:00
Antonio Scandurra
f517050548 Partially fix assistant onboarding (#25313)
While investigating #24896, I noticed two issues:

1. The default configuration for the `zed.dev` provider was using the
wrong string for Claude 3.5 Sonnet. This meant the provider would always
result as not configured until the user selected it from the model
picker, because we couldn't deserialize that string to a valid
`anthropic::Model` enum variant.
2. When clicking on `Open New Chat`/`Start New Thread` in the provider
configuration, we would select `Claude 3.5 Haiku` by default instead of
Claude 3.5 Sonnet.

Release Notes:

- Fixed some issues that caused AI providers to sometimes be
misconfigured.
2025-02-24 07:29:55 +00:00
IaVashik
8114d17cba google_ai: Add support for Gemini 2.0 models (#24448)
Add support for the newly released Gemini 2.0 models from Google announced this new family of models earlier this week (2025-02-05).

Release Notes:

- Added support for Google's new Gemini 2.0 models.
2025-02-07 11:18:18 -05:00
João Marcos
5bd7eaa173 Solve 50+ cargo doc warnings (#24071)
Release Notes:

- N/A
2025-02-01 06:19:29 +00:00
Piotr Osiewicz
c9534e8025 chore: Use workspace fields for edition and publish (#23291)
This prepares us for an upcoming bump to Rust 2024 edition.

Release Notes:

- N/A
2025-01-17 17:39:22 +01:00
volt
799e81ffe5 google_ai: Add Gemini 2.0 Flash support (#22665)
Release Notes:

- Added support for Google's Gemini 2.0 Flash experimental model.

Note:

Weirdly enough the model is slow on small talk responses like 'hi' (in
my tests) but very fast on things that need more tokens like 'write me a
snake game in python'. Likely an API problem.

TESTED ONLY ON WINDOWS! Would test further but don't have Linux
installed and don't have an Mac. Will likely work everywhere.

Why?:

I think Gemini 2.0 Flash is incredibly good model at coding and
following instructions. I think it would be nice to have it in the
editor. I did as minimal changes as possible while adding the model and
streaming validation. I think it's worth merging the commits as they
bring good improvements.

---------

Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
2025-01-06 21:28:31 +00:00
Thorsten Ball
aee01f2c50 assistant: Remove low_speed_timeout (#20681)
This removes the `low_speed_timeout` setting from all providers as a
response to issue #19509.

Reason being that the original `low_speed_timeout` was only as part of
#9913 because users wanted to _get rid of timeouts_. They wanted to bump
the default timeout from 5sec to a lot more.

Then, in the meantime, the meaning of `low_speed_timeout` changed in
#19055 and was changed to a normal `timeout`, which is a different thing
and breaks slower LLMs that don't reply with a complete response in the
configured timeout.

So we figured: let's remove the whole thing and replace it with a
default _connect_ timeout to make sure that we can connect to a server
in 10s, but then give the server as long as it wants to complete its
response.

Closes #19509

Release Notes:

- Removed the `low_speed_timeout` setting from LLM provider settings,
since it was only used to _increase_ the timeout to give LLMs more time,
but since we don't have any other use for it, we simply remove the
setting to give LLMs as long as they need.

---------

Co-authored-by: Antonio <antonio@zed.dev>
Co-authored-by: Peter Tripp <peter@zed.dev>
2024-11-15 07:37:31 +01:00
Patrick Sy
966b18e142 assistant: Fix Gemini 1.5 Pro throwing "missing field 'index' at line N column M" (#20200)
Closes https://github.com/zed-industries/zed/issues/20033

- Fixed deserialization error of `GenerateContentCandidate` where `index` is unexpectedly nil
2024-11-04 17:01:08 -05:00
Conrad Irwin
e28496d4e2 Stop leaking isahc assumption (#18408)
Users of our http_client crate knew they were interacting with isahc as
they set its extensions on the request. This change adds our own
equivalents for their APIs in preparation for changing the default http
client.

Release Notes:

- N/A
2024-09-26 14:01:05 -06:00
Peter Tripp
fb9d01b0d5 assistant: Add display_name for OpenAI and Gemini (#17508) 2024-09-10 13:41:06 -04:00
Bennet Bo Fenner
f413ea90bf assistant: Fix Google AI provider not respecting low_speed_timeout_in_seconds (#17423)
Release Notes:

- Fixed an issue when using Google Gemini models, where the setting
`low_speed_timeout_in_seconds` was not respected
2024-09-05 18:16:30 +02:00
Marshall Bowers
cf5f4dddf5 Authorize access to language model providers based on country (#15859)
This PR updates the LLM service to authorize access to language model
providers based on the requester's country.

We detect the country using Cloudflare's
[`CF-IPCountry`](https://developers.cloudflare.com/fundamentals/reference/http-request-headers/#cf-ipcountry)
header.

The country code is then checked against the list of supported countries
for the given LLM provider. Countries that are not supported will
receive an `HTTP 451: Unavailable For Legal Reasons` response.

Release Notes:

- N/A
2024-08-06 11:49:04 -04:00
Antonio Scandurra
d6bdaa8a91 Simplify LLM protocol (#15366)
In this pull request, we change the zed.dev protocol so that we pass the
raw JSON for the specified provider directly to our server. This avoids
the need to define a protobuf message that's a superset of all these
formats.

@bennetbo: We also changed the settings for available_models under
zed.dev to be a flat format, because the nesting seemed too confusing.
Can you help us upgrade the local provider configuration to be
consistent with this? We do whatever we need to do when parsing the
settings to make this simple for users, even if it's a bit more complex
on our end. We want to use versioning to avoid breaking existing users,
but need to keep making progress.

```json
"zed.dev": {
  "available_models": [
    {
      "provider": "anthropic",
        "name": "some-newly-released-model-we-havent-added",
        "max_tokens": 200000
      }
  ]
}
```

Release Notes:

- N/A

---------

Co-authored-by: Nathan <nathan@zed.dev>
2024-07-28 11:07:10 +02:00
Marshall Bowers
02c43a5bf2 Add missing workspace lints (#15237)
This PR adds the missing workspace lint configuration for the following
crates that were missing it:

- `google_ai`
- `open_ai`
- `tab_switcher`

Release Notes:

- N/A
2024-07-25 19:52:24 -04:00