We've realized that a lot of the logic within an
`EditPredictionProvider` is not specific to a particular edit prediction
model / service. Rather, it is just the generic state management
required to perform edit predictions at all in Zed. We want to move to a
setup where there's one "built-in" edit prediction provider in Zed,
which can be pointed at different edit prediction models. The only logic
that is different for different models is how we construct the prompt,
send the request, and parse the output.
This PR also changes the behavior of the staff-only `zeta2` feature flag
so that in only gates your *ability* to use Zeta2, but you can still use
your local settings file to choose between different edit prediction
models/services: zeta1, zeta2, and sweep.
This PR also makes zeta1's outcome reporting and prediction-rating
features work with all prediction models, not just zeta1.
To do:
* [x] remove duplicated logic around sending cloud requests between
zeta1 and zeta2
* [x] port the outcome reporting logic from zeta to zeta2.
* [x] get the "rate completions" modal working with all EP models
* [x] display edit prediction diff
* [x] show edit history events
* [x] remove the original `zeta` crate.
Release Notes:
- N/A
---------
Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Ben Kunkle <ben@zed.dev>
1. Introduce a common `PromptFormatter` trait
2. Let models define their generation params.
3. Add support for the experimental 1120-seedcoder prompt format
Release Notes:
- N/A
This PR restructures the subcommands in `zeta-cli`, so that the
prediction engine (currently `zeta1` vs `zeta2`) is no longer the
highest order subcommand. Instead, there is just one layer of
subcommands: `eval`, `predict`, `context`, etc. Within these commands,
there are flags for using `zeta1`, `zeta2`, and now `sweep`.
Release Notes:
- N/A
---------
Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Agus <agus@zed.dev>
Closes #ISSUE
Makes it so that a file named `bucketed_analysis.md` is written to the
runs directory after an eval is ran with > 1 repetitions. This file
buckets the predictions made by the model by comparing the edits made so
that seeing how many times different failure modes were encountered
becomes much easier.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
This prompt is for a fine-tuned model. It has the following changes,
compared to `minimal`:
- No instructions at all, except for one sentence at the beginning of
the prompt.
- Output is a simplified unified diff -- hunk headers have no line
counts (e.g., `@@ -20 +20 @@`)
- Qwen's FIM tokens are used where possible (`<|file_sep|>`,
`<|fim_prefix|>`, `<|fim_suffix|>`, etc.)
To evaluate this model:
```
ZED_ZETA2_MODEL=zeta2-exp [usual zeta-cli eval params ...] --prompt-format minimal-qwen
```
This will point to the most recent Baseten deployment of zeta2-exp
(which may change in the future, so the prompt-format may get out of
sync).
Release Notes:
- N/A
This PR improves Zeta2's matching of `old_text`/`new_text` pairs, using
similar code to what we use in the edit agent. For right now, we've
duplicated the code, as opposed to trying to generalize it.
Release Notes:
- N/A
---------
Co-authored-by: Max <max@zed.dev>
Co-authored-by: Michael <michael@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Agus <agus@zed.dev>
1. Add `--prompt-format=minimal` that matches single-sentence
instructions used in fine-tuned models (specifically, in `1028-*` and
`1029-*` models)
2. Use separate configs for agentic context search model and edit
prediction model. This is useful when running a fine-tuned EP model, but
we still want to run vanilla model for context retrieval.
3. `zeta2-exp` is a symlink to the same-named Baseten deployment. This
model can be redeployed and updated without having to update the
deployment id.
4. Print scores as a compact table
Release Notes:
- N/A
---------
Co-authored-by: Piotr Osiewicz <piotr@zed.dev>
Adds a `--repeat` flag to the zeta eval that runs each example as many
times as specified. Also makes the output nicer in a few ways.
Release Notes:
- N/A
---------
Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Michael <michael@zed.dev>
- Improves the determinism of the search step for better cache
reusability
- Adds a `--cache force` mode that refuses to make any requests or
searches that aren't cached
- The structure of the `zeta-*` directories under `target` has been
rethought for convenience
Release Notes:
- N/A
---------
Co-authored-by: Agus <agus@zed.dev>
Closes #ISSUE
Improves error reporting for various failure modes of zeta2, including
failing to parse the `<old_text>`/`<new_text>` pattern, and the contents
of `<old_text>` failing to match.
Additionally, makes it so that evals are checked out into a worktree
with the _repo_ name instead of the _example_ name, in order to make
sure that the eval name has no influence on the models prediction. The
repo name worktrees are still namespaced by the example name like
`{example_name}/{repo_name}` to ensure evals pointing to the same repo
do not conflict.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
---------
Co-authored-by: Agus <agus@zed.dev>
We'll now cache LLM responses at the request level (by hash of
URL+contents) for both context and prediction. This way we don't need to
worry about mistakenly using the cache when we change the prompt or its
components.
Release Notes:
- N/A
---------
Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
Extract some of the improvements from to the unified diff prompt from
https://github.com/zed-industries/zed/pull/42171 and adds some other
about how context work to improve the reliability of predictions.
We also now strip the `<|user_cursor|>` marker if it appears in the
output rather than failing.
Release Notes:
- N/A
---------
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Since we removed the filtering step during context gathering, we want
the model to perform more targeted searches. This PR tweaks search tool
schema allowing the model to search within syntax nodes such as `impl`
blocks or methods.
This is what the query schema looks like now:
```rust
/// Search for relevant code by path, syntax hierarchy, and content.
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]
pub struct SearchToolQuery {
/// 1. A glob pattern to match file paths in the codebase to search in.
pub glob: String,
/// 2. Regular expressions to match syntax nodes **by their first line** and hierarchy.
///
/// Subsequent regexes match nodes within the full content of the nodes matched by the previous regexes.
///
/// Example: Searching for a `User` class
/// ["class\s+User"]
///
/// Example: Searching for a `get_full_name` method under a `User` class
/// ["class\s+User", "def\sget_full_name"]
///
/// Skip this field to match on content alone.
#[schemars(length(max = 3))]
#[serde(default)]
pub syntax_node: Vec<String>,
/// 3. An optional regular expression to match the final content that should appear in the results.
///
/// - Content will be matched within all lines of the matched syntax nodes.
/// - If syntax node regexes are provided, this field can be skipped to include as much of the node itself as possible.
/// - If no syntax node regexes are provided, the content will be matched within the entire file.
pub content: Option<String>,
}
```
We'll need to keep refining this, but the core implementation is ready.
Release Notes:
- N/A
---------
Co-authored-by: Ben <ben@zed.dev>
Co-authored-by: Max <max@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
* Fix a panic that happened because we lost the
`ContextRetrievalStarted` debug message, so we didn't assign `t0`.
* Write the edit prediction response log file as a markdown file
containing the text, not a JSON file. We mostly always want the text
content.
Release Notes:
- N/A
* Allow expressing alternative possible context fetches in `Expected
Context` section
* Allow marking a subset of lines as "required" in `Expected Context`.
We still need to improve how we display the results. I've removed the
context pass/fail pretty printing for now, because it would need to be
rethought to work with the new structure, but for now I think we should
focus on getting basic predictions to run. But this is progress toward a
better structure for eval examples.
Release Notes:
- N/A
---------
Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
This PR implements the `zeta-cli eval` command. It will:
- Run the edit prediction model if there are no cached results
- Compute precision/recall/F1 for context retrieval at the line level:
every retrieved line of context is counted as a true positive (correct
retrieval), false positive (retrieved something that was not expected),
or false negative (didn't retrieve an expected line)
- Compute similar metrics for edit predictions
- Pretty-print results, highlighting the difference between actual and
expected when printing to tty
Other changes:
- `zeta-cli predict` accepts a `--format` argument with options `md`,
`json`, `diff`
- Code restructure
Release Notes:
- N/A
---------
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
This PR adds a `zeta zeta2 predict` subcommand that takes an edit
prediction example markdown file as an argument, and performs zeta2's
prediction, showing the retrieved context and the predicted edit.
* [x] Apply uncommitted diff to get repo into the right state.
* [x] Apply edits in edit history
* [x] Display predicted edits as unified diff, regardless of model
output format
Release Notes:
- N/A
---------
Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Ben Kunkle <ben.kunkle@gmail.com>
Adds a `convert-example` subcommand to the zeta cli that converts eval
examples from/to `json`, `toml`, and `md` formats.
Release Notes:
- N/A
---------
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
We've been considering removing workspace-hack for a couple reasons:
- Lukas ran into a situation where its build script seemed to be causing
spurious rebuilds. This seems more likely to be a cargo bug than an
issue with workspace-hack itself (given that it has an empty build
script), but we don't necessarily want to take the time to hunt that
down right now.
- Marshall mentioned hakari interacts poorly with automated crate
updates (in our case provided by rennovate) because you'd need to have
`cargo hakari generate && cargo hakari manage-deps` after their changes
and we prefer to not have actions that make commits.
Currently removing workspace-hack causes our workspace to grow from
~1700 to ~2000 crates being built (depending on platform), which is
mainly a problem when you're building the whole workspace or running
tests across the the normal and remote binaries (which is where
feature-unification nets us the most sharing). It doesn't impact
incremental times noticeably when you're just iterating on `-p zed`, and
we'll hopefully get these savings back in the future when
rust-lang/cargo#14774 (which re-implements the functionality of hakari)
is finished.
Release Notes:
- N/A
Adds a new `NumberedLines` format which is similar to `MarkedExcerpt`
but each line is prefixed with its line number.
Also fixes a bug where contagious snippets wouldn't get merged.
Release Notes:
- N/A
---------
Co-authored-by: Michael Sloan <mgsloan@gmail.com>
Co-authored-by: Michael <michael@zed.dev>
Retrieval stats will now use polars to build a big data frame for
references with the cartesian product of LSP declarations and retrieved
declaration candidates (with all their score components) and rebuilds
the stats summary on top of it.
This data frame is written to a `.parquet` file, which we can load into
advanced analytics tools (such as Metabase), so we can explore our
scoring distributions and find ways to improve retrieval, and then train
the decision tree.
Release Notes:
- N/A
Gathering LSP declarations in zeta_cli can take a really long time for
big repos and has to be started from scratch if interrupted.
Instead of writing the cache file once we have walked the whole
worktree, we'll now do so incrementally as we complete each file. On
subsequent runs, we'll load as many valid declarations as has been
previously written to the cache, and then continue to request the rest
from the LSP which will append to the existing file as it makes
progress. If the last cache entry is incomplete, we'll truncate the
cache file to the end of the last valid line and continue from there, so
we can just `ctrl-c` without breaking resumability.
Release Notes:
- N/A
Before this change, it would save every buffer and wait for diagnostics.
For rust analyzer this would cause a lot of rechecking and greatly slow
down the analysis
Release Notes:
- N/A
Co-authored-by: Agus <agus@zed.dev>
Also skips indexing files that don't have a suffix that indicates a
known language, and skips when the language doesn't have an outline
grammar.
Release Notes:
- N/A
---------
Co-authored-by: Agus <agus@zed.dev>
Closes https://github.com/zed-industries/zed/issues/38690Closes#37353
### Background
On Windows, paths are normally separated by `\`, unlike mac and linux
where they are separated by `/`. When editing code in a project that
uses a different path style than your local system (e.g. remoting from
Windows to Linux, using WSL, and collaboration between windows and unix
users), the correct separator for a path may differ from the "native"
separator.
Previously, to work around this, Zed converted paths' separators in
numerous places. This was applied to both absolute and relative paths,
leading to incorrect conversions in some cases.
### Solution
Many code paths in Zed use paths that are *relative* to either a
worktree root or a git repository. This PR introduces a dedicated type
for these paths called `RelPath`, which stores the path in the same way
regardless of host platform, and offers `Path`-like manipulation APIs.
RelPath supports *displaying* the path using either separator, so that
we can display paths in a style that is determined at runtime based on
the current project.
The representation of absolute paths is left untouched, for now.
Absolute paths are different from relative paths because (except in
contexts where we know that the path refers to the local filesystem)
they should generally be treated as opaque strings. Currently we use a
mix of types for these paths (std::path::Path, String, SanitizedPath).
Release Notes:
- N/A
---------
Co-authored-by: Cole Miller <cole@zed.dev>
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Peter Tripp <petertripp@gmail.com>
Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>
Co-authored-by: Lukas Wirth <me@lukaswirth.dev>