This PR adds a staff-only button to the edit prediction menu for
capturing your current editing session as edit prediction example file.
When you click that button, it opens a markdown tab with the example. By
default, the most recent change that you've made is used as the expected
patch, and all of the previous events are used as the editing history.
<img width="303" height="123" alt="Screenshot 2025-12-14 at 6 58 33 PM"
src="https://github.com/user-attachments/assets/600c7bf2-7cf4-4d27-8cd4-8bb70d0b20b0"
/>
Release Notes:
- N/A
We were panicking whenever something went wrong with an example in the
CLI. This can be very disruptive when running many examples, and e.g a
single request fails. Instead, if running more than one example, errors
will now be logged alongside instructions to explore and re-run the
example by itself.
<img width="1454" height="744" alt="CleanShot 2025-12-12 at 13 32 04@2x"
src="https://github.com/user-attachments/assets/87c59e64-08b9-4461-af5b-03af5de94152"></img>
You can still opt in to stop as soon as en error occurs with the new
`--failfast` argument.
Release Notes:
- N/A
- Limit status lines to 10 in case `max_parallelism` is specified with a
grater value
- Handle logging gracefully rather than writing over it when clearing
status lines
Release Notes:
- N/A
This PR restructures the commands of the Edit Prediction CLI (now called
`ep`), to support some flows that are important for the training
process:
* generating zeta2 prompt and expected output, without running
predictions
* scoring outputs that are generated by a system other than the
production code (to evaluate the model during training)
To achieve this, we've restructured the CLI commands so that they all
take as input, and produce as output, a consistent, uniform data format:
a set of one or more `Example` structs, expressible either as the
original markdown format, or as a JSON lines. The `Example` struct
starts with the basic fields that are in human-readable eval format, but
contain a number of optional fields that are filled in by different
steps in the processing pipeline (`context`, `predict`, `format-prompt`,
and `score`).
### To do
* [x] Adjust the teacher model output parsing to use the full buffer
contents
* [x] Move udiff to cli
* [x] Align `format-prompt` with Zeta2's production code
* [x] Change score output to assume same provider
* [x] Move pretty reporting to `eval` command
* [x] Store cursor point in addition to cursor offset
* [x] Rename `edit_prediction_cli2` -> `edit_prediction_cli` (nuke the
old one)
Release Notes:
- N/A
---------
Co-authored-by: Oleksiy Syvokon <oleksiy@zed.dev>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Ben Kunkle <ben@zed.dev>
This PR partially implements a knowledge distillation data pipeline.
`zeta distill` gets a dataset of chronologically ordered commits and
generates synthetic predictions with a teacher model (one-shot Claude
Sonnet).
`zeta distill --batches cache.db` will enable Message Batches API. Under
the first run, this command will collect all LLM requests and upload a
batch of them to Anthropic. On subsequent runs, it will check the batch
status. If ready, it will download the result and put them into the
local cache.
Release Notes:
- N/A
---------
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Ben Kunkle <ben@zed.dev>