This PR partially implements a knowledge distillation data pipeline. `zeta distill` gets a dataset of chronologically ordered commits and generates synthetic predictions with a teacher model (one-shot Claude Sonnet). `zeta distill --batches cache.db` will enable Message Batches API. Under the first run, this command will collect all LLM requests and upload a batch of them to Anthropic. On subsequent runs, it will check the batch status. If ready, it will download the result and put them into the local cache. Release Notes: - N/A --------- Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com> Co-authored-by: Ben Kunkle <ben@zed.dev>
1.6 KiB
1.6 KiB