This PR implements the `zeta-cli eval` command. It will:
- Run the edit prediction model if there are no cached results
- Compute precision/recall/F1 for context retrieval at the line level:
every retrieved line of context is counted as a true positive (correct
retrieval), false positive (retrieved something that was not expected),
or false negative (didn't retrieve an expected line)
- Compute similar metrics for edit predictions
- Pretty-print results, highlighting the difference between actual and
expected when printing to tty
Other changes:
- `zeta-cli predict` accepts a `--format` argument with options `md`,
`json`, `diff`
- Code restructure
Release Notes:
- N/A
---------
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Agus Zubiaga <agus@zed.dev>