Commit Graph

3 Commits

Author SHA1 Message Date
Richard Feldman
afeb3d4fd9 Make eval more resilient to bad input from LLM (#29703)
I saw a slice panic (for begin > end) in a debug build of the eval. This
should just be a failed assertion, not a panic that takes out the whole
eval run!

Release Notes:

- N/A
2025-04-30 18:13:45 -04:00
Richard Feldman
d566864891 Make code block eval resilient to indentation (#29633)
This reduces spurious failures in the eval.

Release Notes:

- N/A
2025-04-30 02:13:13 +00:00
Richard Feldman
d7004030b3 Code block evals (#29619)
Add a targeted eval for code block formatting, and revise the system
prompt accordingly.

### Eval before, n=8

<img width="728" alt="eval before"
src="https://github.com/user-attachments/assets/552b6146-3d26-4eaa-86f9-9fc36c0cadf2"
/>

### Eval after prompt change, n=8 (excluding the new evals, so just
testing the prompt change)

<img width="717" alt="eval after"
src="https://github.com/user-attachments/assets/c78c7a54-4c65-470c-b135-8691584cd73e"
/>

Release Notes:

- N/A
2025-04-29 18:52:09 -04:00