Uses the latest version of the SDK + schema crate. A bit painful because we needed to move to `#[non_exhaustive]` on all of these structs/enums, but will be much easier going forward. Also, since we depend on unstable features, I am pinning the version so we don't accidentally introduce compilation errors from other update cycles. Release Notes: - N/A
Eval
This eval assumes the working directory is the root of the repository. Run it with:
cargo run -p eval
The eval will optionally read a .env file in crates/eval if you need it to set environment variables, such as API keys.
Explorer Tool
The explorer tool generates a self-contained HTML view from one or more thread JSON file. It provides a visual interface to explore the agent thread, including tool calls and results. See ./docs/explorer.md for more details.
Usage
cargo run -p eval --bin explorer -- --input <path-to-json-files> --output <output-html-path>
Example:
cargo run -p eval --bin explorer -- --input ./runs/2025-04-23_15-53-30/fastmcp_bugifx/*/last.messages.json --output /tmp/explorer.html