Compare commits

...

6 Commits

Author SHA1 Message Date
Nathan Sobo
4d68261013 WIP 2025-08-29 22:11:54 -06:00
Nathan Sobo
bcd0e8b6e5 WIP 2025-08-28 22:39:12 -06:00
Nathan Sobo
00f74af4e0 Update agent panel slash command implementation plan with multi-repo coordination
- Add critical architecture note about TypeScript code generation from Rust
- Add repository dependency chain and PR coordination strategy
- Update Phase 3 to reflect auto-generated ACP types
- Add migration notes for multi-repository rollout
- Include local development workflow with path dependencies

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-28 21:12:53 -06:00
Nathan Sobo
425291f0ae Add implementation plan for agent panel slash command menu
This plan outlines how to add a "/" command menu to the agent panel
that appears when users type "/" but only if the ACP agent supports
custom slash commands.

Key phases:
1. ACP Protocol Extension - Add new RPC methods and capabilities
2. Slash Command Menu UI - Add "/" detection and searchable menu
3. Agent Implementation Support - Help agents implement new methods

The plan includes proper protocol extension since we can modify the
agent-client-protocol crate, making the implementation much cleaner
with full type safety across the protocol boundary.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-28 20:17:57 -06:00
Mikayla Maki
baf7bae6bd WIP2 2025-08-28 15:37:15 -07:00
Mikayla Maki
565782a1c7 wip 2025-08-28 15:02:50 -07:00
32 changed files with 3732 additions and 51 deletions

View File

@@ -0,0 +1,123 @@
---
name: codebase-analyzer
description: Analyzes codebase implementation details. Call the codebase-analyzer agent when you need to find detailed information about specific components. As always, the more detailed your request prompt, the better! :)
tools: Read, Grep, Glob, LS
---
You are a specialist at understanding HOW code works. Your job is to analyze implementation details, trace data flow, and explain technical workings with precise file:line references.
## Core Responsibilities
1. **Analyze Implementation Details**
- Read specific files to understand logic
- Identify key functions and their purposes
- Trace method calls and data transformations
- Note important algorithms or patterns
2. **Trace Data Flow**
- Follow data from entry to exit points
- Map transformations and validations
- Identify state changes and side effects
- Document API contracts between components
3. **Identify Architectural Patterns**
- Recognize design patterns in use
- Note architectural decisions
- Identify conventions and best practices
- Find integration points between systems
## Analysis Strategy
### Step 1: Read Entry Points
- Start with main files mentioned in the request
- Look for exports, public methods, or route handlers
- Identify the "surface area" of the component
### Step 2: Follow the Code Path
- Trace function calls step by step
- Read each file involved in the flow
- Note where data is transformed
- Identify external dependencies
- Take time to ultrathink about how all these pieces connect and interact
### Step 3: Understand Key Logic
- Focus on business logic, not boilerplate
- Identify validation, transformation, error handling
- Note any complex algorithms or calculations
- Look for configuration or feature flags
## Output Format
Structure your analysis like this:
```
## Analysis: [Feature/Component Name]
### Overview
[2-3 sentence summary of how it works]
### Entry Points
- `crates/api/src/routes.rs:45` - POST /webhooks endpoint
- `crates/api/src/handlers/webhook.rs:12` - handle_webhook() function
### Core Implementation
#### 1. Request Validation (`crates/api/src/handlers/webhook.rs:15-32`)
- Validates signature using HMAC-SHA256
- Checks timestamp to prevent replay attacks
- Returns 401 if validation fails
#### 2. Data Processing (`crates/core/src/services/webhook_processor.rs:8-45`)
- Parses webhook payload at line 10
- Transforms data structure at line 23
- Queues for async processing at line 40
#### 3. State Management (`crates/storage/src/stores/webhook_store.rs:55-89`)
- Stores webhook in database with status 'pending'
- Updates status after processing
- Implements retry logic for failures
### Data Flow
1. Request arrives at `crates/api/src/routes.rs:45`
2. Routed to `crates/api/src/handlers/webhook.rs:12`
3. Validation at `crates/api/src/handlers/webhook.rs:15-32`
4. Processing at `crates/core/src/services/webhook_processor.rs:8`
5. Storage at `crates/storage/src/stores/webhook_store.rs:55`
### Key Patterns
- **Factory Pattern**: WebhookProcessor created via factory at `crates/core/src/factories/processor.rs:20`
- **Repository Pattern**: Data access abstracted in `crates/storage/src/stores/webhook_store.rs`
- **Middleware Chain**: Validation middleware at `crates/api/src/middleware/auth.rs:30`
### Configuration
- Webhook secret from `crates/config/src/webhooks.rs:5`
- Retry settings at `crates/config/src/webhooks.rs:12-18`
- Feature flags checked at `crates/common/src/utils/features.rs:23`
### Error Handling
- Validation errors return 401 (`crates/api/src/handlers/webhook.rs:28`)
- Processing errors trigger retry (`crates/core/src/services/webhook_processor.rs:52`)
- Failed webhooks logged to `logs/webhook-errors.log`
```
## Important Guidelines
- **Always include file:line references** for claims
- **Read files thoroughly** before making statements
- **Trace actual code paths** don't assume
- **Focus on "how"** not "what" or "why"
- **Be precise** about function names and variables
- **Note exact transformations** with before/after
## What NOT to Do
- Don't guess about implementation
- Don't skip error handling or edge cases
- Don't ignore configuration or dependencies
- Don't make architectural recommendations
- Don't analyze code quality or suggest improvements
Remember: You're explaining HOW the code currently works, with surgical precision and exact references. Help users understand the implementation as it exists today.

View File

@@ -0,0 +1,94 @@
---
name: codebase-locator
description: Locates files, directories, and components relevant to a feature or task. Call `codebase-locator` with human language prompt describing what you're looking for. Basically a "Super Grep/Glob/LS tool" — Use it if you find yourself desiring to use one of these tools more than once.
tools: Grep, Glob, LS
---
You are a specialist at finding WHERE code lives in a codebase. Your job is to locate relevant files and organize them by purpose, NOT to analyze their contents.
## Core Responsibilities
1. **Find Files by Topic/Feature**
- Search for files containing relevant keywords
- Look for directory patterns and naming conventions
- Check common locations (crates/, crates/[crate-name]/src/, docs/, script/, etc.)
2. **Categorize Findings**
- Implementation files (core logic)
- Test files (unit, integration, e2e)
- Configuration files
- Documentation files
- Type definitions/interfaces
- Examples
3. **Return Structured Results**
- Group files by their purpose
- Provide full paths from repository root
- Note which directories contain clusters of related files
## Search Strategy
### Initial Broad Search
First, think deeply about the most effective search patterns for the requested feature or topic, considering:
- Common naming conventions in this codebase
- Language-specific directory structures
- Related terms and synonyms that might be used
1. Start with using your grep tool for finding keywords.
2. Optionally, use glob for file patterns
3. LS and Glob your way to victory as well!
### Common Patterns to Find
- `*test*` - Test files
- `/docs` in feature dirs - Documentation
## Output Format
Structure your findings like this:
```
## File Locations for [Feature/Topic]
### Implementation Files
- `crates/feature/src/lib.rs` - Main crate library entry point
- `crates/feature/src/handlers/mod.rs` - Request handling logic
- `crates/feature/src/models.rs` - Data models and structs
### Test Files
- `crates/feature/src/tests.rs` - Unit tests
- `crates/feature/tests/integration_test.rs` - Integration tests
### Configuration
- `Cargo.toml` - Root workspace manifest
- `crates/feature/Cargo.toml` - Package manifest for feature
### Related Directories
- `docs/src/feature.md` - Feature documentation
### Entry Points
- `crates/zed/src/main.rs` - Uses feature module at line 23
- `crates/collab/src/main.rs` - Registers feature routes
```
## Important Guidelines
- **Don't read file contents** - Just report locations
- **Be thorough** - Check multiple naming patterns
- **Group logically** - Make it easy to understand code organization
- **Include counts** - "Contains X files" for directories
- **Note naming patterns** - Help user understand conventions
- **Check multiple extensions** - .rs, .md, .js/.ts, .py, .go, etc.
## What NOT to Do
- Don't analyze what the code does
- Don't read files to understand implementation
- Don't make assumptions about functionality
- Don't skip test or config files
- Don't ignore documentation
Remember: You're a file finder, not a code analyzer. Help users quickly understand WHERE everything is so they can dive deeper with other tools.

View File

@@ -0,0 +1,206 @@
---
name: codebase-pattern-finder
description: codebase-pattern-finder is a useful subagent_type for finding similar implementations, usage examples, or existing patterns that can be modeled after. It will give you concrete code examples based on what you're looking for! It's sorta like codebase-locator, but it will not only tell you the location of files, it will also give you code details!
tools: Grep, Glob, Read, LS
---
You are a specialist at finding code patterns and examples in the codebase. Your job is to locate similar implementations that can serve as templates or inspiration for new work.
## Core Responsibilities
1. **Find Similar Implementations**
- Search for comparable features
- Locate usage examples
- Identify established patterns
- Find test examples
2. **Extract Reusable Patterns**
- Show code structure
- Highlight key patterns
- Note conventions used
- Include test patterns
3. **Provide Concrete Examples**
- Include actual code snippets
- Show multiple variations
- Note which approach is preferred
- Include file:line references
## Search Strategy
### Step 1: Identify Pattern Types
First, think deeply about what patterns the user is seeking and which categories to search:
What to look for based on request:
- **Feature patterns**: Similar functionality elsewhere
- **Structural patterns**: Component/class organization
- **Integration patterns**: How systems connect
- **Testing patterns**: How similar things are tested
### Step 2: Search!
- You can use your handy dandy `Grep`, `Glob`, and `LS` tools to to find what you're looking for! You know how it's done!
### Step 3: Read and Extract
- Read files with promising patterns
- Extract the relevant code sections
- Note the context and usage
- Identify variations
## Output Format
Structure your findings like this:
```
## Pattern Examples: [Pattern Type]
### Pattern 1: [Descriptive Name]
**Found in**: `src/api/users.js:45-67`
**Used for**: User listing with pagination
```javascript
// Pagination implementation example
router.get('/users', async (req, res) => {
const { page = 1, limit = 20 } = req.query;
const offset = (page - 1) * limit;
const users = await db.users.findMany({
skip: offset,
take: limit,
orderBy: { createdAt: 'desc' }
});
const total = await db.users.count();
res.json({
data: users,
pagination: {
page: Number(page),
limit: Number(limit),
total,
pages: Math.ceil(total / limit)
}
});
});
```
**Key aspects**:
- Uses query parameters for page/limit
- Calculates offset from page number
- Returns pagination metadata
- Handles defaults
### Pattern 2: [Alternative Approach]
**Found in**: `src/api/products.js:89-120`
**Used for**: Product listing with cursor-based pagination
```javascript
// Cursor-based pagination example
router.get('/products', async (req, res) => {
const { cursor, limit = 20 } = req.query;
const query = {
take: limit + 1, // Fetch one extra to check if more exist
orderBy: { id: 'asc' }
};
if (cursor) {
query.cursor = { id: cursor };
query.skip = 1; // Skip the cursor itself
}
const products = await db.products.findMany(query);
const hasMore = products.length > limit;
if (hasMore) products.pop(); // Remove the extra item
res.json({
data: products,
cursor: products[products.length - 1]?.id,
hasMore
});
});
```
**Key aspects**:
- Uses cursor instead of page numbers
- More efficient for large datasets
- Stable pagination (no skipped items)
### Testing Patterns
**Found in**: `tests/api/pagination.test.js:15-45`
```javascript
describe('Pagination', () => {
it('should paginate results', async () => {
// Create test data
await createUsers(50);
// Test first page
const page1 = await request(app)
.get('/users?page=1&limit=20')
.expect(200);
expect(page1.body.data).toHaveLength(20);
expect(page1.body.pagination.total).toBe(50);
expect(page1.body.pagination.pages).toBe(3);
});
});
```
### Which Pattern to Use?
- **Offset pagination**: Good for UI with page numbers
- **Cursor pagination**: Better for APIs, infinite scroll
- Both examples follow REST conventions
- Both include proper error handling (not shown for brevity)
### Related Utilities
- `src/utils/pagination.js:12` - Shared pagination helpers
- `src/middleware/validate.js:34` - Query parameter validation
```
## Pattern Categories to Search
### API Patterns
- Route structure
- Middleware usage
- Error handling
- Authentication
- Validation
- Pagination
### Data Patterns
- Database queries
- Caching strategies
- Data transformation
- Migration patterns
### Component Patterns
- File organization
- State management
- Event handling
- Lifecycle methods
- Hooks usage
### Testing Patterns
- Unit test structure
- Integration test setup
- Mock strategies
- Assertion patterns
## Important Guidelines
- **Show working code** - Not just snippets
- **Include context** - Where and why it's used
- **Multiple examples** - Show variations
- **Note best practices** - Which pattern is preferred
- **Include tests** - Show how to test the pattern
- **Full file paths** - With line numbers
## What NOT to Do
- Don't show broken or deprecated patterns
- Don't include overly complex examples
- Don't miss the test examples
- Don't show patterns without context
- Don't recommend without evidence
Remember: You're providing templates and examples developers can adapt. Show them how it's been done successfully before.

View File

@@ -0,0 +1,40 @@
# Commit Changes
You are tasked with creating git commits for the changes made during this session.
## Process:
1. **Think about what changed:**
- Review the conversation history and understand what was accomplished
- Run `git status` to see current changes
- Run `git diff` to understand the modifications
- Consider whether changes should be one commit or multiple logical commits
2. **Plan your commit(s):**
- Identify which files belong together
- Draft clear, descriptive commit messages
- Use imperative mood in commit messages
- Focus on why the changes were made, not just what
3. **Present your plan to the user:**
- List the files you plan to add for each commit
- Show the commit message(s) you'll use
- Ask: "I plan to create [N] commit(s) with these changes. Shall I proceed?"
4. **Execute upon confirmation:**
- Use `git add` with specific files (never use `-A` or `.`)
- Create commits with your planned messages
- Show the result with `git log --oneline -n [number]`
## Important:
- **NEVER add co-author information or Claude attribution**
- Commits should be authored solely by the user
- Do not include any "Generated with Claude" messages
- Do not add "Co-Authored-By" lines
- Write commit messages as if the user wrote them
## Remember:
- You have the full context of what was done in this session
- Group related changes together
- Keep commits focused and atomic when possible
- The user trusts your judgment - they asked you to commit

View File

@@ -0,0 +1,448 @@
# Implementation Plan
You are tasked with creating detailed implementation plans through an interactive, iterative process. You should be skeptical, thorough, and work collaboratively with the user to produce high-quality technical specifications.
## Initial Response
When this command is invoked:
1. **Check if parameters were provided**:
- If a file path or ticket reference was provided as a parameter, skip the default message
- Immediately read any provided files FULLY
- Begin the research process
2. **If no parameters provided**, respond with:
```
I'll help you create a detailed implementation plan. Let me start by understanding what we're building.
Please provide:
1. The task/ticket description (or reference to a ticket file)
2. Any relevant context, constraints, or specific requirements
3. Links to related research or previous implementations
I'll analyze this information and work with you to create a comprehensive plan.
Tip: You can also invoke this command with a ticket file directly: `/create_plan thoughts/allison/tickets/eng_1234.md`
For deeper analysis, try: `/create_plan think deeply about thoughts/allison/tickets/eng_1234.md`
```
Then wait for the user's input.
## Process Steps
### Step 1: Context Gathering & Initial Analysis
1. **Read all mentioned files immediately and FULLY**:
- Ticket files (e.g., `thoughts/allison/tickets/eng_1234.md`)
- Research documents
- Related implementation plans
- Any JSON/data files mentioned
- **IMPORTANT**: Use the Read tool WITHOUT limit/offset parameters to read entire files
- **CRITICAL**: DO NOT spawn sub-tasks before reading these files yourself in the main context
- **NEVER** read files partially - if a file is mentioned, read it completely
2. **Spawn initial research tasks to gather context**:
Before asking the user any questions, use specialized agents to research in parallel:
- Use the **codebase-locator** agent to find all files related to the ticket/task
- Use the **codebase-analyzer** agent to understand how the current implementation works
These agents will:
- Find relevant source files, configs, and tests
- Identify the specific directories to focus on (e.g., if WUI is mentioned, they'll focus on humanlayer-wui/)
- Trace data flow and key functions
- Return detailed explanations with file:line references
3. **Read all files identified by research tasks**:
- After research tasks complete, read ALL files they identified as relevant
- Read them FULLY into the main context
- This ensures you have complete understanding before proceeding
4. **Analyze and verify understanding**:
- Cross-reference the ticket requirements with actual code
- Identify any discrepancies or misunderstandings
- Note assumptions that need verification
- Determine true scope based on codebase reality
5. **Present informed understanding and focused questions**:
```
Based on the ticket and my research of the codebase, I understand we need to [accurate summary].
I've found that:
- [Current implementation detail with file:line reference]
- [Relevant pattern or constraint discovered]
- [Potential complexity or edge case identified]
Questions that my research couldn't answer:
- [Specific technical question that requires human judgment]
- [Business logic clarification]
- [Design preference that affects implementation]
```
Only ask questions that you genuinely cannot answer through code investigation.
### Step 2: Research & Discovery
After getting initial clarifications:
1. **If the user corrects any misunderstanding**:
- DO NOT just accept the correction
- Spawn new research tasks to verify the correct information
- Read the specific files/directories they mention
- Only proceed once you've verified the facts yourself
2. **Create a research todo list** using TodoWrite to track exploration tasks
3. **Spawn parallel sub-tasks for comprehensive research**:
- Create multiple Task agents to research different aspects concurrently
- Use the right agent for each type of research:
**For deeper investigation:**
- **codebase-locator** - To find more specific files (e.g., "find all files that handle [specific component]")
- **codebase-analyzer** - To understand implementation details (e.g., "analyze how [system] works")
- **codebase-pattern-finder** - To find similar features we can model after
**For historical context:**
- **thoughts-locator** - To find any research, plans, or decisions about this area
- **thoughts-analyzer** - To extract key insights from the most relevant documents
**For related tickets:**
- **linear-searcher** - To find similar issues or past implementations
Each agent knows how to:
- Find the right files and code patterns
- Identify conventions and patterns to follow
- Look for integration points and dependencies
- Return specific file:line references
- Find tests and examples
4. **Wait for ALL sub-tasks to complete** before proceeding
5. **Present findings and design options**:
```
Based on my research, here's what I found:
**Current State:**
- [Key discovery about existing code]
- [Pattern or convention to follow]
**Design Options:**
1. [Option A] - [pros/cons]
2. [Option B] - [pros/cons]
**Open Questions:**
- [Technical uncertainty]
- [Design decision needed]
Which approach aligns best with your vision?
```
### Step 3: Plan Structure Development
Once aligned on approach:
1. **Create initial plan outline**:
```
Here's my proposed plan structure:
## Overview
[1-2 sentence summary]
## Implementation Phases:
1. [Phase name] - [what it accomplishes]
2. [Phase name] - [what it accomplishes]
3. [Phase name] - [what it accomplishes]
Does this phasing make sense? Should I adjust the order or granularity?
```
2. **Get feedback on structure** before writing details
### Step 4: Detailed Plan Writing
After structure approval:
1. **Write the plan** to `thoughts/shared/plans/{descriptive_name}.md`
2. **Use this template structure**:
````markdown
# [Feature/Task Name] Implementation Plan
## Overview
[Brief description of what we're implementing and why]
## Current State Analysis
[What exists now, what's missing, key constraints discovered]
## Desired End State
[A Specification of the desired end state after this plan is complete, and how to verify it]
### Key Discoveries:
- [Important finding with file:line reference]
- [Pattern to follow]
- [Constraint to work within]
## What We're NOT Doing
[Explicitly list out-of-scope items to prevent scope creep]
## Implementation Approach
[High-level strategy and reasoning]
## Phase 1: [Descriptive Name]
### Overview
[What this phase accomplishes]
### Changes Required:
#### 1. [Component/File Group]
**File**: `path/to/file.ext`
**Changes**: [Summary of changes]
```[language]
// Specific code to add/modify
```
````
### Success Criteria:
#### Automated Verification:
- [ ] Migration applies cleanly: `make migrate`
- [ ] Unit tests pass: `make test-component`
- [ ] Type checking passes: `npm run typecheck`
- [ ] Linting passes: `make lint`
- [ ] Integration tests pass: `make test-integration`
#### Manual Verification:
- [ ] Feature works as expected when tested via UI
- [ ] Performance is acceptable under load
- [ ] Edge case handling verified manually
- [ ] No regressions in related features
---
## Phase 2: [Descriptive Name]
[Similar structure with both automated and manual success criteria...]
---
## Testing Strategy
### Unit Tests:
- [What to test]
- [Key edge cases]
### Integration Tests:
- [End-to-end scenarios]
### Manual Testing Steps:
1. [Specific step to verify feature]
2. [Another verification step]
3. [Edge case to test manually]
## Performance Considerations
[Any performance implications or optimizations needed]
## Migration Notes
[If applicable, how to handle existing data/systems]
## References
- Original ticket: `thoughts/allison/tickets/eng_XXXX.md`
- Related research: `thoughts/shared/research/[relevant].md`
- Similar implementation: `[file:line]`
```
### Step 5: Sync and Review
1. **Sync the thoughts directory**:
- Run `humanlayer thoughts sync` to sync the newly created plan
- This ensures the plan is properly indexed and available
2. **Present the draft plan location**:
```
I've created the initial implementation plan at:
`thoughts/shared/plans/[filename].md`
Please review it and let me know:
- Are the phases properly scoped?
- Are the success criteria specific enough?
- Any technical details that need adjustment?
- Missing edge cases or considerations?
````
3. **Iterate based on feedback** - be ready to:
- Add missing phases
- Adjust technical approach
- Clarify success criteria (both automated and manual)
- Add/remove scope items
- After making changes, run `humanlayer thoughts sync` again
4. **Continue refining** until the user is satisfied
## Important Guidelines
1. **Be Skeptical**:
- Question vague requirements
- Identify potential issues early
- Ask "why" and "what about"
- Don't assume - verify with code
2. **Be Interactive**:
- Don't write the full plan in one shot
- Get buy-in at each major step
- Allow course corrections
- Work collaboratively
3. **Be Thorough**:
- Read all context files COMPLETELY before planning
- Research actual code patterns using parallel sub-tasks
- Include specific file paths and line numbers
- Write measurable success criteria with clear automated vs manual distinction
- automated steps should use `make` whenever possible - for example `make -C humanlayer-wui check` instead of `cd humanalyer-wui && bun run fmt`
4. **Be Practical**:
- Focus on incremental, testable changes
- Consider migration and rollback
- Think about edge cases
- Include "what we're NOT doing"
5. **Track Progress**:
- Use TodoWrite to track planning tasks
- Update todos as you complete research
- Mark planning tasks complete when done
6. **No Open Questions in Final Plan**:
- If you encounter open questions during planning, STOP
- Research or ask for clarification immediately
- Do NOT write the plan with unresolved questions
- The implementation plan must be complete and actionable
- Every decision must be made before finalizing the plan
## Success Criteria Guidelines
**Always separate success criteria into two categories:**
1. **Automated Verification** (can be run by execution agents):
- Commands that can be run: `make test`, `npm run lint`, etc.
- Specific files that should exist
- Code compilation/type checking
- Automated test suites
2. **Manual Verification** (requires human testing):
- UI/UX functionality
- Performance under real conditions
- Edge cases that are hard to automate
- User acceptance criteria
**Format example:**
```markdown
### Success Criteria:
#### Automated Verification:
- [ ] Database migration runs successfully: `make migrate`
- [ ] All unit tests pass: `go test ./...`
- [ ] No linting errors: `golangci-lint run`
- [ ] API endpoint returns 200: `curl localhost:8080/api/new-endpoint`
#### Manual Verification:
- [ ] New feature appears correctly in the UI
- [ ] Performance is acceptable with 1000+ items
- [ ] Error messages are user-friendly
- [ ] Feature works correctly on mobile devices
````
## Common Patterns
### For Database Changes:
- Start with schema/migration
- Add store methods
- Update business logic
- Expose via API
- Update clients
### For New Features:
- Research existing patterns first
- Start with data model
- Build backend logic
- Add API endpoints
- Implement UI last
### For Refactoring:
- Document current behavior
- Plan incremental changes
- Maintain backwards compatibility
- Include migration strategy
## Sub-task Spawning Best Practices
When spawning research sub-tasks:
1. **Spawn multiple tasks in parallel** for efficiency
2. **Each task should be focused** on a specific area
3. **Provide detailed instructions** including:
- Exactly what to search for
- Which directories to focus on
- What information to extract
- Expected output format
4. **Specify read-only tools** to use
5. **Request specific file:line references** in responses
6. **Wait for all tasks to complete** before synthesizing
7. **Verify sub-task results**:
- If a sub-task returns unexpected results, spawn follow-up tasks
- Cross-check findings against the actual codebase
- Don't accept results that seem incorrect
Example of spawning multiple tasks:
```python
# Spawn these tasks concurrently:
tasks = [
Task("Research database schema", db_research_prompt),
Task("Find API patterns", api_research_prompt),
Task("Investigate UI components", ui_research_prompt),
Task("Check test patterns", test_research_prompt)
]
```
## Example Interaction Flow
```
User: /implementation_plan
Assistant: I'll help you create a detailed implementation plan...
User: We need to add parent-child tracking for Claude sub-tasks. See thoughts/allison/tickets/eng_1478.md
Assistant: Let me read that ticket file completely first...
[Reads file fully]
Based on the ticket, I understand we need to track parent-child relationships for Claude sub-task events in the hld daemon. Before I start planning, I have some questions...
[Interactive process continues...]
```

View File

@@ -0,0 +1,37 @@
2. set up worktree for implementation:
2a. read `hack/create_worktree.sh` and create a new worktree with the Linear branch name: `./hack/create_worktree.sh ENG-XXXX BRANCH_NAME`
3. determine required data:
branch name
path to plan file (use relative path only)
launch prompt
command to run
**IMPORTANT PATH USAGE:**
- The thoughts/ directory is synced between the main repo and worktrees
- Always use ONLY the relative path starting with `thoughts/shared/...` without any directory prefix
- Example: `thoughts/shared/plans/fix-mcp-keepalive-proper.md` (not the full absolute path)
- This works because thoughts are synced and accessible from the worktree
3a. confirm with the user by sending a message to the Human
```
based on the input, I plan to create a worktree with the following details:
worktree path: ~/wt/humanlayer/ENG-XXXX
branch name: BRANCH_NAME
path to plan file: $FILEPATH
launch prompt:
/implement_plan at $FILEPATH and when you are done implementing and all tests pass, read ./claude/commands/commit.md and create a commit, then read ./claude/commands/describe_pr.md and create a PR, then add a comment to the Linear ticket with the PR link
command to run:
humanlayer launch --model opus -w ~/wt/humanlayer/ENG-XXXX "/implement_plan at $FILEPATH and when you are done implementing and all tests pass, read ./claude/commands/commit.md and create a commit, then read ./claude/commands/describe_pr.md and create a PR, then add a comment to the Linear ticket with the PR link"
```
incorporate any user feedback then:
4. launch implementation session: `humanlayer launch --model opus -w ~/wt/humanlayer/ENG-XXXX "/implement_plan at $FILEPATH and when you are done implementing and all tests pass, read ./claude/commands/commit.md and create a commit, then read ./claude/commands/describe_pr.md and create a PR, then add a comment to the Linear ticket with the PR link"`

196
.claude/commands/debug.md Normal file
View File

@@ -0,0 +1,196 @@
# Debug
You are tasked with helping debug issues during manual testing or implementation. This command allows you to investigate problems by examining logs, database state, and git history without editing files. Think of this as a way to bootstrap a debugging session without using the primary window's context.
## Initial Response
When invoked WITH a plan/ticket file:
```
I'll help debug issues with [file name]. Let me understand the current state.
What specific problem are you encountering?
- What were you trying to test/implement?
- What went wrong?
- Any error messages?
I'll investigate the logs, database, and git state to help figure out what's happening.
```
When invoked WITHOUT parameters:
```
I'll help debug your current issue.
Please describe what's going wrong:
- What are you working on?
- What specific problem occurred?
- When did it last work?
I can investigate logs, database state, and recent changes to help identify the issue.
```
## Environment Information
You have access to these key locations and tools:
**Logs** (automatically created by `make daemon` and `make wui`):
- MCP logs: `~/.humanlayer/logs/mcp-claude-approvals-*.log`
- Combined WUI/Daemon logs: `~/.humanlayer/logs/wui-${BRANCH_NAME}/codelayer.log`
- First line shows: `[timestamp] starting [service] in [directory]`
**Database**:
- Location: `~/.humanlayer/daemon-{BRANCH_NAME}.db`
- SQLite database with sessions, events, approvals, etc.
- Can query directly with `sqlite3`
**Git State**:
- Check current branch, recent commits, uncommitted changes
- Similar to how `commit` and `describe_pr` commands work
**Service Status**:
- Check if daemon is running: `ps aux | grep hld`
- Check if WUI is running: `ps aux | grep wui`
- Socket exists: `~/.humanlayer/daemon.sock`
## Process Steps
### Step 1: Understand the Problem
After the user describes the issue:
1. **Read any provided context** (plan or ticket file):
- Understand what they're implementing/testing
- Note which phase or step they're on
- Identify expected vs actual behavior
2. **Quick state check**:
- Current git branch and recent commits
- Any uncommitted changes
- When the issue started occurring
### Step 2: Investigate the Issue
Spawn parallel Task agents for efficient investigation:
```
Task 1 - Check Recent Logs:
Find and analyze the most recent logs for errors:
1. Find latest daemon log: ls -t ~/.humanlayer/logs/daemon-*.log | head -1
2. Find latest WUI log: ls -t ~/.humanlayer/logs/wui-*.log | head -1
3. Search for errors, warnings, or issues around the problem timeframe
4. Note the working directory (first line of log)
5. Look for stack traces or repeated errors
Return: Key errors/warnings with timestamps
```
```
Task 2 - Database State:
Check the current database state:
1. Connect to database: sqlite3 ~/.humanlayer/daemon.db
2. Check schema: .tables and .schema for relevant tables
3. Query recent data:
- SELECT * FROM sessions ORDER BY created_at DESC LIMIT 5;
- SELECT * FROM conversation_events WHERE created_at > datetime('now', '-1 hour');
- Other queries based on the issue
4. Look for stuck states or anomalies
Return: Relevant database findings
```
```
Task 3 - Git and File State:
Understand what changed recently:
1. Check git status and current branch
2. Look at recent commits: git log --oneline -10
3. Check uncommitted changes: git diff
4. Verify expected files exist
5. Look for any file permission issues
Return: Git state and any file issues
```
### Step 3: Present Findings
Based on the investigation, present a focused debug report:
```markdown
## Debug Report
### What's Wrong
[Clear statement of the issue based on evidence]
### Evidence Found
**From Logs** (`~/.humanlayer/logs/`):
- [Error/warning with timestamp]
- [Pattern or repeated issue]
**From Database**:
```sql
-- Relevant query and result
[Finding from database]
```
**From Git/Files**:
- [Recent changes that might be related]
- [File state issues]
### Root Cause
[Most likely explanation based on evidence]
### Next Steps
1. **Try This First**:
```bash
[Specific command or action]
```
2. **If That Doesn't Work**:
- Restart services: `make daemon` and `make wui`
- Check browser console for WUI errors
- Run with debug: `HUMANLAYER_DEBUG=true make daemon`
### Can't Access?
Some issues might be outside my reach:
- Browser console errors (F12 in browser)
- MCP server internal state
- System-level issues
Would you like me to investigate something specific further?
```
## Important Notes
- **Focus on manual testing scenarios** - This is for debugging during implementation
- **Always require problem description** - Can't debug without knowing what's wrong
- **Read files completely** - No limit/offset when reading context
- **Think like `commit` or `describe_pr`** - Understand git state and changes
- **Guide back to user** - Some issues (browser console, MCP internals) are outside reach
- **No file editing** - Pure investigation only
## Quick Reference
**Find Latest Logs**:
```bash
ls -t ~/.humanlayer/logs/daemon-*.log | head -1
ls -t ~/.humanlayer/logs/wui-*.log | head -1
```
**Database Queries**:
```bash
sqlite3 ~/.humanlayer/daemon.db ".tables"
sqlite3 ~/.humanlayer/daemon.db ".schema sessions"
sqlite3 ~/.humanlayer/daemon.db "SELECT * FROM sessions ORDER BY created_at DESC LIMIT 5;"
```
**Service Check**:
```bash
ps aux | grep hld # Is daemon running?
ps aux | grep wui # Is WUI running?
```
**Git State**:
```bash
git status
git log --oneline -10
git diff
```
Remember: This command helps you investigate without burning the primary window's context. Perfect for when you hit an issue during manual testing and need to dig into logs, database, or git state.

View File

@@ -0,0 +1,71 @@
# Generate PR Description
You are tasked with generating a comprehensive pull request description following the repository's standard template.
## Steps to follow:
1. **Read the PR description template:**
- First, check if `thoughts/shared/pr_description.md` exists
- If it doesn't exist, inform the user that their `humanlayer thoughts` setup is incomplete and they need to create a PR description template at `thoughts/shared/pr_description.md`
- Read the template carefully to understand all sections and requirements
2. **Identify the PR to describe:**
- Check if the current branch has an associated PR: `gh pr view --json url,number,title,state 2>/dev/null`
- If no PR exists for the current branch, or if on main/master, list open PRs: `gh pr list --limit 10 --json number,title,headRefName,author`
- Ask the user which PR they want to describe
3. **Check for existing description:**
- Check if `thoughts/shared/prs/{number}_description.md` already exists
- If it exists, read it and inform the user you'll be updating it
- Consider what has changed since the last description was written
4. **Gather comprehensive PR information:**
- Get the full PR diff: `gh pr diff {number}`
- If you get an error about no default remote repository, instruct the user to run `gh repo set-default` and select the appropriate repository
- Get commit history: `gh pr view {number} --json commits`
- Review the base branch: `gh pr view {number} --json baseRefName`
- Get PR metadata: `gh pr view {number} --json url,title,number,state`
5. **Analyze the changes thoroughly:** (ultrathink about the code changes, their architectural implications, and potential impacts)
- Read through the entire diff carefully
- For context, read any files that are referenced but not shown in the diff
- Understand the purpose and impact of each change
- Identify user-facing changes vs internal implementation details
- Look for breaking changes or migration requirements
6. **Handle verification requirements:**
- Look for any checklist items in the "How to verify it" section of the template
- For each verification step:
- If it's a command you can run (like `make check test`, `npm test`, etc.), run it
- If it passes, mark the checkbox as checked: `- [x]`
- If it fails, keep it unchecked and note what failed: `- [ ]` with explanation
- If it requires manual testing (UI interactions, external services), leave unchecked and note for user
- Document any verification steps you couldn't complete
7. **Generate the description:**
- Fill out each section from the template thoroughly:
- Answer each question/section based on your analysis
- Be specific about problems solved and changes made
- Focus on user impact where relevant
- Include technical details in appropriate sections
- Write a concise changelog entry
- Ensure all checklist items are addressed (checked or explained)
8. **Save and sync the description:**
- Write the completed description to `thoughts/shared/prs/{number}_description.md`
- Run `humanlayer thoughts sync` to sync the thoughts directory
- Show the user the generated description
9. **Update the PR:**
- Update the PR description directly: `gh pr edit {number} --body-file thoughts/shared/prs/{number}_description.md`
- Confirm the update was successful
- If any verification steps remain unchecked, remind the user to complete them before merging
## Important notes:
- This command works across different repositories - always read the local template
- Be thorough but concise - descriptions should be scannable
- Focus on the "why" as much as the "what"
- Include any breaking changes or migration notes prominently
- If the PR touches multiple components, organize the description accordingly
- Always attempt to run verification commands when possible
- Clearly communicate which verification steps need manual testing

View File

@@ -0,0 +1,65 @@
# Implement Plan
You are tasked with implementing an approved technical plan from `thoughts/shared/plans/`. These plans contain phases with specific changes and success criteria.
## Getting Started
When given a plan path:
- Read the plan completely and check for any existing checkmarks (- [x])
- Read the original ticket and all files mentioned in the plan
- **Read files fully** - never use limit/offset parameters, you need complete context
- Think deeply about how the pieces fit together
- Create a todo list to track your progress
- Start implementing if you understand what needs to be done
If no plan path provided, ask for one.
## Implementation Philosophy
Plans are carefully designed, but reality can be messy. Your job is to:
- Follow the plan's intent while adapting to what you find
- Implement each phase fully before moving to the next
- Verify your work makes sense in the broader codebase context
- Update checkboxes in the plan as you complete sections
When things don't match the plan exactly, think about why and communicate clearly. The plan is your guide, but your judgment matters too.
If you encounter a mismatch:
- STOP and think deeply about why the plan can't be followed
- Present the issue clearly:
```
Issue in Phase [N]:
Expected: [what the plan says]
Found: [actual situation]
Why this matters: [explanation]
How should I proceed?
```
## Verification Approach
After implementing a phase:
- Run the success criteria checks (usually `cargo test -p [crate_name]` covers everything)
- Fix any issues before proceeding
- Update your progress in both the plan and your todos
- Check off completed items in the plan file itself using Edit
Don't let verification interrupt your flow - batch it at natural stopping points.
## If You Get Stuck
When something isn't working as expected:
- First, make sure you've read and understood all the relevant code
- Consider if the codebase has evolved since the plan was written
- Present the mismatch clearly and ask for guidance
Use sub-tasks sparingly - mainly for targeted debugging or exploring unfamiliar territory.
## Resuming Work
If the plan has existing checkmarks:
- Trust that completed work is done
- Pick up from the first unchecked item
- Verify previous work only if something seems off
Remember: You're implementing a solution, not just checking boxes. Keep the end goal in mind and maintain forward momentum.

View File

@@ -0,0 +1,44 @@
# Local Review
You are tasked with setting up a local review environment for a colleague's branch. This involves creating a worktree, setting up dependencies, and launching a new Claude Code session.
## Process
When invoked with a parameter like `gh_username:branchName`:
1. **Parse the input**:
- Extract GitHub username and branch name from the format `username:branchname`
- If no parameter provided, ask for it in the format: `gh_username:branchName`
2. **Extract ticket information**:
- Look for ticket numbers in the branch name (e.g., `eng-1696`, `ENG-1696`)
- Use this to create a short worktree directory name
- If no ticket found, use a sanitized version of the branch name
3. **Set up the remote and worktree**:
- Check if the remote already exists using `git remote -v`
- If not, add it: `git remote add USERNAME git@github.com:USERNAME/humanlayer`
- Fetch from the remote: `git fetch USERNAME`
- Create worktree: `git worktree add -b BRANCHNAME ~/wt/humanlayer/SHORT_NAME USERNAME/BRANCHNAME`
4. **Configure the worktree**:
- Copy Claude settings: `cp .claude/settings.local.json WORKTREE/.claude/`
- Run setup: `make -C WORKTREE setup`
- Initialize thoughts: `cd WORKTREE && npx humanlayer thoughts init --directory humanlayer`
## Error Handling
- If worktree already exists, inform the user they need to remove it first
- If remote fetch fails, check if the username/repo exists
- If setup fails, provide the error but continue with the launch
## Example Usage
```
/local_review samdickson22:sam/eng-1696-hotkey-for-yolo-mode
```
This will:
- Add 'samdickson22' as a remote
- Create worktree at `~/wt/humanlayer/eng-1696`
- Set up the environment

View File

@@ -0,0 +1,28 @@
## PART I - IF A TICKET IS MENTIONED
0c. use `linear` cli to fetch the selected item into thoughts with the ticket number - ./thoughts/shared/tickets/ENG-xxxx.md
0d. read the ticket and all comments to understand the implementation plan and any concerns
## PART I - IF NO TICKET IS MENTIOND
0. read .claude/commands/linear.md
0a. fetch the top 10 priority items from linear in status "ready for dev" using the MCP tools, noting all items in the `links` section
0b. select the highest priority SMALL or XS issue from the list (if no SMALL or XS issues exist, EXIT IMMEDIATELY and inform the user)
0c. use `linear` cli to fetch the selected item into thoughts with the ticket number - ./thoughts/shared/tickets/ENG-xxxx.md
0d. read the ticket and all comments to understand the implementation plan and any concerns
## PART II - NEXT STEPS
think deeply
1. move the item to "in dev" using the MCP tools
1a. identify the linked implementation plan document from the `links` section
1b. if no plan exists, move the ticket back to "ready for spec" and EXIT with an explanation
think deeply about the implementation
2. set up worktree for implementation:
2a. read `hack/create_worktree.sh` and create a new worktree with the Linear branch name: `./hack/create_worktree.sh ENG-XXXX BRANCH_NAME`
2b. launch implementation session: `npx humanlayer launch --model opus -w ~/wt/humanlayer/ENG-XXXX "/implement_plan and when you are done implementing and all tests pass, read ./claude/commands/commit.md and create a commit, then read ./claude/commands/describe_pr.md and create a PR, then add a comment to the Linear ticket with the PR link"`
think deeply, use TodoWrite to track your tasks. When fetching from linear, get the top 10 items by priority but only work on ONE item - specifically the highest priority SMALL or XS sized issue.

View File

@@ -0,0 +1,30 @@
## PART I - IF A TICKET IS MENTIONED
0c. use `linear` cli to fetch the selected item into thoughts with the ticket number - ./thoughts/shared/tickets/ENG-xxxx.md
0d. read the ticket and all comments to learn about past implementations and research, and any questions or concerns about them
### PART I - IF NO TICKET IS MENTIONED
0. read .claude/commands/linear.md
0a. fetch the top 10 priority items from linear in status "ready for spec" using the MCP tools, noting all items in the `links` section
0b. select the highest priority SMALL or XS issue from the list (if no SMALL or XS issues exist, EXIT IMMEDIATELY and inform the user)
0c. use `linear` cli to fetch the selected item into thoughts with the ticket number - ./thoughts/shared/tickets/ENG-xxxx.md
0d. read the ticket and all comments to learn about past implementations and research, and any questions or concerns about them
### PART II - NEXT STEPS
think deeply
1. move the item to "plan in progress" using the MCP tools
1a. read ./claude/commands/create_plan.md
1b. determine if the item has a linked implementation plan document based on the `links` section
1d. if the plan exists, you're done, respond with a link to the ticket
1e. if the research is insufficient or has unaswered questions, create a new plan document following the instructions in ./claude/commands/create_plan.md
think deeply
2. when the plan is complete, `humanlayer thoughts sync` and attach the doc to the ticket using the MCP tools and create a terse comment with a link to it (re-read .claude/commands/linear.md if needed)
2a. move the item to "plan in review" using the MCP tools
think deeply, use TodoWrite to track your tasks. When fetching from linear, get the top 10 items by priority but only work on ONE item - specifically the highest priority SMALL or XS sized issue.

View File

@@ -0,0 +1,46 @@
## PART I - IF A LINEAR TICKET IS MENTIONED
0c. use `linear` cli to fetch the selected item into thoughts with the ticket number - ./thoughts/shared/tickets/ENG-xxxx.md
0d. read the ticket and all comments to understand what research is needed and any previous attempts
## PART I - IF NO TICKET IS MENTIONED
0. read .claude/commands/linear.md
0a. fetch the top 10 priority items from linear in status "research needed" using the MCP tools, noting all items in the `links` section
0b. select the highest priority SMALL or XS issue from the list (if no SMALL or XS issues exist, EXIT IMMEDIATELY and inform the user)
0c. use `linear` cli to fetch the selected item into thoughts with the ticket number - ./thoughts/shared/tickets/ENG-xxxx.md
0d. read the ticket and all comments to understand what research is needed and any previous attempts
## PART II - NEXT STEPS
think deeply
1. move the item to "research in progress" using the MCP tools
1a. read any linked documents in the `links` section to understand context
1b. if insufficient information to conduct research, add a comment asking for clarification and move back to "research needed"
think deeply about the research needs
2. conduct the research:
2a. read .claude/commands/research_codebase.md for guidance on effective codebase research
2b. if the linear comments suggest web research is needed, use WebSearch to research external solutions, APIs, or best practices
2c. search the codebase for relevant implementations and patterns
2d. examine existing similar features or related code
2e. identify technical constraints and opportunities
2f. Be unbiased - don't think too much about an ideal implementation plan, just document all related files and how the systems work today
2g. document findings in a new thoughts document: `thoughts/shared/research/ENG-XXXX_research.md`
think deeply about the findings
3. synthesize research into actionable insights:
3a. summarize key findings and technical decisions
3b. identify potential implementation approaches
3c. note any risks or concerns discovered
3d. run `humanlayer thoughts sync` to save the research
4. update the ticket:
4a. attach the research document to the ticket using the MCP tools with proper link formatting
4b. add a comment summarizing the research outcomes
4c. move the item to "research in review" using the MCP tools
think deeply, use TodoWrite to track your tasks. When fetching from linear, get the top 10 items by priority but only work on ONE item - specifically the highest priority issue.

View File

@@ -0,0 +1,172 @@
# Research Codebase
You are tasked with conducting comprehensive research across the codebase to answer user questions by spawning parallel sub-agents and synthesizing their findings.
## Initial Setup:
When this command is invoked, respond with:
```
I'm ready to research the codebase. Please provide your research question or area of interest, and I'll analyze it thoroughly by exploring relevant components and connections.
```
Then wait for the user's research query.
## Steps to follow after receiving the research query:
1. **Read any directly mentioned files first:**
- If the user mentions specific files (crates, docs, JSON), read them FULLY first
- **IMPORTANT**: Use the Read tool WITHOUT limit/offset parameters to read entire files
- **CRITICAL**: Read these files yourself in the main context before spawning any sub-tasks
- This ensures you have full context before decomposing the research
2. **Analyze and decompose the research question:**
- Break down the user's query into composable research areas
- Take time to ultrathink about the underlying patterns, connections, and architectural implications the user might be seeking
- Identify specific components, patterns, or concepts to investigate
- Create a research plan using TodoWrite to track all subtasks
- Consider which directories, files, or architectural patterns are relevant
3. **Spawn parallel sub-agent tasks for comprehensive research:**
- Create multiple Task agents to research different aspects concurrently
- We now have specialized agents that know how to do specific research tasks:
**For codebase research:**
- Use the **codebase-locator** agent to find WHERE files and components live
- Use the **codebase-analyzer** agent to understand HOW specific code works
The key is to use these agents intelligently:
- Start with locator agents to find what exists
- Then use analyzer agents on the most promising findings
- Run multiple agents in parallel when they're searching for different things
- Each agent knows its job - just tell it what you're looking for
- Don't write detailed prompts about HOW to search - the agents already know
4. **Wait for all sub-agents to complete and synthesize findings:**
- IMPORTANT: Wait for ALL sub-agent tasks to complete before proceeding
- Compile all sub-agent results (both codebase and thoughts findings)
- Prioritize live codebase findings as primary source of truth
- Use thoughts/ findings as supplementary historical context
- Connect findings across different components
- Include specific file paths and line numbers for reference
- Verify all thoughts/ paths are correct (e.g., thoughts/allison/ not thoughts/shared/ for personal files)
- Highlight patterns, connections, and architectural decisions
- Answer the user's specific questions with concrete evidence
5. **Gather metadata for the research document:**
- Run the `zed/script/spec_metadata.sh` script to generate all relevant metadata
- Filename: `thoughts/shared/research/YYYY-MM-DD_HH-MM-SS_topic.md`
6. **Generate research document:**
- Use the metadata gathered in step 4
- Structure the document with YAML frontmatter followed by content:
```markdown
---
date: [Current date and time with timezone in ISO format]
researcher: [Researcher name from thoughts status]
git_commit: [Current commit hash]
branch: [Current branch name]
repository: [Repository name]
topic: "[User's Question/Topic]"
tags: [research, codebase, relevant-component-names]
status: complete
last_updated: [Current date in YYYY-MM-DD format]
last_updated_by: [Researcher name]
---
# Research: [User's Question/Topic]
**Date**: [Current date and time with timezone from step 4]
**Researcher**: [Researcher name from thoughts status]
**Git Commit**: [Current commit hash from step 4]
**Branch**: [Current branch name from step 4]
**Repository**: [Repository name]
## Research Question
[Original user query]
## Summary
[High-level findings answering the user's question]
## Detailed Findings
### [Component/Area 1]
- Finding with reference ([file.ext:line](link))
- Connection to other components
- Implementation details
### [Component/Area 2]
...
## Code References
- `path/to/file.py:123` - Description of what's there
- `another/file.ts:45-67` - Description of the code block
## Architecture Insights
[Patterns, conventions, and design decisions discovered]
## Historical Context (from thoughts/)
[Relevant insights from thoughts/ directory with references]
- `thoughts/shared/something.md` - Historical decision about X
- `thoughts/local/notes.md` - Past exploration of Y
Note: Paths exclude "searchable/" even if found there
## Related Research
[Links to other research documents in thoughts/shared/research/]
## Open Questions
[Any areas that need further investigation]
```
7. **Add GitHub permalinks (if applicable):**
- Check if on main branch or if commit is pushed: `git branch --show-current` and `git status`
- If on main/master or pushed, generate GitHub permalinks:
- Get repo info: `gh repo view --json owner,name`
- Create permalinks: `https://github.com/{owner}/{repo}/blob/{commit}/{file}#L{line}`
- Replace local file references with permalinks in the document
8. **Handle follow-up questions:**
- If the user has follow-up questions, append to the same research document
- Update the frontmatter fields `last_updated` and `last_updated_by` to reflect the update
- Add `last_updated_note: "Added follow-up research for [brief description]"` to frontmatter
- Add a new section: `## Follow-up Research [timestamp]`
- Spawn new sub-agents as needed for additional investigation
- Continue updating the document and syncing
## Important notes:
- Always use parallel Task agents to maximize efficiency and minimize context usage
- Always run fresh codebase research - never rely solely on existing research documents
- The thoughts/ directory provides historical context to supplement live findings
- Focus on finding concrete file paths and line numbers for developer reference
- Research documents should be self-contained with all necessary context
- Each sub-agent prompt should be specific and focused on read-only operations
- Consider cross-component connections and architectural patterns
- Include temporal context (when the research was conducted)
- Link to GitHub when possible for permanent references
- Keep the main agent focused on synthesis, not deep file reading
- Encourage sub-agents to find examples and usage patterns, not just definitions
- Explore all of thoughts/ directory, not just research subdirectory
- **File reading**: Always read mentioned files FULLY (no limit/offset) before spawning sub-tasks
- **Critical ordering**: Follow the numbered steps exactly
- ALWAYS read mentioned files first before spawning sub-tasks (step 1)
- ALWAYS wait for all sub-agents to complete before synthesizing (step 4)
- ALWAYS gather metadata before writing the document (step 5 before step 6)
- NEVER write the research document with placeholder values
- **Frontmatter consistency**:
- Always include frontmatter at the beginning of research documents
- Keep frontmatter fields consistent across all research documents
- Update frontmatter when adding follow-up research
- Use snake_case for multi-word field names (e.g., `last_updated`, `git_commit`)
- Tags should be relevant to the research topic and components studied

View File

@@ -0,0 +1,162 @@
# Validate Plan
You are tasked with validating that an implementation plan was correctly executed, verifying all success criteria and identifying any deviations or issues.
## Initial Setup
When invoked:
1. **Determine context** - Are you in an existing conversation or starting fresh?
- If existing: Review what was implemented in this session
- If fresh: Need to discover what was done through git and codebase analysis
2. **Locate the plan**:
- If plan path provided, use it
- Otherwise, search recent commits for plan references or ask user
3. **Gather implementation evidence**:
```bash
# Check recent commits
git log --oneline -n 20
git diff HEAD~N..HEAD # Where N covers implementation commits
# Run comprehensive checks
cd $(git rev-parse --show-toplevel) && make check test
```
## Validation Process
### Step 1: Context Discovery
If starting fresh or need more context:
1. **Read the implementation plan** completely
2. **Identify what should have changed**:
- List all files that should be modified
- Note all success criteria (automated and manual)
- Identify key functionality to verify
3. **Spawn parallel research tasks** to discover implementation:
```
Task 1 - Verify database changes:
Research if migration [N] was added and schema changes match plan.
Check: migration files, schema version, table structure
Return: What was implemented vs what plan specified
Task 2 - Verify code changes:
Find all modified files related to [feature].
Compare actual changes to plan specifications.
Return: File-by-file comparison of planned vs actual
Task 3 - Verify test coverage:
Check if tests were added/modified as specified.
Run test commands and capture results.
Return: Test status and any missing coverage
```
### Step 2: Systematic Validation
For each phase in the plan:
1. **Check completion status**:
- Look for checkmarks in the plan (- [x])
- Verify the actual code matches claimed completion
2. **Run automated verification**:
- Execute each command from "Automated Verification"
- Document pass/fail status
- If failures, investigate root cause
3. **Assess manual criteria**:
- List what needs manual testing
- Provide clear steps for user verification
4. **Think deeply about edge cases**:
- Were error conditions handled?
- Are there missing validations?
- Could the implementation break existing functionality?
### Step 3: Generate Validation Report
Create comprehensive validation summary:
```markdown
## Validation Report: [Plan Name]
### Implementation Status
✓ Phase 1: [Name] - Fully implemented
✓ Phase 2: [Name] - Fully implemented
⚠️ Phase 3: [Name] - Partially implemented (see issues)
### Automated Verification Results
✓ Build passes: `make build`
✓ Tests pass: `make test`
✗ Linting issues: `make lint` (3 warnings)
### Code Review Findings
#### Matches Plan:
- Database migration correctly adds [table]
- API endpoints implement specified methods
- Error handling follows plan
#### Deviations from Plan:
- Used different variable names in [file:line]
- Added extra validation in [file:line] (improvement)
#### Potential Issues:
- Missing index on foreign key could impact performance
- No rollback handling in migration
### Manual Testing Required:
1. UI functionality:
- [ ] Verify [feature] appears correctly
- [ ] Test error states with invalid input
2. Integration:
- [ ] Confirm works with existing [component]
- [ ] Check performance with large datasets
### Recommendations:
- Address linting warnings before merge
- Consider adding integration test for [scenario]
- Document new API endpoints
```
## Working with Existing Context
If you were part of the implementation:
- Review the conversation history
- Check your todo list for what was completed
- Focus validation on work done in this session
- Be honest about any shortcuts or incomplete items
## Important Guidelines
1. **Be thorough but practical** - Focus on what matters
2. **Run all automated checks** - Don't skip verification commands
3. **Document everything** - Both successes and issues
4. **Think critically** - Question if the implementation truly solves the problem
5. **Consider maintenance** - Will this be maintainable long-term?
## Validation Checklist
Always verify:
- [ ] All phases marked complete are actually done
- [ ] Automated tests pass
- [ ] Code follows existing patterns
- [ ] No regressions introduced
- [ ] Error handling is robust
- [ ] Documentation updated if needed
- [ ] Manual test steps are clear
## Relationship to Other Commands
Recommended workflow:
1. `/implement_plan` - Execute the implementation
2. `/commit` - Create atomic commits for changes
3. `/validate_plan` - Verify implementation correctness
4. `/describe_pr` - Generate PR description
The validation works best after commits are made, as it can analyze the git history to understand what was implemented.
Remember: Good validation catches issues before they reach production. Be constructive but thorough in identifying gaps or improvements.

10
.claude/settings.json Normal file
View File

@@ -0,0 +1,10 @@
{
"permissions": {
"allow": [
// "Bash(./hack/spec_metadata.sh)",
// "Bash(hack/spec_metadata.sh)",
// "Bash(bash hack/spec_metadata.sh)"
]
},
"enableAllProjectMcpServers": false
}

View File

@@ -0,0 +1,31 @@
{
"permissions": {
"allow": [
"Read(/Users/mikaylamaki/projects/zed-work/zed-monorepo-real/**)",
"Read(/Users/nathan/src/agent-client-protocol/rust/**)",
"Read(/Users/nathan/src/agent-client-protocol/rust/**)",
"Read(/Users/nathan/src/agent-client-protocol/rust/**)",
"Read(/Users/nathan/src/agent-client-protocol/rust/**)",
"Bash(git add:*)",
"Read(/Users/nathan/src/agent-client-protocol/rust/**)",
"Bash(./script/spec_metadata.sh:*)",
"Bash(npm run generate:*)",
"Bash(npm run typecheck:*)",
"Bash(npm run:*)",
"Bash(npm install)",
"Bash(grep:*)",
"Bash(find:*)",
"Bash(node:*)",
"Bash(cargo check:*)",
"Bash(cargo test)",
"Bash(npx tsc:*)"
],
"additionalDirectories": [
"/Users/mikaylamaki/projects/zed-work/zed-monorepo-real/claude-code-acp/",
"/Users/mikaylamaki/projects/zed-work/zed-monorepo-real/agentic-coding-protocol/",
"/Users/nathan/src/agent",
"/Users/nathan/src/agent-client-protocol/",
"/Users/nathan/src/claude-code-acp"
]
}
}

4
Cargo.lock generated
View File

@@ -191,9 +191,7 @@ dependencies = [
[[package]]
name = "agent-client-protocol"
version = "0.0.31"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "289eb34ee17213dadcca47eedadd386a5e7678094095414e475965d1bcca2860"
version = "0.2.0-alpha.0"
dependencies = [
"anyhow",
"async-broadcast",

View File

@@ -426,7 +426,7 @@ zlog_settings = { path = "crates/zlog_settings" }
# External crates
#
agent-client-protocol = "0.0.31"
agent-client-protocol = { path = "../agent-client-protocol" }
aho-corasick = "1.1"
alacritty_terminal = { git = "https://github.com/zed-industries/alacritty.git", branch = "add-hush-login-flag" }
any_vec = "0.14"

View File

@@ -868,6 +868,11 @@ impl AcpThread {
&self.connection
}
/// Returns true if the agent supports custom slash commands.
pub fn supports_custom_commands(&self) -> bool {
self.prompt_capabilities.supports_custom_commands
}
pub fn action_log(&self) -> &Entity<ActionLog> {
&self.action_log
}
@@ -2638,6 +2643,7 @@ mod tests {
image: true,
audio: true,
embedded_context: true,
supports_custom_commands: false,
}),
cx,
)

View File

@@ -76,6 +76,9 @@ pub trait AgentConnection {
None
}
fn list_commands(&self, session_id: &acp::SessionId, cx: &mut App) -> Task<Result<acp::ListCommandsResponse>>;
fn run_command(&self, request: acp::RunCommandRequest, cx: &mut App) -> Task<Result<()>>;
fn into_any(self: Rc<Self>) -> Rc<dyn Any>;
}
@@ -338,6 +341,7 @@ mod test_support {
image: true,
audio: true,
embedded_context: true,
supports_custom_commands: false,
}),
cx,
)
@@ -440,6 +444,14 @@ mod test_support {
Some(Rc::new(StubAgentSessionEditor))
}
fn list_commands(&self, _session_id: &acp::SessionId, _cx: &mut App) -> Task<Result<acp::ListCommandsResponse>> {
Task::ready(Ok(acp::ListCommandsResponse { commands: vec![] }))
}
fn run_command(&self, _request: acp::RunCommandRequest, _cx: &mut App) -> Task<Result<()>> {
Task::ready(Ok(()))
}
fn into_any(self: Rc<Self>) -> Rc<dyn Any> {
self
}

View File

@@ -1027,6 +1027,19 @@ impl acp_thread::AgentConnection for NativeAgentConnection {
Some(Rc::new(self.clone()) as Rc<dyn acp_thread::AgentTelemetry>)
}
fn list_commands(&self, session_id: &acp::SessionId, _cx: &mut App) -> Task<Result<acp::ListCommandsResponse>> {
// Native agent doesn't support custom commands yet
let _session_id = session_id.clone();
Task::ready(Ok(acp::ListCommandsResponse {
commands: vec![],
}))
}
fn run_command(&self, _request: acp::RunCommandRequest, _cx: &mut App) -> Task<Result<()>> {
// Native agent doesn't support custom commands yet
Task::ready(Err(anyhow!("Custom commands not supported")))
}
fn into_any(self: Rc<Self>) -> Rc<dyn Any> {
self
}

View File

@@ -588,6 +588,7 @@ impl Thread {
image,
audio: false,
embedded_context: true,
supports_custom_commands: false,
}
}

View File

@@ -136,6 +136,7 @@ impl AcpConnection {
read_text_file: true,
write_text_file: true,
},
terminal: true,
},
})
.await?;
@@ -328,6 +329,23 @@ impl AgentConnection for AcpConnection {
.detach();
}
fn list_commands(&self, session_id: &acp::SessionId, cx: &mut App) -> Task<Result<acp::ListCommandsResponse>> {
let conn = self.connection.clone();
let session_id = session_id.clone();
cx.foreground_executor().spawn(async move {
conn.list_commands(acp::ListCommandsRequest { session_id }).await
.map_err(Into::into)
})
}
fn run_command(&self, request: acp::RunCommandRequest, cx: &mut App) -> Task<Result<()>> {
let conn = self.connection.clone();
cx.foreground_executor().spawn(async move {
conn.run_command(request).await
.map_err(Into::into)
})
}
fn into_any(self: Rc<Self>) -> Rc<dyn Any> {
self
}

View File

@@ -244,6 +244,7 @@ impl AgentConnection for ClaudeAgentConnection {
image: true,
audio: false,
embedded_context: true,
supports_custom_commands: false,
}),
cx,
)
@@ -339,6 +340,19 @@ impl AgentConnection for ClaudeAgentConnection {
.log_err();
}
fn list_commands(&self, session_id: &acp::SessionId, _cx: &mut App) -> Task<Result<acp::ListCommandsResponse>> {
// Claude agent doesn't support custom commands yet
let _session_id = session_id.clone();
Task::ready(Ok(acp::ListCommandsResponse {
commands: vec![],
}))
}
fn run_command(&self, _request: acp::RunCommandRequest, _cx: &mut App) -> Task<Result<()>> {
// Claude agent doesn't support custom commands yet
Task::ready(Err(anyhow!("Custom commands not supported")))
}
fn into_any(self: Rc<Self>) -> Rc<dyn Any> {
self
}

View File

@@ -1,10 +1,10 @@
use std::cell::Cell;
use std::cell::{Cell, RefCell};
use std::ops::Range;
use std::rc::Rc;
use std::sync::Arc;
use std::sync::atomic::AtomicBool;
use acp_thread::MentionUri;
use acp_thread::{AcpThread, MentionUri};
use agent_client_protocol as acp;
use agent2::{HistoryEntry, HistoryStore};
use anyhow::Result;
@@ -15,6 +15,7 @@ use language::{Buffer, CodeLabel, HighlightId};
use lsp::CompletionContext;
use project::{
Completion, CompletionIntent, CompletionResponse, Project, ProjectPath, Symbol, WorktreeId,
lsp_store::CompletionDocumentation,
};
use prompt_store::PromptStore;
use rope::Point;
@@ -32,6 +33,12 @@ use crate::context_picker::{
ContextPickerAction, ContextPickerEntry, ContextPickerMode, selection_ranges,
};
#[derive(Debug)]
enum CompletionType {
Mention(MentionCompletion),
SlashCommand(SlashCommandCompletion),
}
pub(crate) enum Match {
File(FileMatch),
Symbol(SymbolMatch),
@@ -47,6 +54,69 @@ pub struct EntryMatch {
entry: ContextPickerEntry,
}
#[derive(Debug, Clone)]
pub struct SlashCommandCompletion {
pub source_range: Range<usize>,
pub command_name: String,
pub argument: Option<String>,
}
impl SlashCommandCompletion {
fn try_parse(line: &str, offset_to_line: usize) -> Option<Self> {
let last_slash_start = line.rfind('/')?;
if last_slash_start >= line.len() {
return Some(Self {
source_range: last_slash_start + offset_to_line..last_slash_start + 1 + offset_to_line,
command_name: String::new(),
argument: None,
});
}
// Check if slash is at word boundary (not preceded by alphanumeric)
if last_slash_start > 0
&& line
.chars()
.nth(last_slash_start - 1)
.is_some_and(|c| c.is_alphanumeric())
{
return None;
}
let rest_of_line = &line[last_slash_start + 1..];
let mut command_name = String::new();
let mut argument = None;
let mut parts = rest_of_line.split_whitespace();
let mut end = last_slash_start + 1;
if let Some(cmd_text) = parts.next() {
end += cmd_text.len();
command_name = cmd_text.to_string();
// Check for arguments after command name
match rest_of_line[cmd_text.len()..].find(|c: char| !c.is_whitespace()) {
Some(whitespace_count) => {
if let Some(arg_text) = parts.next() {
argument = Some(arg_text.to_string());
end += whitespace_count + arg_text.len();
}
}
None => {
// Rest of line is entirely whitespace
end += rest_of_line.len() - cmd_text.len();
}
}
}
Some(Self {
source_range: last_slash_start + offset_to_line..end + offset_to_line,
command_name,
argument,
})
}
}
impl Match {
pub fn score(&self) -> f64 {
match self {
@@ -67,6 +137,7 @@ pub struct ContextPickerCompletionProvider {
history_store: Entity<HistoryStore>,
prompt_store: Option<Entity<PromptStore>>,
prompt_capabilities: Rc<Cell<acp::PromptCapabilities>>,
thread: Rc<RefCell<Option<WeakEntity<AcpThread>>>>,
}
impl ContextPickerCompletionProvider {
@@ -83,9 +154,15 @@ impl ContextPickerCompletionProvider {
history_store,
prompt_store,
prompt_capabilities,
thread: Rc::new(RefCell::new(None)),
}
}
/// Set the ACP thread for slash command support
pub fn set_thread(&self, thread: WeakEntity<AcpThread>) {
*self.thread.borrow_mut() = Some(thread);
}
fn completion_for_entry(
entry: ContextPickerEntry,
source_range: Range<Anchor>,
@@ -645,22 +722,123 @@ impl CompletionProvider for ContextPickerCompletionProvider {
_window: &mut Window,
cx: &mut Context<Editor>,
) -> Task<Result<Vec<CompletionResponse>>> {
let state = buffer.update(cx, |buffer, _cx| {
// Get the buffer state first
let (line, offset_to_line) = buffer.update(cx, |buffer, _cx| {
let position = buffer_position.to_point(buffer);
let line_start = Point::new(position.row, 0);
let offset_to_line = buffer.point_to_offset(line_start);
let mut lines = buffer.text_for_range(line_start..position).lines();
let line = lines.next()?;
MentionCompletion::try_parse(
let line = lines.next().unwrap_or("");
(line.to_string(), offset_to_line)
});
// Then check for completions outside of the buffer update
let completion_state = {
// First try mention completion
if let Some(mention) = MentionCompletion::try_parse(
self.prompt_capabilities.get().embedded_context,
&line,
offset_to_line,
) {
Some(CompletionType::Mention(mention))
} else if let Some(thread) = self.thread.borrow().as_ref().cloned() {
// Then try slash command completion (only if thread supports commands)
if let Ok(supports_commands) = thread.read_with(cx, |thread, _| {
thread.supports_custom_commands()
}) {
if supports_commands {
if let Some(slash) = SlashCommandCompletion::try_parse(&line, offset_to_line) {
Some(CompletionType::SlashCommand(slash))
} else {
None
}
} else {
None
}
} else {
None
}
} else {
None
}
};
let Some(completion_type) = completion_state else {
return Task::ready(Ok(Vec::new()));
};
match completion_type {
CompletionType::Mention(state) => self.complete_mentions(state, buffer.clone(), buffer_position, cx),
CompletionType::SlashCommand(state) => self.complete_slash_commands(state, buffer.clone(), buffer_position, cx),
}
}
fn is_completion_trigger(
&self,
buffer: &Entity<language::Buffer>,
position: language::Anchor,
_text: &str,
_trigger_in_words: bool,
_menu_is_open: bool,
cx: &mut Context<Editor>,
) -> bool {
let buffer = buffer.read(cx);
let position = position.to_point(buffer);
let line_start = Point::new(position.row, 0);
let offset_to_line = buffer.point_to_offset(line_start);
let mut lines = buffer.text_for_range(line_start..position).lines();
if let Some(line) = lines.next() {
// Check for @ mention completions
if let Some(completion) = MentionCompletion::try_parse(
self.prompt_capabilities.get().embedded_context,
line,
offset_to_line,
)
});
let Some(state) = state else {
return Task::ready(Ok(Vec::new()));
};
) {
let in_range = completion.source_range.start <= offset_to_line + position.column as usize
&& completion.source_range.end >= offset_to_line + position.column as usize;
if in_range {
return true;
}
}
// Check for slash command completions (only if thread supports commands)
if let Some(thread) = self.thread.borrow().as_ref().cloned() {
if let Ok(supports_commands) = thread.read_with(cx, |thread, _| {
thread.supports_custom_commands()
}) {
if supports_commands {
if let Some(completion) = SlashCommandCompletion::try_parse(line, offset_to_line) {
let in_range = completion.source_range.start <= offset_to_line + position.column as usize
&& completion.source_range.end >= offset_to_line + position.column as usize;
return in_range;
}
}
}
}
false
} else {
false
}
}
fn sort_completions(&self) -> bool {
false
}
fn filter_completions(&self) -> bool {
false
}
}
impl ContextPickerCompletionProvider {
fn complete_mentions(
&self,
state: MentionCompletion,
buffer: Entity<Buffer>,
_buffer_position: Anchor,
cx: &mut Context<Editor>,
) -> Task<Result<Vec<CompletionResponse>>> {
let Some(workspace) = self.workspace.upgrade() else {
return Task::ready(Ok(Vec::new()));
};
@@ -753,49 +931,85 @@ impl CompletionProvider for ContextPickerCompletionProvider {
Ok(vec![CompletionResponse {
completions,
// Since this does its own filtering (see `filter_completions()` returns false),
// there is no benefit to computing whether this set of completions is incomplete.
is_incomplete: true,
}])
})
}
fn is_completion_trigger(
fn complete_slash_commands(
&self,
buffer: &Entity<language::Buffer>,
position: language::Anchor,
_text: &str,
_trigger_in_words: bool,
_menu_is_open: bool,
state: SlashCommandCompletion,
buffer: Entity<Buffer>,
_buffer_position: Anchor,
cx: &mut Context<Editor>,
) -> bool {
let buffer = buffer.read(cx);
let position = position.to_point(buffer);
let line_start = Point::new(position.row, 0);
let offset_to_line = buffer.point_to_offset(line_start);
let mut lines = buffer.text_for_range(line_start..position).lines();
if let Some(line) = lines.next() {
MentionCompletion::try_parse(
self.prompt_capabilities.get().embedded_context,
line,
offset_to_line,
)
.map(|completion| {
completion.source_range.start <= offset_to_line + position.column as usize
&& completion.source_range.end >= offset_to_line + position.column as usize
})
.unwrap_or(false)
} else {
false
}
}
) -> Task<Result<Vec<CompletionResponse>>> {
let Some(thread) = self.thread.borrow().as_ref().cloned() else {
return Task::ready(Ok(Vec::new()));
};
fn sort_completions(&self) -> bool {
false
}
let snapshot = buffer.read(cx).snapshot();
let source_range = snapshot.anchor_before(state.source_range.start)
..snapshot.anchor_after(state.source_range.end);
fn filter_completions(&self) -> bool {
false
let command_prefix = state.command_name.clone();
cx.spawn(async move |_, cx| {
// Get session ID and connection from the thread
let (session_id, connection) = thread.read_with(cx, |thread, _| {
(thread.session_id().clone(), thread.connection().clone())
})?;
// Fetch commands from the agent
let commands_task = cx.update(|cx| {
connection.list_commands(&session_id, cx)
})?;
let response = commands_task.await?;
// Filter commands matching the typed prefix
let matching_commands: Vec<_> = response.commands
.into_iter()
.filter(|cmd| {
// Support both prefix matching and fuzzy matching
cmd.name.starts_with(&command_prefix) ||
cmd.name.to_lowercase().contains(&command_prefix.to_lowercase())
})
.collect();
// Convert to project::Completion following existing patterns
let completions: Vec<_> = matching_commands
.into_iter()
.map(|command| {
let new_text = if command.requires_argument {
format!("/{} ", command.name) // Add space for argument
} else {
format!("/{}", command.name)
};
Completion {
replace_range: source_range.clone(),
new_text: new_text.clone(),
label: CodeLabel::plain(command.name.clone(), None),
icon_path: Some(IconName::ZedAssistant.path().into()),
documentation: if !command.description.is_empty() {
Some(CompletionDocumentation::SingleLine(command.description.clone().into()))
} else {
None
},
source: project::CompletionSource::Custom,
insert_text_mode: None,
confirm: Some(Arc::new(move |_, _, _| {
// For now, just insert the text - command execution will be handled later
false
})),
}
})
.collect();
Ok(vec![CompletionResponse {
completions,
is_incomplete: false,
}])
})
}
}
@@ -928,6 +1142,7 @@ impl MentionCompletion {
}
}
#[cfg(test)]
mod tests {
use super::*;

View File

@@ -65,6 +65,7 @@ pub struct MessageEditor {
prompt_store: Option<Entity<PromptStore>>,
prevent_slash_commands: bool,
prompt_capabilities: Rc<Cell<acp::PromptCapabilities>>,
completion_provider: Rc<ContextPickerCompletionProvider>,
_subscriptions: Vec<Subscription>,
_parse_slash_command_task: Task<()>,
}
@@ -99,13 +100,15 @@ impl MessageEditor {
},
None,
);
let completion_provider = ContextPickerCompletionProvider::new(
let context_completion_provider = ContextPickerCompletionProvider::new(
cx.weak_entity(),
workspace.clone(),
history_store.clone(),
prompt_store.clone(),
prompt_capabilities.clone(),
);
let completion_provider = Rc::new(context_completion_provider);
let semantics_provider = Rc::new(SlashCommandSemanticsProvider {
range: Cell::new(None),
});
@@ -119,7 +122,7 @@ impl MessageEditor {
editor.set_show_indent_guides(false, cx);
editor.set_soft_wrap();
editor.set_use_modal_editing(true);
editor.set_completion_provider(Some(Rc::new(completion_provider)));
editor.set_completion_provider(Some(completion_provider.clone()));
editor.set_context_menu_options(ContextMenuOptions {
min_entries_visible: 12,
max_entries_visible: 12,
@@ -170,11 +173,17 @@ impl MessageEditor {
prompt_store,
prevent_slash_commands,
prompt_capabilities,
completion_provider,
_subscriptions: subscriptions,
_parse_slash_command_task: Task::ready(()),
}
}
pub fn set_thread(&mut self, thread: WeakEntity<acp_thread::AcpThread>, _cx: &mut Context<Self>) {
// Update the completion provider with the thread reference
self.completion_provider.set_thread(thread);
}
pub fn insert_thread_summary(
&mut self,
thread: agent2::DbThreadMetadata,
@@ -1905,6 +1914,7 @@ mod tests {
image: true,
audio: true,
embedded_context: true,
supports_custom_commands: false,
});
cx.simulate_input("Lorem ");

View File

@@ -551,10 +551,16 @@ impl AcpThreadView {
None
};
this.thread_state = ThreadState::Ready {
thread,
thread: thread.clone(),
title_editor,
_subscriptions: subscriptions,
};
// Update the message editor with the thread reference for slash command completion
this.message_editor.update(cx, |editor, cx| {
editor.set_thread(thread.downgrade(), cx);
});
this.message_editor.focus_handle(cx).focus(window);
this.profile_selector = this.as_native_thread(cx).map(|thread| {

25
script/spec_metadata.sh Executable file
View File

@@ -0,0 +1,25 @@
#!/usr/bin/env bash
set -euo pipefail
# Collect metadata
DATETIME_TZ=$(date '+%Y-%m-%d %H:%M:%S %Z')
FILENAME_TS=$(date '+%Y-%m-%d_%H-%M-%S')
if command -v git >/dev/null 2>&1 && git rev-parse --is-inside-work-tree >/dev/null 2>&1; then
REPO_ROOT=$(git rev-parse --show-toplevel)
REPO_NAME=$(basename "$REPO_ROOT")
GIT_BRANCH=$(git branch --show-current 2>/dev/null || git rev-parse --abbrev-ref HEAD)
GIT_COMMIT=$(git rev-parse HEAD)
else
REPO_ROOT=""
REPO_NAME=""
GIT_BRANCH=""
GIT_COMMIT=""
fi
# Print similar to the individual command outputs
echo "Current Date/Time (TZ): $DATETIME_TZ"
[ -n "$GIT_COMMIT" ] && echo "Current Git Commit Hash: $GIT_COMMIT"
[ -n "$GIT_BRANCH" ] && echo "Current Branch Name: $GIT_BRANCH"
[ -n "$REPO_NAME" ] && echo "Repository Name: $REPO_NAME"
echo "Timestamp For Filename: $FILENAME_TS"

0
thoughts/.gitkeep Normal file
View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,515 @@
---
date: 2025-08-28 15:34:28 PDT
researcher: Mikayla Maki
git_commit: 425291f0aed2abe148e1a8ea4eda74569e25c2b7
branch: claude-experiments
repository: zed
topic: "Custom Slash Commands for Agent Client Protocol"
tags: [research, codebase, acp, slash-commands, claude-code, protocol-extension]
status: complete
last_updated: 2025-08-28
last_updated_by: Nathan
last_updated_note: "Added detailed findings from agent-client-protocol and claude-code-acp repositories"
---
# Research: Custom Slash Commands for Agent Client Protocol
**Date**: 2025-08-28 15:34:28 PDT
**Researcher**: Mikayla Maki
**Git Commit**: 565782a1c769c90e58e012a80ea1c2d0cfcdb837
**Branch**: claude-experiments
**Repository**: zed
## Research Question
We're adding support for custom slash commands to Agent Client Protocol for the **agent panel** (not assistant 1/text threads). The client should be able to:
- List available commands
- Run a command with arguments (check Claude Code behavior)
In the Claude Code ACP adapter, we want implement the _agent_ side of the protocol:
- List commands by reading out of the `.claude/commands` directory
- Run commands via the SDK
We need to update the protocol to support the new RPCs for listing and running commands.
We need to understand how to run commands via the SDK.
**Important Note**: This is for the agent panel UX, NOT the existing assistant/text thread slash commands. The existing slash command infrastructure is for assistant 1/text threads and is not relevant to this implementation.
## Summary
The research reveals the architecture needed for implementing custom slash commands in the **agent panel** via ACP:
**Agent Panel Architecture**: Separate UI system from assistant/text threads with dedicated components (`AgentPanel`, `AcpThreadView`) and message handling through ACP protocol integration.
**ACP Protocol**: JSON-RPC based with clear patterns for adding new RPC methods through request/response enums, method dispatch, and capability negotiation. Handles session management, tool calls, and real-time message streaming.
**Claude Commands Structure**: Markdown-based command definitions in `.claude/commands/` with consistent format, metadata, and programmatic parsing potential.
**SDK Integration**: Claude Code ACP adapter bridges ACP protocol with Claude SDK, providing tool execution and session management through MCP servers.
**Note**: The existing Claude Code slash command system (`SlashCommand` trait, `assistant_slash_command` crate) is **not relevant** - that's for assistant 1/text threads. The agent panel needs its own custom command implementation.
## Detailed Findings
### Agent Panel Architecture
**Core Infrastructure** (`crates/agent_ui/`):
- `agent_panel.rs:24` - Main `AgentPanel` struct and UI component
- `acp/thread_view.rs:315` - `AcpThreadView` component for individual agent conversations
- `acp/message_editor.rs` - Message input component with slash command integration for agent panel
- `acp.rs` - ACP module entry point connecting to external ACP agents
**Agent Panel vs Assistant Distinction**:
The agent panel is **completely separate** from the assistant/text thread system:
- Agent panel uses ACP (Agent Client Protocol) for external agent communication
- Assistant uses internal Zed slash commands and text thread editors
- Different UI components, different input handling, different protocol integration
**ACP Integration Flow**:
1. External ACP agent process spawned via `agent_servers/src/acp.rs:63-76`
2. JSON-RPC connection established over stdin/stdout at line 84
3. Protocol initialization with capability negotiation at line 131
4. Sessions created via `new_session()` request for isolated conversations
5. User input converted to `PromptRequest` and sent to ACP agent
6. Agent responses stream back as `SessionUpdate` notifications
7. UI updates processed in `AcpThread::handle_session_update()`
**Current Input Handling**:
- Message composition through `MessageEditor` and specialized ACP message editor
- Standard chat input without custom command support currently
- Integration with model selector, context strip, and profile management
### Agent Client Protocol RPC Patterns
**Core Structure** (`agent-client-protocol/rust/`):
- JSON-RPC based bidirectional communication via symmetric `Agent`/`Client` traits
- Type-safe request/response enums with `#[serde(untagged)]` routing
- Capability negotiation via `AgentCapabilities` and `ClientCapabilities`
- Auto-generated JSON Schema from Rust types via `JsonSchema` derives
**Agent Trait Methods** (`agent.rs:18-108`):
- `initialize()` - Connection establishment and capability negotiation
- `authenticate()` - Authentication using advertised methods
- `new_session()` - Creates conversation contexts
- `load_session()` - Loads existing sessions (capability-gated)
- `prompt()` - Processes user prompts with full lifecycle
- `cancel()` - Cancels ongoing operations
**Client Trait Methods** (`client.rs:19-114`):
- `request_permission()` - Requests user permission for tool calls
- `write_text_file()` / `read_text_file()` - File operations (capability-gated)
- `session_notification()` - Handles session updates from agent
**RPC Infrastructure Pattern**:
1. **Method Constants** - Define at lines `agent.rs:395-415` / `client.rs:451-485`
2. **Request/Response Structs** - With `#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]`
3. **Schema Annotations** - `#[schemars(extend("x-side" = "agent", "x-method" = "method_name"))]`
4. **Untagged Enums** - Add to `ClientRequest`/`AgentResponse` enums for message routing
5. **Trait Methods** - Add to `Agent`/`Client` traits with `impl Future` signatures
6. **Connection Methods** - Implement in `ClientSideConnection`/`AgentSideConnection`
7. **Message Handling** - Update `MessageHandler` implementations for dispatch
**Protocol Versioning** (`version.rs:4-20`):
- Current: V1 with backward compatibility
- Breaking changes require version bump
- Non-breaking additions use capability flags
### .claude/commands Directory Structure
**Format**: Markdown files with consistent structure:
```markdown
# Command Name
[Description]
## Initial Response
[Standardized first response]
## Process Steps
### Step 1: [Phase Name]
[Instructions]
### Claude Code ACP Adapter Implementation
**Architecture** (`claude-code-acp/src/`):
- `acp-agent.ts:51` - `ClaudeAcpAgent` class implementing complete ACP `Agent` interface
- `mcp-server.ts:9` - Internal MCP proxy server for file operations and permissions
- `tools.ts:22` - Tool format conversion between Claude SDK and ACP representations
- Session management with UUID tracking and Claude SDK `Query` objects
**Agent Interface Implementation** (`acp-agent.ts:51-218`):
- `initialize()` at line 63: Declares capabilities (image, embedded_context) and auth methods
- `newSession()` at line 84: Creates UUID sessions with MCP server integration
- `prompt()` at line 140: Main query execution using Claude SDK with real-time streaming
- `cancel()` at line 211: Properly handles session cancellation and cleanup
**Session Lifecycle** (`acp-agent.ts:84-134`):
1. Generate UUID session ID and create pushable input stream
2. Configure MCP servers from ACP request parameters
3. Start internal HTTP-based MCP proxy server on dynamic port
4. Initialize Claude SDK query with working directory, MCP servers, tool permissions
5. Enable `mcp__acp__read` while disabling direct file tools for security
**Query Execution Flow** (`acp-agent.ts:140-209`):
1. Convert ACP prompt to Claude format via `promptToClaude()` at line 237
2. Push user message to Claude SDK input stream
3. Iterate through Claude SDK responses with real-time streaming
4. Handle system, result, user, and assistant message types
5. Convert Claude messages to ACP format via `toAcpNotifications()` at line 312
6. Stream session updates back to ACP client
**MCP Proxy Architecture** (`mcp-server.ts:9-449`):
- **Internal HTTP Server**: Creates MCP server for Claude SDK integration
- **Tool Implementations**:
- `read` (lines 19-94): Proxies to ACP client's `readTextFile()`
- `write` (lines 96-149): Proxies to ACP client's `writeTextFile()`
- `edit` (lines 152-239): Text replacement with line tracking
- `multi-edit` (lines 241-318): Sequential edit operations
- **Permission Integration**: Routes tool permission requests through ACP client
**Current Command Support**:
- **No existing slash command infrastructure** - all interactions use standard prompt interface
- **No `.claude/commands` directory integration** currently implemented
- **Command detection would require preprocessing** before Claude SDK integration
## Code References
### Zed Integration Layer
- `crates/agent_ui/src/agent_panel.rs:24` - Main AgentPanel component
- `crates/agent_ui/src/acp/thread_view.rs:315` - AcpThreadView UI component
- `crates/agent_ui/src/acp/message_editor.rs` - Agent panel message input
- `crates/agent_servers/src/acp.rs:63-162` - ACP connection establishment
- `crates/acp_thread/src/acp_thread.rs:826` - ACP thread creation
### Agent Client Protocol
- `agent-client-protocol/rust/agent.rs:18-108` - Agent trait with 6 core methods
- `agent-client-protocol/rust/client.rs:19-114` - Client trait for bidirectional communication
- `agent-client-protocol/rust/acp.rs:120` - ClientSideConnection implementation
- `agent-client-protocol/rust/acp.rs:341` - AgentSideConnection implementation
- `agent-client-protocol/rust/rpc.rs:30-367` - RPC connection infrastructure
- `agent-client-protocol/rust/agent.rs:333-371` - AgentCapabilities and PromptCapabilities
- `agent-client-protocol/rust/agent.rs:423-432` - ClientRequest/AgentResponse enum routing
- `agent-client-protocol/rust/generate.rs:24-77` - JSON schema generation
### Claude Code ACP Adapter
- `claude-code-acp/src/acp-agent.ts:51-218` - ClaudeAcpAgent implementing Agent interface
- `claude-code-acp/src/mcp-server.ts:9-449` - Internal MCP proxy server
- `claude-code-acp/src/tools.ts:22-395` - Tool format conversion
- `claude-code-acp/src/utils.ts:7-75` - Stream processing utilities
### Command Infrastructure
- `.claude/commands/*.md` - Command definition files (markdown format)
- No existing slash command infrastructure in claude-code-acp currently
## Architecture Insights
**Agent Panel System**: Completely separate from assistant/text threads, uses ACP protocol for external agent communication with JSON-RPC over stdin/stdout, manages sessions with unique IDs, and provides real-time message streaming with UI updates.
**ACP Protocol**: Designed for extensibility with capability negotiation, type safety through Rust enums, symmetric bidirectional design, and JSON-RPC foundation. Handles tool calls, permissions, and session management.
**Command Definitions**: Human-readable markdown with programmatically parseable structure, consistent metadata patterns, and workflow automation framework stored in `.claude/commands/`.
**Integration Patterns**: Claude Code ACP adapter provides proven pattern for bridging protocols, MCP servers enable tool execution proxying, session management handles concurrent interactions. Agent panel needs new command integration separate from existing slash commands.
## Implementation Recommendations
### 1. Protocol Extension for Custom Commands
Add new RPC methods following exact ACP patterns in `agent-client-protocol/rust/`:
**Method Constants** (`agent.rs:395-415`):
```rust
pub const SESSION_LIST_COMMANDS: &str = "session/list_commands";
pub const SESSION_RUN_COMMAND: &str = "session/run_command";
```
**Request/Response Types** (after `agent.rs:371`):
```rust
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]
#[schemars(extend("x-side" = "agent", "x-method" = "session/list_commands"))]
#[serde(rename_all = "camelCase")]
pub struct ListCommandsRequest {
pub session_id: SessionId,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]
#[schemars(extend("x-side" = "agent", "x-method" = "session/list_commands"))]
#[serde(rename_all = "camelCase")]
pub struct ListCommandsResponse {
pub commands: Vec<CommandInfo>,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]
#[serde(rename_all = "camelCase")]
pub struct CommandInfo {
pub name: String,
pub description: String,
pub requires_argument: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)]
#[schemars(extend("x-side" = "agent", "x-method" = "session/run_command"))]
#[serde(rename_all = "camelCase")]
pub struct RunCommandRequest {
pub session_id: SessionId,
pub command: String,
pub args: Option<String>,
}
```
**Trait Extension** (add to `Agent` trait after `cancel()` at line 107):
```rust
fn list_commands(
&self,
arguments: ListCommandsRequest,
) -> impl Future<Output = Result<ListCommandsResponse, Error>>;
fn run_command(
&self,
arguments: RunCommandRequest,
) -> impl Future<Output = Result<(), Error>>;
```
**Enum Routing** (add to `ClientRequest` at line 423 and `AgentResponse`):
```rust
ListCommandsRequest(ListCommandsRequest),
RunCommandRequest(RunCommandRequest),
```
**Capability Extension** (add to `PromptCapabilities` at line 358):
```rust
/// Agent supports custom slash commands via `list_commands` and `run_command`.
#[serde(default)]
pub supports_custom_commands: bool,
```
### 2. Agent Panel UI Integration
**Option A**: Extend ACP Message Editor
- Modify `crates/agent_ui/src/acp/message_editor.rs` to detect custom commands
- Add command completion/suggestion UI similar to existing patterns
- Trigger custom command execution through ACP protocol
**Option B**: New Command Interface
- Create dedicated command input component in agent panel
- Separate from regular message input to provide distinct UX
- Integrate with `AcpThreadView` for command results display
### 3. ACP Agent Implementation
In Claude Code ACP adapter, extend `ClaudeAcpAgent` class at `claude-code-acp/src/acp-agent.ts:51`:
**Add Command Parser** (new module at `src/command-parser.ts`):
```typescript
export interface CommandInfo {
name: string;
description: string;
requires_argument: boolean;
content?: string;
}
export class CommandParser {
private commandsDir: string;
private cachedCommands?: CommandInfo[];
constructor(cwd: string) {
this.commandsDir = path.join(cwd, '.claude', 'commands');
}
async listCommands(): Promise<CommandInfo[]> {
// Parse *.md files, extract H1 titles and descriptions
}
async getCommand(name: string): Promise<CommandInfo | null> {
// Return specific command with full content for execution
}
}
```
**Extend ClaudeAcpAgent** (add after line 218):
```typescript
private commandParser?: CommandParser;
// In constructor around line 60:
if (options.cwd && fs.existsSync(path.join(options.cwd, '.claude', 'commands'))) {
this.commandParser = new CommandParser(options.cwd);
}
// Update initialize() around line 68 to advertise capability:
agent_capabilities: {
prompt_capabilities: {
image: true,
audio: false,
embedded_context: true,
supports_custom_commands: !!this.commandParser,
},
}
async listCommands(request: ListCommandsRequest): Promise<ListCommandsResponse> {
if (!this.commandParser) return { commands: [] };
const commands = await this.commandParser.listCommands();
return {
commands: commands.map(cmd => ({
name: cmd.name,
description: cmd.description,
requires_argument: cmd.requires_argument,
}))
};
}
async runCommand(request: RunCommandRequest): Promise<void> {
if (!this.commandParser) throw new Error('Commands not supported');
const command = await this.commandParser.getCommand(request.command);
if (!command) throw new Error(`Command not found: ${request.command}`);
// Execute command via existing session mechanism
const session = this.sessions.get(request.session_id);
if (!session) throw new Error('Session not found');
// Create system prompt from command content
let prompt = command.content;
if (command.requires_argument && request.args) {
prompt += `\n\nArguments: ${request.args}`;
}
// Inject as system message and process via existing prompt flow
session.input.push({ role: 'user', content: prompt });
// Stream results back via existing session update mechanism
// (handled automatically by query execution loop at line 150)
}
```
### 4. Command Parsing and Execution
Implement markdown parser for `.claude/commands/*.md`:
```typescript
function parseCommandFile(content: string): CommandInfo {
// Extract H1 title for name/description
// Find "Initial Response" section
// Parse metadata and requirements
// Return structured command info
}
```
Execute commands by sending command content as system prompt to Claude SDK, similar to existing ACP query patterns.
## Open Questions
1. **Agent Panel UX**: Should custom commands be integrated into the existing message input or as a separate command interface?
2. **Command Arguments**: How should complex command arguments be structured and validated in the agent panel context?
3. **Command Context**: Should commands have access to current ACP session state, file context, or conversation history?
4. **Command Discovery**: Should commands be cached or re-read on each listing request? How does this integrate with ACP session lifecycle?
5. **Command Execution**: Should commands run in isolated contexts or share ACP session state?
6. **Error Handling**: What's the appropriate error handling strategy for command parsing and execution failures in the agent panel?
7. **UI Integration**: How should command execution progress and results be displayed within the `AcpThreadView` component?
## Follow-up Research 2025-08-28 20:29:47 MDT
After gaining access to the actual `agent-client-protocol` and `claude-code-acp` repositories, I updated this research document with concrete implementation details:
### Key New Findings
**Agent Client Protocol Structure**: The protocol uses a symmetric `Agent`/`Client` trait design with `#[serde(untagged)]` enum routing, JSON schema generation, and explicit capability negotiation. Adding new RPC methods requires specific patterns for method constants, request/response structs, trait extensions, and enum routing.
**Claude Code ACP Adapter**: Implements the full Agent interface using Claude SDK integration via MCP proxy servers. Currently has no slash command infrastructure - all interactions use the standard prompt interface. Command detection would require preprocessing before Claude SDK integration.
**Implementation Requirements**: More complex than initially understood - requires protocol extension, trait implementations, enum routing updates, capability advertisement, and coordination between three repositories (zed, agent-client-protocol, claude-code-acp).
### Updated Implementation Approach
1. **Protocol Extension**: Add `session/list_commands` and `session/run_command` methods following exact ACP patterns
2. **Capability System**: Extend `PromptCapabilities` with `supports_custom_commands` flag
3. **Agent Implementation**: Create command parser module in claude-code-acp to read `.claude/commands/*.md` files
4. **UI Integration**: Use existing completion provider patterns in Zed's agent panel message editor
The research now includes specific file:line references and concrete code examples for all three layers of the implementation.
###### ORIGINAL CLAUDE PROMPT (2025-08-28 15:34:28 PDT):
⏺ I've completed comprehensive research on adding custom
slash commands to the Agent Client Protocol for the
**agent panel** (NOT assistant/text threads). Here's what
I found:
Key Findings
Agent Panel Architecture: The agent panel is completely
separate from assistant/text threads, using ACP protocol
for external agent communication via JSON-RPC over
stdin/stdout. It has dedicated UI components like
AgentPanel and AcpThreadView.
ACP Protocol Patterns: The protocol uses JSON-RPC with
type-safe request/response enums. Adding new RPC methods
follows a clear pattern: define structs, add to enums,
update dispatch logic. Handles sessions, tool calls, and
real-time streaming.
.claude/commands Structure: Commands are defined as
structured Markdown files with consistent format -
perfect for programmatic parsing. Each contains metadata,
process steps, and execution guidelines.
SDK Integration: The Claude Code ACP adapter already
bridges ACP with the Claude SDK through MCP servers,
providing a proven pattern for command execution.
**IMPORTANT**: The existing Claude Code slash command
system (SlashCommand trait, assistant_slash_command crate)
is NOT relevant - that's for assistant 1/text threads.
The agent panel needs its own custom command
implementation.
Implementation Approach
You'll need to:
1. Extend ACP Protocol: Add ListCommandsRequest/Response
and RunCommandRequest/Response to the schema
2. Agent Panel UI: Integrate custom commands into agent
panel message input or create separate command interface
3. Parse Command Definitions: Read .claude/commands/\*.md
files and extract metadata (name, description,
requirements)
4. Execute Commands: Run commands via Claude SDK queries
through existing ACP session mechanism
5. Handle Streaming: Use ACP's session notification
system to stream command progress to AcpThreadView
The research document is saved at thoughts/shared/researc
h/2025-08-28_15-34-28_custom-slash-commands-acp.md with
detailed implementation recommendations, code references,
and open questions for your consideration.
###### STATS ON PROMPT TO GENERATE THIS DOCUMENT:
Total cost: $2.87
Total duration (API): 12m 59.0s
Total duration (wall): 6m 52.2s
Total code changes: 242 lines added, 0 lines removed
Usage by model:
claude-3-5-haiku: 4.3k input, 175 output, 0 cache read, 0 cache write
claude-sonnet: 2.9k input, 31.3k output, 4.8m cache read, 256.1k cache write