---
id: DE-005
slug: implement-spec-backfill
name: Delta - implement spec backfill
created: '2025-11-02'
updated: '2025-11-02'
status: complete
kind: delta
aliases: []
relations: []
tags: ["agent"]
applies_to:
  specs:
  - PROD-007
  requirements:
  - PROD-007.FR-001
  - PROD-007.FR-002
  - PROD-007.FR-003
  - PROD-007.FR-004
  - PROD-007.NF-001
  - PROD-007.NF-002
---

# DE-005 – implement spec backfill

```yaml supekku:delta.relationships@v1
schema: supekku.delta.relationships
version: 1
delta: DE-005
revision_links:
  introduces: []
  supersedes: []
specs:
  primary:
    - PROD-007
  collaborators:
    - SPEC-110
    - SPEC-112
requirements:
  implements:
    - PROD-007.FR-001
    - PROD-007.FR-002
    - PROD-007.FR-003
    - PROD-007.FR-004
    - PROD-007.NF-001
    - PROD-007.NF-002
  updates: []
  verifies: []
phases:
  - id: IP-005.PHASE-01
```

## 1. Summary & Context
- **Product Spec(s)**: [PROD-007](../../specify/product/PROD-007/PROD-007.md) – Agent spec backfill workflow
- **Technical Spec(s)**: To be created - will define CLI commands, detection logic, agent workflow
- **Implementation Plan**: [IP-005](./IP-005.md) – Not started
- **Change Drivers**: PROD-001/RISK-007 (no workflow for completing existing/incomplete specs)

## 2. Motivation

After `spec-driver sync`, users have dozens or hundreds of auto-generated stub specs. Manually completing these is tedious (2+ hours per spec), error-prone (YAML syntax, conventions), and blocks achieving comprehensive spec coverage. This delta implements agent-assisted backfill to reduce completion time to <10 minutes per spec while maintaining quality through validation.

Key pain points addressed:
- Stub specs provide no value without completion
- Manual YAML editing is error-prone
- Teams can't achieve >20% spec coverage due to effort required
- No tooling to leverage existing code documentation (contracts) for spec completion

## 3. Scope & Objectives

**Primary Outcomes**:
- PROD-007.FR-001: Agent workflow completes stub specs with minimal user questions (≤3)
- PROD-007.FR-005: CLI detects stub vs. modified content; auto-replaces stub, requires `--force` for modified
- PROD-007.FR-006: CLI provides `show template` command for agents to reference structure
- PROD-007.FR-003/FR-004: Batch mode processes multiple specs with configurable automation

**Operational Constraints**:
- Must not overwrite manually-created content (critical - data loss risk)
- Batch mode must complete 10 specs in <15 min (automated mode)
- Zero breaking changes to existing CLI commands

**Dependencies**:
- Contracts must be generated (via sync) for optimal results
- SpecRegistry must support reading/writing updated specs

## 4. Out of Scope

- Creating new specs from scratch (PROD-001 handles this)
- Editing arbitrary spec content (use text editor)
- AI-generated architecture diagrams
- Non-Python language support (future enhancement)
- Real-time collaborative editing

## 5. Approach Overview

**System Touchpoints**:
- `supekku/cli/show.py` - Add `show template` command
- `supekku/cli/backfill.py` - New module for backfill commands
- `supekku/scripts/lib/specs/completion.py` - New module for completion logic
- `.claude/commands/supekku.backfill.md` - New agent command
- `supekku/scripts/lib/specs/registry.py` - Existing registry for read/write

**Key Changes**:
1. **Template retrieval** (PROD-007.FR-006): `show.py` gets `template` subcommand returning Jinja2 template markdown
2. **Stub detection** (PROD-007.FR-005): Compare spec content after frontmatter to template; match → stub, diff → modified
3. **Completion logic** (PROD-007.FR-001, FR-002): Read spec + contracts, identify gaps, fill sections, validate
4. **Batch mode** (PROD-007.FR-003, FR-004): Glob matching, progress reporting, error isolation
5. **Agent command** (all requirements): Orchestrates workflow with user prompts

**Migration / Rollout Notes**:
- No migration needed (net-new functionality)
- Can roll out incrementally: template retrieval → single spec → batch mode

## 6. Verification Strategy

**Requirements Coverage**:
- PROD-007.FR-001: VT-001 - End-to-end agent workflow test (stub spec → backfilled spec)
- PROD-007.FR-002: VT-002 - Manual content preservation test (partial spec not overwritten)
- PROD-007.FR-003/FR-004: VT-003 - Batch mode test (10 specs, interactive vs automated)
- PROD-007.FR-005: VT-004 - Stub detection test (auto-replace stub, require --force for modified)
- PROD-007.FR-006: VT-005 - Template retrieval test (both product and tech kinds)
- PROD-007.NF-001: VA-001 - Performance test (10 specs <15min, 50 specs <30min)
- PROD-007.NF-002: VA-002 - Usability test (≤3 questions per spec in 80%+ of cases)

**Planned Artefacts**:
- VT-001 through VT-005: Unit and integration tests
- VA-001: Performance benchmarks
- VA-002: User testing sessions with question count tracking

**Acceptance Criteria**:
1. All VT tests passing
2. VA-001: Batch performance within targets
3. VA-002: Question count ≤3 per spec in 80%+ cases
4. Real-world validation: 5 users successfully backfill batches of 10+ specs
5. Zero manual content overwrite incidents in testing

## 7. Risks & Mitigations

**Risk**: Stub detection incorrectly identifies modified content as stub
- **Likelihood**: Low (exact string matching after frontmatter)
- **Impact**: Critical (data loss)
- **Mitigation**: Conservative detection logic; comprehensive test suite; `--dry-run` mode shows what would change

**Risk**: Contract quality varies; backfill quality suffers
- **Likelihood**: Medium (contracts depend on code quality/comments)
- **Impact**: Medium (incomplete/inaccurate specs)
- **Mitigation**: Mark low-confidence sections for review; allow user to provide additional context

**Risk**: Batch mode performance doesn't meet targets (10 specs in 15min)
- **Likelihood**: Medium (agent latency unpredictable)
- **Impact**: Low (feature still useful if slower)
- **Mitigation**: Optimize by batching API calls; parallel processing where safe; clear progress indicators

**Risk**: Users expect perfect AI completion; disappointed by limitations
- **Likelihood**: High (AI hype)
- **Impact**: Low (feature still valuable)
- **Mitigation**: Clear documentation of capabilities/limits; mark assumptions in generated specs

## 8. Follow-ups & Tracking

**Future Phases / Deltas**:
- Phase 2: Sub-agent delegation for parallel batch processing (performance optimization)
- Future: Support for non-Python languages (Go, TypeScript) -- TBD is this language agnostic or dependent?
- Future: Relationship inference beyond contracts (semantic analysis)

**Backlog Items**:
- Create tech spec defining implementation details (CLI API, detection algorithms, etc.)
- Design stub detection algorithm with edge case handling
- Create `.claude/commands/supekku.backfill.md` agent command
- Implement `supekku/cli/backfill.py` module
- ensure tech specs auto-created with status 'stub'


**Open Decisions / Questions**:
- **Q1**: Batch mode default: interactive or automated? **Leaning**: Interactive (safer)
- **Q2**: Handle incomplete contracts: skip spec or partial completion? **Leaning**: Partial + mark for review

## 9. Implementation Notes

**Environment Setup**:
```bash
# Development
uv sync
just test  # Run all tests
just lint  # Lint code

# Testing backfill workflow
uv run spec-driver sync  # Generate stub specs
uv run spec-driver show template tech  # Retrieve template
uv run spec-driver backfill spec SPEC-123  # Single spec
uv run spec-driver backfill batch "SPEC-*"  # Batch mode
```

**Key Design Decisions**:
1. **Stub detection**: status = 'stub' && lines < 30
2. **Contracts location**: `specify/{kind}/{spec-id}/contracts/*.md`
3. **Progress reporting**: Print to stdout in real-time (not just at end)
4. **Error handling**: Isolate per-spec errors; continue batch on failure

**Reference Documentation**:
- PROD-007: Product spec defining user requirements
- PROD-001: Parent spec for spec creation workflow (shared patterns)
- `.claude/commands/supekku.specify.md`: Similar agent command pattern

## 10. Implementation Notes (2025-11-02)

### Phase 01 Status: ~90% Complete

**Completed Work**:
- ✅ Core backfill infrastructure (Tasks 1.1-1.5)
  - `show template` command
  - Stub detection (status + line count)
  - `backfill spec` CLI command
  - Agent workflow documentation
- ✅ Auto-spec improvements (Tasks 1.5.1-1.5.3)
  - Sync creates `status: 'stub'` specs
  - Migrated 16 existing stubs
  - Status theming (stub=mid-grey, draft=light-grey)
  - `--prune` safety (requires --force for non-stubs)

**Remaining**: Task 1.6 (integration testing & dogfooding)

**Key Decisions Made**:
- Status-based stub detection (primary) + line count fallback (pragmatic)
- CLI does mechanics, agent does intelligence
- Batch mode deferred to Phase 02
- No programmatic completion module needed

**Issues Discovered**:
- ISSUE-009: Status fields lack enum validation across all entities

**Files Added**:
- `.claude/commands/supekku.backfill.md` - Agent workflow
- `supekku/scripts/migrate_stub_status.py` - One-time migration

**See**: `phases/phase-01.md` Section 12 for detailed handover notes
