REFACTOR.md – Authoring Brief (Revised for Lean & Refactor Principles)¶
This document is the canonical, action-oriented, self-contained, and phased roadmap for the vexy_json refactor sprint. It integrates the detailed refactor playbook and quality principles from REFACTOR_PROMPT.md
and the minimalization/dead code removal guidance from LEAN.md
. It is written for a technically strong engineer new to this repository.
1. Executive Summary¶
The vexy_json codebase is a monolithic Rust crate implementing a forgiving JSON parser, CLI, and WASM module. Its tightly coupled structure, legacy/dead code, and lack of clear boundaries hinder maintainability, performance, and extensibility. This refactor will:
- Decouple components into a Cargo workspace of focused crates.
- Remove dead/legacy code and minimize dependencies.
- Feature-gate optional components (WASM, Serde, CLI).
- Enforce production-grade, review-friendly, and performance-aware practices.
- Improve documentation, developer experience, and CI/CD quality gates.
Upon completion, vexy_json will be a lean, maintainable, and extensible parser suite, with robust testing, clear architecture, and a minimal core suitable for embedding or distribution.
2. Guiding Principles¶
2.1. Production-grade Quality & Lean Minimalism¶
- Write clean, idiomatic, boring Rust. Avoid clever macros.
- Remove all dead/legacy code (see Section 4).
- Minimize dependencies; only use well-audited crates.
- Feature-gate all optional functionality (WASM, Serde, CLI).
- No public API breakage unless unavoidable and documented.
2.2. Parity With Reference Implementation¶
- Maintain 100% compatibility with the JavaScript
the reference implementation
test suite unless deviations are documented.
2.3. Incremental, Review-friendly Commits¶
- Refactor in small, atomic, test-passing commits.
- Each PR must be reviewable, CI-green, and benchmarked.
2.4. Minimal Public-API Breakage¶
- Downstream code and WASM builds must not break.
- Breaking changes require CHANGELOG entries and semver bumps.
2.5. Performance Awareness¶
- No >3% regression on Criterion benchmarks unless justified.
- Document and benchmark all performance-impacting changes.
2.6. Great DX¶
- Improve docs, examples, and error messages as code is touched.
- Run
./build.sh
locally before pushing.
2.7. Security & Safety First¶
- Eliminate all
unsafe
code. - Remove all
unwrap
/expect
unless justified and documented.
3. Architectural Re-design¶
3.1. Workspace Structure¶
Refactor into a Cargo workspace with these crates:
- vexy_json-core: Core parser, lexer, value types, errors. No I/O, CLI, or WASM logic.
- vexy_json-cli: CLI wrapper, feature-gated.
- vexy_json-wasm: WASM bindings, feature-gated.
- vexy_json-serde: Serde integration, feature-gated.
- test-utils: Shared test helpers.
- examples/, benches/: Kept for development, excluded from lean/core builds.
3.2. Minimal Core¶
The minimal, embeddable crate consists of only:
src/lib.rs
src/parser.rs
src/lexer.rs
src/value.rs
src/error.rs
All other files are optional, feature-gated, or excluded from minimal builds.
4. Refactor Playbook (Phased Steps)¶
4.1. Phase 1: On-boarding & Baseline¶
- Clone repo, run
./build.sh
, ensure reproducible build. - Review
docs/internal/CLAUDE.md
,IMPLEMENTATION_SUMMARY.md
,PLAN.md
. - Run and record baseline benchmarks.
- Create
refactor/phase-1-module-layout
branch.
4.2. Phase 4: Lexer Simplification¶
- Remove config duplication; config only in parser. (Completed)
- Evaluate
logos
crate for lexer; benchmark and adopt if beneficial. (Completed) - Ensure canonical token stream; add property tests. (Completed)
4.3. Phase 5: Parser Refactor¶
- Introduce
ParserState
struct. (Completed) - Remove tail recursion; use explicit stack. (Completed - addressed by
max_depth
inParserOptions
) - Improve error reporting with
Span
. - Add config validation.
- Add property-based round-trip tests.
4.4. Phase 6: Error & Result Type Revamp¶
- Use
thiserror
for error enums. - Provide error source chains.
- Export
ParseResult<T = Value>
alias.
4.5. Phase 7: WASM & Serde Bindings¶
- Regenerate WASM with latest
wasm-bindgen
. - Expose JS-friendly API.
- Feature-gate all bindings.
4.6. Phase 8: Benchmark & CI Pipeline¶
- Move benches to
benches/
root. - Add CI matrix for Rust toolchains and WASM.
- Add
cargo udeps
andcargo deny
checks.
4.7. Phase 9: Documentation & DX¶
- Update code comments to explain "why".
- Auto-generate docs in CI; deploy to GitHub Pages.
- Write migration guide if any
pub
items are renamed.
4.8. Phase 10: Release Planning¶
- Bump version to
0.2.0
(semver). - Update
CHANGELOG.md
with highlights.
5. Technical Debt Catalogue & Fix Plan¶
ID | File / Module | Issue / Impact / Fix (summary) | Effort |
---|---|---|---|
P0 | src/parser.rs |
Monolithic, complex logic. Rewrite as Pratt/recursive descent parser. | L |
P0 | src/main.rs:95 |
Custom JSON formatter. Use serde_json . |
S |
P1 | src/parser.rs:313 |
Parser calculates token positions. Lexer should emit spans. | M |
P1 | src/main.rs:45 |
CLI pre-processes input. Move logic to lexer. | S |
P1 | everywhere | Inconsistent error handling. Eliminate Error::Custom . |
M |
P2 | tests/ |
Lack of property-based testing. Add proptest . |
M |
P2 | src/lib.rs |
Tests inside lib. Move to tests/ . |
S |
6. Testing & Quality Gates¶
- Coverage Baseline: Measure with
cargo-tarpaulin
. - Target Coverage:
vexy_json-core
≥95%, CLI ≥80%, WASM ≥90%. - Testing Pyramid: Unit, integration, property-based, and performance tests.
- CI Workflow: Format, lint, test, coverage, bench, build artifacts.
- Deliverable Checklist per PR:
./build.sh
green locally.- All tests & benches pass on CI.
- Coverage ≥90% for touched code.
- Docs updated for public API changes.
- CHANGELOG entry under Unreleased.
7. Migration Strategy¶
- Create
refactor/workspace
branch. - Convert to Cargo workspace; create new crate structure.
- Migrate core files first; re-export from old crate for compatibility.
- Add
--refactor-parser
CLI flag for dual-track testing. - Run CI on both old and new implementations until cut-over.
- Tag before each major step for rollback.
8. Performance Targets¶
- Parsing Throughput: 10MB in <100ms (release build).
- Performance Parity: Within 3% of old parser, within 10% of
serde_json
. - WASM: 1MB in <50ms in browser.
- Use
cargo-flamegraph
andpprof
for profiling.
9. Documentation & DX¶
- API docs auto-generated and deployed.
- Examples for CLI, core, WASM.
- Updated README with badges.
- CONTRIBUTING.md with workflow, style, PR checklist.
10. Timeline & Milestones¶
Week | Deliverable | Success Metric |
---|---|---|
1-2 | Workspace setup & vexy_json-core created |
CI green, core builds, dead code removed. |
3-4 | Lexer refactored, emits spans | Token struct has span, parser updated. |
5-8 | New parser implemented | Property tests pass. |
9-10 | CLI/WASM migrated to new parser | All integration tests pass. |
11 | Old parser removed, final cleanup | No breaking changes in public API. |
12 | Docs updated, refactor branch merged | Branch merged to main. |
11. Acceptance Criteria¶
- All CLI flags and behaviors preserved.
- Public Rust API is identical or a superset, verified with
cargo-public-api diff
. - WASM bundle size ≤300KB gzipped.
- CI pipeline completes in <12 minutes.
- Test coverage for core ≥95%.
- No performance regressions on benchmarks.
- Only minimal, essential files in core crate.
12. Open Questions & Assumptions¶
Question | Owner | Due Date |
---|---|---|
What is the Minimum Supported Rust Version (MSRV) for this project? | @engineer-1 | Week 1 |
Are there any clients depending on the exact error messages? | @product | Week 1 |
What is the long-term support plan for JSON-C style comments (# )? |
@product | Week 2 |
13. Final Notes¶
Treat this refactor as paving the road for long-term maintainability and minimalism, not chasing micro-optimizations. When in doubt, choose readability and simplicity, but back up decisions with benchmark data. Periodically re-audit for dead/unused modules and update this plan as new feature gates or refactors are added.