Met with CCExtractor mentors, ramped up on the existing report format, and scoped the v1.0 JSON schema fields against downstream parser needs.
Read existing -out=report text emitter top-to-bottom. Drafted schema field list with mentors. Confirmed PAT enumeration was a known multi-program edge case before coding even started.
Added --report-format CLI flag, wired cJSON into the report emitter, scaffolded the v1.0 schema envelope (version, generator, media[]).
Replaced program-counter walk with PAT-based enumeration. Verified against 25 multi-program files that previously reported wrong program IDs.
Extending has_any_captions to OR across CEA-608, CEA-708, DVB, and Teletext rather than just CEA-608/708. Validating against teletext-only DVB streams in the test suite.
Lock schema fields. Run full 172-file test suite, confirm 100% valid JSON, sign off on v1.0 before midterm.
Add per-stream codec, bitrate, and language metadata to the JSON output. Keep field set conservative — no breaking changes after v1.0 lock.
Define a typed errors[] array. Decode failures, parse warnings, and stream gaps each get a stable error code. Prep midterm evaluation submission.
Midterm evaluation week (Jul 6 – Jul 10). Submit progress report. Sync with mentors on remaining scope.
Document jq-based parsing patterns for common CI checks: caption presence, error count, stream count mismatches.
Author the schema reference doc. Verify the output parses cleanly in Python (pydantic), TypeScript (zod), and Rust (serde) — three reference consumers.
Add JSON-output assertions to the existing testsuite. Catch silent schema drift before it ships.
Mentor walkthrough of the PR. Address feedback. Confirm v1.0 schema is frozen.
Final report draft. PR ready for upstream merge. Final submission window opens Aug 17 – Aug 24.