Skip to content

Schema Documentation Feedback #1542

@MrChocolateMoose

Description

@MrChocolateMoose

Overview

I've read through the documentation a few times for schema validation as well as done a few more advanced activities like using Copilot (using premium models like Codex 5) automatically making schemas based upon the documentation.

The documentation is pretty comprehensive and awesome as far as technical documentation goes.

I can say happily say that many copilot based use cases where you describe the schema you want and it creates it works in a single shot (which is magical 🪄), but I noticed some stumbling which inevitably goes back to the documentation. I decided to give a human peer review of the schema docs and point out various stumbling points for robots.

Feedback

  • Better organization of the document headers
    • Modeling Example is linked to Field Configuration but appears at the same header level
    • Better tie-in for regex pattern restrictions section to the string type (via linkages or colocating it). Copilot was tripped up on this and took 2-3 attempts to associate things together
    • Two schema configuration sections exist at the same header level
    • Some sections should be sub-headers but aren't (e.g. schema configuration => JSON File Configuration, Python Configuration)
    • Local/Network Validation could be in a more centralized concept section for validation
    • Error messages / schema violations "feels" out of place and maybe better suited near the end or some section about actually trying to run sphinx with everything you've created above
    • Better split out between Schema and Field (with Type System) might be helpful since they are two separate inter-related concepts
  • Better hierarchical drill down with linkage on what a schemas.json file can contain:
    • Schema Components describes many of the top-level concepts, but misses out on defining a "schemas": [ ]" section. Sections that would be contained within this schemas section would benefit greatly to have a clause saying that they should be contained within that section. Right now, a lot of this explicit knowledge is shown in examples to piece together.
    • Better linkage or hierarchical flow for a technical reference manual of the schema.json drilling down each layer, describing the properties it can contain, the data types, and their unique constraints. It isn't explicit (through links or directly nested section hierarchies) to go from the type system to schema json to schema components, to supported data types from a linkage or hierarchy section. If you read the document a few times, then it makes sense. However, robots stumble on harder things because of this and was the number one reason why copilot wouldn't get it right in the first shot because it would read that JSON Schemas are being used and fail to be able to conceptualize what the full schema and limitations it should apply. It would assume broader concepts for json schema would apply to some section where it didn't only to learn later once it had to double/triple check it's work and find the later mentioned sections.
  • More approachable tests section. I learned that there are a lot of examples, which is how most people learn and why so many examples were included in the docs. Unfortunately, I do not know that (a) copilot would pick up on them, (b) that people would try to associate (find) the tests and the snapshot results. Perhaps something like using sphinx-test-reports would be a better pick here to associate things together, but that would be a big undertaking.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions