Skip to content

Dataset Validation Script #5

@ffl096

Description

@ffl096

We should provide a validation script that ensures datasets to be in the specified format. Non-exhaustive list of thinks to validate:

General structure

  • Dataset must be a (optionally gzipped) plain text file.
  • Dataset metadata (JSON object) on first line.
  • Any node-related line must appear before edges.

Attribute-related

  • Dates and datetimes (whenever a time attribute appears) should be in ISO 8601 format. The timezone must be specified.
  • weight attributes must be a float or integer.

Suggestions

The validation script should issue improvement suggestions that are not necessarily errors but good practice to follow:

  • A node is explicitly given, but does not has explicit attributes and appears inside an edge (i.e., can be inferred implicitly)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions