From 11ddbe5fb4273627d80a448d9ee756c29494eb6d Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Mon, 27 Oct 2025 13:41:16 +0100 Subject: [PATCH 1/5] docs: :memo: add `Extensions` to config guide --- docs/guide/config.qmd | 65 ++++++++++++++++++++++++++++++------------- 1 file changed, 45 insertions(+), 20 deletions(-) diff --git a/docs/guide/config.qmd b/docs/guide/config.qmd index be137171..3f73b121 100644 --- a/docs/guide/config.qmd +++ b/docs/guide/config.qmd @@ -11,8 +11,8 @@ are available: - `version`: The version of Data Package standard to check against. Defaults to `v2`. - `exclusions`: A list of checks to exclude. -- `custom_checks`: The list of custom checks to run in addition to the - checks defined in the standard. +- `extensions`: The list of extensions, which are additional checks + that supplement those specified by the Data Package standard. - `strict`: Whether to run recommended checks in addition to required ones. Defaults to `False`. @@ -88,13 +88,17 @@ the package and resource properties, and the resource `path` doesn't point to a data file. However, as we have defined exclusions for all of these, the function will flag no issues. -## Adding custom checks +## Adding extensions -It is possible to create custom checks in addition to the ones defined -in the Data Package standard. +It is possible to add extensions in addition to the ones defined in the +Data Package standard. There are currently two types of extensions +supported: `CustomCheck` and `RequiredCheck`. You can use as many +`CustomCheck`s and `RequiredCheck`s as you want to fit your needs. + +### Custom checks Let's say your organisation only accepts Data Packages licensed under -MIT. You can express this requirement in a `CustomCheck` as follows: +MIT. You can express this `CustomCheck` as follows: ```{python} license_check = cdp.CustomCheck( @@ -108,24 +112,18 @@ license_check = cdp.CustomCheck( ) ``` -Here's a breakdown of what each argument does: - -- `type`: An identifier for your custom check. This is what will show - up in error messages and what you will use if you want to exclude - your check. Each `CustomCheck` should have a unique `type`. -- `jsonpath`: The location of the field or fields the custom check - applies to, expressed in [JSON - path](https://en.wikipedia.org/wiki/JSONPath) notation. This check - applies to the `name` field of all package licenses. -- `message`: The message that is shown when the check is violated. -- `check`: A function that expresses the custom check. It takes the - value at the `jsonpath` location as input and returns true if the - check is met, false if it isn't. +For more details on what each parameter means, see the +[`CustomCheck`](/docs/reference/custom_check.qmd) documentation. +Specific to this example, the `type` is setting the identifier of the +check to `only-mit` and the `jsonpath` is indicating to only check the +`name` property of each license in the `licenses` property of the Data +Package. To register your custom checks with the `check()` function, you add them to the `Config` object passed to the function: ```{python} +#| eval: false package_properties = { "name": "woolly-dormice", "title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.", @@ -147,13 +145,40 @@ package_properties = { ], } -config = cdp.Config(custom_checks=[license_check]) +config = cdp.Config(extensions=cdp.Extensions(custom_checks=[license_check])) cdp.check(properties=package_properties, config=config) ``` We can see that the custom check was applied: `check()` returned one issue flagging the first license attached to the Data Package. +### Required checks + +You can also set specific properties in the `datapackage.json` file as +required, even though it isn't required by the Data Package standard. +For example, if you want to make the `description` field of Data Package +a required field, you can define a `RequiredCheck` like this: + +```{python} +#| eval: false +description_required = cdp.RequiredCheck( + jsonpath="$.description", + message="The 'description' field is required in the Data Package properties.", +) +``` + +See the [`RequiredCheck`](/docs/reference/required_check.qmd) +documentation for more details on its parameters. + +To use this `RequiredCheck` in the `Config` object passed to `check()`, +it would look like: + +```{python} +#| eval: false +config = cdp.Config(extensions=cdp.Extensions(required_checks=[description_required])) +cdp.check(properties=package_properties, config=config) +``` + ## Strict mode The Data Package standard has both requirements and recommendations. By From 3c64e02e861fe2bd7e16ceb92aa2e61149d29dd2 Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Mon, 27 Oct 2025 13:56:33 +0100 Subject: [PATCH 2/5] docs: :memo: moved from guide into docstring --- src/check_datapackage/custom_check.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/check_datapackage/custom_check.py b/src/check_datapackage/custom_check.py index 8e55a18a..10717eff 100644 --- a/src/check_datapackage/custom_check.py +++ b/src/check_datapackage/custom_check.py @@ -22,10 +22,10 @@ class CustomCheck: check (Callable[[Any], bool]): A function that expresses the custom check. It takes the value at the `jsonpath` location as input and returns true if the check is met, false if it isn't. - type (str): The type of the custom check (e.g., a JSON schema type such as - "required", "type", "pattern", or "format", or a custom type). It will be - shown in error messages and can be used in an `Exclusion` object to exclude - the check. Each custom check should have a unique `type`. + type (str): An identifier for your custom check. This will show up in the + message as well as what you will use if you want to also exclude it + with the `type` argument of an `Exclusion` object. Each custom check should + have a unique `type`. Examples: ```{python} From c87449d4e09e7f09fd26922868752de02c26ce17 Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Mon, 27 Oct 2025 16:21:23 +0100 Subject: [PATCH 3/5] docs: :pencil2: minor edits from review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Signe Kirk Brødbæk --- docs/guide/config.qmd | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/docs/guide/config.qmd b/docs/guide/config.qmd index 3f73b121..713073bd 100644 --- a/docs/guide/config.qmd +++ b/docs/guide/config.qmd @@ -11,7 +11,7 @@ are available: - `version`: The version of Data Package standard to check against. Defaults to `v2`. - `exclusions`: A list of checks to exclude. -- `extensions`: The list of extensions, which are additional checks +- `extensions`: A list of extensions, which are additional checks that supplement those specified by the Data Package standard. - `strict`: Whether to run recommended checks in addition to required ones. Defaults to `False`. @@ -90,10 +90,10 @@ these, the function will flag no issues. ## Adding extensions -It is possible to add extensions in addition to the ones defined in the -Data Package standard. There are currently two types of extensions -supported: `CustomCheck` and `RequiredCheck`. You can use as many -`CustomCheck`s and `RequiredCheck`s as you want to fit your needs. +It is possible to add checks in addition to the ones defined in the +Data Package standard. We call these additional checks *extensions*. There are currently two types of extensions +supported: `CustomCheck` and `RequiredCheck`. You can add as many +`CustomCheck`s and `RequiredCheck`s to your `Config` as you want to fit your needs. ### Custom checks @@ -154,8 +154,8 @@ issue flagging the first license attached to the Data Package. ### Required checks -You can also set specific properties in the `datapackage.json` file as -required, even though it isn't required by the Data Package standard. +You can also set specific properties in the `datapackage.json` file to be +required, even when they aren't required by the Data Package standard with a `RequiredCheck`. For example, if you want to make the `description` field of Data Package a required field, you can define a `RequiredCheck` like this: @@ -170,8 +170,7 @@ description_required = cdp.RequiredCheck( See the [`RequiredCheck`](/docs/reference/required_check.qmd) documentation for more details on its parameters. -To use this `RequiredCheck` in the `Config` object passed to `check()`, -it would look like: +To apply this `RequiredCheck`, it should be added to the `Config` object passed to `check()` like shown below: ```{python} #| eval: false From 46d18659b83102c70e1738b0565682759a20e373 Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Mon, 27 Oct 2025 16:22:43 +0100 Subject: [PATCH 4/5] revert: :rewind: revert docstring description --- src/check_datapackage/custom_check.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/check_datapackage/custom_check.py b/src/check_datapackage/custom_check.py index 10717eff..8e55a18a 100644 --- a/src/check_datapackage/custom_check.py +++ b/src/check_datapackage/custom_check.py @@ -22,10 +22,10 @@ class CustomCheck: check (Callable[[Any], bool]): A function that expresses the custom check. It takes the value at the `jsonpath` location as input and returns true if the check is met, false if it isn't. - type (str): An identifier for your custom check. This will show up in the - message as well as what you will use if you want to also exclude it - with the `type` argument of an `Exclusion` object. Each custom check should - have a unique `type`. + type (str): The type of the custom check (e.g., a JSON schema type such as + "required", "type", "pattern", or "format", or a custom type). It will be + shown in error messages and can be used in an `Exclusion` object to exclude + the check. Each custom check should have a unique `type`. Examples: ```{python} From 150af9c6add72d63d090ee2f0dcda2fbaa0cecf4 Mon Sep 17 00:00:00 2001 From: "Luke W. Johnston" Date: Tue, 28 Oct 2025 14:05:33 +0100 Subject: [PATCH 5/5] docs: :art: ran formatter --- docs/guide/config.qmd | 48 +++++++++++++++++++++++-------------------- 1 file changed, 26 insertions(+), 22 deletions(-) diff --git a/docs/guide/config.qmd b/docs/guide/config.qmd index 508344f2..490700b1 100644 --- a/docs/guide/config.qmd +++ b/docs/guide/config.qmd @@ -11,18 +11,19 @@ are available: - `version`: The version of Data Package standard to check against. Defaults to `v2`. - `exclusions`: A list of checks to exclude. -- `extensions`: A list of extensions, which are additional checks - that supplement those specified by the Data Package standard. +- `extensions`: A list of extensions, which are additional checks that + supplement those specified by the Data Package standard. - `strict`: Whether to include "SHOULD" checks in addition to "MUST" checks. Defaults to `False`. ::: callout-important The Data Package standard uses language from [RFC -2119](https://www.ietf.org/rfc/rfc2119.txt) to define its specifications. -They use "MUST" for required properties and "SHOULD" for properties that -should be included but are not strictly required. We try to match this -language in `check-datapackage` by using the terms "MUST" and "SHOULD", -though we also use "required" for "MUST" in our documentation. +2119](https://www.ietf.org/rfc/rfc2119.txt) to define its +specifications. They use "MUST" for required properties and "SHOULD" for +properties that should be included but are not strictly required. We try +to match this language in `check-datapackage` by using the terms "MUST" +and "SHOULD", though we also use "required" for "MUST" in our +documentation. ::: ## Excluding checks @@ -99,10 +100,11 @@ these, the function will flag no issues. ## Adding extensions -It is possible to add checks in addition to the ones defined in the -Data Package standard. We call these additional checks *extensions*. There are currently two types of extensions -supported: `CustomCheck` and `RequiredCheck`. You can add as many -`CustomCheck`s and `RequiredCheck`s to your `Config` as you want to fit your needs. +It is possible to add checks in addition to the ones defined in the Data +Package standard. We call these additional checks *extensions*. There +are currently two types of extensions supported: `CustomCheck` and +`RequiredCheck`. You can add as many `CustomCheck`s and `RequiredCheck`s +to your `Config` as you want to fit your needs. ### Custom checks @@ -163,10 +165,11 @@ issue flagging the first license attached to the Data Package. ### Required checks -You can also set specific properties in the `datapackage.json` file to be -required, even when they aren't required by the Data Package standard with a `RequiredCheck`. -For example, if you want to make the `description` field of Data Package -a required field, you can define a `RequiredCheck` like this: +You can also set specific properties in the `datapackage.json` file to +be required, even when they aren't required by the Data Package standard +with a `RequiredCheck`. For example, if you want to make the +`description` field of Data Package a required field, you can define a +`RequiredCheck` like this: ```{python} #| eval: false @@ -179,7 +182,8 @@ description_required = cdp.RequiredCheck( See the [`RequiredCheck`](/docs/reference/required_check.qmd) documentation for more details on its parameters. -To apply this `RequiredCheck`, it should be added to the `Config` object passed to `check()` like shown below: +To apply this `RequiredCheck`, it should be added to the `Config` object +passed to `check()` like shown below: ```{python} #| eval: false @@ -190,12 +194,12 @@ cdp.check(properties=package_properties, config=config) ## Strict mode The Data Package standard includes properties that "MUST" and "SHOULD" -be included and/or have a specific format in a compliant Data Package. By default, `check()` only -the `check()` function only includes "MUST" checks. To include "SHOULD" checks, -set the `strict` argument to `True`. For example, -the `name` field of a Data Package "SHOULD" not contain special -characters. So running `check()` in strict mode (`strict=True`) on the following -properties would output an issue. +be included and/or have a specific format in a compliant Data Package. +By default, `check()` only the `check()` function only includes "MUST" +checks. To include "SHOULD" checks, set the `strict` argument to `True`. +For example, the `name` field of a Data Package "SHOULD" not contain +special characters. So running `check()` in strict mode (`strict=True`) +on the following properties would output an issue. ```{python} #| eval: false