Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 45 additions & 20 deletions docs/guide/config.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ are available:
- `version`: The version of Data Package standard to check against.
Defaults to `v2`.
- `exclusions`: A list of checks to exclude.
- `custom_checks`: The list of custom checks to run in addition to the
checks defined in the standard.
- `extensions`: The list of extensions, which are additional checks
that supplement those specified by the Data Package standard.
- `strict`: Whether to run recommended checks in addition to required
ones. Defaults to `False`.

Expand Down Expand Up @@ -88,13 +88,17 @@ the package and resource properties, and the resource `path` doesn't
point to a data file. However, as we have defined exclusions for all of
these, the function will flag no issues.

## Adding custom checks
## Adding extensions

It is possible to create custom checks in addition to the ones defined
in the Data Package standard.
It is possible to add extensions in addition to the ones defined in the
Data Package standard. There are currently two types of extensions
supported: `CustomCheck` and `RequiredCheck`. You can use as many
`CustomCheck`s and `RequiredCheck`s as you want to fit your needs.

### Custom checks

Let's say your organisation only accepts Data Packages licensed under
MIT. You can express this requirement in a `CustomCheck` as follows:
MIT. You can express this `CustomCheck` as follows:

```{python}
license_check = cdp.CustomCheck(
Expand All @@ -108,24 +112,18 @@ license_check = cdp.CustomCheck(
)
```

Here's a breakdown of what each argument does:

- `type`: An identifier for your custom check. This is what will show
up in error messages and what you will use if you want to exclude
your check. Each `CustomCheck` should have a unique `type`.
- `jsonpath`: The location of the field or fields the custom check
applies to, expressed in [JSON
path](https://en.wikipedia.org/wiki/JSONPath) notation. This check
applies to the `name` field of all package licenses.
- `message`: The message that is shown when the check is violated.
- `check`: A function that expresses the custom check. It takes the
value at the `jsonpath` location as input and returns true if the
check is met, false if it isn't.
For more details on what each parameter means, see the
[`CustomCheck`](/docs/reference/custom_check.qmd) documentation.
Specific to this example, the `type` is setting the identifier of the
check to `only-mit` and the `jsonpath` is indicating to only check the
`name` property of each license in the `licenses` property of the Data
Package.

To register your custom checks with the `check()` function, you add them
to the `Config` object passed to the function:

```{python}
#| eval: false
package_properties = {
"name": "woolly-dormice",
"title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",
Expand All @@ -147,13 +145,40 @@ package_properties = {
],
}

config = cdp.Config(custom_checks=[license_check])
config = cdp.Config(extensions=cdp.Extensions(custom_checks=[license_check]))
cdp.check(properties=package_properties, config=config)
```

We can see that the custom check was applied: `check()` returned one
issue flagging the first license attached to the Data Package.

### Required checks

You can also set specific properties in the `datapackage.json` file as
required, even though it isn't required by the Data Package standard.
For example, if you want to make the `description` field of Data Package
a required field, you can define a `RequiredCheck` like this:

```{python}
#| eval: false
description_required = cdp.RequiredCheck(
jsonpath="$.description",
message="The 'description' field is required in the Data Package properties.",
)
```

See the [`RequiredCheck`](/docs/reference/required_check.qmd)
documentation for more details on its parameters.

To use this `RequiredCheck` in the `Config` object passed to `check()`,
it would look like:

```{python}
#| eval: false
config = cdp.Config(extensions=cdp.Extensions(required_checks=[description_required]))
cdp.check(properties=package_properties, config=config)
```

## Strict mode

The Data Package standard has both requirements and recommendations. By
Expand Down
8 changes: 4 additions & 4 deletions src/check_datapackage/custom_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ class CustomCheck:
check (Callable[[Any], bool]): A function that expresses the custom check.
It takes the value at the `jsonpath` location as input and
returns true if the check is met, false if it isn't.
type (str): The type of the custom check (e.g., a JSON schema type such as
"required", "type", "pattern", or "format", or a custom type). It will be
shown in error messages and can be used in an `Exclusion` object to exclude
the check. Each custom check should have a unique `type`.
type (str): An identifier for your custom check. This will show up in the
message as well as what you will use if you want to also exclude it
with the `type` argument of an `Exclusion` object. Each custom check should
have a unique `type`.

Examples:
```{python}
Expand Down