Consistency check script #114

CIGbalance · 2025-10-16T10:00:48Z

Added a preliminary script for checking consistency, along with a README detailing the envisioned github workflow.

Main question for review: Should I be checking for anything else? Is there anything that is too strict?

Closes #123

kvdblom · 2025-11-19T10:01:03Z

utils/validate_yaml.py

+    "textual description",
+]
+
+UNIQUE_FIELDS = ["name", "reference", "implementation"]


This triggers an error when:

reference is '' (empty), which is probably not what we want. Thinking about it: A reference can also introduce multiple problems, so does not need to be unique in any case? What do you think?

Similarly, I expect the same happens for the implementation field, which may also not need to be unique, because a single package may implement multiple problems/benchmarks. (I guess it might depend a bit on how specific we want the reference to the implementation to be, but it probably cannot always be specific enough to be unique.)

Yeah, I added it since I was hoping this could weed out instances where the same problem was added multiple times just with a different name. But you are right, there are several valid reasons for why the reference/implementation could be the same. I am going to change it to a warning instead.

kvdblom · 2025-11-19T10:03:58Z

utils/validate_yaml.py

+import sys
+
+# Define the required fields your YAML must have
+REQUIRED_FIELDS = [


We have the same list of fields in yaml_to_html.py (called default_columns). We should probably maintain it in a single place, and/or let one inherit the other?

Great catch, I am going to import from the file

kvdblom · 2025-11-19T10:13:54Z

utils/validate_yaml.py

+
+
+def check_fields(data):
+    if len(data) != len(REQUIRED_FIELDS):


I think this should test that there are at least this many fields. I would explicitly want people to add new fields for interesting properties we do not collect yet. Then:

Properties not in the REQUIRED_FIELDS should then be checked against a (to be created) NOT_REQUIRED_FIELDS which would contain all other fields (might be empty for now).

A message should be returned listing the new fields (found in neither the required or not-required lists), to be verified by an OPL maintainer as actually new (not just a new name for an existing property), and then either added to the not required list or fixed (or requested as change on a PR) to have the correct existing name.

Ideally all other checks are still done before such a list is returned, so we know everything else already passes the checks, and verifying new fields (or maybe other similar things) is all that needs to be done.

Ok, I removed the length check because it doesn't really add anything in this case.
I am now adding warnings for unknown fields that the reviewers can check.
Adding to the optional fields variable will be done on merge in the merging script, so this is not part of this PR. For now there are just no optional fields.

kvdblom · 2025-11-20T15:03:55Z

Something else that came to mind that I did not think about or try yesterday:

What happens if multiple new problems/benchmarks are added at the same time (in a single yaml file)?

CIGbalance · 2025-11-27T10:23:55Z

Something else that came to mind that I did not think about or try yesterday:
* What happens if multiple new problems/benchmarks are added at the same time (in a single yaml file)?

Yes, very good point. I changed this now to test for each entry and not assume it is only one.

Note: I also changed the format of the prints. with the new syntax, they should be automatically picked up by the GitHub Action. This is to be tested for PR #127

kvdblom · 2025-12-03T10:19:08Z

utils/validate_yaml.py

+    # Load existing problems
+    read_status, existing_data = read_data(PROBLEMS_FILE)
+    if read_status != 0:
+        print("::eror::Could not read existing problems for novelty check.")


typo: eror -> error

kvdblom · 2025-12-03T10:43:04Z

utils/validate_yaml.py

+        print("::eror::Could not read existing problems for novelty check.")
+        return False
+    assert existing_data is not None
+    for field in UNIQUE_FIELDS or UNIQUE_WARNING_FIELDS:


This does not seem to work, and ends up only checking the UNIQUE_FIELDS
Change to: UNIQUE_FIELDS + UNIQUE_WARNING_FIELDS

An edge case I'm not entirely sure what to do with: Fields that are left empty also trigger this warning, e.g.: ::warning::Field 'reference' with value '' already exists. Consider choosing a unique value.
Maybe that is fine, but maybe it would be nicer to ignore (not give a warning) those cases.

This also made me realise/check: If UNIQUE_FIELDS are left empty (i.e., field exists, but has no value), they currently don't raise an error. It would probably be good if they did, because currently I can add a problem without a name.

This is probably the only key thing, the rest is minor and/or future problems.

kvdblom · 2025-12-03T10:47:14Z

utils/README.md

+The intended way of adding a new problem to the repository is thus as follows:
+
+* Change the [new_problem.yaml](new_problem.yaml) template file to fit the new problem.
+* Create a PR which modifies with the changes (for example with a fork).


Sentence, maybe remove 'modifies' ?

kvdblom · 2025-12-03T10:49:00Z

utils/README.md

+
+* On PR creation and commits to the PR, the [validate_yaml.py](validate_yaml.py) script is run to check that the YAML file is valid and consistent. It is expecting the changes to be in the [new_problem.yaml](new_problem.yaml) file.
+* Then the PR should be reviewed manually.
+* When the PR is merged into the main branch, a second script runs (which doesn't exist yet), that adds the content of [new_problem.yaml](new_problem.yaml) to the [problems.yaml](../problems.yaml) file, and returns it to its previous version.


I don't understand what you mean with "and returns it to its previous version"

kvdblom · 2025-12-03T10:57:00Z

utils/new_problem.yaml

Maybe not a now issue, but when the new template is ready, there may be REQUIRED_FIELDS that are subfields to other fields. E.g., something like this:

variables:
dimensionality: scalable
variable type: continuous

These would not pass the current checks.

CIGbalance added 2 commits October 16, 2025 11:12

First basic validity check

8693177

unsaved changes

47cdb6e

CIGbalance marked this pull request as ready for review November 19, 2025 09:34

CIGbalance requested review from Dvermetten and kvdblom November 19, 2025 09:35

CIGbalance mentioned this pull request Nov 19, 2025

Add an automatic pipeline for contributing new problems to the repository #122

Open

kvdblom requested changes Nov 19, 2025

View reviewed changes

kvdblom mentioned this pull request Nov 19, 2025

Identify and implement missing steps for processing google form output to YAML on github #128

Open

requested changes

8f7c4be

CIGbalance requested a review from kvdblom November 27, 2025 10:25

kvdblom mentioned this pull request Nov 27, 2025

Feat/ga checks #127

Open

kvdblom requested changes Dec 3, 2025

View reviewed changes



		def check_fields(data):
		if len(data) != len(REQUIRED_FIELDS):

Consistency check script #114

Are you sure you want to change the base?

Consistency check script #114

Uh oh!

Conversation

CIGbalance commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kvdblom commented Nov 20, 2025

Uh oh!

CIGbalance commented Nov 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CIGbalance commented Oct 16, 2025 •

edited

Loading