Unify the model parameter validation code introduced in PR 2092 with the DataValidator classes#2107
Unify the model parameter validation code introduced in PR 2092 with the DataValidator classes#2107GernotMaier wants to merge 13 commits intoilluminator_southfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors model-parameter validation so that the same DataValidator-based logic is used regardless of whether parameters originate from files, the DB, or the overwrite mechanism—aiming to preserve PR #2092 behavior while consolidating validation in simtools.data_model.
Changes:
- Moved/centralized overwrite-time model-parameter validation to
DataValidator.validate_model_parameter()and added schema helpers to fetch per-parameter type/unit by schema version. - Standardized dimensionless-unit handling via
value_conversion.is_dimensionless_unit()/normalize_dimensionless_unit()and updated unit-conversion behavior/tests accordingly. - Expanded unit tests for heterogeneous-list validation and schema helper behavior; updated mock DB parameter fixtures to match schema expectations.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/simtools/model/model_parameter.py |
Routes overwrite validation through DataValidator and resolves schema versions/types/units from data_model.schema. |
src/simtools/data_model/validate_data.py |
Adds heterogeneous-list validation flow and adjusts unit conversion outputs (now returning unit strings). |
src/simtools/data_model/schema.py |
Adds helpers to load per-parameter schemas and extract type/unit with normalization. |
src/simtools/utils/value_conversion.py |
Introduces shared helpers to detect/normalize dimensionless unit markers. |
src/simtools/simtel/simtel_config_reader.py |
Uses the new dimensionless-unit helper when applying schema units. |
src/simtools/db/db_handler.py |
Normalizes dimensionless units to None when inserting parameters. |
src/simtools/data_model/model_data_writer.py |
Normalizes dimensionless units for model-parameter outputs and before writing. |
tests/unit_tests/* |
Updates expectations for new validation/error behavior; adds coverage for new helpers and heterogeneous lists. |
tests/resources/mock_db/mock_parameters.json |
Updates mock parameter units/values to align with schema-driven validation (but currently contains an inconsistency; see comments). |
orelgueta
left a comment
There was a problem hiding this comment.
Thanks for this!
Minor comments, but at least the num_gains tests should be looked at, unless I really misunderstood something.
src/simtools/data_model/schema.py
Outdated
| def get_parameter_type_from_schema(par_name, schema_version): | ||
| """ | ||
| Get parameter type from schema file for a specific schema version. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| par_name: str | ||
| Name of the parameter. | ||
| schema_version: str | ||
| Schema version to look up. | ||
|
|
||
| Returns | ||
| ------- | ||
| str or list | ||
| Type of the parameter (string for simple types, list for heterogeneous types). | ||
| """ | ||
| return _get_parameter_attribute_from_schema(par_name, schema_version, "type") | ||
|
|
||
|
|
||
| def get_parameter_unit_from_schema(par_name, schema_version): | ||
| """ | ||
| Get parameter unit from schema file for a specific schema version. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| par_name: str | ||
| Name of the parameter. | ||
| schema_version: str | ||
| Schema version to look up. | ||
|
|
||
| Returns | ||
| ------- | ||
| str or list or None | ||
| Unit of the parameter (string for simple types, list for heterogeneous types, | ||
| None for dimensionless parameters). | ||
| """ | ||
| return _get_parameter_attribute_from_schema(par_name, schema_version, "unit") |
There was a problem hiding this comment.
These are quite trivial functions with a lot of documentation just to avoid opening the API to using the private function _get_parameter_attribute_from_schema. Is there a good reason not to make that one public and give as examples the type and unit in the documentation?
There was a problem hiding this comment.
Agree - changed it to a public get_parameter_attribute_from_schema function only.
(story behind is that these where two long function until I realised that they are doing essentially the same).
| tel_model = copy.deepcopy(telescope_model_lst) | ||
|
|
||
| changes = {"num_gains": {"value": 4, "version": "2.0.0"}} | ||
| changes = {"num_gains": {"value": 2, "version": "2.0.0"}} |
There was a problem hiding this comment.
Since it is originally 2, aren't you worried that this change to the test makes it less robust? If the changing of the value failed, this test will not catch it because it will still be two gains.
There was a problem hiding this comment.
Yes! Note that the change was necessary as the schema is not used for the validation: num_gains = 4 are not allowed anymore.
There was a problem hiding this comment.
I changed these values now to make sure we always test a change to a different value. Please check if you agree.
|
|
||
| # Simple value (not a dict with 'value' or 'version' keys) | ||
| changes = {"num_gains": 5} | ||
| changes = {"num_gains": 2} |
There was a problem hiding this comment.
Same comment as above about the change of the number of gains.
| side_effect=FileNotFoundError("schema file missing"), | ||
| ) | ||
|
|
||
| tel_model.overwrite_model_parameter("num_gains", 2) |
There was a problem hiding this comment.
Again, should test different number of gains.
| "num_gains": { | ||
| "version": "1.0.0", | ||
| "value": 999, | ||
| "value": 2, |
| "num_gains": { | ||
| "version": "1.0.0", | ||
| "value": 10, | ||
| "value": 2, |
| """Test overwrite_model_parameter with parameter_version but no value.""" | ||
| tel_model = copy.deepcopy(telescope_model_lst) | ||
|
|
||
| # Mock get_model_parameter to return parameter dict | ||
| mocker.patch.object(tel_model.db, "get_model_parameter", return_value=num_gains_dict) | ||
|
|
||
| # Call with only parameter_version (no value) | ||
| tel_model.overwrite_model_parameter("num_gains", value=None, parameter_version="2.0.0") | ||
|
|
||
| # Verify the parameter was updated | ||
| assert tel_model.parameters["num_gains"]["value"] == 999 | ||
| assert tel_model.parameters["num_gains"]["parameter_version"] == "2.0.0" |
There was a problem hiding this comment.
Here the overwrite worked without going through the validation because the num_gains value was changed to 999 which shouldn't be allowed
|
|
I made a couple of changes and made an additional comment. From my point of view, this specific PR can be merged, but you should take a look at those changes before we do. |



This PR should provide exactly the same functionality as PR #2092, but with model parameter validation entirely moved to the data_validator module. This should ensure the model parameters are validated with the same code - independently if it is a parameter dict from file, from DB, or provided through the overwrite model parameter mechanism.
The handling of values without units was inconsistent (
"",None,dimensionless) and I tried to fix this - introduced a single function with the logic invalue_conversion.normalize_dimensionless_unit.