Skip to content

Unify the model parameter validation code introduced in PR 2092 with the DataValidator classes#2107

Open
GernotMaier wants to merge 13 commits intoilluminator_southfrom
illuminator_south_use_schema_module
Open

Unify the model parameter validation code introduced in PR 2092 with the DataValidator classes#2107
GernotMaier wants to merge 13 commits intoilluminator_southfrom
illuminator_south_use_schema_module

Conversation

@GernotMaier
Copy link
Copy Markdown
Contributor

@GernotMaier GernotMaier commented Apr 2, 2026

This PR should provide exactly the same functionality as PR #2092, but with model parameter validation entirely moved to the data_validator module. This should ensure the model parameters are validated with the same code - independently if it is a parameter dict from file, from DB, or provided through the overwrite model parameter mechanism.

The handling of values without units was inconsistent ("", None, dimensionless) and I tried to fix this - introduced a single function with the logic in value_conversion.normalize_dimensionless_unit.

@GernotMaier GernotMaier self-assigned this Apr 2, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors model-parameter validation so that the same DataValidator-based logic is used regardless of whether parameters originate from files, the DB, or the overwrite mechanism—aiming to preserve PR #2092 behavior while consolidating validation in simtools.data_model.

Changes:

  • Moved/centralized overwrite-time model-parameter validation to DataValidator.validate_model_parameter() and added schema helpers to fetch per-parameter type/unit by schema version.
  • Standardized dimensionless-unit handling via value_conversion.is_dimensionless_unit() / normalize_dimensionless_unit() and updated unit-conversion behavior/tests accordingly.
  • Expanded unit tests for heterogeneous-list validation and schema helper behavior; updated mock DB parameter fixtures to match schema expectations.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/simtools/model/model_parameter.py Routes overwrite validation through DataValidator and resolves schema versions/types/units from data_model.schema.
src/simtools/data_model/validate_data.py Adds heterogeneous-list validation flow and adjusts unit conversion outputs (now returning unit strings).
src/simtools/data_model/schema.py Adds helpers to load per-parameter schemas and extract type/unit with normalization.
src/simtools/utils/value_conversion.py Introduces shared helpers to detect/normalize dimensionless unit markers.
src/simtools/simtel/simtel_config_reader.py Uses the new dimensionless-unit helper when applying schema units.
src/simtools/db/db_handler.py Normalizes dimensionless units to None when inserting parameters.
src/simtools/data_model/model_data_writer.py Normalizes dimensionless units for model-parameter outputs and before writing.
tests/unit_tests/* Updates expectations for new validation/error behavior; adds coverage for new helpers and heterogeneous lists.
tests/resources/mock_db/mock_parameters.json Updates mock parameter units/values to align with schema-driven validation (but currently contains an inconsistency; see comments).

@GernotMaier GernotMaier marked this pull request as ready for review April 2, 2026 13:09
@GernotMaier GernotMaier requested a review from orelgueta April 2, 2026 13:09
Copy link
Copy Markdown
Contributor

@orelgueta orelgueta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

Minor comments, but at least the num_gains tests should be looked at, unless I really misunderstood something.

Comment on lines +485 to +521
def get_parameter_type_from_schema(par_name, schema_version):
"""
Get parameter type from schema file for a specific schema version.

Parameters
----------
par_name: str
Name of the parameter.
schema_version: str
Schema version to look up.

Returns
-------
str or list
Type of the parameter (string for simple types, list for heterogeneous types).
"""
return _get_parameter_attribute_from_schema(par_name, schema_version, "type")


def get_parameter_unit_from_schema(par_name, schema_version):
"""
Get parameter unit from schema file for a specific schema version.

Parameters
----------
par_name: str
Name of the parameter.
schema_version: str
Schema version to look up.

Returns
-------
str or list or None
Unit of the parameter (string for simple types, list for heterogeneous types,
None for dimensionless parameters).
"""
return _get_parameter_attribute_from_schema(par_name, schema_version, "unit")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are quite trivial functions with a lot of documentation just to avoid opening the API to using the private function _get_parameter_attribute_from_schema. Is there a good reason not to make that one public and give as examples the type and unit in the documentation?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - changed it to a public get_parameter_attribute_from_schema function only.

(story behind is that these where two long function until I realised that they are doing essentially the same).

tel_model = copy.deepcopy(telescope_model_lst)

changes = {"num_gains": {"value": 4, "version": "2.0.0"}}
changes = {"num_gains": {"value": 2, "version": "2.0.0"}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is originally 2, aren't you worried that this change to the test makes it less robust? If the changing of the value failed, this test will not catch it because it will still be two gains.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Note that the change was necessary as the schema is not used for the validation: num_gains = 4 are not allowed anymore.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed these values now to make sure we always test a change to a different value. Please check if you agree.


# Simple value (not a dict with 'value' or 'version' keys)
changes = {"num_gains": 5}
changes = {"num_gains": 2}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about the change of the number of gains.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

side_effect=FileNotFoundError("schema file missing"),
)

tel_model.overwrite_model_parameter("num_gains", 2)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, should test different number of gains.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

"num_gains": {
"version": "1.0.0",
"value": 999,
"value": 2,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, number of gains

"num_gains": {
"version": "1.0.0",
"value": 10,
"value": 2,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_gains

Comment on lines 522 to 533
"""Test overwrite_model_parameter with parameter_version but no value."""
tel_model = copy.deepcopy(telescope_model_lst)

# Mock get_model_parameter to return parameter dict
mocker.patch.object(tel_model.db, "get_model_parameter", return_value=num_gains_dict)

# Call with only parameter_version (no value)
tel_model.overwrite_model_parameter("num_gains", value=None, parameter_version="2.0.0")

# Verify the parameter was updated
assert tel_model.parameters["num_gains"]["value"] == 999
assert tel_model.parameters["num_gains"]["parameter_version"] == "2.0.0"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the overwrite worked without going through the validation because the num_gains value was changed to 999 which shouldn't be allowed

@ctao-sonarqube
Copy link
Copy Markdown

ctao-sonarqube bot commented Apr 2, 2026

@orelgueta
Copy link
Copy Markdown
Contributor

I made a couple of changes and made an additional comment. From my point of view, this specific PR can be merged, but you should take a look at those changes before we do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants