Skip to content

Conversation

@sandervh14
Copy link
Contributor

Story Title

train/selection/validation split error

Changes made

This proposes a fix for the floating point rounding error in equality checking of the train_prop + selection_prop + validation_prop against 1.0.
The unit test is a little counter-intuitive since it needs to check if a ValueError is NOT raised, but it does carefully test the condition.

How does the solution address the problem

The fixed is based on math.isclose().

Linked issues

Resolves #33

…ain_prop + selection_prop + validation_prop against 1.0.
@sandervh14 sandervh14 requested a review from JanBenisek March 16, 2021 21:15
@sandervh14 sandervh14 linked an issue Mar 16, 2021 that may be closed by this pull request
@JanBenisek JanBenisek added this to the v1.0.2 milestone Mar 17, 2021
Copy link
Contributor

@JanBenisek JanBenisek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I commented, please keep only one test but add more test cases to it.

selection_prop,
error_msg)

def test_train_selection_validation_split_error_correct_prop_no_rounding_error(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, seems like we already have a unit test for that behavior here:

def test_train_selection_validation_split(self, train_prop: float,

I'd suggest keep only one of the tests (perhaps the original one), BUT add more possible values to test.

"""

if train_prop + selection_prop + validation_prop != 1.0:
if not math.isclose(train_prop + selection_prop + validation_prop, 1.0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this solution, good idea!

@sandervh14 sandervh14 requested a review from JanBenisek April 1, 2021 08:28
@JanBenisek JanBenisek merged commit d315590 into develop Apr 1, 2021
@JanBenisek JanBenisek deleted the issue-#33-data-split-error branch April 1, 2021 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

train/selection/validation split error

2 participants