Skip to content

Conversation

@chezou
Copy link
Contributor

@chezou chezou commented Dec 10, 2024

Description

Fix ValueError: YtY and Y should have the same number of columns on ImplicitALSWrapper with GPU.

Before

➜  RecTools git:(fix-als-gpu) poetry run pytest tests/models/test_implicit_als.py
================================================= test session starts ==================================================
platform linux -- Python 3.12.3, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/aki/src/RecTools
configfile: setup.cfg
plugins: cov-5.0.0, subtests-0.12.1, typeguard-4.4.1, mock-3.14.0
collected 77 items

tests/models/test_implicit_als.py .....................F.......................................................  [100%]

======================================================= FAILURES =======================================================
____________________ TestImplicitALSWrapperModel.test_happy_path_with_features[True-expected0-True] ____________________

self = <tests.models.test_implicit_als.TestImplicitALSWrapperModel object at 0x7fa8d0655af0>
fit_features_together = True
expected =   user_id item_id  rank
0      u1      i2     1
1      u3      i3     1
2      u3      i2     2
use_gpu = True
dataset_w_features = Dataset(user_id_map=IdMap(external_ids=array(['u1', 'u2', 'u3'], dtype=object)), item_id_map=IdMap(external_ids=array(...tored elements in Compressed Sparse Row format>, names=(('f1', '__is_direct_feature'), ('f2', '__is_direct_feature'))))

    @pytest.mark.parametrize(
        "fit_features_together,expected",
        (
            (
                True,
                pd.DataFrame(
                    {
                        Columns.User: ["u1", "u3", "u3"],
                        Columns.Item: ["i2", "i3", "i2"],
                        Columns.Rank: [1, 1, 2],
                    }
                ),
            ),
            (
                False,
                pd.DataFrame(
                    {
                        Columns.User: ["u1", "u3", "u3"],
                        Columns.Item: ["i2", "i2", "i3"],
                        Columns.Rank: [1, 1, 2],
                    }
                ),
            ),
        ),
    )
    def test_happy_path_with_features(
        self, fit_features_together: bool, expected: pd.DataFrame, use_gpu: bool, dataset_w_features: Dataset
    ) -> None:
        dataset = dataset_w_features

        # In case of big number of iterations there are differences between CPU and GPU results
        base_model = AlternatingLeastSquares(factors=32, num_threads=2, use_gpu=use_gpu)
        self._init_model_factors_inplace(base_model, dataset)

>       model = ImplicitALSWrapperModel(model=base_model, fit_features_together=fit_features_together).fit(dataset)

tests/models/test_implicit_als.py:244:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
rectools/models/base.py:306: in fit
    self._fit(dataset, *args, **kwargs)
rectools/models/implicit_als.py:195: in _fit
    self._fit_model_for_epochs(dataset, self.model.iterations)
rectools/models/implicit_als.py:211: in _fit_model_for_epochs
    fit_als_with_features_together_inplace(
rectools/models/implicit_als.py:459: in fit_als_with_features_together_inplace
    _fit_combined_factors_on_gpu_inplace(
rectools/models/implicit_als.py:618: in _fit_combined_factors_on_gpu_inplace
    model.solver.calculate_yty(Y, _YtY, model.regularization)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   ValueError: YtY and Y should have the same number of columns

_cuda.pyx:272: ValueError
=================================================== warnings summary ===================================================
tests/models/test_implicit_als.py: 20 warnings
  /home/aki/src/RecTools/rectools/dataset/identifiers.py:60: FutureWarning: unique with argument that is not not a Series, Index, ExtensionArray, or np.ndarray is deprecated and will raise in a future version.
    unq_values = pd.unique(values)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================== short test summary info ================================================
FAILED tests/models/test_implicit_als.py::TestImplicitALSWrapperModel::test_happy_path_with_features[True-expected0-True] - ValueError: YtY and Y should have the same number of columns
====================================== 1 failed, 76 passed, 20 warnings in 11.30s ======================================

After

➜  RecTools git:(fix-als-gpu) poetry run pytest tests/models/test_implicit_als.py
================================================= test session starts ==================================================
platform linux -- Python 3.12.3, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/aki/src/RecTools
configfile: setup.cfg
plugins: cov-5.0.0, subtests-0.12.1, typeguard-4.4.1, mock-3.14.0
collected 77 items

tests/models/test_implicit_als.py .............................................................................  [100%]

=================================================== warnings summary ===================================================
tests/models/test_implicit_als.py: 20 warnings
  /home/aki/src/RecTools/rectools/dataset/identifiers.py:60: FutureWarning: unique with argument that is not not a Series, Index, ExtensionArray, or np.ndarray is deprecated and will raise in a future version.
    unq_values = pd.unique(values)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================================== 77 passed, 20 warnings in 9.53s ============================================

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Optimization

How Has This Been Tested?

Before submitting a PR, please check yourself against the following list. It would save us quite a lot of time.

  • Have you read the contribution guide?
  • Have you updated the relevant docstrings? We're using Numpy format, please double-check yourself
  • Does your change require any new tests?
  • Have you updated the changelog file?

@codecov
Copy link

codecov bot commented Dec 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (9b3992e) to head (f9d317e).
Report is 99 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##              main      #228     +/-   ##
===========================================
  Coverage   100.00%   100.00%             
===========================================
  Files           45        59     +14     
  Lines         2242      3876   +1634     
===========================================
+ Hits          2242      3876   +1634     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chezou chezou changed the title Fix Implicit ALS matrix zero assignment size Fix Implicit ALS matrix zero assignment size on GPU Dec 11, 2024
Copy link
Collaborator

@blondered blondered left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this. Thank you!

@blondered blondered merged commit eabfad1 into MobileTeleSystems:main Dec 11, 2024
8 checks passed
@chezou chezou deleted the fix-als-gpu branch December 11, 2024 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants