Skip to content

TST use global_random_seed in sklearn/_loss/tests/test_glm_distribution.py #23741

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 23, 2022

Conversation

marenwestermann
Copy link
Member

Reference Issues/PRs

#22827

What does this implement/fix? Explain your changes.

I used global_random_seed in the function test_deviance_derivative. This caused the following 3 test failures related to the absolute error at the end of the test (see details). I therefore increased the margin from 1e-6 to 3e-6. I'm not sure if this is desired though.

(sklearn-dev) ➜  scikit-learn git:(main) ✗ SKLEARN_TESTS_GLOBAL_RANDOM_SEED="all" pytest sklearn/_loss/tests/test_glm_distribution.py -k test_deviance_derivative
============================================================================================= test session starts ==============================================================================================
platform darwin -- Python 3.10.2, pytest-7.0.1, pluggy-1.0.0
rootdir: /Users/maren/open_source/scikit-learn, configfile: setup.cfg
plugins: xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 917 items / 17 deselected / 900 selected

sklearn/_loss/tests/test_glm_distribution.py ..............................................................................................................FF...F....................................... [ 17%]
........................................................................................................................................................................................................ [ 39%]
........................................................................................................................................................................................................ [ 61%]
........................................................................................................................................................................................................ [ 83%]
.................................................................................................................................................                                                        [100%]

=================================================================================================== FAILURES ===================================================================================================
________________________________________________________________________________ test_deviance_derivative[12-GammaDistribution] ________________________________________________________________________________

family = <sklearn._loss.glm_distribution.GammaDistribution object at 0x142dc4e80>, global_random_seed = 12

    @pytest.mark.parametrize(
        "family",
        [
            NormalDistribution(),
            PoissonDistribution(),
            GammaDistribution(),
            InverseGaussianDistribution(),
            TweedieDistribution(power=-2.5),
            TweedieDistribution(power=-1),
            TweedieDistribution(power=1.5),
            TweedieDistribution(power=2.5),
            TweedieDistribution(power=-4),
        ],
        ids=lambda x: x.__class__.__name__,
    )
    def test_deviance_derivative(family, global_random_seed):
        """Test deviance derivative for different families."""
        rng = np.random.RandomState(global_random_seed)
        y_true = rng.rand(10)
        # make data positive
        y_true += np.abs(y_true.min()) + 1e-2

        y_pred = y_true + np.fmax(rng.rand(10), 0.0)

        dev = family.deviance(y_true, y_pred)
        assert isinstance(dev, float)
        dev_derivative = family.deviance_derivative(y_true, y_pred)
        assert dev_derivative.shape == y_pred.shape

        err = check_grad(
            lambda y_pred: family.deviance(y_true, y_pred),
            lambda y_pred: family.deviance_derivative(y_true, y_pred),
            y_pred,
        ) / np.linalg.norm(dev_derivative)
>       assert abs(err) < 1e-6
E       assert 1.262523602238264e-06 < 1e-06
E        +  where 1.262523602238264e-06 = abs(1.262523602238264e-06)

sklearn/_loss/tests/test_glm_distribution.py:123: AssertionError
___________________________________________________________________________ test_deviance_derivative[12-InverseGaussianDistribution] ___________________________________________________________________________

family = <sklearn._loss.glm_distribution.InverseGaussianDistribution object at 0x142dc4ee0>, global_random_seed = 12

    @pytest.mark.parametrize(
        "family",
        [
            NormalDistribution(),
            PoissonDistribution(),
            GammaDistribution(),
            InverseGaussianDistribution(),
            TweedieDistribution(power=-2.5),
            TweedieDistribution(power=-1),
            TweedieDistribution(power=1.5),
            TweedieDistribution(power=2.5),
            TweedieDistribution(power=-4),
        ],
        ids=lambda x: x.__class__.__name__,
    )
    def test_deviance_derivative(family, global_random_seed):
        """Test deviance derivative for different families."""
        rng = np.random.RandomState(global_random_seed)
        y_true = rng.rand(10)
        # make data positive
        y_true += np.abs(y_true.min()) + 1e-2

        y_pred = y_true + np.fmax(rng.rand(10), 0.0)

        dev = family.deviance(y_true, y_pred)
        assert isinstance(dev, float)
        dev_derivative = family.deviance_derivative(y_true, y_pred)
        assert dev_derivative.shape == y_pred.shape

        err = check_grad(
            lambda y_pred: family.deviance(y_true, y_pred),
            lambda y_pred: family.deviance_derivative(y_true, y_pred),
            y_pred,
        ) / np.linalg.norm(dev_derivative)
>       assert abs(err) < 1e-6
E       assert 2.712845682880818e-06 < 1e-06
E        +  where 2.712845682880818e-06 = abs(2.712845682880818e-06)

sklearn/_loss/tests/test_glm_distribution.py:123: AssertionError
______________________________________________________________________________ test_deviance_derivative[12-TweedieDistribution3] _______________________________________________________________________________

family = <sklearn._loss.glm_distribution.TweedieDistribution object at 0x142dc5060>, global_random_seed = 12

    @pytest.mark.parametrize(
        "family",
        [
            NormalDistribution(),
            PoissonDistribution(),
            GammaDistribution(),
            InverseGaussianDistribution(),
            TweedieDistribution(power=-2.5),
            TweedieDistribution(power=-1),
            TweedieDistribution(power=1.5),
            TweedieDistribution(power=2.5),
            TweedieDistribution(power=-4),
        ],
        ids=lambda x: x.__class__.__name__,
    )
    def test_deviance_derivative(family, global_random_seed):
        """Test deviance derivative for different families."""
        rng = np.random.RandomState(global_random_seed)
        y_true = rng.rand(10)
        # make data positive
        y_true += np.abs(y_true.min()) + 1e-2

        y_pred = y_true + np.fmax(rng.rand(10), 0.0)

        dev = family.deviance(y_true, y_pred)
        assert isinstance(dev, float)
        dev_derivative = family.deviance_derivative(y_true, y_pred)
        assert dev_derivative.shape == y_pred.shape

        err = check_grad(
            lambda y_pred: family.deviance(y_true, y_pred),
            lambda y_pred: family.deviance_derivative(y_true, y_pred),
            y_pred,
        ) / np.linalg.norm(dev_derivative)
>       assert abs(err) < 1e-6
E       assert 2.415069957951342e-06 < 1e-06
E        +  where 2.415069957951342e-06 = abs(2.415069957951342e-06)

sklearn/_loss/tests/test_glm_distribution.py:123: AssertionError
================================================================================= 3 failed, 897 passed, 17 deselected in 0.69s =================================================================================

Any other comments?

ping @rth

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @marenwestermann !

@rth rth merged commit 9dbbdbd into scikit-learn:main Jun 23, 2022
@marenwestermann marenwestermann deleted the test-glm-distribution branch June 24, 2022 12:10
ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Jul 11, 2022
…on.py (scikit-learn#23741)

Co-authored-by: Maren Westermann <maren.westermann@free-now.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants