Skip to content

ENH Add periodic extrapolation to SplineTransformer #19483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 53 commits into from
Mar 17, 2021

Conversation

mlondschien
Copy link
Contributor

Reference Issues/PRs

Extends #18368.

What does this implement/fix? Explain your changes.

This PR adds the option for periodic extrapolation (with periodic splines) to the SplineTransformer.

Any other comments?

Thank you @lorentzenchr for adding the SplineTransformer! I often use periodic splines e.g. to model effects for the day of the year and would believe this feature to be very helpful. Please let me know if we should expand documenting code / examples.

@mlondschien mlondschien changed the title Splines periodic Add periodic extrapolation to SplineTransformer Feb 18, 2021
@ogrisel
Copy link
Member

ogrisel commented Feb 19, 2021

If supporting old versions of scipy is too much of a hassle I think it's fine to raise NotImplementedError with a message that explains to upgrade scipy to use this new feature.

@ogrisel
Copy link
Member

ogrisel commented Feb 19, 2021

Alternatively, if missing code in scipy is a rather self-contained pure-python function, feel free to backport it in sklearn/utils/fixes.py with a comment to credit provenance.

@mlondschien
Copy link
Contributor Author

mlondschien commented Feb 19, 2021

If supporting old versions of scipy is too much of a hassle I think it's fine to raise NotImplementedError with a message that explains to upgrade scipy to use this new feature.

Alternatively, if missing code in scipy is a rather self-contained pure-python function, feel free to backport it in sklearn/utils/fixes.py with a comment to credit provenance.

The only code needed to support scipy<1.0.0 is (basically a copy from the code in scipy):

n = spl.t.size - spl.k - 1
X[:, i] = spl.t[spl.k] + (X[:, i] - spl.t[spl.k]) % (spl.t[n] - spl.t[spl.k])

IMO this is fine, but raising a NotImplementedError works as well. Adding a backport to sklean/utils/fixes requires more code than above and might be too much added complexity for my taste.

@ogrisel
Copy link
Member

ogrisel commented Feb 19, 2021

Alright, I am fine with keeping this at it is then.

@ogrisel
Copy link
Member

ogrisel commented Feb 19, 2021

Could you please extend the spline example with a new section on modeling periodic data (with synthetic data with a mix of yearly and weekly signals)? You can use pandas to generate such fake data.

https://scikit-learn.org/dev/auto_examples/linear_model/plot_polynomial_interpolation.html#sphx-glr-auto-examples-linear-model-plot-polynomial-interpolation-py

Note: as I have no experience with using splines myself, that would help me review the PR by seeing how to use this new feature correctly and assessing its correctness.

@mlondschien
Copy link
Contributor Author

I've added the test and an example. I used a synthetic example as I believed that this better visualized how periodic splines work. I am open for other suggestions though. How do I check whether the example renders correctly?

@mlondschien mlondschien requested a review from ogrisel February 22, 2021 10:30
@mlondschien
Copy link
Contributor Author

Since you implemented the transformer, it would be very helpful if you could have a look @lorentzenchr.

@lorentzenchr
Copy link
Member

Since you implemented the transformer, it would be very helpful if you could have a look

@mlondschien I will. Thanks for working on this extension.

Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the test side, I miss a test that the extrapolation works. Let's say period length is 1. Fitting is on x in [0, 1] then transform([0, 1]) == transform([1, 2]) and in particular at the knots.

Maybe, one should more clearly state, that the periodicity is given either by the manually specified knots or by the range of the training data. It is neither automatically detected, nor can you specify the periodicity directly (with a parameter).

mlondschien and others added 2 commits February 26, 2021 19:12
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
@mlondschien
Copy link
Contributor Author

Thanks @lorentzenchr for having a look.

On the test side, I miss a test that the extrapolation works. Let's say period length is 1. Fitting is on x in [0, 1] then transform([0, 1]) == transform([1, 2]) and in particular at the knots.

This is done in this test, even if not as explicitly as you are suggesting. However this (and the test you suggested) would pass for any periodic continuation of splines (e.g. normal B-Splines with extrapolate="periodic". I can add a test that asserts that transformers with knots at [1, 2, 3, 4, 5] and [2, 3, 4, 5, 6] yield the same result up to permutation of coordinates.

Maybe, one should more clearly state, that the periodicity is given either by the manually specified knots or by the range of the training data. It is neither automatically detected, nor can you specify the periodicity directly (with a parameter).

I have tried to make this more explicit in the example. Where / how would you specify this in the docstring? One could add

If 'periodic', periodic splines with a periodicity equal to the distance of the first and last knot are used. Periodic splines enforce equal function values and derivatives at the first and last knots.

to the extrapolation parameter. Maybe this could be made more concise.

Note that this statement is no longer correct. Would you like to point out the different number of splines for periodic splines or is this a minor detail?

@lorentzenchr
Copy link
Member

Periodic b-splines turn out to be surprisingly complicated. But we're close to the finish.

@mlondschien
Copy link
Contributor Author

mlondschien commented Mar 2, 2021

I've expanded the example to suggest users to provide knots explicitly if the periodicity is known. I've also tried to make the smoothness test more self-explanatory.

Yes to spare compute resources.

This makes sense. I would have expected the tests to never finish (i.e. yellow dot) instead of failing though.

Actually it might be worth adding a test that checks the periodicity extrapolation of the spline transformer (without sticking it into a pipeline with a regressor) with 2 input features e.g. feature 0 with values in [-1, 1] and features 1 with values in [0, 5] in the training set.
And then checking that we get the transformed features on a test set with values on [1, 2] for features 0 and [5, 10] for feature 1.

Sounds reasonable. I've added such a test.

Periodic b-splines turn out to be surprisingly complicated. But we're close to the finish.

Thanks for investing the time to review.

@ogrisel
Copy link
Member

ogrisel commented Mar 2, 2021

I pushed ca92728 to improve test_spline_transformer_periodic_splines_smoothness to test continuity on the raw diff, prior to dividing by delta (to get the derivative). Also I changed the assertion by comparing the scaler abs max to tol to get more informative values in the pytest assertion error message in case of failure.

# Check continuity of the (d-1)-th derivative
assert (np.abs(dXt) < tol).all()
diff = np.diff(dXt, axis=0)
assert np.abs(diff).max() < tol
Copy link
Contributor Author

@mlondschien mlondschien Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this (and below) depends on tol / delta ~ 10. I.e. if we reduce (or increase) the number of rows in X substantially, this test will fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why though. Continuity should always be checked on the raw diff, assuming that the steps are fine grained enough, no?

Otherwise we "check continuity" by bounding the absolute value of the derivative itself which seems wrong to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given continuity, the "raw diff" scales with the order of the stepsize (delta). It will be roughly of the same order. Above, your stepsize is ~1e-3, so checking against tol=1e-2 is fine. If we reduce the stepsize to 0.1, the diff will be of order diff ~ 0.1 > tol = 1e-2.

@mlondschien
Copy link
Contributor Author

@lorentzenchr, if you could have another (final 😉?) look, that would be great.

@lorentzenchr
Copy link
Member

@mlondschien This PR looks very good and, if necessary, we can improve some minor things in later PRs. I have 2 remaining issues, a small one:

  • Add the periodic case to test_spline_transformer_unity_decomposition.

And a large one: test_spline_transformer_periodic_splines_smoothness
tol=1e-2 is not a very strict test. Also, we don't need to test continuity at all values of X. Just at the boundaries b of the period. Something like

eps = 1e-8
for b in [0, 8, 16]:
    Xnew = np.linspace(b - degree*eps , b + degree*eps, num=2*degree + 1)[:, None]
    Xt = transformer.fit_transform(Xnew).squeeze()
    assert Xt[degree-1]  = pytest.approx(Xt[degree+1])  # compare at b-eps and b+eps
    # TODO: compare left n-th derivative at b-eps with right n-th derivative at b+eps

What do you think?

@mlondschien
Copy link
Contributor Author

  • Add the periodic case to test_spline_transformer_unity_decomposition.

Sure!

tol=1e-2 is not a very strict test. Also, we don't need to test continuity at all values of X. Just at the boundaries b of the period. Something like

eps = 1e-8
for b in [0, 8, 16]:
    Xnew = np.linspace(b - degree*eps , b + degree*eps, num=2*degree + 1)[:, None]
    Xt = transformer.fit_transform(Xnew).squeeze()
    assert Xt[degree-1]  = pytest.approx(Xt[degree+1])  # compare at b-eps and b+eps
    # TODO: compare left n-th derivative at b-eps with right n-th derivative at b+eps

What do you think?

IMO the test from before, I.e. with X.shape = (1000,1), was strict enough, as we checked that the diff was an order of magnitude above the tolerance for the last iteration where we expected discontinuity. This test makes sure that we did not choose a too large tolerance. However, if you prefer, I can also adjust the test to only check continuity at the boundary, choose a smaller stepsize there and thus a smaller tolerance. We could then also test continuity of the constant extrapolation and C1 of linear extrapolation in the same test via parametrization.

@mlondschien
Copy link
Contributor Author

I have adjusted (but not pushed) the smoothness test as proposed by you @lorentzenchr. See here:

@pytest.mark.parametrize("degree", [3, 5])
def test_spline_transformer_periodic_splines_smoothness(degree):
    """Test that spline transformation is smooth at first / last knot."""
    X = np.linspace(0, 8, 10)[:, None]

    transformer = SplineTransformer(
        degree=degree,
        extrapolation="periodic",
        knots=[[0.0], [1.0], [3.0], [4.0], [5.0], [8.0]]
    )
    Xt = transformer.fit(X)

    eps = 1e-8

    for b in [0, 8, 16]:
        dXt = transformer.transform(
            np.linspace(
                b - degree * eps,
                b + degree * eps,
                2 * degree + 1
            )[:, None]
        )

        for d in range(degree):
            assert_allclose(dXt[0, :], dXt[-1, :], atol = 10 * eps, rtol=0)
            dXt = np.diff(dXt, axis=0) / eps

This fails due to problems with precision. The original dXt has precision of around 1e-16 and so has np.diff(dXt, axis=0). In the second iteration, due to dividing by eps, the dXt has precision of around 1e-8. As you can imagine, the test fails for degree >= 3.

Possible options:

  • Use lower precision here, such that eps ** degree >> 1e-16. For degree = 5, this implies eps > 1e-3.
  • Go back to the original test.

Let me know what you think @lorentzenchr. If you had something different in mind that does not have this problem, please let me know. I don't believe that we should hold off on merging this PR due to details about tolerances / precision. Feel free to add any changes you would like to the smoothness test onto this PR.

@lorentzenchr
Copy link
Member

What about looking at b-eps and b+eps with eps=1e-12, but then use a different variable, i.e. h=1e-3, for the numerical drivatives. I'm really only interested in the 2 values at b-eps and b+eps. In addition, one can use higher order formulas for numerical derivatives, see https://en.wikipedia.org/wiki/Finite_difference_coefficient and https://github.com/lorentzenchr/scikit-learn/blob/4e007384e7eb4e86553a1bb44a220d9a7ce48134/sklearn/_loss/tests/test_loss.py#L89-L104.
Ideally, one would use backward differences at b-eps and forward differences at b+eps.

@mlondschien
Copy link
Contributor Author

In addition, one can use higher order formulas for numerical derivatives

I could not find an implementation for forward / backward diff weights in scipy. Are you fine with scipy.misc.central_diff_weights?

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@lorentzenchr
Copy link
Member

I could not find an implementation for forward / backward diff weights in scipy. Are you fine with scipy.misc.central_diff_weights?

It's not my intention to make this PR more complex. If you could use https://en.wikipedia.org/wiki/Finite_difference_coefficient, that'd be great. Using scipy.misc.central_diff_weightsmight also do the job. So I'd suggest use that and make a comment such that a future PR could improve this with forward and backward differences.

@mlondschien
Copy link
Contributor Author

It's not my intention to make this PR more complex.

To be frank, IMO its too late for this. I have probably spent as much time on the smoothness test as on the periodic splines themselves.

If you could use https://en.wikipedia.org/wiki/Finite_difference_coefficient, that'd be great. Using scipy.misc.central_diff_weights might also do the job. So I'd suggest use that and make a comment such that a future PR could improve this with forward and backward differences.

How would you go about using the coefficients from Wikipedia? Just copy them over?

I have tried using (central) finite differences as suggested by you. See the code in the details below. The test still fails for high degrees / small tolerances. IMO this is still due to precision. According to Wikipedia, the d-th derivative can be approximated as the sum of products of weights and function evaluation divided by the stepsize (eps) to the d-th power. This is also how scipy.misc.derivative is implemented. According to Wikipedia, if I use order (accuracy) o coefficients, the approximation will be correct up to a term of order stepsize to the o-th power. However, IIUC, this assumes infinite precision in the function evaluations themselves. In our use-case, the function evaluations have precision around 1e-16, so using a stepsize of eps, any approximation of the d-th derivative will have a precision of at best 1e-16 * eps ** d.

See the following example:

In [1]: from scipy.misc import derivative
   ...: import numpy as np
   ...: 
   ...: def f(x):
   ...:     return np.sin(x)
   ...: 

In [2]: derivative(f, 0, dx=1e-8, n=1, order=7)
Out[2]: 1.0000000000000002

In [3]: derivative(f, 0, dx=1e-8, n=2, order=7)
Out[3]: 7.237830359838991e-09

In [4]: derivative(f, 0, dx=1e-8, n=3, order=7)
Out[4]: 9.098986738083303

In [5]: derivative(f, 0, dx=1e-4, n=3, order=7)
Out[5]: -0.9999999011365072

In [6]: derivative(f, 0, dx=1e-4, n=5, order=7)
Out[6]: 5.421010862427521

In [7]: derivative(f, 0, dx=1e-2, n=5, order=7)
Out[7]: 0.999966567882815

Note how the third derivative of f using a stepsize of 1e-8 is complete garbage. Reducing the stepsize to 1e-4 produces a much more reasonable result.

I believe that hunting higher tolerances is useless. IMO the test as I had implemented it originally or as it is currently is sufficient for the use-case. I have added a small commit onto @ogrisel's suggestion to make the tolerance dependent of delta. I am not willing to invest a significant amount of time into the test. If you are unhappy with the tolerance, please add a commit on top to fix this. One option would be to only test until degree=2 with stepsize eps=1e-8 or until degree = 4 with stepsize eps=1e-4. I believe that asserting the extrapolation is smooth for higher degrees is more important than a strict tolerance.

I sincerely hope that this test does not prevent us from merging the PR. Periodic splines are a feature that improves on the status quo. The current tests are much more extensive than those before, e.g. as there are no tests for continuity / differentiability at the border for constant / linear extrapolation.

@pytest.mark.parametrize("degree", [3, 5])
def test_spline_transformer_periodic_splines_smoothness(degree):
    """Test that spline transformation is smooth at first / last knot."""
    X_fit = np.linspace(-2, 10, 10)[:, None]

    transformer = SplineTransformer(
        degree=degree,
        extrapolation="periodic",
        knots=[[0.0], [1.0], [3.0], [4.0], [5.0], [8.0]]
    ).fit(X_fit)

    eps = 1e-8

    # We use scipy.misc.central_diff_weights to compute higher-order
    # derivatives of sufficient accuracy. See also
    # https://en.wikipedia.org/wiki/Finite_difference_coefficient
    weights = {
        d: central_diff_weights(2 * d + 1, d) for d in range(1, degree + 1)
    }

    for b in [0, 8, 16]:
        X_left = np.linspace(
            b - (2 * degree + 1) * eps,
            b - eps,
            2 * degree + 1
        )[:, None]
        X_right = np.linspace(
            b + eps,
            b + (2 * degree + 1) * eps,
            2 * degree + 1
        )[:, None]
        Xt_left = transformer.transform(X_left)
        Xt_right = transformer.transform(X_right)

        for d in range(1, degree + 1):
            dXt_left = np.matmul(
                Xt_left[(degree - d) : (degree + d + 1), :].T,
                weights[d]
            ) / (eps ** d)
            dXt_right = np.matmul(
                Xt_right[(degree - d) : (degree + d + 1), :].T,
                weights[d]
            ) / (eps ** d)
            assert_allclose(dXt_left, dXt_right, rtol = 0, atol=10)

@ogrisel
Copy link
Member

ogrisel commented Mar 17, 2021

I agree let's merge. Exploring how to best test derivatives in scikit-learn in general can be done in another PR.

Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @mlondschien Thank you so much. This is a very useful extension. And sorry for being such a p.i.t.a. about the continuity test.

@lorentzenchr lorentzenchr changed the title Add periodic extrapolation to SplineTransformer ENH Add periodic extrapolation to SplineTransformer Mar 17, 2021
@lorentzenchr
Copy link
Member

lorentzenchr commented Mar 17, 2021

@mlondschien Also thanks for being frank and patient.

@lorentzenchr lorentzenchr merged commit 95b7c68 into scikit-learn:main Mar 17, 2021
@ogrisel
Copy link
Member

ogrisel commented Mar 17, 2021

Thank you very much for the very nice contribution @mlondschien!

@glemaitre glemaitre mentioned this pull request Apr 22, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants