Support orthogonal polynomial features (via QR decomposition) in `PolynomialFeatures` #31223

cottnich · 2025-04-18T04:56:26Z

Describe the workflow you want to enable

I want to introduce support for orthogonal polynomial features via QR decomposition in PolynomialFeatures, closely mirroring the behavior of R's poly() function.

In regression modeling, using orthogonal polynomials can often lead to improved numerical stability and reduced multi-collinearity among polynomial terms

As an example of what the difference looks like in R,

#fits raw polynomial data without an orthogonal basis
model_raw <- lm(y ~ I(x) + I(x^2) + I(x^3), data = data)
#model_raw <- lm(y ~poly(x,3,raw=TRUE), data = data)

#fits the same degree-3 polynomial using an orthogonal basis
model_poly <- lm(y ~ poly(x, 3), data = data)

This behavior cannot currently be replicated with scikit-learn's PolynomialFeatures, which only produces the raw monomial terms. As a result transitioning from R to Python often leads to discrepancies in model behavior and performance.

Describe your proposed solution

I propose extending PolynomialFeatures with a new parameter:

PolynomialFeatures(..., method="raw")

Accepted values:

"raw" (default): retains existing behavior, returning standard raw terms
"qr": applies QR decomposition to each feature to generate orthogonal polynomial features.

Because R's poly() only operates on 1D input vectors, my thought was to apply QR decomposition feature by feature when the input is multi-dimensional. Each column is processed independently, mirroring R's approach.

This feature would interact with other parameters as follows:

include_bias: When method="qr", The orthogonal polynomial basis inherently includes a transformed first column. However, this column is not a plain column of ones. Therefore, the concept of include_bias=True (which appends a column of ones) becomes redundant or misleading in this context. One option is to always set include_bias=False if method=qr and always return orthogonal columns only, or raise a warning.
interaction_only: This would be incompatible with method="qr" since the QR-based transformation does not naturally support selective inclusion of interaction terms.

Describe alternatives you've considered, if relevant

Currently, users must implement QR decomposition manually when orthogonal polynomials are needed. This is a common pattern in statistical workflows but lacks "off the shelf" support in any major python library. This feature would eliminate the need to do this decomposition manually and would improve workflows for researchers who are used to R's statistical tools.

Additional context

This idea stemmed from a broader effort to convert statistical modeling pipelines from R to python, where discrepencies in regression results were traced to the lack of orthogonal polynomial support in PolynomialFeatures.

I have drafted and tested a 1D implementation of this feature but wanted feedback on whether this idea aligns with scikit-learn's scope before moving on. In particular, I'd appreciate input on

Acceptability of feature-wise orthogonalization for multi-feature input.
Preferred parameter naming (e.g., method="qr" vs. orthogonal=True).
Compatibility decisions around parameters like include_bias and interaction_only.

The text was updated successfully, but these errors were encountered:

ogrisel · 2025-04-25T13:30:34Z

Thanks for your proposal.

Because R's poly() only operates on 1D input vectors, my thought was to apply QR decomposition feature by feature when the input is multi-dimensional. Each column is processed independently, mirroring R's approach.

That sounds quite different from the current behavior of PolynomialFeatures which always considers interactions between input features. I wonder if trying to implement the two approaches into the same class makes sense. Maybe we could implement that as a new transformer but before doing so, I would rather make sure that people actually need this feature before implementing it in scikit-learn and adding the maintenance of a new estimator and cognitive overhead choosing from too many estimators and options.

Could you please publish prototype code to a gist.github.com or other small repo or notebook? It would be great if you could highlight a simple regression tasks for which this kind of feature engineering is a game changer (either in terms of predictive performance, model size, fitting speed, numerical stability of the fit) compared to what is already available in scikit-learn.

It would be great to compare to:

SplineTransformer()
PolynomialFeatures()
make_pipeline(SplineTransformer(), PolynomialFeatures(interaction_only=True)

and all of the above followed by a PCA step.

BTW: maybe @lorentzenchr would like to share his views on such a new feature in scikit-learn.

ogrisel · 2025-04-25T13:36:57Z

I also noticed that formulaic provides an implementation of the poly function that seems to follow the R-style orthogonalization convention by default:

https://matthewwardrop.github.io/formulaic/latest/guides/splines/#poly

Note that it is possible to wrap formulaic as a scikit-learn transformer but it requires copy-pasting some boilerplate code as explained here:

https://matthewwardrop.github.io/formulaic/latest/guides/integration/#scikit-learn

Rather than replicating such features in scikit-learn, it might be more fruitful to see if formulaic authors be open to the idea of adding first-class integration with scikit-learn API/pipelines in directly into formulaic or via a new extension package. If so, we could extend on of scikit-learn examples on polynomial / spline features to show how to use formulaic with scikit-learn a go beyond the options implemented by default in scikit-learn.

lorentzenchr · 2025-05-13T13:29:24Z

@cottnich What is your motivation?

As a result transitioning from R to Python often leads to discrepancies in model behavior and performance.

Without penalty, the predicted values are the same for all design matrices related to orthogonal transformations.
B-Splines are almost always preferred to pure polynomials of features.
If you want to be as close to R's lm and glm, I recommend to use https://www.statsmodels.org .

alexshtf · 2025-05-19T14:58:19Z

There are existing orthogonal polynomial bases, such as the Legendre basis, which are already orthogonal and don't need any transformation at all. We can also support interactions using tensor-product bases.
Do we really need the heavy computational overhead of "orthogonalizing" the coefficients?

lorentzenchr · 2025-05-19T16:35:34Z

Again, what is your motivation? Why do you want orthogonal polynomials?

alexshtf · 2025-05-21T10:29:26Z

I don't. The OP does. But I can guess why.

You can fit high degree polynomials without the well-known "bad effects" of overfitting. Moreover and you can easily prune models by simply removing the tail of the coefficients, and staying with low-degree polynomials. This is because they are a sort of a spectrum, where higher degree polynomials are like frequency components - this pruning is just denoising.

What I don't understand is why it belongs to scikit-learn, and not their own research repository, where the OP does their research on polynomials. This library is easily extensible - just inherit the right transformer base-class, and you can put your new super-duper polynomial basis into your pipeline.

cottnich · 2025-05-27T19:43:53Z

Thank you for all the responses and my apologies for not checking on this thread. I actually wasn't aware that the formulaic library handles this. I assumed it was like statsmodels.formula which doesn't do any additional computations.

cottnich added Needs Triage Issue requires triage New Feature labels Apr 18, 2025

ogrisel added Needs Decision - Include Feature Requires decision regarding including feature and removed Needs Triage Issue requires triage labels Apr 25, 2025

cottnich closed this as completed May 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support orthogonal polynomial features (via QR decomposition) in `PolynomialFeatures` #31223

Support orthogonal polynomial features (via QR decomposition) in `PolynomialFeatures` #31223

cottnich commented Apr 18, 2025

ogrisel commented Apr 25, 2025

Uh oh!

ogrisel commented Apr 25, 2025

Uh oh!

lorentzenchr commented May 13, 2025

Uh oh!

alexshtf commented May 19, 2025

Uh oh!

lorentzenchr commented May 19, 2025

Uh oh!

alexshtf commented May 21, 2025 •

edited

Loading

Uh oh!

cottnich commented May 27, 2025

Uh oh!

Uh oh!

Support orthogonal polynomial features (via QR decomposition) in PolynomialFeatures #31223

Support orthogonal polynomial features (via QR decomposition) in PolynomialFeatures #31223

Comments

cottnich commented Apr 18, 2025

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

ogrisel commented Apr 25, 2025

Uh oh!

ogrisel commented Apr 25, 2025

Uh oh!

lorentzenchr commented May 13, 2025

Uh oh!

alexshtf commented May 19, 2025

Uh oh!

lorentzenchr commented May 19, 2025

Uh oh!

alexshtf commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cottnich commented May 27, 2025

Uh oh!

Support orthogonal polynomial features (via QR decomposition) in `PolynomialFeatures` #31223

Support orthogonal polynomial features (via QR decomposition) in `PolynomialFeatures` #31223

alexshtf commented May 21, 2025 •

edited

Loading