-
-
Notifications
You must be signed in to change notification settings - Fork 26.1k
Closed
Labels
Description
Describe the workflow you want to enable
This is a feature request to improve performance of the QuantileTransformer. It takes ~60 minutes to fit, uses a huge amount of memory when transforming large non-sparse dataframes with 30M+ rows and 500 columns. It also does not support sample_weight. Ideally it should be as fast as catboost's Pool quantize method, which does many of the same computations in a fraction of the time:
https://catboost.ai/docs/en/concepts/python-reference_pool_quantized
Describe your proposed solution
See source code for https://catboost.ai/docs/en/concepts/python-reference_pool_quantized
Describe alternatives you've considered, if relevant
No response
Additional context
No response