Adding L2- regualization in gradient boosting 

During a project I have been working a lot with the scikit learn and xgboost source code and I have been missing L2-regualzation in the scikit learn since my experience from xgboost is that including it gives preformance improvements. Their implimentation of this looks the following:

#Leaf calculation - param.h  line 285
 dw = -sum_grad / (sum_hess + p.reg_lambda);
#gain Calculation /imprunity_improvement - param.h  line 250
gain= Sqr(sum_grad) / (sum_hess + p.reg_lambda);

Then there are some additional for l1 regualization, maximum improvement, etc. but the other code seems to be enogth given scikit learn hyperparameters. Full implimentation at https://github.com/dmlc/xgboost/blob/master/src/tree/param.h

I scikit learn the leaf calculation path seems to be handled by adding self.reg_lambda to all the denominators and adding it to the class. While the imprunity_improvement seems to handled  by changing proxy_impurity_improvement and all classes in _criterion.pyx and adding an extra hyperparameter to it.

Does this seem reasonable or relevant?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Adding L2- regualization in gradient boosting #8784

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Adding L2- regualization in gradient boosting #8784

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions