Another possibility is to threshold the predict_proba differently, such that the decision maximizes whatever metric you have defined.

On 03/15/2016 07:44 AM, Mamun Rashid wrote:
Hi All,
I have asked this question couple of weeks ago on the list. I have a two class problem where my positive class ( Class 1 ) and negative class ( Class 0 ) is imbalanced. Secondly I care much less about the negative class. So, I specified both class weight (to a random forest classifier) and sample wright to
the fit function to give more importance to my positive class.

cl_weight = {0:weight1,1:weight2}
clf= RandomForestClassifier(n_estimators=400, max_depth=None, min_samples_split=2, random_state=0, oob_score=True, class_weight = cl_weight, criterion=*“g**ini*")
sample_weight = np.array([weightif m ==1 else 1 for min df_tr[label_column]])
y_pred  = clf.fit(X_tr, y_tr,sample_weight= sample_weight).predict(X_te)
Despite specifying dramatically different class weight I do not observe much difference. Example :: cl_weight = {0:0.001, 1:0.999} and cl_weight = {0:0.50, 1:0.50}. Am I passing the class weight correctly ? I am giving example of two folds from these two runs :: Fold 1 and Fold 2.
## cl_weight = {0:0.001, 1:0.999}
Fold_1 Confusion Matrix 0 1 0 1681 26 1 636 149 Fold_5 Confusion Matrix 0 1 0 1670 15 1 734 160 ## cl_weight = {0:0.50, 1:0.50} Fold_1 Confusion Matrix 0 1 0 1690 15 1 630 163 Fold_5 Confusion Matrix 0 1 0 1676 14 1 709 170
Thanks,
Mamun


------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to