Description
From the discussion on the mailing list: http://sourceforge.net/mailarchive/message.php?msg_id=29883772
"I think we could have classes=None
constructor parameter in
SGDClassifier an possibly many other classifiers. When provided we
would not use the traditional self.classes_ = np.unique(y)
idiom
already implemented in some classifiers of the project (but not all).
+1 also for raising a ValueError exception when classes != None
and
if the y
provided at fit time has some values not in classes
.
However we need to check with some benchmarks that this integrity
check is not too costly.
This constructor parameters could be overriden by a fit_param
to
preserve backward compat, especially for classifier models with a
partial_fit
method.
The expected behavior for a classifier that is passed a non-None
classes
constructor param would be to never predict a class value.
In case of predict_proba method the missing fit-time class
probabilities should be 0.0.
This protocol (including expected exception types and error messages)
should be formalized as a series of common tests in
sklearn/tests/test_common.py and redundant book keeping code should be
factorized in the sklearn.base.py's ClassifierMixin class IMHO."
-Oliver Grisel