-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MAINT: Remove redundant code in MLP #14815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A minimal example would be something like, from sklearn.neural_network.multilayer_perceptron import MLPClassifier
from sklearn.neural_network._base import ACTIVATIONS, DERIVATIVES
ACTIVATIONS['identity2'] = lambda x: x
def inplace_identity_derivative2(Z, delta):
pass
DERIVATIVES['identity2'] = inplace_identity_derivative2
mlp = MLPClassifier(activation='identity2')
mlp.fit([[0, 1], [1, 1]], [0, 1]) changing public variables in the scikit-learn namespace is not great, but as a somewhat experimental solution it should work (also it doesn't cost much). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, why not?
I'm very worried about this one. Which one of the following two are we doing?
I kinda don't like not documenting it, cause we'll force ourselves to deprecate the behavior if we wanna change it anyway. Which means why not document and let people use it then? |
We don't have to document anything if we don't judge it necessary. The one line change to the MLP file could be done just under the justification of removal of redundant code unless you think it's controversial. It happens that you can then provide custom activation by adding to these dict variables. Is it worth explicitly disallowing that? I don't know. Technically I can locally import those inside
We could pass a function that returns the activation function and the derivative. The problem is that activation and derivatives are not computed at the same time (meaning it would be slower) and also add a bit more complexity to code. |
I would prefer that.
I meant passing two functions, but I'm happy with the private variable solution as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rth
Are we going to start recommending users to do this in our issue board? |
@@ -97,6 +97,10 @@ def _forward_pass(self, activations): | |||
activations : list, length = n_layers - 1 | |||
The ith element of the list holds the values of the ith layer. | |||
""" | |||
# local import to allow custom activations set via this | |||
# private variable in _base | |||
from ._base import ACTIVATIONS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, why do we need a local import exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First to remove ACTIVATIONS
and DERIVATIVES
from a public import path. Second to allow monkeypatching it. If you import it at the top level, it's imported together with MLPClassifier
, and then changing neural_network._base.ACTIVATION
no longer has an effect on MLPClassifier
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the second part: base.ACTIVATION is a singleton dict, it's never copied anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Actually the above example will work even with top level import. The problem is pytest monkeypatching, if I don't do local imports tests using it will fail, since I monkeypatch ACTIVATIONS
instead of just updating it to be sure not to affect the global state or other tests.
I removed the comment as it was not accurate, but I guess in any case it doesn't hurt not to expose those variables at top level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a test that doesn't require monkey patching? I'd be fine with del ACTIVATIONS[key]
at the end of the test. It's not great, but having a local import so that we can write a test in a certain way sounds like a strange design to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not great, but having a local import so that we can write a test in a certain way sounds like a strange design to me
Fair enough, I can also just revert 7519f14 and we will be back where we started but with an example in the issue using the private API :)
Allowing to do this via a private API, implying that we are not officially supporting it and it can change in the future, is not such a bad compromise. But it still leaves people the ability to experiment if they are so inclined. Documenting it would make this mechanism a bit more public, I'm not sure that is something we want. How do you think we should handle it? |
Cutom
This is not helping me to work with custom activation functions in Sklearn . |
@ArijitN5 The PR isn't even merged yet so it's normal that the code snippet doesn't work yet. Also note that even though we are making this possible, we do not recommend it. You would be using a private API that is subject to change without deprecation. |
Could we consider (privately) accept a tuple of callables instead of hacking the dict? |
I'd be happy to publicly accepting a tuple of callables, why not? |
The problem is that tuple of callables doesn't generalize well to the case when more than 2 callable are needed (e.g. GLM) #14720 (comment) so it's not a general solution to this problem. |
hence the idea of keeping it private. it's possible, but not recommended. |
OK so I reverted latest changes, and can remove the test if it bothers you. So there would be only a 2 line code simplification PR. The example in the PR still uses the private API. To summarize, if we were to do this as part of the public API, we could a) add a
Can't say I'm too enthusiastic about any of them. Putting all constraints aside I would probably go with e) or c) (with a named tuple) . |
I'd be fine with not having the test, and I'm also fine with C or E. After these discussions I'm also just happy with the PR as is, but ideally I'd open an issue to have one of your suggestions implemented after this one. |
I'm OK with the PR as-is too. If we ever write an example we might want to use |
I removed the test and changed the PR title. Should be good to go. Will try to open an issue about possible public API implementation later today. |
Merging, thanks @rth I'm OK with a public API but I think we decided not to add more features to the NN code during the meeting? So might just not be worth it |
Closes #14807
Closes #11952
Alternative to #10665
This makes sure that it's possible to use a custom activation function registering new ones in
sklearn.neural_network.multilayer_perceptron.{ACTIVATIONS,DERIVATIVES}
.It was almost working before, except for a manual check of supported activation (instead of using the
ACTIVATIONS
variable), so this mostly adds a test.Not sure if we want to document it (and if so where).