Skip to content

MAINT: Remove redundant code in MLP #14815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 29, 2019

Conversation

rth
Copy link
Member

@rth rth commented Aug 26, 2019

Closes #14807
Closes #11952

Alternative to #10665

This makes sure that it's possible to use a custom activation function registering new ones in sklearn.neural_network.multilayer_perceptron.{ACTIVATIONS,DERIVATIVES}.

It was almost working before, except for a manual check of supported activation (instead of using the ACTIVATIONS variable), so this mostly adds a test.

Not sure if we want to document it (and if so where).

@rth
Copy link
Member Author

rth commented Aug 26, 2019

A minimal example would be something like,

from sklearn.neural_network.multilayer_perceptron import MLPClassifier
from sklearn.neural_network._base import ACTIVATIONS, DERIVATIVES                         

ACTIVATIONS['identity2'] = lambda x: x                                                                                                 

def inplace_identity_derivative2(Z, delta): 
     pass 
                                                                                                                                        
DERIVATIVES['identity2'] = inplace_identity_derivative2                                                                                

mlp = MLPClassifier(activation='identity2')                                                                                            

mlp.fit([[0, 1], [1, 1]], [0, 1])

changing public variables in the scikit-learn namespace is not great, but as a somewhat experimental solution it should work (also it doesn't cost much).

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not?

@adrinjalali
Copy link
Member

I'm very worried about this one. Which one of the following two are we doing?

  • Allow people to use this API, which is public. If we're doing this, then we should document it in neural_networks_supervised.rst at least, and add a whats_new entry. But that means we're actually happy about this solution; are we? I rather let them pass a function instead of having to add the function to the dictionary.
  • We want to have a better solution, and we'll probably do that, but later. That means we would have to deprecate this solution which is just dirty, if we already know we don't like it.

I kinda don't like not documenting it, cause we'll force ourselves to deprecate the behavior if we wanna change it anyway. Which means why not document and let people use it then?

@rth
Copy link
Member Author

rth commented Aug 27, 2019

Allow people to use this API, which is public. If we're doing this, then we should document it in neural_networks_supervised.rst at least, and add a whats_new entry

We don't have to document anything if we don't judge it necessary. The one line change to the MLP file could be done just under the justification of removal of redundant code unless you think it's controversial. It happens that you can then provide custom activation by adding to these dict variables. Is it worth explicitly disallowing that? I don't know. Technically ACTIVATIONS, DERIVATIVES are defined in sklearn.neural_network._base which is private it's just that I can't monkeypatch it there (that doesn't work).

I can locally import those inside MLPClassifier and remove top level imports in multilayer_perceptron.py (which would also allow using this mechanism by only changing private variables) if you think that would be preferrable?

I rather let them pass a function instead of having to add the function to the dictionary.

We could pass a function that returns the activation function and the derivative. The problem is that activation and derivatives are not computed at the same time (meaning it would be slower) and also add a bit more complexity to code.

@adrinjalali
Copy link
Member

I can locally import those inside MLPClassifier and remove top level imports in multilayer_perceptron.py (which would also allow using this mechanism by only changing private variables) if you think that would be preferrable?

I would prefer that.

We could pass a function that returns the activation function and the derivative. The problem is that activation and derivatives are not computed at the same time (meaning it would be slower) and also add a bit more complexity to code.

I meant passing two functions, but I'm happy with the private variable solution as well.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rth

@thomasjpfan
Copy link
Member

Are we going to start recommending users to do this in our issue board?

@@ -97,6 +97,10 @@ def _forward_pass(self, activations):
activations : list, length = n_layers - 1
The ith element of the list holds the values of the ith layer.
"""
# local import to allow custom activations set via this
# private variable in _base
from ._base import ACTIVATIONS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, why do we need a local import exactly?

Copy link
Member Author

@rth rth Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First to remove ACTIVATIONS and DERIVATIVES from a public import path. Second to allow monkeypatching it. If you import it at the top level, it's imported together with MLPClassifier, and then changing neural_network._base.ACTIVATION no longer has an effect on MLPClassifier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the second part: base.ACTIVATION is a singleton dict, it's never copied anywhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Actually the above example will work even with top level import. The problem is pytest monkeypatching, if I don't do local imports tests using it will fail, since I monkeypatch ACTIVATIONS instead of just updating it to be sure not to affect the global state or other tests.

I removed the comment as it was not accurate, but I guess in any case it doesn't hurt not to expose those variables at top level?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test that doesn't require monkey patching? I'd be fine with del ACTIVATIONS[key] at the end of the test. It's not great, but having a local import so that we can write a test in a certain way sounds like a strange design to me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not great, but having a local import so that we can write a test in a certain way sounds like a strange design to me

Fair enough, I can also just revert 7519f14 and we will be back where we started but with an example in the issue using the private API :)

@rth
Copy link
Member Author

rth commented Aug 27, 2019

Are we going to start recommending users to do this in our issue board?

Allowing to do this via a private API, implying that we are not officially supporting it and it can change in the future, is not such a bad compromise. But it still leaves people the ability to experiment if they are so inclined. Documenting it would make this mechanism a bit more public, I'm not sure that is something we want. How do you think we should handle it?

@officialarijit
Copy link

Cutom

A minimal example would be something like,

from sklearn.neural_network.multilayer_perceptron import MLPClassifier
from sklearn.neural_network._base import ACTIVATIONS, DERIVATIVES                         

ACTIVATIONS['identity2'] = lambda x: x                                                                                                 

def inplace_identity_derivative2(Z, delta): 
     pass 
                                                                                                                                        
DERIVATIVES['identity2'] = inplace_identity_derivative2                                                                                

mlp = MLPClassifier(activation='identity2')                                                                                            

mlp.fit([[0, 1], [1, 1]], [0, 1])

changing public variables in the scikit-learn namespace is not great, but as a somewhat experimental solution it should work (also it doesn't cost much).

This is not helping me to work with custom activation functions in Sklearn .
Please suggest what process should I do.

@NicolasHug
Copy link
Member

@ArijitN5 The PR isn't even merged yet so it's normal that the code snippet doesn't work yet.

Also note that even though we are making this possible, we do not recommend it. You would be using a private API that is subject to change without deprecation.

@NicolasHug
Copy link
Member

Could we consider (privately) accept a tuple of callables instead of hacking the dict?

@adrinjalali
Copy link
Member

Could we consider (privately) accept a tuple of callables instead of hacking the dict?

I'd be happy to publicly accepting a tuple of callables, why not?

@rth
Copy link
Member Author

rth commented Aug 28, 2019

I'd be happy to publicly accepting a tuple of callables, why not?

The problem is that tuple of callables doesn't generalize well to the case when more than 2 callable are needed (e.g. GLM) #14720 (comment) so it's not a general solution to this problem.

@NicolasHug
Copy link
Member

hence the idea of keeping it private. it's possible, but not recommended.

@rth
Copy link
Member Author

rth commented Aug 28, 2019

OK so I reverted latest changes, and can remove the test if it bothers you. So there would be only a 2 line code simplification PR. The example in the PR still uses the private API.

To summarize, if we were to do this as part of the public API, we could

a) add a activation_derivative=None parameter, and accept a callable there and for activation.
b) have some registration mechanism for ACTIVATION (e.g. register_activation('name', activation, derivative)
c) accept a tuple of callable with (fn activation, fn derivative)
d) accept a callable returning a tuple
e) Accept an instance (or a named tuple) of the form .

 class Activation:
    def __call___(...)
    
    def derivative(...)

Can't say I'm too enthusiastic about any of them. Putting all constraints aside I would probably go with e) or c) (with a named tuple) .

@adrinjalali
Copy link
Member

I'd be fine with not having the test, and I'm also fine with C or E. After these discussions I'm also just happy with the PR as is, but ideally I'd open an issue to have one of your suggestions implemented after this one.

@NicolasHug
Copy link
Member

I'm OK with the PR as-is too.

If we ever write an example we might want to use _base.ACTIVATIONS instead of the public path.

@rth rth changed the title Allow custom activation in MLP MAINT: Remove redundant code in MLP Aug 29, 2019
@rth
Copy link
Member Author

rth commented Aug 29, 2019

I removed the test and changed the PR title. Should be good to go.

Will try to open an issue about possible public API implementation later today.

@NicolasHug
Copy link
Member

Merging, thanks @rth

I'm OK with a public API but I think we decided not to add more features to the NN code during the meeting? So might just not be worth it

@NicolasHug NicolasHug merged commit 64ed843 into scikit-learn:master Aug 29, 2019
@rth rth deleted the custom-activation branch April 2, 2020 07:45
@rth rth mentioned this pull request Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants