Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed mask shape to suit arbitrary number of feature dimensions #785

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

TobyPDE
Copy link

@TobyPDE TobyPDE commented Dec 23, 2016

If you use the CustomRecurrentLayer in order to perform a recurrent convolution, then the masking did not work properly because the masking variable assumed input to have the shape (batch_size, seq_len, num_features). This PR fixes this bug and makes masking work with an arbitrary number of feature dimensions.

Copy link
Member

@f0k f0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch! There's still a problem with your implementation, though, see my comments. Once you're done, could you please squash your commits into one? We'd like to have a clean history to make it easier to track changes. (Let me know if you need help on squashing.) Thank you!

# (seq_len, batch_size, 1, ..., 1)
# |---N---|
num_feature_dims = input.ndim - 2
mask = mask.dimshuffle((1, 0) + ('x',) * num_feature_dims)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! To make the implementation simpler, though, could you instead do:

if mask is not None:
    # expand mask from (seq_len, batch_size) to cover all feature dimensions
    mask = T.shape_padright(mask, input.ndim - 2)

This is easier to read and also doesn't require a long comment to explain!

Reading again, your implementation is actually wrong -- the mask needs to match the dimensionality of the hidden states, not the input states. So instead of input.ndim - 2, use len(self.input_to_hidden.output_shape) - 2. Putting this together:

if mask is not None:
    # expand mask from (seq_len, batch_size) to cover all hidden dimensions
    hid_ndim = len(self.input_to_hidden.output_shape)
    mask = T.shape_padright(mask, hid_ndim - 2)

Could you update your PR accordingly? This would be great!

n_out_filters, filter_shape, pad='same')
l_hid_to_hid = lasagne.layers.Conv2DLayer(
lasagne.layers.InputLayer((None, n_out_filters, width, height)),
n_out_filters, filter_shape, pad='same')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you modify this test such that the hidden dimensionality is different from the input dimensionality? This should uncover that your current fix of the mask dimensionality is wrong. I'm not sure there's a sensible use case for this -- maybe just reshape l_in_to_hid to 5 dimensions and then use a DenseLayer(..., num_leading_dims=-1) for hid_to_hid so the dimensionality stays at 5.

@f0k f0k added this to the v0.2 milestone Feb 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants