-
Notifications
You must be signed in to change notification settings - Fork 940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed mask shape to suit arbitrary number of feature dimensions #785
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good catch! There's still a problem with your implementation, though, see my comments. Once you're done, could you please squash your commits into one? We'd like to have a clean history to make it easier to track changes. (Let me know if you need help on squashing.) Thank you!
# (seq_len, batch_size, 1, ..., 1) | ||
# |---N---| | ||
num_feature_dims = input.ndim - 2 | ||
mask = mask.dimshuffle((1, 0) + ('x',) * num_feature_dims) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! To make the implementation simpler, though, could you instead do:
if mask is not None:
# expand mask from (seq_len, batch_size) to cover all feature dimensions
mask = T.shape_padright(mask, input.ndim - 2)
This is easier to read and also doesn't require a long comment to explain!
Reading again, your implementation is actually wrong -- the mask needs to match the dimensionality of the hidden states, not the input states. So instead of input.ndim - 2
, use len(self.input_to_hidden.output_shape) - 2
. Putting this together:
if mask is not None:
# expand mask from (seq_len, batch_size) to cover all hidden dimensions
hid_ndim = len(self.input_to_hidden.output_shape)
mask = T.shape_padright(mask, hid_ndim - 2)
Could you update your PR accordingly? This would be great!
n_out_filters, filter_shape, pad='same') | ||
l_hid_to_hid = lasagne.layers.Conv2DLayer( | ||
lasagne.layers.InputLayer((None, n_out_filters, width, height)), | ||
n_out_filters, filter_shape, pad='same') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you modify this test such that the hidden dimensionality is different from the input dimensionality? This should uncover that your current fix of the mask dimensionality is wrong. I'm not sure there's a sensible use case for this -- maybe just reshape l_in_to_hid to 5 dimensions and then use a DenseLayer(..., num_leading_dims=-1) for hid_to_hid so the dimensionality stays at 5.
If you use the
CustomRecurrentLayer
in order to perform a recurrent convolution, then the masking did not work properly because the masking variable assumed input to have the shape(batch_size, seq_len, num_features)
. This PR fixes this bug and makes masking work with an arbitrary number of feature dimensions.