Skip to content

Commit 4f7d884

Browse files
author
Grégoire
committed
add reference in the intro and other typos
1 parent c0ab16e commit 4f7d884

File tree

2 files changed

+23
-20
lines changed

2 files changed

+23
-20
lines changed

doc/intro.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,9 @@ from energy models:
4949
Building towards including the Contractive auto-encoders tutorial, we have the code for now:
5050
* `Contractive auto-encoders`_ code - There is some basic doc in the code.
5151

52+
Recurrent neural networks with word embeddings and context window:
53+
* :ref:`Semantic Parsing of Speech using Recurrent Net <rnnslu>`
54+
5255
Energy-based recurrent neural network (RNN-RBM):
5356
* :ref:`Modeling and generating sequences of polyphonic music <rnnrbm>`
5457

doc/rnnslu.txt

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -178,7 +178,7 @@ Using Theano, it gives::
178178
(nv+1, de)).astype(theano.config.floatX)) # add one for PADDING at the end
179179

180180
idxs = T.imatrix() # as many columns as words in the context window and as many lines as words in the sentence
181-
x, _ = theano.scan(fn = lambda idx: embeddings[idx].flatten(), sequences = idxs)
181+
x, _ = theano.scan(fn=lambda idx: embeddings[idx].flatten(), sequences=idxs)
182182

183183
The x symbolic variable corresponds to a matrix of shape (number of words in the
184184
sentences, dimension of the embedding space X context window size).
@@ -193,7 +193,7 @@ Let's compile a theano function to do so
193193
[-1, 0, 1, 2, 3, 4,-1],
194194
[ 0, 1, 2, 3, 4,-1,-1],
195195
[ 1, 2, 3, 4,-1,-1,-1]]
196-
>>> f = theano.function( inputs=[idxs], outputs=x)
196+
>>> f = theano.function(inputs=[idxs], outputs=x)
197197
>>> f(cx)
198198
array([[-0.08088442, 0.08458307, 0.05064092, ..., 0.06876887,
199199
-0.06648078, -0.15192257],
@@ -250,25 +250,25 @@ It gives the following code::
250250
de :: dimension of the word embeddings
251251
cs :: word window context size
252252
'''
253-
self.emb = theano.shared(name='embeddings', value=0.2 * numpy.random.uniform(-1.0, 1.0,\
253+
self.emb = theano.shared(name='embeddings', value=0.2 * numpy.random.uniform(-1.0, 1.0,
254254
(ne+1, de)).astype(theano.config.floatX)) # add one for PADDING at the end
255-
self.Wx = theano.shared(name='Wx', value=0.2 * numpy.random.uniform(-1.0, 1.0,\
255+
self.Wx = theano.shared(name='Wx', value=0.2 * numpy.random.uniform(-1.0, 1.0,
256256
(de * cs, nh)).astype(theano.config.floatX))
257-
self.Wh = theano.shared(name='Wh', value=0.2 * numpy.random.uniform(-1.0, 1.0,\
257+
self.Wh = theano.shared(name='Wh', value=0.2 * numpy.random.uniform(-1.0, 1.0,
258258
(nh, nh)).astype(theano.config.floatX))
259-
self.W = theano.shared(name='W', value=0.2 * numpy.random.uniform(-1.0, 1.0,\
259+
self.W = theano.shared(name='W', value=0.2 * numpy.random.uniform(-1.0, 1.0,
260260
(nh, nc)).astype(theano.config.floatX))
261261
self.bh = theano.shared(name='bh', value=numpy.zeros(nh, dtype=theano.config.floatX))
262262
self.b = theano.shared(name='b', value=numpy.zeros(nc, dtype=theano.config.floatX))
263263
self.h0 = theano.shared(name='h0', value=numpy.zeros(nh, dtype=theano.config.floatX))
264264

265265
# bundle
266-
self.params = [ self.emb, self.Wx, self.Wh, self.W, self.bh, self.b, self.h0 ]
266+
self.params = [self.emb, self.Wx, self.Wh, self.W, self.bh, self.b, self.h0]
267267

268268
Then we integrate the way to build the input from the embedding matrix::
269269

270270
idxs = T.imatrix() # as many columns as context window size/lines as words in the sentence
271-
x, _ = theano.scan(fn = lambda idx: self.emb[idx].flatten(), sequences = idxs)
271+
x, _ = theano.scan(fn=lambda idx: self.emb[idx].flatten(), sequences=idxs)
272272
y = T.ivector('y') # label
273273

274274
We use the scan operator to construct the recursion, works like a charm::
@@ -278,33 +278,33 @@ We use the scan operator to construct the recursion, works like a charm::
278278
s_t = T.nnet.softmax(T.dot(h_t, self.W) + self.b)
279279
return [h_t, s_t]
280280

281-
[h, s], _ = theano.scan(fn=recurrence, \
282-
sequences=x, outputs_info=[self.h0, None], \
281+
[h, s], _ = theano.scan(fn=recurrence,
282+
sequences=x, outputs_info=[self.h0, None],
283283
n_steps=x.shape[0])
284284

285-
p_y_given_x_sentence = s[:,0,:]
285+
p_y_given_x_sentence = s[:, 0, :]
286286
y_pred = T.argmax(p_y_given_x_sentence, axis=1)
287287

288288
Theano will then compute all the gradients automatically to maximize the log-likelihood::
289289

290290
lr = T.scalar('lr')
291291
nll = -T.mean(T.log(p_y_given_x_sentence)[T.arange(x.shape[0]),y])
292292
gradients = T.grad( nll, self.params )
293-
updates = OrderedDict(( p, p-lr*g ) for p, g in zip( self.params , gradients))
293+
updates = OrderedDict((p, p - lr*g) for p, g in zip(self.params, gradients))
294294

295295
Next compile those functions::
296296

297297
self.classify = theano.function(inputs=[idxs], outputs=y_pred)
298298

299-
self.train = theano.function( inputs = [idxs, y, lr],
300-
outputs = nll,
301-
updates = updates )
299+
self.train = theano.function(inputs=[idxs, y, lr],
300+
outputs=nll,
301+
updates=updates)
302302

303303
We keep the word embeddings on the unit sphere by normalizing them after each update::
304304

305-
self.normalize = theano.function( inputs = [],
306-
updates = {self.emb:\
307-
self.emb/T.sqrt((self.emb**2).sum(axis=1)).dimshuffle(0,'x')})
305+
self.normalize = theano.function(inputs=[],
306+
updates = {self.emb:
307+
self.emb / T.sqrt((self.emb**2).sum(axis=1)).dimshuffle(0, 'x')})
308308

309309
And that's it!
310310

@@ -358,7 +358,7 @@ Results
358358
**Timing**
359359

360360
Running experiments on ATIS using this `repository <https://github.com/mesnilgr/is13>`_
361-
will run one epoch in less than 40 seconds on bart1 processor using less than 200 Mo of RAM::
361+
will run one epoch in less than 40 seconds on i7 CPU 950 @ 3.07GHz using less than 200 Mo of RAM::
362362

363363
[learning] epoch 0 >> 100.00% completed in 34.48 (sec) <<
364364

@@ -378,7 +378,7 @@ After a few epochs, you obtain decent performance **94.48 % of F1 score**.::
378378

379379
**Word Embedding Nearest Neighbors**
380380

381-
We can check the k-nearest neighbors of the thus learned embeddings. L2 and
381+
We can check the k-nearest neighbors of the learned embeddings. L2 and
382382
cosine distance gave the same results so we plot them for the cosine distance.
383383

384384
+------------------------------+------------------------------+------------------------------+------------------------------+------------------------------+------------------------------+------------------------------+------------------------------+------------------------------+------------------------------+

0 commit comments

Comments
 (0)