Deep Learning - IIT Ropar - Unit 12 - Week 9

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

10/27/24, 1:13 PM Deep Learning - IIT Ropar - - Unit 12 - Week 9

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

vmcse09@gmail.com 

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Deep Learning - IIT Ropar (course)

Course Week 9 : Assignment 9


outline The due date for submitting this assignment has passed.
Due on 2024-09-25, 23:59 IST.
About
NPTEL ()
Assignment submitted on 2024-09-20, 12:08 IST
How does an 1) Let X be the co-occurrence matrix such that the (i, j) -th entry of X captures the 1 point
NPTEL PMI between the i-th and j -th word in the corpus. Every row of X corresponds to the
online representation of the i-th word in the corpus. Suppose each row of X is normalized (i.e., the L2
course norm of each row is 1) then the (i, j) -th entry of XX T captures the:
work? ()

Week 1 () PMI between word i and word j

Week 2 () Euclidean distance between word i and word j

Probability that word i


Week 3 ()

Cosine similarity between word i


week 4 ()
Yes, the answer is correct.
Score: 1
Week 5 ()
Accepted Answers:
Cosine similarity between word i
Week 6 ()

2) You are given the one hot representation of two words below:
Week 7 ()
CAR= [1, 0, 0, 0, 0] , BUS= [0, 0, 0, 1, 0]

Week 8 ()
What is the Euclidean distance between CAR and BUS?

Week 9 () 1.4142

One-hot Yes, the answer is correct.


representation
s of words

https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=115&assessment=297 1/4
10/27/24, 1:13 PM Deep Learning - IIT Ropar - - Unit 12 - Week 9

(unit? Score: 1
unit=115&less Accepted Answers:
on=116) (Type: Range) 1.40,1.42
1 point
Distributed
Representatio 3) Suppose we are learning the representations of words using Glove representations. 1 point
ns of words If we observe that the cosine similarity between two representations vi and vj for words `i' and `
(unit?
j ' is very high. which of the following statements is true?( parameter bi = 0.02 and bj = 0.07)
unit=115&less
on=117)
X ij = 0.02
SVD for
learning word
X ij = 0.2
representation
s (unit?
X ij = 0.88
unit=115&less
on=118)
X ij = 0

SVD for Yes, the answer is correct.


learning word Score: 1
representation Accepted Answers:
s (Contd.) X ij = 0.88

(unit?
unit=115&less 4) Which of the following is an advantage of the CBOW model compared to the Skip- 1 point
on=119)
gram model?
Continuous
bag of words It is faster to train
model (unit? It requires less memory
unit=115&less
It performs better on rare words
on=120)
All of the above
Skip-gram
Yes, the answer is correct.
model (unit?
Score: 1
unit=115&less
Accepted Answers:
on=121)
It is faster to train
Skip-gram
model (Contd.) 5) Which of the following is true about the input representation in the CBOW model? 1 point
(unit?
unit=115&less Each word is represented as a one-hot vector
on=122)
Each word is represented as a continuous vector
Contrastive Each word is represented as a sequence of one-hot vectors
estimation
Each word is represented as a sequence of continuous vectors
(unit?
unit=115&less Yes, the answer is correct.
on=123) Score: 1
Accepted Answers:
Hierarchical Each word is represented as a one-hot vector
softmax (unit?
unit=115&less
6) What is the role of the softmax function in the skip-gram method? 1 point
on=124)

GloVe To calculate the dot product between the target word and the context words
representation To transform the dot product into a probability distribution
s (unit?
To calculate the distance between the target word and the context words
unit=115&less
on=125) To adjust the weights of the neural network during training

https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=115&assessment=297 2/4
10/27/24, 1:13 PM Deep Learning - IIT Ropar - - Unit 12 - Week 9

Yes, the answer is correct.


Evaluating
Score: 1
word
Accepted Answers:
representation
To transform the dot product into a probability distribution
s (unit?
unit=115&less 7) Suppose we are learning the representations of words using Glove representations. 1 point
on=126) If we observe that the cosine similarity between two representations vi and vj for words ‘i’ and ‘j’
Relation is very high. which of the following statements is true?( parameter bi = 0.02 and bj = 0.05
between SVD
and Word2Vec
X ij = 0.03.
(unit?
unit=115&less
X ij = 0.8.
on=127)

Lecture X ij = 0.35.

Material for
Week 9 (unit? X ij = 0.

unit=115&less Yes, the answer is correct.


on=128) Score: 1
Accepted Answers:
Week 9
X ij = 0.8.
Feedback
Form: Deep
Learning - IIT 8) What is the computational complexity of computing the softmax function in the 1 point
Ropar (unit? output layer of a neural network?
unit=115&less
on=192)
O(n)

Quiz: Week 9
2
: Assignment O(n )

9
(assessment? O(nlogn)

name=297)
O(logn)

week 10 ()
Yes, the answer is correct.
Score: 1
Week 11 () Accepted Answers:
O(n)

Week 12 ()
9) How does Hierarchical Softmax reduce the computational complexity of computing 1 point
Download the softmax function?
Videos ()
It replaces the softmax function with a linear function
Books () It uses a binary tree to approximate the softmax function
It uses a heuristic to compute the softmax function faster
Text It does not reduce the computational complexity of computing the softmax function
Transcripts
() Yes, the answer is correct.
Score: 1
Accepted Answers:
Problem
It uses a binary tree to approximate the softmax function
Solving
Session -
10) What is the disadvantage of using Hierarchical Softmax? 1 point
July 2024 ()
It requires more memory to store the binary tree
It is slower than computing the softmax function directly

https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=115&assessment=297 3/4
10/27/24, 1:13 PM Deep Learning - IIT Ropar - - Unit 12 - Week 9

It is less accurate than computing the softmax function directly


It is more prone to overfitting than computing the softmax function directly

No, the answer is incorrect.


Score: 0
Accepted Answers:
It requires more memory to store the binary tree

https://onlinecourses.nptel.ac.in/noc24_cs114/unit?unit=115&assessment=297 4/4

You might also like