Lecture#14

word embedding process

Uploaded by

Qareena sadiq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views38 pages

Lecture#14

word embedding process

Uploaded by

Qareena sadiq

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 38

Word Embedding

by
What is Word Embedding?
A word embedding is a learned representation for text
where words that have the same meaning have a
similar representation.
One of the benefits of using dense and low-
dimensional vectors is computational: the majority of
neural network toolkits do not play well with very high-
dimensional, sparse vectors. … The main benefit of
the dense representations is generalization power: if
we believe some features may provide similar clues, it
is worthwhile to provide a representation that is able to
capture these similarities.
Word Embedding in NLP
The continuous bag-of-words (CBOW) model is a neural network
for natural languages processing tasks such as language translation
and text classification. It predicts a target word based on
the context of the surrounding words and is trained on a large
dataset of text using an optimization algorithm such as stochastic
gradient descent. Once trained, the CBOW model generates
numerical vectors, known as word embeddings, which capture the
semantics of words in a continuous vector space and can be used in
various NLP tasks. It is often combined with other techniques and
models, such as the skip-gram model, and can be implemented
using libraries like gensim in python.
Word embeddings Application

• Compute similar words

• Text classifications
• Document clustering/grouping
• Feature extraction for text
classifications
• Natural language processing.
Word Embedding in NLP
Word Embeddings or Word
vectorization is a methodology in
NLP to map words or phrases from
vocabulary to a corresponding
vector of real numbers which used
to find word predictions, word
similarities/semantics. The process
of converting words into numbers
are called Vectorization.After the
words are converted as vectors, we
need to use some techniques such
as Euclidean distance, Cosine
Similarity to identify similar words.
Word Embedding algorithms
An embedding layer, for lack of a better name, is a word
embedding that is learned jointly with a neural network model
on a specific natural language processing task, such as
language modeling or document classification. It requires that
document text be cleaned and prepared such that each word is
one-hot encoded. The size of the vector space is specified as
part of the model, such as 50, 100, or 300 dimensions. The
vectors are initialized with small random numbers. The
embedding layer is used on the front end of a neural network
and is fit in a supervised way using the Backpropagation
algorithm. If a multilayer Perceptron model is used, then the
word vectors are concatenated before being fed as input to the
model. If a recurrent neural network is used, then each word
may be taken as one input in a sequence
Types of Word Embedding

Word
Embedding

Frequency Prediction
based based
•BOW •word2Vc
•TFIDF
Word2vec
Word2vec is a widely used natural language processing
technique that uses a neural network to learn distributed
representations of words, also known as word embeddings.
These embeddings capture the semantics of a word in a
continuous vector space, such that similar words are close
together in the vector space. Word2vec has two main model
architectures: continuous bag-of-words (CBOW) and skip-gram.
CBOW predicts the current word based on the context of the
surrounding words, while skip-gram predicts the surrounding
words given the current word. Word2vec can be trained on a large
text dataset and is commonly used in various natural language
processing tasks, such as language translation, text classification,
and information retrieval.
Why Word2Vec

• BOW and TFIDF techniques are not able

to give semantic meaning suppose:
• Happy/Enjoy: Bow/TFIDF cannot find
similarties between these two words.
• Google engineers develop this technique
in 2013
Vocabulary in Word2Vec
• In word embedding models such as word2vec, the vocabulary
refers to the set of unique words on which the model has been
trained. The vocabulary is typically created by preprocessing the
input text data and selecting a subset of words to include based
on certain criteria, such as frequency of occurrence or length.
• For example, the word2vec model can create the vocabulary by
building a dictionary of all the unique words in the input text
data and filtering out words that occur too infrequently or are too
long. The vocabulary size is typically determined by the
parameter min_count, which specifies the minimum number of
occurrences of a word in the input data for it to be included in the
vocabulary.
• The vocabulary is used to create numerical vectors, also known
as word embeddings,
continuous bag-of-words (CBOW) model

The continuous bag-of-words (CBOW) model is a neural network for

natural languages processing tasks such as language translation and
text classification. It predicts a target word based on the context of the
surrounding words and is trained on a large dataset of text using an
optimization algorithm such as stochastic gradient descent. Once trained,
the CBOW model generates numerical vectors, known as word
embeddings, which capture the semantics of words in a continuous
vector space and can be used in various NLP tasks. It is often combined
with other techniques and models, such as the skip-gram model, and can
be implemented using libraries like gensim in python.
Difference
• Cannot find semantics • Find semantic meaning
meaning and relation of words
• very high dimensional • Low dimensional vectors
vectors bcz many zeros bcz every word have
are here values
• Sparse Vectors • Dense Vectors
• Overfitting Problem • No Overfitting problems
install these libraries ans modules for word2Vec

https://colab.research.google.com/dri...

https://github.com/campusx-official/g...
Let’s see an example
• Julie loves John more than Linda loves John
• Jane loves John more than Julie loves John

the two vectors are,

Item 1: [2, 0, 1, 1, 0, 2, 1, 1]
Item 2: [2, 1, 1, 0, 1, 1, 1, 1]
King Queen Man Women Monkey

Gender 1 0 1 0 1
Male
Wealth 1 1 0.7 0.3 0
Power 1 0.7 0.6 0.5 0
Weight 0.7 0.5 0.6 0.5 0.3
Speak 1 1 1 1 0

King-man+women
1-1+0=0
1-0.7+0.3=0.6
1-0.6+0.4=0.8
Why Cosine Similarity
• Count the common words or Euclidean distance is
the general approach used to match similar
documents which are based on counting the number
of common words between the documents.

• This approach will not work even if the number of

common words increases but the document talks
about different topics. To overcome this flaw, the
“Cosine Similarity” approach is used to find the
similarity between the documents.
Types of Word2Vec

Word2Vec

CBOW Skipgram
Word2Vec
Word2Vec is a statistical method for efficiently learning
a standalone word embedding from a text corpus.It
was developed by Tomas Mikolov, et al. at Google in
2013 as a response to make the neural-network-based
training of the embedding more efficient and since
then has become the de facto standard for developing
pre-trained word embedding.
•Continuous Bag-of-Words, or CBOW model.
•Continuous Skip-Gram Model.
CBOW tries to predict a word on the basis of its
neighbors, while Skip Gram tries to predict the
neighbors of a word.
• Word2Vec will set the windoe first:
• If window size is 3.
___Target____

• If window size is 5.
____,_____, Target,_____,_____

• If window size is 7.
• ____,____,____,Target,____,_____,_____
What is a Continuous Bag of Words (CBOW)?

Continuous Bag of Words (CBOW) is a popular natural language

processing technique used to generate word embeddings. Word
embeddings are important for many NLP tasks because they capture
semantic and syntactic relationships between words in a language.
CBOW is a neural network-based algorithm that predicts a target word
given its surrounding context words. It is a type of “unsupervised”
learning, meaning that it can learn from unlabeled data, and it is often
used to pre-train word embeddings that can be used for various NLP
tasks such as sentiment analysis, text classification, and machine
translation.
Word2Vec
Implementation of CBOW Model
Thanx for Listening

Mark S. Gockenbach-Partial Differential Equations - Analytical and Numerical Methods, Second Edition-SIAM (2011) PDF
100% (1)
Mark S. Gockenbach-Partial Differential Equations - Analytical and Numerical Methods, Second Edition-SIAM (2011) PDF
666 pages
L02 - IS - Security Models
100% (1)
L02 - IS - Security Models
17 pages
Trackpad Pro Ver. 5.0 Class 6
From Everand
Trackpad Pro Ver. 5.0 Class 6
Nidhi Arora
No ratings yet
Artificial Intelligence & Neural Networks Unit-5 Basics of NN
50% (2)
Artificial Intelligence & Neural Networks Unit-5 Basics of NN
16 pages
SPINS: Security Protocols For Sensor Networks
No ratings yet
SPINS: Security Protocols For Sensor Networks
29 pages
Digital Steganography
No ratings yet
Digital Steganography
38 pages
402B Deep Learning
No ratings yet
402B Deep Learning
82 pages
Office Automation
No ratings yet
Office Automation
14 pages
Advanced Computer Networks - CS716 Power Point Slides Lecture 25
No ratings yet
Advanced Computer Networks - CS716 Power Point Slides Lecture 25
264 pages
Unit - III
No ratings yet
Unit - III
34 pages
MACHINE LEARNING LAB Manual
No ratings yet
MACHINE LEARNING LAB Manual
48 pages
Beyond Binary Classification
No ratings yet
Beyond Binary Classification
34 pages
Unit 1 MWS
No ratings yet
Unit 1 MWS
22 pages
Neural Network Unit - 4 - 221210 - 134739
No ratings yet
Neural Network Unit - 4 - 221210 - 134739
15 pages
Ece443 - Wireless Sensor Networks Course Information Sheet: Electronics and Communication Engineering Department
No ratings yet
Ece443 - Wireless Sensor Networks Course Information Sheet: Electronics and Communication Engineering Department
10 pages
Lecture Notes-Cns by Suthoju Girija Rani
100% (1)
Lecture Notes-Cns by Suthoju Girija Rani
163 pages
DCCN Notes
No ratings yet
DCCN Notes
27 pages
Files in Python
No ratings yet
Files in Python
12 pages
PDF
100% (2)
PDF
39 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
23 pages
Subject Name Parallel and Distributed Computing
100% (1)
Subject Name Parallel and Distributed Computing
3 pages
A Wireless Intrusion Detection System and A New Attack Model
No ratings yet
A Wireless Intrusion Detection System and A New Attack Model
28 pages
NOS DOS Case Study
No ratings yet
NOS DOS Case Study
7 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
23 pages
OS Lecture3 - Inter Process Communication
No ratings yet
OS Lecture3 - Inter Process Communication
43 pages
Final - Unit 1.1
No ratings yet
Final - Unit 1.1
23 pages
AD3461 ML Lab Manual
No ratings yet
AD3461 ML Lab Manual
32 pages
Cs3591-Unit 4
No ratings yet
Cs3591-Unit 4
19 pages
Broadcasting Chat Server
83% (6)
Broadcasting Chat Server
25 pages
Data Modelling and Visualization
No ratings yet
Data Modelling and Visualization
31 pages
CH 03
No ratings yet
CH 03
67 pages
Data Storage Technologies and Networks
No ratings yet
Data Storage Technologies and Networks
7 pages
Unit 3 - Computer Networks - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Computer Networks - WWW - Rgpvnotes.in
18 pages
UNIT V Application Layer
100% (1)
UNIT V Application Layer
18 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Systolic Arrays & Their Applications
No ratings yet
Systolic Arrays & Their Applications
35 pages
Distributed File Systems
No ratings yet
Distributed File Systems
75 pages
Inception Net
No ratings yet
Inception Net
88 pages
Error Detection Techniques
No ratings yet
Error Detection Techniques
27 pages
IoT Module 4 Associated IoT Technologies
No ratings yet
IoT Module 4 Associated IoT Technologies
56 pages
Unit 3
No ratings yet
Unit 3
28 pages
Exercise - 3 Submission - Group - 12
No ratings yet
Exercise - 3 Submission - Group - 12
14 pages
C Programming and Data Structures
No ratings yet
C Programming and Data Structures
5 pages
CC Modul 5 Gud
No ratings yet
CC Modul 5 Gud
11 pages
CS6801-Multi Core Architectures and Programming
No ratings yet
CS6801-Multi Core Architectures and Programming
9 pages
Traditional and Modern Symmetric Key Ciphers
No ratings yet
Traditional and Modern Symmetric Key Ciphers
49 pages
Unit V Easy To Learn
No ratings yet
Unit V Easy To Learn
21 pages
PPS Course Material
100% (1)
PPS Course Material
177 pages
Module3-Fitting A Model To Data
No ratings yet
Module3-Fitting A Model To Data
57 pages
Micro Mobility, CIP, HAWAII, HMIPv6
No ratings yet
Micro Mobility, CIP, HAWAII, HMIPv6
12 pages
1-Introduction To Networking
No ratings yet
1-Introduction To Networking
18 pages
A MCN Questions
No ratings yet
A MCN Questions
16 pages
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
61 pages
Unit 4-DBP
No ratings yet
Unit 4-DBP
66 pages
Machine Learning Algorithms
100% (1)
Machine Learning Algorithms
15 pages
Unit 4 PPT
No ratings yet
Unit 4 PPT
34 pages
Communication Operations
No ratings yet
Communication Operations
70 pages
PPS - Unit 1
No ratings yet
PPS - Unit 1
69 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
426 pages
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
From Everand
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
Kameron Hussain
No ratings yet
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)
C3.1 Turbulent Flow Over A 2D Multi-Element Airfoil: 1. Code Description
No ratings yet
C3.1 Turbulent Flow Over A 2D Multi-Element Airfoil: 1. Code Description
3 pages
Division of Polynomials - Long Division
No ratings yet
Division of Polynomials - Long Division
2 pages
Developmental Mathematics With Applications and Visualization Prealgebra Beginning Algebra and Intermediate Algebra 2nd Edition Rockswold Test Bank
100% (32)
Developmental Mathematics With Applications and Visualization Prealgebra Beginning Algebra and Intermediate Algebra 2nd Edition Rockswold Test Bank
39 pages
Vieta's Formula Addendum
No ratings yet
Vieta's Formula Addendum
2 pages
Multiobjective Optimization Using Artificial Intelligence Techniques Seyedali Mirjalili Download
No ratings yet
Multiobjective Optimization Using Artificial Intelligence Techniques Seyedali Mirjalili Download
39 pages
AI PT 1 Class 10 - 094838
No ratings yet
AI PT 1 Class 10 - 094838
2 pages
Application of ANN in Predicting Credit Card Default
No ratings yet
Application of ANN in Predicting Credit Card Default
19 pages
CSC312 Automata Theory: Chapter # 5 by Cohen (Cont )
No ratings yet
CSC312 Automata Theory: Chapter # 5 by Cohen (Cont )
15 pages
Stepping Stone Method
No ratings yet
Stepping Stone Method
4 pages
Assumptions in ANN: Information Processing Occurs at Many Simple
No ratings yet
Assumptions in ANN: Information Processing Occurs at Many Simple
26 pages
2 - Bisection Method of Solving A Nonlinear Equation
No ratings yet
2 - Bisection Method of Solving A Nonlinear Equation
9 pages
Robust Discrete Optimization and Network Flows
No ratings yet
Robust Discrete Optimization and Network Flows
26 pages
Minimization Model Example
No ratings yet
Minimization Model Example
22 pages
10.1007 - 978 3 319 33003 7 - 11
No ratings yet
10.1007 - 978 3 319 33003 7 - 11
10 pages
Algorithm CH 4 Lec 1
No ratings yet
Algorithm CH 4 Lec 1
18 pages
Polynomials
No ratings yet
Polynomials
4 pages
Function Activate
No ratings yet
Function Activate
9 pages
Mat 540 Quiz 5 With Answers
100% (5)
Mat 540 Quiz 5 With Answers
9 pages
Integration Using The Gauss Quadrature Rule - Method 8
No ratings yet
Integration Using The Gauss Quadrature Rule - Method 8
8 pages
Plate No. 6: Analysis of Indeterminate Frame Structure Using Moment Distribution Method
No ratings yet
Plate No. 6: Analysis of Indeterminate Frame Structure Using Moment Distribution Method
5 pages
TM5 Lecture 14 Interpolation
No ratings yet
TM5 Lecture 14 Interpolation
38 pages
Homework 4: IEOR 160: Operations Research I (Fall 2014)
No ratings yet
Homework 4: IEOR 160: Operations Research I (Fall 2014)
2 pages
PowerPoint Presentation (5822034)
No ratings yet
PowerPoint Presentation (5822034)
157 pages
Factoring Polynomials Completely: Calculator Tips Back To Last Slide Rules
No ratings yet
Factoring Polynomials Completely: Calculator Tips Back To Last Slide Rules
30 pages
Transportation Simplex Steps
No ratings yet
Transportation Simplex Steps
3 pages
Part 2 Lecture Notes On Interpolation
No ratings yet
Part 2 Lecture Notes On Interpolation
26 pages
Deep Reinforcement Learning PDF
No ratings yet
Deep Reinforcement Learning PDF
150 pages
Polynomials - DHA 02 - Neev 2026
No ratings yet
Polynomials - DHA 02 - Neev 2026
5 pages
Or Cat 1
No ratings yet
Or Cat 1
4 pages

Lecture#14

Uploaded by

Lecture#14

Uploaded by

Word Embedding

• Compute similar words

• BOW and TFIDF techniques are not able

The continuous bag-of-words (CBOW) model is a neural network for

the two vectors are,

• This approach will not work even if the number of

Continuous Bag of Words (CBOW) is a popular natural language

You might also like