Lecture 1 Introduction
Lecture 1 Introduction
Lecture 1 Introduction
Language Processing
This lecture
• Introduction to Language
• What is NLP? Why it is important?
• NLP Applications
• Linguistic Knowledge
• Challenges
• NLP course- What will you learn from this course?
• Natural language vs. Artificial Language
• A vocabulary consists of a set of words
• A text is composed of a sequence of words from a vocabulary
• A language is constructed of a set of all possible texts
• What is NLP
• Wiki: Natural language processing(NLP) is a field of computer science,
artificial intelligence, and computational linguistics concerned with the
interactions between computers and human (natural) languages.
• Identify the structureand meaningof words, sentences, textsand
conversations
• Deep understanding of broad language
NLP
• Introduction to Language
• What is NLP? Why it is important?
• NLP Applications
• Linguistic Knowledge
• Challenges
• NLP course- What will you learn from this course?
NLP Applications
• Spell and Grammar Checking
• Checking spelling and grammar
• Suggesting alternatives for the errors
Optical Character Recognition
• Optical character recognition or optical character reader is the
electronic or mechanical conversion of images of typed, handwritten
or printed text into machine-encoded text, whether from a scanned
document, a photo of a document, a scene-photo or from subtitle
text superimposed on an image
Word Prediction
• Predicting the next word that is highly probable to be typed by the user
Information Retrieval
• Introduction to Language
• What is NLP? Why it is important?
• NLP Applications
• Linguistic Knowledge
• Challenges
• NLP course- What will you learn from this course?
Linguistic knowledge
NLP and linguistics :
• Letters – a,b,c … z
• Words – combining letters to form words
• Phonetics and phonology - The study of linguistic sounds and their
relations to words
• Morphology - The study of internal structures of words and how they can
be modified, parsing complex words into their components e.g.
(ni)(na)(kula).
• Syntax - The study of the structural relationships between words in a
sentence
• Semantics - The study of the meaning of words, and how these combine to
form the meanings of sentences
Discourse
• The study of linguistic units larger than a single statement
John reads a book. He borrowed it from his friend.
Pragmatics
• Introduction to Language
• What is NLP? Why it is important?
• NLP Applications
• NLP Techniques
• Linguistic Knowledge
• Challenges
• NLP course- What will you learn from this course?
Challenges
• Word sense ambiguity Word sense / meaning ambiguity
Ambiguity
• Ambiguous headlines:
• Include your children when baking cookies – as an ingredient??
• Hospitals are Sued by 7 Foot Doctors – Doctors with 7 feet??
• Iraqi Head Seeks Arms – Head seeking arms??
• Introduction to Language
• What is NLP? Why it is important?
• NLP Applications
• Linguistic Knowledge
• Challenges
• NLP course- What will you learn from this course?
What you will learn in this course
• The NLP pipeline - key components of text understanding and
• Core NLP techniques: tokenization, lemmatization, stemming, chunking,
Sentence splitting, part of speech tagging, syntactic parsing
• Core NLP technologies : named entity recognition, co-reference resolution,
event extraction, language modelling
• Text analytics using Python, NLTK, Spacy
• Text classification and sentiment analysis
• Sentence representation – bag of words, tf-idf
• Building a simple text classifier
• Recent trends in NLP – words embeddings and Neural Networks