Qns
Qns
Qns
Which of the following techniques can be used for the purpose of keyword normalization, the
process of converting a keyword into its meaningful base form?
A. Lemmatization B. Levenshtein distance C. Morphing D. Stemming
4. For the string ‘mash’, identify which of the following set of strings have a Levenshtein
distance of 1.
A. smash, mas, lash, mushy, hash B. bash, stash, lush, flash, dash
C.smash, mas, lash, mush, ash D. None of the above
5. What is the part of speech tag for the word "quickly" in the sentence "She quickly finished her
homework."?
A. Adverb B. Verb C. Noun D. Adjective
6. ____________ ambiguity exists in the presence of two or more possible meanings of the
sentence within a single word.
7. _____________ is the process of removing the inflections of a word in order to map the word
to its root form.
10. The part of speech tag for the word "the" in the sentence "The cat slept on the mat." is
_______________
16. The Noisy Channel Model in NLP is only applicable to speech recognition and language
translation.
A.True B. False
17. Ambiguity in NLP is the state of being ambiguous usually with more than one interpretation
of a word, phrase or sentence.
A.True B. False
18. Models that assign probabilities to sequences of words are called language models.
A.True B. False
19. We normalize the counts of words in an n-gram model to make the value to fall between 0
and 1.
A.True B. False
20. The part of speech tag for the word "happy" in the sentence "She was happy to see her
friends." is adjective.
A.True B. False
1. Imagine you are designing a system for a customer service chatbot. Your client wants the
chatbot to efficiently answer customer queries in natural language. Which application of Natural
Language Processing would be most suitable for this task?
A. Spam Detection B. Question Answering C. Sentiment Analysis D. Machine Translation
2. Suppose you are developing an NLP tool for a research project aiming to extract information
from research papers. What would be the primary purpose of implementing information
extraction in your NLP system?
A. Converting spoken words into text B. Detecting unwanted e-mails in a user's inbox
C. Extracting structured information from unstructured or semi-structured machine-readable
documents. D. Analyzing the attitude and emotional state of the sender
3. Assume that we modify the costs incurred for operations in calculating Levenshtein
distance, such that both the insertion and deletion operations incur a cost of 1 each, while
substitution incurs a cost of 2. Now, for the string ‘lash’ which of the following set of strings
will have an edit distance of 1?
4. Suppose you are evaluating the performance of a language model on a test corpus containing
unseen words. What would be the likely outcome in terms of perplexity if the language model is
unsmoothed?
A. 0 B.Infinity C. any non-zero value D.None of the above
5. Imagine you are analyzing the sentence "I am running late for my appointment." What part of
speech tag would you assign to the word "running" in this context?
A. Adverb B.Verb C. Noun D. Adjective
8. _______ many bi-grams can be generated from the given sentence: India is my country.
9.Markov assumption states that the probability of a word depends only on the
__________________word.
10. The part of speech tag for the word "beautiful" in the sentence "The sunset was beautiful" is
____________
16. The Noisy Channel Model in NLP is only applicable to speech recognition and language
translation.
A.True B. False
17. The outputs of Lemmatization and Stemming for the same word might differ.
A.True B. False
20. In a match between India and Australia during the ICC World Cup 2023, the commentator
remarked, "The striker kicked the ball with incredible precision." The appropriate part of speech
tag for the word 'kicked' in this sentence is Verb.
A.True B. False
1. What is NLP? Discuss the various phases involved in NLP process with suitable example.
2. What do you mean by ambiguity in natural language? Explain with suitable example.
4. Explain Damerau-Levenshtein edit distance. Compute the minimum edit distance between
actress, cress, caress, access, across, acres, acres with respect to acress.
5. Construct a parse tree for the sentence “all the morning flights from Denver to Tampa leaving
before 10”.
6. Consider the following corpus C12 of 4 sentences. What is the total count of unique bi-
grams for which the likelihood will be estimated? Assume we do not perform any pre-
processing.
Calculate the perplexity of <s> they play in a big garden </s> assuming a bi-gram language
model.
8. Given a corpus C3, the Maximum Likelihood Estimation (MLE) for the bigram “dried
berries” is 0.4 and the count of occurrence of the word “dried” is 680. for the same corpus
C3, the likelihood of “dried berries” after applying add-one smoothing is 0.05. What is the
vocabulary size of C3?
9. Can neural units compute simple functions of input like AND, OR, and XOR? Justify.