AMT302 QUESTION BANK - Format
AMT302 QUESTION BANK - Format
QUESTION BANK
MODULE I
MODULE II
1.Explain word concept word embeddings with an example?Explain word concept word embeddings
with an example?
2.What are the main differences between skip gram and continuous bag of words?
3.State Bayes’ Theorem.
4.Explain key stages of NLP Pipeline
5.Which techniques are used to convert text to numerical vectors?
6.How to solve text pre-processing
7.Explain the Feature Extraction techniques
8.Explain different Distributed Representation
9.Given the following data about documents and contents, use tf-idf document
scoring method to retrieve the document for the query Data Scientists
Doc 1: Ben studies about computers in Computer Lab.
Doc 2: Steve teaches at Brown University.
Doc 3: Data Scientists work on large datasets
1.The size of the corpus is 10,000,000 million documents,if we assume 0.3 million million
documents contain the term ‘cat’,then find TF-IDF score?
2.Differentiate Rule based classification and machine learning based classification?
3.Consider a document containing 100 words wherein the word cat appears 3
times. Now, assume we have 10 million documents and the word cat appears in
one thousand of these. Compute the normalized tf and the tf-idf and compare
them.
4.Differentiate Part-of-Speech(POS) tagging and Named Entity recognition(NER)
5.Explain Application of Text Classification.
6.Explain the pipeline for building text classification systems
7.Explain Sentiment Analysis with Logistic Regression.
8.Explain key stages of Information Extraction Pipeline
9.Explain Sentiment Analysis with SVM.
10.Explain Name Entity Recognition using Sequence Labelling with an example
Module Ⅳ
Module Ⅴ