ML QB Odd 2023
ML QB Odd 2023
ML QB Odd 2023
QUESTIONS
UNIT NO- 1 :
Introduction to Machine Learning
TOPIC:1 : Human Learning and its Types, Machine Learning and its
Types
Sr. SHORT QUESTIONS/MCQ Marks
No
1. 1Identify correct groups of machine learning problems: 01
a) Given a transaction, label as fraud or regular, learn fraud detection.
b) Given set of movies found on a portal and group them into set of movies with
same genre.
c) Given database of visitors, automatically discover popularity of holiday place
segments and group visitors according to holiday place.
d) Find pattern of purchasing electronics during Diwali. [LJIET]
UNIT NO- 2 :
Preparing to Model
TOPIC:1 : Basic Data Types, Exploring numerical data, Exploring
categorical data, Exploring relationships between variables
Sr. SHORT QUESTIONS/MCQ Marks
No
1. a.
1 Divide data into two parts: training and testing 01
b. Find potential issues in data
c. Understand nature and quality of data
d. Do remediation
e. Train model based on training data
Arrange given preprocessing activities in correct order for supervised learning. [LJIET]
c, b, d, a, e
a, c, d, b, e
c, d, b, e, a
c, d, b, a, e
a, c, b, d, e
a, b, c, d, e
2. a.
2 Discrete values can assume finite or count ably infinite number of values 01
b. Pin code is example of Nominal attribute
c. Nominal attribute may or may not have finite number of values.
d. Numerical attributes can have count ably infinite values
e. Nominal and Ordinal attributes are in general discrete.
Which statement is/are false? [LJIET]
only a
both a,c
both b,e
all three c, d, e
all three a, c, d
none
3. __________
3 is likely to get shifted drastically due to presence of ____________. 01
[LJIET]
Mean, Outlier
Machine Learning (3170724) 2023-24 Page 3
L.J. Institute of Engineering & Technology Semester: VII (2023-24)
Median, Outlier
Mean, Variance
Mean, Median
Variance, Outlier
Median, Variance
4. 4 1. Mechanism for one shot view and a. Whisker 01
understand nature of data
2. Box span from first to third b. Box Plot
quartile
3. Can range up to 1.5 times IQR c. Median
from bottom/top of box
4. Denoted by line or band in box d. Inter quartile range
[LJIET]
33, 46.5
35,49
31,44
31,49
35,44
32.5,45
7. To explore relationship between variables what can be used? 01
1. Box plot
2. Histogram
3. Scatter plot
4. Cross tab
5. PCA [LJIET]
3,4
1,2
1,2,3,4
5
1,5
Machine Learning (3170724) 2023-24 Page 4
L.J. Institute of Engineering & Technology Semester: VII (2023-24)
1,2,4
8. The height of ___________ reflects total count of data elements whose value falls 01
specifically ________. [LJIET]
Bar, bin
Bin, bar
Histogram, mode
Mode, histogram
Bar, mode
Bin, mode
9. 1. Discrete values can assume only finite values 01
2. Employee ID is example of Nominal attribute
3. Nominal attribute may or may not have finite number of values.
4. Numerical attributes can have count ably infinite values
5. Nominal and Ordinal attributes are in general discrete.
Which statement is/are false? [LJIET]
only a
both a,c
both b,e
all three c, d, e
all three a, c, d
none
10. Numerical attributes having lesser possible number of values can be treated as 01
________.[LJIET]
Categorical
Nominal
Ordinal
Discrete
Interval
Ratio
11. 1. Larger the value of variance indicates more dispersion in data. 01
2. Deviation between mean and median is significant high, means chances of
outliers is less.
3. Larger difference between two quartile values indicates more data spread in
respective quarter.
Identify false statements [LJIET]
All 1,2,3
Both 1,2
Both 2,3
Only 1
Only 2
Only 3
12. What is first and third quartile of data. 01
44, 12, 25, 71, 27, 59, 59, 38, 66 [LJIET]
26,62.5
27,59
27,44
Nominal
Interval
Ordinal
Ratio
Discrete
Binary
Sr. DESCRIPTIVE QUESTIONS Marks
No
1. Define feature and explain the process of transforming numeric features to categorical 07
features with suitable example. (Jan 2023) [LJIET]
2. What is categorical data? Explain its types with examples. (Jan 2023) [LJIET] 04
3. How can we take care of outliers in data? (Dec 2021, Jun 2022) [LJIET] 03
4. What is outlier? How can we take care of outliers? (Jun 2023) [LJIET] 04
5. Explain Key elements of Machine Learning. Explain various function approximation 04
methods. (Jun 2022) [LJIET]
6. 1
What are the main activities involved when you are preparing to start with modeling in 07
Machine Learning? [LJIET]
7. 2
What are the basic data types in Machine Learning? Explain by giving examples of each 07
one of them. [LJIET]
8. What are the basic data types in machine learning? Give an example of each one of them. 03
(Jun 2023) [LJIET]
9. 3
Explain in details the different components of Box Plot. State how outliers can be detected 07
using Box Plots. [LJIET]
10. 4
State and explain the two ways in which we can explore the relationship between two 07
variables (attributes). [LJIET]
Missing values
Outliers
Missing values and outliers
Error in data collection
Error in sample selection
Error in data collection and sample selection
2. Techniques
2 for dimensionality reduction PCA stands for [LJIET] 01
Only 1
Only 2
Only 3
Both 1,3
Both 2,3
None of given
Sr. DESCRIPTIVE QUESTIONS Marks
No
1. What are the Techniques Provided in Data Preprocessing? Explain in brief (Jun 2022) 07
[LJIET]
2. Define Outliers. Can they be handled in a dataset? If yes, how? [LJIET] 05
3. Explain in details the various ways to address the missing values in a dataset. [LJIET] 07
4. 2
What are the different techniques for data pre-processing. Explain in brief, dimensionality 07
reduction and feature selection.[LJIET]
5. What are the different techniques for data pre-processing? Explain in brief. (Jun 2023) 03
[LJIET]
UNIT NO- 3 :
Modelling and Evaluation
TOPIC:1 : Selecting a model, Training a model, Model representation
and interpretability
Sr. SHORT QUESTIONS/MCQ Marks
No
1. 1A machine learning problem that does not include target variable is called _________ 01
and that include target variable is called _________. [LJIET]
Model
Abstraction
Machine Learning (3170724) 2023-24 Page 7
L.J. Institute of Engineering & Technology Semester: VII (2023-24)
Generalization
Training
Testing
Validation
3 Form of – SRSWS ? [LJIET]
3. Full 01
AUC
Machine Learning (3170724) 2023-24 Page 8
L.J. Institute of Engineering & Technology Semester: VII (2023-24)
ROC
Residual
F-measure
Kappa-coefficient
purity
Sr. DESCRIPTIVE QUESTIONS Marks
No
1. Elaborate the cross validation in training a model. (Jan 2023) [LJIET] 07
2. Distinguish lazy vs eager learner with an example. (Jan 2023) [LJIET] 04
3. Explain the training of Predictive Model. (Jun 2022) [LJIET] 03
4. What is data sampling? Explain data sampling methods? (Jun 2022) [LJIET] 04
5. Explain K-fold cross validation method with suitable example. (Dec 2021) [LJIET] 07
6. Explain the process of K-fold-cross-validation method (Jun 2023) [LJIET] 04
7. Compare
1 and contrast Descriptive and predictive Models. [LJIET] 05
8. Explain
2 in details about the methods to train a learning model. [LJIET] 07
9. What
3 is underfitting and overfitting in the context of Machine Learning Model? [LJIET] 07
10. Write
4 about the bias-variance tradeoffs in context of model fitting. [LJIET] 07
11. Differentiate between 05
a) Lazy Vs Eager Learners
b) Bagging Vs Boosting [LJIET]
12. What is sampling? Explain Bootstrap sampling. (Jun 2023) [LJIET] 04
(TP+TN)/(TP+FP+FN+TN)
(TP+FN)/(TP+FP+FN+TN)
(FP+TN)/(TP+FP+FN+TN)
(FP+FN)/(TP+FP+FN+TN)
(TP+TN)/(TP+FP-FN+TN)
(TP+TN)/(TP-FP+FN-TN)
3. Sensitivity [LJIET] 01
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
𝐹𝑃
𝑇𝑃 + 𝐹𝑃
𝐹𝑃
𝑇𝑃 + 𝐹𝑁
𝑇𝑃 + 𝑇𝑁
𝑇𝑃
𝑇𝑁
𝑇𝑁 + 𝐹𝑃
6. __________ is a part of model preparation activity. [LJIET] 01
Remediate data
Cross validation
Confusion matrix
ROC curve
Hold-out
Ensemble
7. Identify false statement(s) 01
1. Mean is impacted if too many data elements are having value closer to the far end
of the range.
2. Skewness/Shape of histogram depends on nature of data.
3. Height of bin keeps on decreasing as we move toward right, is called right-skew.
4. Bivariate relationship can be visualized using scatter-plot and box-plot.
Machine Learning (3170724) 2023-24 Page 10
L.J. Institute of Engineering & Technology Semester: VII (2023-24)
5. Cross-tab allows operations like roll-up and drill-down [LJIET]
4 only
4,5 only
None of given
1,2,3 only
1,4 only
All of given
8. What is not correct about Eager Learners from following 01
1. Follows the steps abstraction and generalization.
2. Need to refer back to training data.
3. Support Vector Machine
4. Very little time in training [LJIET]
2,4 only
1,2 only
3,4 only
1,3,4 only
All of given
None of give
9. 1. Arises from simplifying assumptions made by model to make target function less 01
complex
2. Occurs from difference in training data set used to train the model
3. Due to under-fitting a model
4. Due to over-fitting a model
5. Poor performance on complex dataset
Small change in training data set magnified in model [LJIET]
Error due to bias- 1,3,5 Error due to Variance- 2,4,6
Error due to bias- 1,4,6 Error due to Variance- 2,3,5
Error due to bias- 2,4,6 Error due to Variance- 1,3,5
Error due to bias- 2,3,5 Error due to Variance- 1,4,6
Error due to bias- 1,4,5 Error due to Variance- 2,3,6
Error due to bias- 2,3,6 Error due to Variance- 1,4,5
10. In Kappa coefficient,p(pr) chance agreement is calculated as [LJIET] 01
𝑇𝑃 + 𝐹𝑃 𝑇𝑃 + 𝐹𝑁 𝑇𝑁 + 𝐹𝑃 𝑇𝑁 + 𝐹𝑁
∗ + ∗
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁
𝑇𝑃 + 𝐹𝑃 𝑇𝑃 + 𝐹𝑁 𝑇𝑁 + 𝐹𝑃 𝑇𝑁 + 𝐹𝑁
∗ ∗ ∗
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁
𝑇𝑃 + 𝐹𝑃 𝑇𝑃 + 𝐹𝑁 𝑇𝑁 + 𝐹𝑃 𝑇𝑁 + 𝐹𝑁
+ ∗ +
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁
𝑇𝑃 + 𝐹𝑃 𝑇𝑃 + 𝐹𝑁 𝑇𝑁 + 𝐹𝑃 𝑇𝑁 + 𝐹𝑁
− ∗ +
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁
𝑇𝑃 + 𝐹𝑃 𝑇𝑃 + 𝐹𝑁 𝑇𝑁 + 𝐹𝑃 𝑇𝑁 + 𝐹𝑁
∗ + −
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁
𝑇𝑃 + 𝐹𝑃 𝑇𝑃 + 𝐹𝑁 𝑇𝑁 + 𝐹𝑃 𝑇𝑁 + 𝐹𝑁
∗ − ∗
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 + 𝑇𝑁
11. a. R-squared Error 01
b. Silhouette Width
c. Pearson’s Correlation Coefficient
d. Cosine Similarity
Value lies between
1. 0 to +1
2. -1 to +1 [LJIET]
4. Consider the following confusion matrix of the win/loss prediction of cricket match. Calculate 07
model accuracy and error rate, sensitivity, precision, F-measure and kappa value for the same.
(Jun 2023) [LJIET]
Actual Win Actual Loss
Predicted Win 85 4
Predicted Loss 2 9
5. List the methods for Model evaluation. Explain each. How we can improve the 07
performance of model. (Jun 2022) [LJIET]
6. Can
1 the performance of a learning model be improved? If yes, How? [LJIET] 07
7. Explain
2 how can we evaluate the performance of a Classification algorithm by measuring 07
Specificity, precision, recall and ROC curves [LJIET]
UNIT NO- 4 :
Basics of Feature Engineering
TOPIC:1 : Feature Construction, Feature Extraction, Feature
Selection
Sr. SHORT QUESTIONS/MCQ Marks
No
1. 1Meaningful attribute of data set________. [LJIET] 01
Feature
Redundant feature
Relevant feature
Input variable
Output variable
Variable
2. 1.
2 Feature construction 01
2. Feature extraction
3. Feature subset selection
1,2
1,2,4
4
All of given
None of given
2,3
5. If P1,P2 and P3 are three principle components 01
What are true among following:
1. P1, P2 and P3 are parallel
2. Variance of P1 is largest
3. Variance of P3 is smallest
4. P1, P2 and P3 are orthogonal
5. Variance of P1 is smallest
6. Variance of P3 is largest [LJIET]
2,3,4
1,2,3
4,5,6
1,5,6
1,2 only
4,5 only
6. a. PCA 01
b. SVD
Machine Learning (3170724) 2023-24 Page 13
L.J. Institute of Engineering & Technology Semester: VII (2023-24)
c. LDA
0.5298
0.9852
0.2599
0.9985
-0.5298
-0.9852
9. 𝑛11 +𝑛00
a. 01
𝑛01 +𝑛10 +𝑛11 +𝑛00
b. Widely used in text classification
𝑛11
c. 𝑛01 +𝑛10 +𝑛11
√∑𝑛𝑖=1(𝐹1𝑖
𝑟
d. − 𝐹2𝑖 )𝑟 [LJIET]
Manhattan
Minkowski
Pearson
Euclidean
Jaccard
Cosine
Machine Learning (3170724) 2023-24 Page 15
L.J. Institute of Engineering & Technology Semester: VII (2023-24)
15. a. Filter approach 01
b. Wrapper approach
c. Hybrid approach
d. Embedded approach
1. Induction algorithm
2. Statistical tests
3. Induction algorithm and statistical tests
Simultaneous selection and classification [LJIET]
a-2 b-1 c-3 d-4
a-1 b-2 c-3 d-4
a-1 b-2 c-4 d-3
a-2 b-1 c-4 d-3
a-1 b-3 c-2 d-4
a-2 b-3 c-1 d-4
16. What is true about feature selection process 01
1. No new feature is generated
2. Improving efficiency of a learning model
3. Better understanding of underlying model
4. Faster and cost effective model
Used functional mapping [LJIET]
1,2,3,4 only
1,3,4,5 only
1,2,5 only
All of given
None of given
1,2 only
17. What is true if groups are well separated in LDA 01
1. Intra-group mean are far away from each other
2. Data points are close to intra-group mean
3. Intra-group mean from grand-mean is large. [LJIET]
All of given
1,2 only
2,3 only
1,3 only
None of given
1 only
18. What features are ALWAYS candidate for rejection in the process of feature subset selection 01
a. Redundant features
b. Irrelevant features
c. Weakly relevant features
d. Non-redundant features[LJIET]
a,b only
a,b,c only
all of given
c only
b,c only
b,c,d only
19. Which is not a search strategy[LJIET] 01
Sequential forward selection
Machine Learning (3170724) 2023-24 Page 16
L.J. Institute of Engineering & Technology Semester: VII (2023-24)
Sequential backward selection
Bi-directional selection
Bi-directional elimination
Recursive elimination
Recursive Selection
Sr. DESCRIPTIVE QUESTIONS Marks
No
1. Show various distance-based similarity measure with its example. (Jan 2023) 04
[LJIET]
2. What is the purpose of Singular value decomposition? How does it achieve? 04
(Jan 2023) [LJIET]
3. What is principal component analysis? How does it work? Explain. (Jan 2023) [LJIET] 07
4. Explain the need of feature engineering in ML. (Jun 2022) [LJIET] 03
5. Differentiate PCA and LDA. (Jun 2022) [LJIET] 04
6. Explain SVD as a feature extraction technique with suitable example. (Dec 2021) 07
[LJIET]
7. 1
What is a feature? What is feature Engineering? What are the major elements of Feature 05
Engineering? Explain them.[LJIET]
8. 2
Explain the process of feature engineering in context of a text categorization problem. 05
[LJIET]
9. 3
Differentiate between SMC and Jaccard Coefficients. [LJIET] 05
10. 4
Explain in short the three methods for Feature Extraction. [LJIET] 07
11. Explain with an example, main underlying concept of feature extraction. What are the most 07
popular algorithms of feature extraction, briefly explain any one. (Jun 2023) [LJIET]
12. 5
State and explain the methods to find out the similarity or redundancy aspect of the 07
attributes in a dataset. [LJIET]
13. What is feature selection? Why it is needed? What are the different approaches of feature 07
selection, briefly explain any one. (Jun 2023) [LJIET]
UNIT NO- 5 :
Brief overview of Probability
TOPIC:1 : Basic concept of Probability, Random Variables, Discrete
Distributions, Continuous Distribution, Central Theorem, Monte Carlo
Approximations.
Sr. DESCRIPTIVE QUESTIONS Marks
No
1. Define the following terms. 03
(i) Variance
(ii) Covariance
(iii) Joint Probability (Jan 2023) [LJIET]
2. What is Bernoulli distribution? Explain briefly with its formula.(Jan 2023) [LJIET] 03
3. What is conditional probability? Define its importance. (Jan 2023) [LJIET] 03
4. If 3% of electronic units manufactured by a company are defective. Find the probability 03
that in a sample of 200 units, less than 2 bulbs are defective. (Dec 2021) [LJIET]
5. In a communication system each data packet consists of 1000 bits. Due to the noise, each 03
bit may be received in error with probability 0.1. It is assumed bit errors occur
independently. Find the probability that there are more than 120 errors in a certain data
packet. (Dec 2021) [LJIET]
6. Explain Binomial Distribution with an example. (Jun 2022) [LJIET] 04
UNIT NO- 8 :
Unsupervised Learning
TOPIC:1 : Supervised vs. Unsupervised Learning, Applications,
Association rules
Sr. DESCRIPTIVE QUESTIONS Marks
No
1. How does the apriori principle help in reducing the calculation overhead for a market 07
basket analysis? Explain with an example. (Jan 2023) [LJIET]
2. Mention few applications areas of unsupervised learning in Engineering. (Jan 2023) 03
[LJIET]
3. Explain how the Market Basket Analysis uses the concepts of association analysis. (Dec 07
2021) [LJIET]
4. Explain the Apriori algorithm for association rule learning with an example. (Dec 2021, 07
Jun 2023) [LJIET]
5. 1
How unsupervised learning is different from supervised leaning? Explain with 07
example.[LJIET]
6. 2
Mention few application areas of unsupervised learning.[LJIET] 07
7. 3
How apriori algorithm helps in reducing the calculation overhead for market basket 07
analysis? Give example.[LJIET]
8. 4
What is Association rule? What are the Applications of Association rule mining? Define 07
support and confidence in Association rule mining.[LJIET]
9. A
5 database has 4 transactions, shown below. 07
Assuming a minimum level of support min_sup = 60% and a minimum level of confidence
min_conf = 80%.
a. Find all frequent item-sets using the Apriori algorithm.
b. List all of the strong association rules, along with their support and confidence
values for
buys(item1, item2) buys(item3) items can be A, B etc.[LJIET]
TOPIC:2 : Clustering
Sr. SHORT QUESTIONS/MCQ Marks
No
1. In
1 which situation k-means clustering fails to give good results?[LJIET] 01