AIML Question

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 38

1.What is the basic building block of a Neural Network?

a) Neuron

b) Synapse

c) Layer

d) Gradient Descent

ANS- Neuron

2.In a Feedforward Neural Network, information travels in which direction?

a) Forward

b) Backward

c) Both Forward and Backward

d) None of the above

ANS- Forward

3.Which distance metric is commonly used in K-means clustering?

a) Manhattan distance

b) Hamming distance

c) Cosine similarity

d)All of above

ANS- Manhattan distance

4.What does the "K" in K-means clustering represent?


a) Number of clusters

b) Number of data points


c) Number of features

d) None of the above

ANS- Number of clusters

5.What is the time complexity of the K-means algorithm?


a) O(n)

b)O(n^2)

c)O(n^3)

d) O(nkt) (n - number of data points, k - number of clusters, t - number of iterations)

ANS- O(n)

6.What is the main objective of Principal Component Analysis (PCA)?

a) Dimensionality reduction

b) Feature engineering

c) Model training

d) Hyperparameter tuning

ANS- Dimensionality reduction

7.What does the Area Under the Curve (AUC) represent in ROC analysis?
a) Ratio of true positives to false positives

b) Accuracy of the classifier

c) Probability that the classifier will rank a randomly chosen positive instance higher than a randomly
chosen negative one

d) None of the above

ANS- Probability that the classifier will rank a randomly chosen positive instance higher than a randomly
chosen negative one

8.Which metric can be calculated from the ROC curve?


a) AUC

b) F1 Score

c) Accuracy
d) Precision

ANS- AUC

9.What does Mean Absolute Error (MAE) measure?

a) Average of the absolute errors

b) Average of the squared errors

c) Percentage of predicted values within a specific range

d) Ratio of correctly predicted values to the total predicted values

ANS- Average of the absolute errors

10.What does Mean Squared Error (MSE) measure?


a) Average of the absolute errors

b) Average of the squared errors

c) Percentage of predicted values within a specific range

d) Ratio of correctly predicted values to the total predicted values

ANS- Average of the squared errors

11.What does the Adjusted Rand Index (ARI) adjust for?


a) Adjusts for cluster separation

b) Average of the squared errors

c) Percentage of predicted values within a specific range

d) Ratio of correctly predicted values to the total predicted values

ANS- Average of the squared errors

12.Which index is sensitive to the number of clusters?

a) Davies-Bouldin Index

b) Silhouette Coefficient

c) Adjusted Rand Index (ARI)

d) Rand Index

ANS- Davies-Bouldin Index


13.What is the purpose of the Elbow Method in cluster validation?
a) Identifying the optimal number of clusters

b) Measuring cluster compactness

c) Calculating the Dunn Index

d) Adjusting the Davies-Bouldin Index

ANS- Identifying the optimal number of clusters

14.What does the Gap Statistic measure in cluster validation?


a) Measures the "gap" between SSE and the expected SSE

b) Measures cluster compactness

c) Measures the silhouette coefficient

d) Measures the Davies-Bouldin Index

ANS- Measures the "gap" between SSE and the expected SSE

15.What is the name of theparameter used to control the number of samples to draw from X_train to train each
tree in Random Forest?
a) n_samples

b) max_features

c) bootstrap

d) None of the above

ANS- bootstrap

16.What is the name of the parameter used to control the strategy used to sample the data in Random Forest?
a) bootstrap

b) n_samples

c) max_features

d) None of the above

ANS- bootstrap

17.What is the name of the metric used to evaluate the clustering solution with respect to the uncertainty of the
clusters
a) Cluster Uncertainty
b) Prediction Interval

c) Confidence Interval

d) None of the above

ANS- Cluster Uncertainty

18.What is the name of the metric used to evaluate the clustering solution with respect to the diversity of the
features

a) Feature Diversity

b) Feature Importance

c) Feature Relevance

d) None of the above

ANS- Feature Diversity

19.What is the name of the metric used to evaluate the clustering solution with respect to the ground truth
labels?

a) Cluster Purity

b) Homogeneity Score

c) Completeness Score

d) None of the above

ANS-) Homogeneity Score

20.What is the name of the metric used to evaluate the clustering solution with respect to the density of the
data?
a) Density-Based Clustering Validation (DBCV)

b) Local Outlier Factor (LOF)

c) DBSCAN

d) None of the above

ANS- Density-Based Clustering Validation (DBCV)


21.What is the name of the metric used to evaluate the performance of a binary classifier with respect to the
ranking of the instances?
a) Average Precision (AP)

b) Precision at k (P@k)

c) Mean Average Precision (MAP)

d) None of the above

ANS- Mean Average Precision (MAP)

22.What is the name of the metric used to evaluate the performance of a regression model with respect to the
distribution of the residuals
a) Explained Variance Score

b) Max Error

c) Mean Absolute Error (MAE)

d) None of the above

ANS- Explained Variance Score

23.What is the name of the algorithm used to compute the principal components in PCA?
a) Power Iteration Method

b) Eigendecomposition

c) Singular Value Decomposition

d) None of the above

ANS- Singular Value Decomposition

24.What is the objective of PCA?


a) To reduce the dimensionality of the data while preserving the maximum amount of variance

b) To increase the dimensionality of the data while preserving the maximum amount of variance

c) To reduce the noise in the data while preserving the maximum amount of information

d) To increase the noise in the data while preserving the maximum amount of information

ANS- To reduce the dimensionality of the data while preserving the maximum amount of variance

25.What is the purpose of out-of-bag (OOB) samples in Random Forest?


a) To estimate the generalization error

b) To increase model complexity


c) To speed up training

d) To improve interpretability

ANS- To estimate the generalization error

26.What is the recommended number of trees in a Random Forest?


a) As many as possible

b)10

c)100

d)1000

ANS-As many as possible

27.Random Forest combines multiple decision trees to make predictions by using what method?
a) Bagging

b)Boosting

c)Pruning

d)Spliting

ANS-Bagging

28.What is a neural network?


a) A data visualization technique

b) A machine learning algorithm

c) A programming language

d) A database management system

ANS- A machine learning algorithm

29.Which library in Python can be used for implementing neural networks and multi-layered
perceptrons?
a) TensorFlow

b) NumPy

c) pandas
d) scikit-learn

ANS- scikit-learn

30.What is K-means clustering used for?


a) Dimensionality reduction

b) Data classification

c) Anomaly detection

d) Anomaly detection

ANS- Data classification

31.How does K-means clustering algorithm work?


a) Hierarchically merging clusters

b) Assigning samples to predefined clusters

c) Calculating distances between data points

d) Fitting a regression line to the data points

ANS- Assigning samples to predefined clusters

32.What is the main goal of PCA?


a) Maximizing information loss

b) Minimizing feature relevance

c) Maximizing feature correlation

d) Minimizing information loss

ANS- Minimizing information loss

33.What does the term "principal component" represent in PCA?


a) The most important feature of the dataset

b) Linear combinations of the original features

c) The first feature in the dataset

d) The target variable


ANS- Linear combinations of the original features

34.Which evaluation metric is commonly used for classification problems?


a) Mean Absolute Error (MAE)

b) R-squared (R^2)

c) F1 Score

d) Root Mean Squared Error (RMSE)

ANS- F1 Score

35.Which evaluation metric is commonly used for regression problems?


a) Accuracy

b) Precision

c) Mean Squared Error (MSE)

d) Recall

ANS- Mean Squared Error (MSE)

1Which evaluation metric penalizes false positives more in classification?

a) Precision

b) Recall

c) F1 Score

d) Accuracy

ANS- Precision

2Which activation function is commonly used for binary classification?

a) Sigmoid

b) ReLU
c) Tanh

d) Leaky ReLU

ANS- Sigmoid

3What is the purpose of the Softmax function in a Neural Network?

a) Activation Function

b) Loss Function

c) Classification

d) Normalization

ANS- Classification

4HHow does K-means assign data points to clusters?

a) Nearest centroid

b) Furthest centroid

c) Randomly

d) Based on majority vote


ANS- Nearest centroid

5What happens in the K-means algorithm during the "update" step?

a) Recalculates cluster centroids

c) Assigns new data points

c) Terminates the algorithm

d) Initializes centroids randomly

ANS- Recalculates cluster centroids

6 In PCA, what is the role of principal components?

a) New features representing original data

b) Original features

c) Noise in the data

d) Outliers in the data

ANS- New features representing original data

7 How does PCA achieve dimensionality reduction?

a) By finding new orthogonal axes


b) By adding more dimensions

c) By removing data points

d) By increasing feature complexity

ANS- By finding new orthogonal axes

8Which metric is NOT affected by class imbalance?

a) Accuracy

b) Precision

c) Recall

d) F1 Score

ANS- Accuracy

9What does the ROC curve plot?

a) Sensitivity vs. Specificity

b) Precision vs. Recall

c) False Positive Rate vs. True Positive Rate


d) True Positive Rate vs. False Positive Rate

ANS- False Positive Rate vs. True Positive Rate

10Which metric penalizes larger errors more heavily?

a) Mean Absolute Error (MAE)

b) Mean Squared Error (MSE)

c) Root Mean Squared Error (RMSE)

d) Mean Absolute Percentage Error (MAPE)

ANS- Mean Squared Error (MSE)

11What does the R-squared (R^2) value indicate?

a) Percentage of variance explained by the model

b) Average of the absolute errors

c) Percentage of predicted values within a specific range

d) Ratio of correctly predicted values to the total predicted values

ANS- Percentage of variance explained by the model


12Which metric is NOT used to evaluate clustering results?

a) Mean Absolute Error (MAE)

b) Davies-Bouldin Index

c) Adjusted Rand Index (ARI)

d) Silhouette Coefficient

ANS- Mean Absolute Error (MAE)

13What is the purpose of the Dunn Index in clustering evaluation?

a) Measures cluster compactness

b) Measures cluster separation

c) Measures the ratio of the maximum inter-cluster distance to the minimum intra-cluster distance

d) Measures the ratio of the minimum inter-cluster distance to the maximum intra-cluster distance

ANS- Measures the ratio of the minimum inter-cluster distance to the maximum intra-cluster distance

14How is the Gap Statistic calculated?

a) Difference of the log of within-cluster dispersion and the log of expected dispersion

b) Difference of the within-cluster dispersion and the expected dispersion


c) Ratio of the within-cluster dispersion to the expected dispersion

d) Ratio of the log of within-cluster dispersion to the log of expected dispersion

ANS- Difference of the log of within-cluster dispersion and the log of expected dispersion

15What is the Hartigan Index used for in cluster validation?

a) Measure of cluster compactness

b) Measure of cluster separation

c) Measure of the "tightness" of a cluster

d) Measure of the cluster's "spread"

ANS- Measure of cluster separation

16What is the name of the parameter used to control the maximum number of features considered for
splitting at each leaf node in Random Forest?

a) max_features

b) min_samples_split

c) min_samples_leaf

d) None of the above

ANS- max_features

17What is the name of the parameter used to control the minimum impurity decrease required to split
an internal node in Random Forest?
a) min_impurity_decrease

b) min_samples_split

c) min_samples_leaf

d) None of the above

ANS- min_impurity_decrease

18What is the name of the metric used to evaluate the clustering solution with respect to the
generalizability of the clusters

a) Transferability Score

b) Domain Adaptation

c) Out-of-Distribution Detection

d) None of the above

ANS- Transferability Score

19What is the name of the metric used to evaluate the clustering solution with respect to the
explainability of the clusters

a) Cluster Interpretability

b) Feature Importance
c) Model-agnostic Explanations

d) None of the above

ANS- Cluster Interpretability

20What is the name of the metric used to evaluate the compactness of the clustering solution?

a) Between-Cluster Sum of Squares (BCSS)

b) Average Theta

c) Davies-Bouldin Index

d) None of the above

ANS- Davies-Bouldin Index

21What is the name of the metric used to evaluate the separation of the clusters in the clustering
solution?

a) F-beta Score

b) G-mean

c) AUC-ROC

d) None of the above


ANS- F-beta Score

22What is the name of the metric used to evaluate the performance of a binary classifier with respect to
the cost of false positives and false negatives?

a) Mean Absolute Percentage Error (MAPE)

b) Symmetric Mean Absolute Percentage Error (SMAPE)

c) Root Relative Squared Error (RRSE)

d) None of the above

ANS- Symmetric Mean Absolute Percentage Error (SMAPE)

23What is the name of the metric used to evaluate the performance of a regression model with respect
to the distribution of the errors?

a) Scree Plot

b) Kaiser's Rule

c) Proportion of Variance Explained

d) None of the above

ANS- Proportion of Variance Explained

24What is the name of the criterion used to select the number of principal components in PCA?
a) K-modes

b) K-prototypes

c) K-medoids

d) None of the above

ANS- K-modes

25What is the name of the variation of K-means Clustering that allows for clustering of categorical data?

a) BTM (Brownian Topic Model)

b) LDA (Latent Dirichlet Allocation)

c) NMF (Non-negative Matrix Factorization)

d) None of the above

ANS- LDA (Latent Dirichlet Allocation)

26What is the name of the variation of K-means Clustering that allows for clustering of text data?

a) Sigmoid

b) Tanh
c) ReLU

d) Softmax

ANS- ReLU

27What is the name of the most commonly used activation function in the hidden layers of a neural
network?

a) Gradient Descent

b) Genetic Algorithm

c) Simulated Annealing

d) None of the above

ANS- Gradient Descent

28What is the name of the algorithm used to train a neural network?

a) To initialize the weights of the network

b) To compute the loss function

c) To introduce non-linearity in the network

d) To optimize the network's hyperparameters


ANS- To introduce non-linearity in the network

29What is the purpose of an activation function in a neural network?

a) Forward propagation

b) Backward propagation

c) Gradient descent

d) Regularization

ANS- Gradient descent

30Which term refers to the process of adjusting the weights and biases of a neural network during
training?

a) Maximizing intra-cluster similarity

b) Minimizing inter-cluster similarity

c) Maximizing inter-cluster similarity

d) Minimizing intra-cluster similarity

ANS- Minimizing intra-cluster similarity


31What is the objective of K-means clustering?

a) They represent the number of clusters

b) They are the initial cluster assignments

c) They are the mean points of each cluster

d) They define the boundaries between clusters

ANS- They are the mean points of each cluster

32.How are principal components determined in PCA?

a) Randomly assigned

b) Based on the mean of each feature

c) By maximizing the variance in the data

D) By minimizing the variance in the data

ANS- By maximizing the variance in the data

34What does the explained variance ratio in PCA indicate?

a) The amount of variance explained by each feature


b) The correlation coefficient between features

c) The sum of squared errors in the data

d) The accuracy of the model

ANS- The amount of variance explained by each feature

34.What does accuracy measure in classification?

a) The proportion of true positives

b) The proportion of true negatives

c) The proportion of correct predictions

d) The proportion of incorrect predictions

ANS- The proportion of correct predictions

1 What does the term "backpropagation" refer to in Neural Networks?

a)Forward pass of data

b)Calculating gradients

c)Activating neurons
d)Initializing weights

2 Which type of Neural Network is designed to remember past information?

a)CNN (Convolutional NN)

b)RNN (Recurrent NN)

c)MLP (Multi-Layer Perceptron)

d)GAN (Generative Adversarial NN)

3What is the best way to determine the optimal number of clusters (K) in K-means?

a)Elbow method

b)Silhouette score

c)Gap statistic

d)All of the above

4How does the Elbow method help in choosing the number of clusters (K)?

a)Looks for the "elbow" point in the plot of within-cluster sum of squares

b)Identifies the "kink" in the curve

c)Considers the highest silhouette score

d)Finds the smallest gap statistic

5What does the term "explained variance" mean in PCA?

a)Amount of variance captured by each principal component

b)Total variance in the dataset

c)Unexplained variance

d)Variance caused by noise

6How does PCA handle multicollinearity in the data?

a)Reduces the impact of multicollinearity

b)Ignores multicollinearity
c)True Positive rate

d)PCA does not affect multicollinearity

7What is Recall (Sensitivity) in classification?

a)Ratio of correctly predicted positive observations to the total predicted positives

b)Increases multicollinearity

c)Increases multicollinearity

d)Ratio of correctly predicted positive observations to the total actual positives

8The F1 score is a combination of which two metrics?

a)Precision and Recall

b)Ability to find all positive samples

c)Increases multicollinearity

d)Minimum intra-cluster distance divided by the maximum inter-cluster distance

9Which metric is a percentage and represents the relative difference between predicted and actual
values?

a)MAE

b)Precision and Specificity

c)True Positive rate

d)Measures the silhouette of clusters

10What does Root Mean Squared Error (RMSE) calculate?

a)Square root of the variance

b)MSE

c)R-squared (R^2)

d)Gap Statistic

11How is the Dunn Index calculated?


a)Maximum intra-cluster distance divided by the minimum inter-cluster distance

b)Average of the absolute errors

c)Sensitivity and Specificity

d)None of above

12What is the purpose of the Hopkins statistic in clustering?

a)Measures the randomness of the data

b)Minimum inter-cluster distance divided by the maximum intra-cluster distance

c)Average of the squared errors

d)None of above

13Which metric is used to evaluate the "tightness" of a cluster?

a)Dunn Index

b)Measures the cluster compactness

c)Maximum inter-cluster distance divided by the minimum intra-cluster distance

14What is the name of the parameter used to control the minimum number of samples required to split
an internal node in Random Forest?

a)min_samples_split

b)Hopkins statistic

c)Measures the tendency of the data points to cluster

d)None of above

15What is the name of the parameter used to control the minimum number of samples required to be a
leaf node in Random Forest?

a)min_samples_leaf

b)max_samples_leaf

c)Hartigan Index

d)None of above
16What is the name of the metric used to evaluate the clustering solution with respect to the fairness of
the algorithm

a)Demographic Parity

b)min_samples_split

c)max_depth

d)None of above

17What is the name of the metric used to evaluate the clustering solution with respect to the
reproducibility of the results

a)Statistical Significance

b)Equal Opportunity

c)max_depth

d)None of above

18What is the name of the metric used to evaluate the quality of the clustering solution?

a)Silhouette Score

b)Reproducibility Score

c)Equalized Odds

d)None of above

19What is the name of the metric used to evaluate the stability of the clustering solution?

a)Adjusted Rand Index (ARI)

b)Davies-Bouldin Index

c)Calinski-Harabasz Index

d)None of above

20What is the name of the metric used to evaluate the performance of a binary classifier in an
imbalanced dataset?

a)Precision-Recall Curve
b)Jaccard Similarity Coefficient

c)Normalized Mutual Information (NMI)

d)None of above

21What is the name of the metric used to evaluate the performance of a multi-class classifier in an
imbalanced dataset?

a)Cohen's Kappa

b)ROC Curve

c)Confusion Matrix

d)None of above

22What is the name of the variation of PCA that allows for non-linear dimensionality reduction?

a)Kernel PCA

b)Matthews Correlation Coefficient (MCC)

c)Fowlkes-Mallows Index

d)None of above

23What is the name of the variation of K-means Clustering that allows for constraints on the clustering
solution?

a)COP-Kmeans

b)t-SNE

c)Autoencoder

d)None of above

24What is the name of the variation of K-means Clustering that allows for incremental updates to the
clustering solution?

a)Mini-Batch K-means

b)Fuzzy C-Means

c)Constrained K-means

d)None of above
25What is the name of the algorithm used to initialize the weights of a neural network?

a)Random Initialization

b)Incremental K-means

c)Streaming K-means

d)None of above

26What is the name of the algorithm used to regularize a neural network?

a)L1 Regularization

b)Xavier Initialization

c)He Initialization

d)Glorot Initialization

27Which metric is commonly used to assess the feature importance in Random Forest?

a)Accuracy

b)L2 Regularization

c)Dropout

d)Early Stopping

28What is the purpose of the backpropagation algorithm in a neural network?

a)To compute the output of the network

b)Precision

c)F1-score

d)Gini importance

29Which layer of a neural network is responsible for extracting high-level features from the input data?

a)Input layer

b)To adjust the weights and biases

c)To initialize the network's parameters

d)To regularize the network


30How is the number of clusters determined in K-means clustering?

a)It is predefined by the user

b)Hidden layer(s)

c)It is based on the variability of the data

d)Activation layer

31What is the drawback of K-means clustering when dealing with outliers?

a)It assigns outliers to their nearest cluster

b)It is determined by the algorithm

c)It separates outliers into a separate cluster

d)It is randomly assigned to each data point

32What is the significance of eigenvalues in PCA?

a)They represent the number of principal components

b)It ignores outliers during clustering

c)They indicate the amount of variance

D)It assigns outliers to the farthest cluster

33 How does PCA handle multicollinearity in the data?

a)By assigning weights to correlated features

b)They determine the direction of the components

c)By transforming correlated features

d)By creating new features from correlated ones

34What does precision measure in classification?

a)The proportion of true positives

b)By removing correlated features

c)The proportion of correct predictions


d)The proportion of correct positive predictions

35Which evaluation metric is used to evaluate imbalanced datasets in classification?

a)F1 Score

b)Accuracy

c)Mean Squared Error (MSE)

d)R-squared (R^2)

You might also like