AIML Question
AIML Question
AIML Question
a) Neuron
b) Synapse
c) Layer
d) Gradient Descent
ANS- Neuron
a) Forward
b) Backward
ANS- Forward
a) Manhattan distance
b) Hamming distance
c) Cosine similarity
d)All of above
b)O(n^2)
c)O(n^3)
ANS- O(n)
a) Dimensionality reduction
b) Feature engineering
c) Model training
d) Hyperparameter tuning
7.What does the Area Under the Curve (AUC) represent in ROC analysis?
a) Ratio of true positives to false positives
c) Probability that the classifier will rank a randomly chosen positive instance higher than a randomly
chosen negative one
ANS- Probability that the classifier will rank a randomly chosen positive instance higher than a randomly
chosen negative one
b) F1 Score
c) Accuracy
d) Precision
ANS- AUC
a) Davies-Bouldin Index
b) Silhouette Coefficient
d) Rand Index
ANS- Measures the "gap" between SSE and the expected SSE
15.What is the name of theparameter used to control the number of samples to draw from X_train to train each
tree in Random Forest?
a) n_samples
b) max_features
c) bootstrap
ANS- bootstrap
16.What is the name of the parameter used to control the strategy used to sample the data in Random Forest?
a) bootstrap
b) n_samples
c) max_features
ANS- bootstrap
17.What is the name of the metric used to evaluate the clustering solution with respect to the uncertainty of the
clusters
a) Cluster Uncertainty
b) Prediction Interval
c) Confidence Interval
18.What is the name of the metric used to evaluate the clustering solution with respect to the diversity of the
features
a) Feature Diversity
b) Feature Importance
c) Feature Relevance
19.What is the name of the metric used to evaluate the clustering solution with respect to the ground truth
labels?
a) Cluster Purity
b) Homogeneity Score
c) Completeness Score
20.What is the name of the metric used to evaluate the clustering solution with respect to the density of the
data?
a) Density-Based Clustering Validation (DBCV)
c) DBSCAN
b) Precision at k (P@k)
22.What is the name of the metric used to evaluate the performance of a regression model with respect to the
distribution of the residuals
a) Explained Variance Score
b) Max Error
23.What is the name of the algorithm used to compute the principal components in PCA?
a) Power Iteration Method
b) Eigendecomposition
b) To increase the dimensionality of the data while preserving the maximum amount of variance
c) To reduce the noise in the data while preserving the maximum amount of information
d) To increase the noise in the data while preserving the maximum amount of information
ANS- To reduce the dimensionality of the data while preserving the maximum amount of variance
d) To improve interpretability
b)10
c)100
d)1000
27.Random Forest combines multiple decision trees to make predictions by using what method?
a) Bagging
b)Boosting
c)Pruning
d)Spliting
ANS-Bagging
c) A programming language
29.Which library in Python can be used for implementing neural networks and multi-layered
perceptrons?
a) TensorFlow
b) NumPy
c) pandas
d) scikit-learn
ANS- scikit-learn
b) Data classification
c) Anomaly detection
d) Anomaly detection
b) R-squared (R^2)
c) F1 Score
ANS- F1 Score
b) Precision
d) Recall
a) Precision
b) Recall
c) F1 Score
d) Accuracy
ANS- Precision
a) Sigmoid
b) ReLU
c) Tanh
d) Leaky ReLU
ANS- Sigmoid
a) Activation Function
b) Loss Function
c) Classification
d) Normalization
ANS- Classification
a) Nearest centroid
b) Furthest centroid
c) Randomly
b) Original features
a) Accuracy
b) Precision
c) Recall
d) F1 Score
ANS- Accuracy
b) Davies-Bouldin Index
d) Silhouette Coefficient
c) Measures the ratio of the maximum inter-cluster distance to the minimum intra-cluster distance
d) Measures the ratio of the minimum inter-cluster distance to the maximum intra-cluster distance
ANS- Measures the ratio of the minimum inter-cluster distance to the maximum intra-cluster distance
a) Difference of the log of within-cluster dispersion and the log of expected dispersion
ANS- Difference of the log of within-cluster dispersion and the log of expected dispersion
16What is the name of the parameter used to control the maximum number of features considered for
splitting at each leaf node in Random Forest?
a) max_features
b) min_samples_split
c) min_samples_leaf
ANS- max_features
17What is the name of the parameter used to control the minimum impurity decrease required to split
an internal node in Random Forest?
a) min_impurity_decrease
b) min_samples_split
c) min_samples_leaf
ANS- min_impurity_decrease
18What is the name of the metric used to evaluate the clustering solution with respect to the
generalizability of the clusters
a) Transferability Score
b) Domain Adaptation
c) Out-of-Distribution Detection
19What is the name of the metric used to evaluate the clustering solution with respect to the
explainability of the clusters
a) Cluster Interpretability
b) Feature Importance
c) Model-agnostic Explanations
20What is the name of the metric used to evaluate the compactness of the clustering solution?
b) Average Theta
c) Davies-Bouldin Index
21What is the name of the metric used to evaluate the separation of the clusters in the clustering
solution?
a) F-beta Score
b) G-mean
c) AUC-ROC
22What is the name of the metric used to evaluate the performance of a binary classifier with respect to
the cost of false positives and false negatives?
23What is the name of the metric used to evaluate the performance of a regression model with respect
to the distribution of the errors?
a) Scree Plot
b) Kaiser's Rule
24What is the name of the criterion used to select the number of principal components in PCA?
a) K-modes
b) K-prototypes
c) K-medoids
ANS- K-modes
25What is the name of the variation of K-means Clustering that allows for clustering of categorical data?
26What is the name of the variation of K-means Clustering that allows for clustering of text data?
a) Sigmoid
b) Tanh
c) ReLU
d) Softmax
ANS- ReLU
27What is the name of the most commonly used activation function in the hidden layers of a neural
network?
a) Gradient Descent
b) Genetic Algorithm
c) Simulated Annealing
a) Forward propagation
b) Backward propagation
c) Gradient descent
d) Regularization
30Which term refers to the process of adjusting the weights and biases of a neural network during
training?
a) Randomly assigned
b)Calculating gradients
c)Activating neurons
d)Initializing weights
3What is the best way to determine the optimal number of clusters (K) in K-means?
a)Elbow method
b)Silhouette score
c)Gap statistic
4How does the Elbow method help in choosing the number of clusters (K)?
a)Looks for the "elbow" point in the plot of within-cluster sum of squares
c)Unexplained variance
b)Ignores multicollinearity
c)True Positive rate
b)Increases multicollinearity
c)Increases multicollinearity
c)Increases multicollinearity
9Which metric is a percentage and represents the relative difference between predicted and actual
values?
a)MAE
b)MSE
c)R-squared (R^2)
d)Gap Statistic
d)None of above
d)None of above
a)Dunn Index
14What is the name of the parameter used to control the minimum number of samples required to split
an internal node in Random Forest?
a)min_samples_split
b)Hopkins statistic
d)None of above
15What is the name of the parameter used to control the minimum number of samples required to be a
leaf node in Random Forest?
a)min_samples_leaf
b)max_samples_leaf
c)Hartigan Index
d)None of above
16What is the name of the metric used to evaluate the clustering solution with respect to the fairness of
the algorithm
a)Demographic Parity
b)min_samples_split
c)max_depth
d)None of above
17What is the name of the metric used to evaluate the clustering solution with respect to the
reproducibility of the results
a)Statistical Significance
b)Equal Opportunity
c)max_depth
d)None of above
18What is the name of the metric used to evaluate the quality of the clustering solution?
a)Silhouette Score
b)Reproducibility Score
c)Equalized Odds
d)None of above
19What is the name of the metric used to evaluate the stability of the clustering solution?
b)Davies-Bouldin Index
c)Calinski-Harabasz Index
d)None of above
20What is the name of the metric used to evaluate the performance of a binary classifier in an
imbalanced dataset?
a)Precision-Recall Curve
b)Jaccard Similarity Coefficient
d)None of above
21What is the name of the metric used to evaluate the performance of a multi-class classifier in an
imbalanced dataset?
a)Cohen's Kappa
b)ROC Curve
c)Confusion Matrix
d)None of above
22What is the name of the variation of PCA that allows for non-linear dimensionality reduction?
a)Kernel PCA
c)Fowlkes-Mallows Index
d)None of above
23What is the name of the variation of K-means Clustering that allows for constraints on the clustering
solution?
a)COP-Kmeans
b)t-SNE
c)Autoencoder
d)None of above
24What is the name of the variation of K-means Clustering that allows for incremental updates to the
clustering solution?
a)Mini-Batch K-means
b)Fuzzy C-Means
c)Constrained K-means
d)None of above
25What is the name of the algorithm used to initialize the weights of a neural network?
a)Random Initialization
b)Incremental K-means
c)Streaming K-means
d)None of above
a)L1 Regularization
b)Xavier Initialization
c)He Initialization
d)Glorot Initialization
27Which metric is commonly used to assess the feature importance in Random Forest?
a)Accuracy
b)L2 Regularization
c)Dropout
d)Early Stopping
b)Precision
c)F1-score
d)Gini importance
29Which layer of a neural network is responsible for extracting high-level features from the input data?
a)Input layer
b)Hidden layer(s)
d)Activation layer
a)F1 Score
b)Accuracy
d)R-squared (R^2)