ML QB

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

ML QUESTION BANK IA-1

1. Define the concept of classification.


Ans) A classification algorithm is a supervised learning technique that uses data
training to determine data into different classes. Classification predictive modeling is
trained using data or observations, and new observations are categorized into classes
or groups. Classification predictive modeling is the task of a mapping function (f) from
input variables (x) to discrete output variables (y). In this approach, the algorithm
generates a probability score and assigns this score to the input. For example, email
service providers use classification to generate probability scores for email identification
to determine if the email is in the spam class or not.

2. How will you design a machine learning system? Steps of developing


machine learning.
Ans)
3. What are real life applications of machine learning?(explain on own)(pg
1-16,1-17)
Ans) 1. Learning Association: (market-basket analysis)
2. Classification: loan given by bank
3. Regression: (Price/score prediction)
4. Unsupervised Learning: (clustering,to find outliers, create your own labeled
data,check patterns,etc)
5. Reinforcement Learning: (Chess game)

4. List and explain issues in machine learning.


Ans)
5. Calculate eigen vector of a given matrix
A = 1 2 -3
2 4 -6
-1 -2 3

6. What are the performance measures to analyze the quality of the model?
Ans)The performance of a machine learning model can be evaluated using several
performance measures. The choice of performance measure depends on the problem
type and the goals of the model. Some common performance measures include:

● Accuracy: The percentage of correctly classified instances in the test set.

● Precision: The ratio of true positives to the total number of predicted


positives.Precision measures the model's ability to identify only the relevant
instances.

● Recall: The ratio of true positives to the total number of actual positives. Recall
measures the model's ability to find all relevant instances.

● F1-score: The harmonic mean of precision and recall. F1-score is a balanced


measure that combines precision and recall.

● Mean Squared Error (MSE): The average squared difference between the
predicted and actual values. MSE is used for regression problems.
● Root Mean Squared Error (RMSE): The square root of the mean squared error.
RMSE is used for regression problems.
RSME= root of MSE
● Confusion Matrix: A matrix that summarizes the number of true positives, false
positives, true negatives, and false negatives. Confusion matrix is used to calculate
other performance measures such as accuracy, precision, recall, and F1-score.

7. Explain overfitting and underfitting of models.


Ans)
8. Calculate SVD of given matrix
A=101
-2 1 0

9. Diagonalize the given matrix A as A= XDX-1


A = 111
111
111

10. Explain support vector machines.


Ans)Support Vector Machine or SVM is one of the most popular Supervised Learning
algorithms, which is used for Classification as well as Regression problems. However,
primarily, it is used for Classification problems in Machine Learning.
The goal of the SVM algorithm is to create the best line or decision boundary that can
segregate n-dimensional space into classes so that we can easily put the new data
point in the correct category in the future. This best decision boundary is called a
hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These
extreme cases are called support vectors, and hence algorithm is termed as Support
Vector Machine.

Example: Suppose we see a strange cat that also has some features of dogs, so if we
want a model that can accurately identify whether it is a cat or dog, such a model can
be created by using the SVM algorithm. We will first train our model with lots of images
of cats and dogs so that it can learn about different features of cats and dogs, and then
we test it with this strange creature. So as the support vector creates a decision
boundary between these two data (cat and dog) and chooses extreme cases (support
vectors), it will see the extreme case of cat and dog. On the basis of the support
vectors, it will classify it as a cat or dog.
11. What is regularized regression?
Ans)
12. Explain the norm of a vector.
Ans) The length of the vector is referred to as the vector norm or the vector’s
magnitude.
The length of a vector is a nonnegative number that describes the extent of the vector in
space, and is sometimes referred to as the vector’s magnitude or the norm.
For example, we have vector v1=[-2,1]
Norm of a vector or L would be,

L=

13. Explain supervised machine learning.


Ans) Supervised learning is the type of machine learning in which machines are trained
using well "labeled" training data, and on the basis of that data, machines predict the
output. The labeled data means some input data is already tagged with the correct
output.
In supervised learning, the training data provided to the machines work as the
supervisor that teaches the machines to predict the output correctly. It applies the same
concept as a student learns in the supervision of the teacher.
Supervised learning is a process of providing input data as well as correct output data to
the machine learning model. The aim of a supervised learning algorithm is to find a
mapping function to map the input variable(x) with the output variable(y).
Supervised learning can be further divided into two types of problems:
● Classification:
Classification algorithms are used when the output variable is categorical, which
means there are two classes such as Yes-No, Male-Female, True-false, etc.
● Regression:
Regression algorithms are used if there is a relationship between the input
variable and the output variable. It is used for the prediction of continuous
variables, such as Weather forecasting, Market Trends, etc.
In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.

14. Explain unsupervised machine learning.


Ans) Unsupervised learning is a machine learning technique in which models are not
supervised using training dataset. Instead, models itself find the hidden patterns and
insights from the given data. It can be compared to learning which takes place in the
human brain while learning new things. It can be defined as: Unsupervised learning is a
type of machine learning in which models are trained using an unlabeled dataset and
are allowed to act on that data without any supervision.

The unsupervised learning algorithm can be further categorized into two types of
problems:

● Clustering: Clustering is a method of grouping the objects into clusters such that
objects with most similarities remain into a group and have less or no similarities
with the objects of another group.

● Association: An association rule is an unsupervised learning method which is


used for finding the relationships between variables in the large database. It
determines the set of items that occur together in the dataset.

Suppose the unsupervised learning algorithm is given an input dataset containing


images of different types of cats and dogs. The algorithm is never trained upon the
given dataset, which means it does not have any idea about the features of the dataset.
The task of the unsupervised learning algorithm is to identify the image features on their
own. Unsupervised learning algorithm will perform this task by clustering the image
dataset into the groups according to similarities between images.

15. Find vectors that are orthogonal to [1,2,3]. Explain why we can have an
infinite number of such vectors.

16. Explain least squares method for supervised machine learning technique.
Ans) Least-square method is the curve that best fits a set of observations with a
minimum sum of squared residuals or errors. Let us assume that the given points of
data are (x1, y1), (x2, y2), (x3, y3), …, (xn, yn) in which all x’s are independent
variables, while all y’s are dependent ones. This method is used to find a linear line of
the form y = mx + b, where y and x are variables, m is the slope, and b is the
y-intercept. The formula to calculate slope m and the value of b is given by:
m = (n∑xy - ∑y∑x)/n∑x2 - (∑x)2
b = (∑y - m∑x)/n
Here, n is the number of data points.
For example,
To illustrate, consider the case of an investor considering whether to invest in a gold
mining company. The investor might wish to know how sensitive the company’s stock
price is to changes in the market price of gold. To study this, the investor could use the
least squares method to trace the relationship between those two variables over time
onto a scatter plot. This analysis could help the investor predict the degree to which the
stock’s price would likely rise or fall for any given increase or decrease in the price of
gold.

17. Solve the linear system -x1+ x2+2x3 = 2, 3x1-x2+x3= 6, -x1+3x2+4x3= 4.

18. What are the applications of singular value decomposition (SVD)


*NOTE: Matrix and linear equations may change.
Formulas:

Eigen Vector:

Use cramers rule,


Example,

You might also like