Assignment 1 AI
Assignment 1 AI
In machine learning, there are several key terms and components that are fundamental to
understanding how learning algorithms operate. Here are some of the main terms and their
explanations with examples:
1. Data : Data is the raw information that is used to train a machine learning model. It
can include structured data (e.g., tables, databases) or unstructured data (e.g., text,
images, audio). For example, a dataset of housing prices with features like square
footage, number of bedrooms, and location.
2. Features : Features, also known as variables or attributes, are the individual
measurable properties or characteristics of the data that are used as input for the
model. In the housing price example, features could include square footage, number
of bedrooms, and location.
3. Labels (Target) : Labels are the output or the dependent variable that the model is
trying to predict. In supervised learning, models are trained on labeled data where the
correct answer (label) is provided along with the input data. For example, in a dataset
for predicting housing prices, the label would be the actual sale price of the houses.
4. Model : A model is a mathematical representation of the relationship between the
features and the labels in the data. It is what makes predictions or classifications based
on the input data. For instance, a linear regression model can be used to predict
housing prices based on features like square footage and number of bedrooms.
5. Training : Training is the process of feeding the model with input data and adjusting
its internal parameters or weights to minimize the difference between the predicted
output and the actual output (labels). This is done through an optimization algorithm
like gradient descent. In the housing price example, the model is trained on a dataset
of houses with known prices, adjusting its parameters to better predict prices.
6. Validation : Validation is the process of evaluating the performance of a trained model
on a separate dataset that was not used during training. It helps to assess how well the
model generalizes to new, unseen data. For example, after training a model to predict
housing prices, it is validated on a separate dataset to see how accurately it predicts
prices for new houses.
7. Testing : Testing is similar to validation but is typically done after finalizing the
model. It involves evaluating the model's performance on a completely new dataset
that it has never seen before. This provides a final assessment of the model's
performance before deployment. In the housing price example, the model would be
tested on a dataset of houses that were not used for training or validation.
These are some of the fundamental components and terms in machine learning.
Understanding them is crucial for building, training, and evaluating machine learning
models effectively.
1. Data Collection : Gather a dataset containing observations of the input features and
their corresponding target values. For example, consider a dataset containing
information about houses such as square footage, number of bedrooms, and their
corresponding sale prices.
2. Data Preprocessing : Clean and preprocess the dataset as needed. This may involve
handling missing values, removing outliers, and scaling the features if necessary.
3. Model Representation : In simple linear regression, there is only one input feature,
while in multiple linear regression, there are multiple input features. The relationship
between the input features x and the target variable y is represented as:
4. Model Training : The goal of training is to find the optimal values for the coefficients
(weights) that minimize the difference between the predicted values and the actual
values in the training dataset. This is typically done by minimizing a loss function
such as the mean squared error (MSE) using optimization techniques like gradient
descent.
5. Model Evaluation : Once the model is trained, evaluate its performance using a
separate validation or test dataset. Common evaluation metrics for regression models
include mean squared error (MSE), root mean squared error (RMSE), and R-squared.
6. Prediction : After evaluating the model, it can be used to make predictions on new
data. Given the input features of a new house, the trained linear regression model can
predict its sale price based on the learned relationship between the features and the
target variable.
Example:
Let's consider a simple linear regression example using a dataset of house prices where the
only feature is the square footage of the house ( x ) and the target variable is the sale price
( y). We want to build a model to predict house prices based on their square footage.
markdownCopy code
Square footage (x) | Sale Price (y)
-------------------------------------
1000 | 200,000
1500 | 300,000
2000 | 400,000
2500 | 500,000
Using linear regression, we want to find the equation of the line that best fits these data
points. After training the model, the equation might look like:
=100+100,000 y=100x+100,000
This equation implies that for every additional square foot in the house, the price increases
by $100, and the base price of the house (intercept) is $100,000.
With this model, we can predict the price of a new house with 1800 square feet:
=100×1800+100,000=280,000 y=100×1800+100,000=280,000
So, according to the linear regression model, a house with 1800 square feet would have a
predicted sale price of $280,000.
Now, let's demonstrate how PCA helps in machine learning with a Matplotlib example using the
Iris dataset:
pythonCopy code
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
# Perform PCA
pca = PCA(n_components=2)
X_r = pca.fit_transform(X)
plt.show()
In this example, we load the Iris dataset, which has four features: sepal length, sepal width, petal
length, and petal width. We then perform PCA to reduce the dimensionality to 2 components and
visualize the data in a 2D plot. Each class of iris plants is represented by a different color, and the
points are plotted based on their projections onto the first two principal components.
This visualization helps us understand how well the different classes are separated in the reduced-
dimensional space, providing insights into the structure of the data and potentially aiding in the
design of machine learning models.
1. Input Layer : The input layer consists of neurons that represent the input features of the data.
Each neuron in the input layer corresponds to one feature, and the number of neurons in this
layer is determined by the dimensionality of the input data.
2. Hidden Layers : Between the input and output layers, there may be one or more hidden
layers. These hidden layers are where the network learns and extracts features from the input
data. Each neuron in a hidden layer receives inputs from all neurons in the previous layer
and produces an output based on a weighted sum of these inputs, passed through an
activation function.
3. Output Layer : The output layer consists of neurons that produce the final output of the
network. The number of neurons in the output layer depends on the nature of the task—
classification problems typically have one neuron per class for binary classification or one
neuron per class for multi-class classification, while regression problems have a single
output neuron.
4. Connections and Weights : Each connection between neurons in adjacent layers is
associated with a weight parameter. These weights determine the strength of the connection
and are adjusted during the training process to minimize the error between the predicted and
actual outputs. Additionally, each neuron may have a bias term that is added to the weighted
sum before passing through the activation function.
5. Activation Function : Neurons in hidden and output layers typically apply an activation
function to the weighted sum of their inputs to introduce non-linearity into the network.
Common activation functions include sigmoid, tanh, ReLU (Rectified Linear Unit), and
softmax.
6. Training : The network is trained using an optimization algorithm such as gradient descent to
minimize a loss function that quantifies the difference between the predicted and actual
outputs. During training, the weights and biases of the network are updated iteratively based
on the gradients of the loss function with respect to these parameters.
7. Backpropagation : Backpropagation is the key algorithm used to train multilayer neural
networks. It computes the gradient of the loss function with respect to the weights of the
network using the chain rule of calculus and adjusts the weights in the opposite direction of
the gradient to minimize the loss.
Overall, a multilayer neural network can learn complex mappings between inputs and outputs,
making it suitable for a wide range of tasks such as classification, regression, and pattern
recognition. However, designing an optimal architecture and training the network effectively
require careful consideration of factors such as the number of layers, the number of neurons per
layer, choice of activation functions, and regularization techniques.