Experiment 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Experiment - 1

Name: Ansari Mohammed Shanouf Valijan


Class: B.E. Computer Engineering, Semester - VII
UID: 2021300004
Batch: M

Aim:
To implement a deep feed-forward neural network for a given problem.

Pre-Lab Questions:
1. What is a deep feed-forward neural network?

A deep feed-forward neural network (DFNN) is a type of artificial neural network where
connections between the nodes do not form cycles. It's called "deep" because it consists of
multiple layers of neurons, including an input layer, several hidden layers, and an output layer.
Each neuron in a layer is connected to every neuron in the subsequent layer, and information
flows in one direction—from input to output—without looping back. The network's depth
and complexity enable it to model and learn complex patterns and relationships in data,
making it a powerful tool for tasks like classification, regression, and more.

2. What are activation functions and why are they important?

Activation functions are mathematical functions applied to the output of each neuron in a
neural network, determining whether the neuron should be activated or not. They introduce
non-linearity into the model, enabling the network to learn complex patterns and
relationships in data that linear functions alone cannot capture. Without activation functions,
even a deep network would behave like a single-layer perceptron, limiting its ability to solve
intricate tasks. Common activation functions include ReLU, sigmoid, and tanh, each offering
different advantages for learning efficiency and model performance. They are crucial for
enabling neural networks to approximate complex functions and perform well on diverse
tasks.

3. What is backpropagation and how does it work?

Backpropagation is a supervised learning algorithm used to train artificial neural networks by


minimizing the error between predicted and actual outputs. It works through a process of
updating the weights of the network to reduce this error. The process involves two main
steps:
1. Forward Pass: Input data is passed through the network layer by layer, and predictions
are made using the current weights. The loss (or error) is computed by comparing the
predicted output to the actual target values.
2. Backward Pass: The error is propagated back through the network using the chain rule
of calculus. Gradients of the loss function with respect to each weight are calculated,
indicating how much a weight contributes to the error. These gradients are then used
to update the weights in the opposite direction of the gradient, reducing the error. This
process is repeated iteratively through multiple epochs until the network learns to
make accurate predictions.
Through this iterative process of adjusting weights based on the computed gradients,
backpropagation optimizes the neural network’s performance.
4. What is the role of a loss function in training a neural network?

A loss function quantifies the difference between the neural network's predicted outputs and
the actual target values, serving as a measure of model performance. During training, it
provides a numerical value representing how well or poorly the network is performing on a
given task. The goal of training is to minimize this loss by adjusting the network's weights
through optimization algorithms like gradient descent. By minimizing the loss function, the
network learns to make predictions that are closer to the actual values, thereby improving its
accuracy and effectiveness in solving the task at hand.

5. Why is normalization or standardization important before training a neural network?

Normalization or standardization is crucial before training a neural network because it


ensures that input features have similar scales and distributions, which helps in speeding up
convergence and improving the stability of the training process. Normalization typically scales
features to a range, such as [0, 1], while standardization transforms features to have a mean
of zero and a standard deviation of one. Without these preprocessing steps, features with
larger scales could disproportionately influence the model, leading to slower training and less
effective learning. Consistent feature scaling helps the network learn more efficiently by
enabling more uniform gradient updates across all features.

Problem Statement:
Title – Predicting the presence or absence of objects in ionosphere of earth based on various
signals received.

Objective – To build a deep feed-forward neural network that takes in the 33 signal inputs
obtained from various devices monitoring the ionosphere of earth and outputs the
probability of absence of objects in the said layer.

Dataset – Ionosphere (https://archive.ics.uci.edu/dataset/52/ionosphere)


Network Architecture –
• Input Layer: 33 neurons.
• Hidden Layer-1: 4 neurons, ReLU activation function.
• Hidden Layer-2: 4 neurons, ReLU activation function.
• Output Layer: 1 neuron, Sigmoid activation function.
Loss Function – Binary Cross Entropy.
Optimizer – adam.
Performance Metric – Loss, Accuracy.

Tools/Libraries – tensorflow (for building and training the model), ucimlrepo (for dataset).

Implementation:
Following is the step by step implementation of the above problem statement as carried out
on google colab (Notebook - FeedForwardNetwork)

Importing necessary libraries-


from ucimlrepo import fetch_ucirepo
import pandas as pd
from tensorflow import keras
from tensorflow.keras import layers

Loading the Data-


# fetch dataset
ionosphere = fetch_ucirepo(id=52)

# data (as pandas dataframes)


X = ionosphere.data.features
y = ionosphere.data.targets

Preprocessing the Data-


a. Concatenating the features and output dataframes to preprocess simultaneously-
df = pd.concat([X,y], axis=1)

b. Encoding the output class ‘g’ as 0 (if object is present) and ‘b’ as 1 (if no detection of object)-
df['Class'] = df['Class'].map({'g': 0, 'b': 1})

c. Normalizing the data and splitting it into train and validation sets (7:3 split)-
df_train = df.sample(frac=0.7, random_state=0)
df_valid = df.drop(df_train.index)
max_ = df_train.max(axis=0)
min_ = df_train.min(axis=0)

df_train = (df_train - min_) / (max_ - min_)


df_valid = (df_valid - min_) / (max_ - min_)
df_train.dropna(axis=1, inplace=True)
df_valid.dropna(axis=1, inplace=True)

X_train = df_train.drop('Class', axis=1)


X_valid = df_valid.drop('Class', axis=1)
y_train = df_train['Class']
y_valid = df_valid['Class']

Building the feed-forward neural network-


model = keras.Sequential([
layers.Dense(4, activation='relu', input_shape=[33]),
layers.Dense(4, activation='relu'),
layers.Dense(1, activation='sigmoid'),
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['binary_accuracy'],
)

Training the model (using early stopping to avoid over-fitting)-


early_stopping = keras.callbacks.EarlyStopping(
patience=10,
min_delta=0.001,
restore_best_weights=True,
)

history = model.fit(
X_train, y_train,
validation_data=(X_valid, y_valid),
batch_size=512,
epochs=1000,
callbacks=[early_stopping],
verbose=0,
)

Evaluating the model on validation set-


history_df = pd.DataFrame(history.history)
# Start the plot at epoch 5
history_df.loc[5:, ['loss', 'val_loss']].plot()

print(("Best Validation Loss: {:0.4f}" +\


"\nBest Validation Accuracy: {:0.4f}")\
.format(history_df['val_loss'].min(),
history_df['val_binary_accuracy'].max()))

Best Validation Loss: 0.3532


Best Validation Accuracy: 0.8476

Evolution of Total loss(Blue Line) and Validation Loss (Orange Line) as Epochs Progress

Making predictions on trained model-


input_data = pd.DataFrame(X_valid.loc[23])

predicted_output = model.predict(input_data.T)

print(f'True Output ----> {y_valid.loc[23]}, Predicted Output ---->


{round(predicted_output[0][0])}')
print('0 ----> Presence of Object in Ionosphere, 1 ----> Absense of Object in
Ionosphere')

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 31ms/step


True Output ----> 1.0, Predicted Output ----> 1
0 ----> Presence of Object in Ionosphere, 1 ----> Absense of Object in Ionosphere

input_data = pd.DataFrame(X_valid.loc[0])

predicted_output = model.predict(input_data.T)

print(f'True Output ----> {y_valid.loc[0]}, Predicted Output ---->


{round(predicted_output[0][0])}')
print('0 ----> Presence of Object in Ionosphere, 1 ----> Absense of Object in
Ionosphere')

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 31ms/step


True Output ----> 0.0, Predicted Output ----> 0
0 ----> Presence of Object in Ionosphere, 1 ----> Absense of Object in Ionosphere

Inferences:
• The trained model was found to have a validation accuracy of 84.76% at around 600
epochs, after which, no significant change in accuracy was observed under the
architecture where 2 hidden layers, each with 4 neurons, were considered.
• Similarly, the loss value for validation set was found to be stabilized at 0.3532.
• The new architecture (as described above) showed an improvement in accuracy from
75% (as implemented during lab) to 84.76% (with an additional hidden layer).
• On using the trained model to predict the presence of objects in ionosphere, the model
was accurately able to match the true output as mentioned in the last section of
implementation.

You might also like