Function Activate

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Building

Sigmoid Activation Function


from Scratch in Python

Without relying on high-level libraries


1 ANSHUMAN JHA
Building Sigmoid Activation Function from Scratch in Python

Table of Contents
1. Introduction to Sigmoid Activation Function
a. What are Activation Functions?
b. Why are Activation Functions Used?
c. Types of Activation Functions:
d. Choosing the Right Activation Function:
2. What is Sigmoid Activation Function?
3. Comparison of ReLU, Leaky ReLU, and Sigmoid Activation Functions
4. The Structure of an Sigmoid Activation Function
5. Implementation Sigmoid Activation Function from Scratch in Python
a. Step 1: Import Necessary Libraries
b. Step 2: Define the Sigmoid Function
c. Step 3: Implementing the Sigmoid Function Derivative
d. Step 4: Plot the Sigmoid Function and Its Derivative
6. Conclusion

2 ANSHUMAN JHA
Building Sigmoid Activation Function from Scratch in Python
1. Introduction to Sigmoid Activation Function
Activation functions are a crucial component of artificial neural networks, playing a key role in enabling the
neural networks to learn complex patterns and make intelligent decisions. The sigmoid function is a type of
activation function that maps any real-valued number into a value between 0 and 1. This is particularly useful in
binary classification problems where the output can be interpreted as a probability.

In this post, we'll explore the fundamental concepts behind sigmoid function and implement it from scratch in
Python, without relying on high-level libraries. We'll also provide sample code with detailed comments and
visualizations to help you understand each step.

What are Activation Functions?

In simple terms, an activation function in a neural network decides whether a neuron should be "activated" or
not, based on the input it receives. Imagine a neuron as a light bulb – the activation function determines if the
bulb should light up or stay off.

Why are Activation Functions Used?

1. Introducing Non-linearity: Without activation functions, neural networks would essentially be performing
just linear transformations. Real-world data often exhibits non-linear relationships, meaning a straight line can't
accurately represent the patterns. Activation functions introduce non-linearity, allowing the network to model
and learn these complex relationships.

2. Decision Boundary Creation: Activation functions help neural networks create decision boundaries. For
example, in image classification, an activation function can help the network decide whether a picture is of a cat
or a dog by drawing a boundary between the two categories based on learned features.

3. Controlling Neuron Output: Activation functions control the output of a neuron, keeping it within a desired
range. This is important for stability and efficiency during training.

Types of Activation Functions:

There are various types of activation functions, each with its own characteristics and applications, including:

* Sigmoid: Outputs a value between 0 and 1, historically used for binary classification.
* Leaky ReLU (Rectified Linear Unit): Outputs the input directly if positive, otherwise 0. Very popular due
to its computational efficiency.
* Tanh (Hyperbolic Tangent): Similar to sigmoid but outputs values between -1 and 1.
* Softmax: Used in the output layer for multi-class classification, providing probabilities for each class.

Choosing the Right Activation Function:

The choice of activation function depends on the specific task, network architecture, and other factors.
Experimentation and research are often needed to find the most suitable one for a particular problem.

In essence, activation functions are the "brain" behind a neural network's decision-making process, enabling it
to learn intricate patterns and make accurate predictions.

3 ANSHUMAN JHA
Building Sigmoid Activation Function from Scratch in Python

2. What is Sigmoid Activation Function?


The sigmoid function, also called the logistic function, is a mathematical function that takes any real number as
input and outputs a value between 0 and 1.
Here's a breakdown:

Formula: The sigmoid function is defined by the following formula:

sigmoid(x) = 1 / (1 + exp(-x)

where 'x' is the input value and 'exp(-x)' represents the exponential function e raised to the power of -x.

Graph: The graph of the sigmoid function is an "S"-shaped curve that smoothly transitions from 0 to 1 as x
increases.

Key Properties:

Output Range: The output of the sigmoid function is always between 0 and 1. This makes it useful for
representing probabilities or values that need to be bounded within a specific range.
Smoothness: The sigmoid function is differentiable, meaning its derivative exists at all points. This smoothness
is important for optimization algorithms used in machine learning.
Non-linearity: The sigmoid function is non-linear. This property is crucial in neural networks, allowing them
to learn complex patterns and relationships in data.

Common Uses of the Sigmoid Function:

* Logistic Regression: Used to model the probability of a binary outcome (e.g., yes/no, true/false).
* Neural Networks: Commonly used as an activation function in the neurons of artificial neural networks. It
introduces non-linearity into the model, allowing it to learn complex relationships.
* Image Classification: In image classification, sigmoid can be used to output the probability of an image
belonging to a particular class.

4 ANSHUMAN JHA
Building Sigmoid Activation Function from Scratch in Python

3. Comparison of ReLU, Leaky ReLU, and Sigmoid Activation Functions

Feature ReLU Leaky ReLU Sigmoid


Formula f(x) = max(0, x) f(x) = { 0.01x for x < 0; x f(x) = 1 / (1 + exp(-x))
for x >= 0 }
Range [0, ∞) (-∞, ∞) (0, 1)
Derivative 1 (x > 0), 0 (x <= 0) 1 (x > 0), 0.01 (x <= 0) f(x) * (1 - f(x))
Advantages - Simple and - Solves vanishing gradient - Smooth and differentiable
computationally efficient problem for all values - Outputs probabilities
- Solves vanishing gradient - Often performs better than
problem for positive values ReLU in practice
Disadvantages - "Dying ReLU" problem: - Performance can be - Vanishing gradient
Neurons can get stuck for sensitive to the leak problem for very
negative inputs coefficient (usually 0.01) large/small inputs
- Computationally more
expensive than ReLU and
Leaky ReLU
Common Use - Widely used in various - Used when dealing with - Binary classification
Cases deep learning tasks sparse gradients or (output layer)
- Often preferred as the vanishing gradient is a - Cases where probability
default activation function concern outputs are desired

Key Considerations:
o Vanishing Gradient Problem: Refers to the issue where gradients become very small during
backpropagation, hindering learning. ReLU can suffer from this for negative values, while Sigmoid
suffers for very large/small values. Leaky ReLU attempts to mitigate this.
o Computational Cost: ReLU and Leaky ReLU are computationally cheaper than Sigmoid due to their
simpler formulas.
o Output Interpretation: Sigmoid outputs probabilities, making it suitable for binary classification.
ReLU and Leaky ReLU are not bound to a probability range.

5 ANSHUMAN JHA
Building Sigmoid Activation Function from Scratch in Python
4. The Structure of a Sigmoid Activation Function
This Structure includes the steps and sub-steps with appropriate labels and connections. Each step corresponds to a
function or a key part of the process described in the provided implementation.

6 ANSHUMAN JHA
Building Sigmoid Activation Function from Scratch in Python
5. Implementation in Python
Let's implement a simple Sigmoid Activation Functionin Python.

Step 1: Import Necessary Libraries


We'll start by importing the necessary libraries. For this implementation, we'll only need numpy for numerical
computations and matplotlib for visualizations.

import numpy as np
import matplotlib.pyplot as plt

Step 2: Define the Sigmoid Function


Next, we'll define the sigmoid function using the mathematical formula mentioned above.

def sigmoid(x):
"""
Compute the sigmoid of x.

Parameters:
x (float or numpy array): Input value or array of values.

Returns:
float or numpy array: Sigmoid of x.
"""
return 1 / (1 + np.exp(-x))

Explanation:
• Parameters: The function takes a single parameter x, which can be a float or a numpy array.
• Returns: The function returns the sigmoid of x.
Step 3: Derivative of the Sigmoid Function
The derivative of the sigmoid function is useful for backpropagation in neural networks. The derivative is given
by:
σ′(x)=σ(x)⋅(1−σ(x))

Implementing the Derivative

def sigmoid_derivative(x):
"""
Compute the derivative of the sigmoid of x.

Parameters:
x (float or numpy array): Input value or array of values.

Returns:
float or numpy array: Derivative of the sigmoid of x.
"""
sig = sigmoid(x)
return sig * (1 - sig)

Explanation:
• Parameters: The function takes a single parameter x, which can be a float or a numpy array.
• Returns: The function returns the derivative of the sigmoid of x.
7 ANSHUMAN JHA
Building Sigmoid Activation Function from Scratch in Python

Step 4: Visualization
To understand the behavior of the sigmoid function and its derivative, we can visualize them by first defining
an input range using `numpy.linspace` to create an array of values. Next, we compute the sigmoid and its
derivative for each input value. Finally, using `matplotlib`, we plot both the sigmoid function and its derivative
on the same graph, allowing us to observe their relationship visually.

• Define the Input Range


o We'll define a range of input values for which we want to plot the sigmoid function and its
derivative.
• Compute the Sigmoid and Its Derivative
o we'll compute the sigmoid and its derivative for the defined range of input values.
• Plot the Sigmoid Function and Its Derivative
o plot the sigmoid function and its derivative.

# Define input range


x_values = np.linspace(-10, 10, 100)

# Compute sigmoid and its derivative


sigmoid_values = sigmoid(x_values)
derivative_values = sigmoid_derivative(x_values)

# Plot sigmoid function and its derivative


plt.figure(figsize=(10, 5))
plt.plot(x_values, sigmoid_values, label='Sigmoid Function')
plt.plot(x_values, derivative_values, label='Derivative of Sigmoid', linestyle='--')
plt.title('Sigmoid Function and Its Derivative')
plt.xlabel('x')
plt.ylabel('Value')
plt.legend()
plt.grid(True)
plt.show()

Explanation:
• Input Range: We use numpy.linspace to create an array of 100 values ranging from -10 to 10.
• Compute Values: We compute the sigmoid and its derivative for these input values.
• Plot: We use matplotlib to plot the sigmoid function and its derivative on the same graph.

8 ANSHUMAN JHA
Building Sigmoid Activation Function from Scratch in Python

6. Conclusion
In this post, we implemented the sigmoid activation function and its derivative from scratch in Python. We also
visualized the function and its derivative to better understand their behavior. The sigmoid function is
fundamental in machine learning, especially in neural networks, due to its ability to map inputs to a (0, 1)
range, making it suitable for binary classification problems.
Understanding and implementing activation functions like the sigmoid from scratch helps in building a strong
foundation in machine learning and deep learning.

Constructive comments and feedback are welcomed

9 ANSHUMAN JHA

You might also like