0% found this document useful (0 votes)
0 views

Logistic Regression in Machine Learning

Logistic regression is a supervised learning algorithm used for classification tasks, predicting the probability of a categorical outcome using a logistic function. It is interpretable, efficient, and commonly applied in areas like spam detection and medical diagnosis, but struggles with non-linear relationships. Support Vector Machines (SVM) and neural networks are also discussed, highlighting their capabilities in handling complex data and various applications in machine learning.

Uploaded by

abhay9955yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Logistic Regression in Machine Learning

Logistic regression is a supervised learning algorithm used for classification tasks, predicting the probability of a categorical outcome using a logistic function. It is interpretable, efficient, and commonly applied in areas like spam detection and medical diagnosis, but struggles with non-linear relationships. Support Vector Machines (SVM) and neural networks are also discussed, highlighting their capabilities in handling complex data and various applications in machine learning.

Uploaded by

abhay9955yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Logistic Regression in Machine Learning

Logistic regression is a foundational algorithm in machine learning, primarily used for


classification tasks rather than regression, despite its name. It is a type of supervised learning
algorithm that predicts the probability of a categorical outcome-most commonly a binary
outcome (e.g., yes/no, spam/not spam) [1] [2] [3] [4] .

How Logistic Regression Works


Model Structure: Logistic regression models the relationship between one or more
independent variables (features) and a dependent variable (target) using a logistic
(sigmoid) function [1] [2] [3] .
Logistic (Sigmoid) Function: The core of logistic regression is the sigmoid function, which
maps any real-valued input into the range (0, 1):

where $ z = w_0 + w_1x_1 + w_2x_2 + ... + w_nx_n $, with $ w $ as weights and $ x $ as


features [2] [3] .
Probability Output: The output of the sigmoid function is interpreted as the probability that
a given input belongs to the positive class. A threshold (commonly 0.5) is used to assign a
class label [1] [3] [5] .

Training the Model


Optimization: The model is trained by finding the optimal weights that minimize the
difference between predicted probabilities and actual class labels. This is typically done
using maximum likelihood estimation or gradient descent, optimizing a loss function such as
cross-entropy (log loss) [1] [2] [5] .
Interpretability: Coefficients in logistic regression indicate the direction and strength of the
relationship between each feature and the outcome, making the model highly
interpretable [3] .

Types of Logistic Regression


Binary Logistic Regression: Predicts one of two possible outcomes (e.g., disease/no
disease) [1] [3] .
Multinomial Logistic Regression: Handles cases where the outcome has more than two
categories.
Ordinal Logistic Regression: Used when the output variable is ordinal (ordered categories).
Key Properties and Assumptions
The dependent variable is categorical (binary or multi-class).
The observations are independent.
There is little or no multicollinearity among the independent variables.
The relationship between the independent variables and the log odds of the dependent
variable is linear [1] .

Practical Example
Suppose you want to predict whether a tumor is cancerous based on its size. You can train a
logistic regression model with tumor size as the feature and cancer status (0 or 1) as the target.
The model will output the probability that a tumor of a given size is cancerous, and you can
classify it based on a chosen threshold [6] .

Applications
Email spam detection
Credit scoring (default prediction)
Medical diagnosis (disease prediction)
Customer churn prediction [3] [4]

Advantages
Simple and fast to train and implement
Highly interpretable results
Provides probabilistic outputs
Works well for linearly separable data [3]

Limitations
Struggles with non-linear relationships unless features are engineered or transformed
Can be less accurate than more complex models on complex datasets
Sensitive to outliers and irrelevant features [3]

Logistic regression remains a go-to method for many real-world classification problems due to its
speed, interpretability, and solid performance on suitable datasets [3] [4] .

Support Vector Machine (SVM) in Machine Learning
Support Vector Machine (SVM) is a powerful and versatile supervised learning algorithm widely
used for classification, regression, and outlier detection tasks in machine learning [7] [8] [9] .

Core Concept
Objective: SVM aims to find the optimal hyperplane that best separates data points of
different classes. The optimal hyperplane is defined as the one with the largest margin-the
maximum distance between the hyperplane and the nearest data points from each class,
called support vectors [10] [8] [9] [11] .
Support Vectors: These are the data points closest to the hyperplane and are critical in
defining the position and orientation of the hyperplane [10] [12] .

How SVM Works


Linear SVM: For data that is linearly separable, SVM finds a straight-line (in 2D) or a
hyperplane (in higher dimensions) that divides the classes with the largest possible
margin [10] [8] [11] .
Nonlinear SVM (Kernel Trick): When data is not linearly separable, SVM uses kernel
functions to transform the input data into a higher-dimensional space, where a linear
separator can be found. Common kernels include polynomial, radial basis function (RBF),
and sigmoid [7] [10] [11] .
Soft Margin: In real-world scenarios, perfect separation may not be possible. SVM allows
for some misclassifications (soft margin) to improve generalization and handle noisy data [8]
[13] .

Types of SVM
Type Description

Linear SVM Uses a linear kernel; suitable for linearly separable data [11] .

Nonlinear SVM Uses kernel functions to handle non-linearly separable data [7] [11] .

Support Vector Regression (SVR) Adapts SVM for regression tasks, predicting continuous values [7] [11] .

One-class SVM Used for outlier or anomaly detection [11] .

Advantages
Effective in high-dimensional spaces and when the number of features exceeds the number
of samples [14] [13] .
Memory efficient, as only support vectors are used in the decision function [14] .
Versatile, with customizable kernels for different data types and structures [14] [13] .
Robust to overfitting, especially in high-dimensional space due to margin maximization [13] .
Limitations
Choosing the right kernel and tuning parameters can be complex [14] [13] .
Computationally intensive for large datasets [14] .
Less interpretable than simpler models like logistic regression [7] .

Applications
Text classification
Image and speech recognition
Medical diagnosis
Bioinformatics (e.g., gene classification) [8] [13] [9] [12]

Example (Python Syntax)

from sklearn import svm


clf = svm.SVC(kernel='linear')
clf.fit(X_train, y_train)

This code fits a linear SVM classifier using the Scikit-learn library [10] .

SVM remains a go-to algorithm for many real-world problems, especially where high accuracy
and the ability to handle complex, high-dimensional data are required [13] [12] .

Kernel Functions and Kernel SVM in Machine Learning


Kernel functions are mathematical tools that enable machine learning algorithms to handle non-
linear data patterns efficiently by implicitly mapping input data into higher-dimensional spaces.
In Support Vector Machines (SVMs), kernels are pivotal for solving complex classification tasks
that linear models cannot address directly [15] [16] [17] .

What Is a Kernel Function?


A kernel function $ K(x, y) $ computes the similarity between two data points $ x $ and $ y $ in
a transformed high-dimensional space, avoiding the need to explicitly calculate the coordinates
in that space. It is defined as:

where $ \phi(x) $ maps the input data to a higher-dimensional feature space [18] [16] . This allows
algorithms to work with non-linear relationships without computational overhead.
The Kernel Trick in SVMs
The kernel trick is a computational shortcut that enables SVMs to operate in high-dimensional
spaces by replacing direct transformations with kernel-based similarity calculations. Key aspects
include:
Implicit Mapping: Instead of computing $ \phi(x) $, the kernel directly calculates the dot
product in the transformed space [15] [18] .
Efficiency: Avoids the "curse of dimensionality," where high-dimensional computations
become infeasible [16] [17] .
Applications: Enables linear classifiers like SVMs to solve non-linear problems by finding
optimal hyperplanes in the transformed space [15] [17] .

Common Kernel Functions in SVMs


Kernel Type Formula Use Case

Linear $ K(x, y) = x \cdot y $ Linearly separable data [17] [19] .

Captures polynomial relationships (e.g., $ d=3


Polynomial $ K(x, y) = (x \cdot y + c)^d $
$) [17] [19] .

Radial Basis $ K(x, y) = \exp\left(-\gamma |x -


Non-linear, high-dimensional data [16] [17] .
(RBF) y|^2\right) $

Sigmoid $ K(x, y) = \tanh(\alpha x \cdot y + c) $ Neural network-inspired models [17] [20] .

Key Properties:
Kernels must satisfy Mercer's theorem (positive semi-definite) [18] .
The choice of kernel and parameters (e.g., $ d $, $ \gamma $) significantly impacts SVM
performance [17] [20] .

Why Use Kernel SVM?


1. Non-Linear Separation: Transforms inseparable data into a linearly separable format (e.g.,
using RBF for complex boundaries) [16] [17] .
2. Computational Efficiency: Reduces complexity from $ O(n^3) $ to $ O(n^2) $ by avoiding
explicit high-dimensional computations [15] [16] .
3. Flexibility: Adaptable to diverse data types (text, images, sequences) via custom
kernels [17] [20] .
Applications of Kernel SVM
Image Recognition: Polynomial kernels detect patterns in pixel data [17] [19] .
Text Classification: Linear/RBF kernels categorize documents [17] .
Bioinformatics: Identifying gene expressions with non-linear kernels [16] .

Advantages and Limitations


Advantages Limitations

Handles high-dimensional data efficiently Kernel selection and tuning can be complex [17] [20] .

Robust to overfitting with proper margins Computationally intensive for large datasets [17] .

Works with sparse or structured data Less interpretable than linear models [16] [17] .

Kernel SVMs are a cornerstone of modern machine learning, enabling models to tackle intricate
patterns while balancing accuracy and computational feasibility. By leveraging the kernel trick,
they extend linear methods to non-linear domains, making them indispensable for tasks like
image classification and medical diagnosis [16] [17] [19] .

Neural Network in Machine Learning


A neural network (also known as an artificial neural network or ANN) is a computational model
inspired by the structure and function of the human brain. It consists of interconnected units
called artificial neurons or nodes, which are organized into layers: an input layer, one or more
hidden layers, and an output layer [21] [22] [23] [24] .

How Neural Networks Work


Structure: Each neuron receives input signals, applies weights to them, sums the results,
and passes the sum through an activation function to produce an output. If the output
exceeds a certain threshold, the neuron activates and transmits its signal to the next
layer [21] [22] [23] [24] .
Learning: Neural networks learn from data by adjusting the weights of connections
between neurons. This process is typically done through training, where the network is
exposed to large datasets and iteratively updates its weights to minimize the error between
its predictions and the actual outcomes. The backpropagation algorithm is commonly used
for this purpose, where errors are propagated backward through the network to refine the
weights [24] .
Layers:
Input Layer: Receives raw data (e.g., images, text, numbers).
Hidden Layers: Perform complex transformations and feature extraction.
Output Layer: Produces the final prediction or classification [21] [22] [23] [24] .
Key Features
Nonlinear Processing: Activation functions (like sigmoid, ReLU, or tanh) allow neural
networks to model complex, nonlinear relationships in data [22] [24] .
Adaptability: Neural networks can adapt to new data and changing patterns, making them
suitable for dynamic environments [25] .
Deep Learning: When a neural network has multiple hidden layers, it is called a deep neural
network (DNN), which is the foundation of deep learning [22] [26] .

Applications
Neural networks are widely used in:
Image and speech recognition
Natural language processing
Predictive analytics
Decision-making systems
Medical diagnosis
Financial forecasting [21] [23] [24] [27]

Advantages
Can learn and model complex, nonlinear relationships
Capable of handling large and high-dimensional datasets
Self-improving through training and exposure to more data

Limitations
Require large amounts of data and computational resources
Can be seen as "black boxes" with limited interpretability
Prone to overfitting if not properly regularized

Neural networks are at the heart of modern machine learning and artificial intelligence, enabling
breakthroughs in fields ranging from computer vision to natural language understanding [21] [22]
[24] .

Perceptron in Machine Learning


A perceptron is the simplest type of artificial neural network and serves as a fundamental
building block for more complex neural networks. It is a supervised learning algorithm primarily
used for binary classification tasks, meaning it decides whether an input belongs to one of two
possible classes [28] [29] [30] .
How a Perceptron Works
Inputs and Weights: The perceptron receives several input values, each associated with a
weight. These weights are learned during training [31] [32] .
Weighted Sum: It computes the weighted sum of the inputs and adds a bias term.
Mathematically, this can be represented as:

where are weights, are input features, and is the bias [28] [30] .
Activation Function: The result is passed through an activation function, typically the
Heaviside step function (also called a threshold function). If the output exceeds a certain
threshold (commonly zero), the perceptron outputs 1; otherwise, it outputs 0 [28] [33] [32] .
Output: The binary output classifies the input as either a positive or negative instance [28]
[31] .

Perceptron Learning Algorithm


1. Initialize the weights and bias, often with small random values [31] [32] .
2. For each training example:
Compute the output using the current weights and bias.
Compare the output to the actual label.
Update the weights and bias if the prediction is incorrect, moving them in the direction
that reduces the error [31] [32] .
3. Repeat until the algorithm converges or a maximum number of iterations is reached.

Key Characteristics
Linear Classifier: The perceptron can only solve problems where the classes are linearly
separable, meaning a straight line (or hyperplane) can separate the two classes [28] [34] [33] .
Single-Layer: It consists of a single layer of computation, distinguishing it from more
complex, multi-layer neural networks [28] [29] .
Foundation for Neural Networks: The perceptron laid the groundwork for the development
of multi-layer perceptrons and deep learning models [30] [35] .

Limitations
Cannot solve non-linearly separable problems (e.g., XOR problem).
Only suitable for binary classification tasks [28] [34] .

Applications
Simple binary classification tasks, such as spam detection or basic image recognition [34]
[30] .
The perceptron remains a cornerstone concept in machine learning, illustrating the principles of
neural computation and supervised learning, and providing the basis for more advanced neural
network architectures [28] [29] [30] .

Multilayer Network in Machine Learning


A multilayer network, most commonly referred to as a multilayer perceptron (MLP) or
multilayer neural network, is a foundational architecture in modern machine learning and deep
learning. Unlike single-layer networks, multilayer networks can model complex, non-linear
relationships in data, making them suitable for a wide range of real-world tasks.

Architecture
A multilayer network is composed of three main types of layers:
Input Layer: Receives the raw input data. Each neuron in this layer corresponds to a feature
or dimension of the input data. The input layer simply passes the data to the next layer
without computation [36] [37] [38] .
Hidden Layers: One or more layers between the input and output layers. Each neuron in a
hidden layer receives inputs from all neurons in the previous layer, computes a weighted
sum plus a bias, and passes the result through a non-linear activation function (e.g., ReLU,
sigmoid, tanh). These layers enable the network to learn hierarchical and abstract
representations of the data, capturing complex patterns that single-layer networks
cannot [36] [37] [38] [39] [40] .
Output Layer: Produces the final predictions. The number of neurons in this layer depends
on the task (e.g., one neuron for binary classification, multiple for multi-class classification or
regression). The output is also passed through an activation function appropriate for the
task (e.g., softmax for classification) [36] [38] [39] .

How Multilayer Networks Work


1. Forward Propagation: Input data is passed through the network layer by layer. Each neuron
in a layer computes a weighted sum of its inputs, adds a bias, and applies an activation
function. The output of one layer becomes the input to the next [39] [41] .
2. Activation Functions: Non-linear functions (such as sigmoid, ReLU, or tanh) introduce non-
linearity, allowing the network to learn complex relationships [36] [39] [41] .
3. Training (Backpropagation): The network is trained using labeled data. It predicts outputs,
calculates the error (loss), and uses backpropagation to update weights and biases in all
layers to minimize this error, typically using an optimization algorithm like stochastic gradient
descent (SGD) [36] [38] [41] [40] .
Key Features and Importance
Non-linear Modeling: By stacking multiple layers with non-linear activation functions,
multilayer networks can approximate any continuous function, making them universal
function approximators [36] [40] .
Hierarchical Feature Learning: Each hidden layer extracts increasingly abstract features
from the data, enabling the network to solve complex tasks such as image recognition,
natural language processing, and speech recognition [36] [38] .
Deep Learning Foundation: Multilayer networks are the basis of deep learning, where
networks with many hidden layers (deep neural networks) are used to tackle highly complex
problems [38] [40] .

Applications
Image and speech recognition
Natural language processing
Predictive analytics
Game-playing agents
Financial forecasting [38]

Summary Table
Layer Type Role

Input Layer Receives and passes input data to the network

Hidden Layer Learns intermediate, abstract representations

Output Layer Produces final prediction or classification

Multilayer networks have revolutionized machine learning by enabling the modeling of complex,
non-linear patterns in data, forming the backbone of modern artificial intelligence and deep
learning applications [36] [38] [40] .

Backpropagation in Machine Learning


Backpropagation is a fundamental algorithm used to train artificial neural networks by
minimizing the error between predicted and actual outputs. It works by efficiently calculating
how the weights and biases in a network should be adjusted to reduce prediction error, making it
essential for deep learning and complex neural network models [42] [43] [44] .
How Backpropagation Works
1. Forward Pass:
Input data is fed through the network, layer by layer, to produce an output.
The output is compared to the actual target value, and the error (loss) is calculated
using a cost function [45] [44] .
2. Backward Pass (Error Propagation):
The error is propagated backward from the output layer through the hidden layers to
the input layer.
Using the chain rule from calculus, the algorithm computes the gradient (partial
derivatives) of the loss function with respect to each weight and bias in the network [43]
[46] [44] .

These gradients indicate how much each parameter contributed to the error.
3. Weight Update:
The gradients are used in an optimization algorithm, typically gradient descent, to
update the weights and biases in a direction that reduces the error [42] [43] [44] .
This process is repeated for many iterations (epochs), allowing the network to learn and
improve its predictions over time.

Key Features
Gradient Calculation: Backpropagation efficiently computes the gradients for all
parameters using the chain rule, making it feasible to train large, multi-layer networks [43]
[46] [44] .

Error Minimization: The algorithm’s goal is to adjust the network’s parameters to minimize
the cost function, thereby improving accuracy [42] [43] [44] .
Supervised Learning: Backpropagation requires labeled data, as it needs the correct
output for each input to compute the error [44] .

Importance and Applications


Backpropagation is the backbone of training deep neural networks and is widely used in
applications like image recognition, natural language processing, and time series
forecasting [44] .
Its efficiency and scalability have enabled the development of modern AI systems capable
of handling complex, non-linear relationships in data [43] [44] .

In summary, backpropagation is a supervised learning algorithm that enables neural networks to


learn from data by propagating errors backward and updating weights to minimize prediction
errors, making it a cornerstone of modern machine learning [42] [43] [44] .

Introduction to Deep Neural Networks in Machine Learning
A deep neural network (DNN) is a type of artificial neural network distinguished by having
multiple layers between the input and output, enabling it to learn and represent highly complex
patterns in data. DNNs are the core architecture behind deep learning, a powerful subset of
machine learning that has driven major advances in artificial intelligence.

Key Concepts of Deep Neural Networks


Structure:
A DNN consists of an input layer (receiving raw data), multiple hidden layers (performing
transformations and feature extraction), and an output layer (producing the final prediction).
The “deep” in DNN refers to the presence of many hidden layers-sometimes dozens or even
hundreds-which allows the network to model intricate relationships and hierarchies in the
data [47] [48] .
Neurons and Layers:
Each layer contains nodes (neurons) that process inputs by applying weights, biases, and
an activation function. These activation functions (like ReLU, sigmoid, or tanh) introduce
non-linearity, enabling the network to learn complex functions [49] [47] .
Learning Process:
DNNs learn by adjusting the weights and biases of their neurons to minimize the difference
between predicted and actual outputs. This is achieved through a process called
backpropagation combined with optimization techniques like gradient descent [49] [47] .

Advantages Over Traditional Machine Learning


Automatic Feature Learning:
Unlike traditional machine learning, which often requires manual feature engineering, deep
neural networks automatically learn relevant features from raw data, making them especially
effective for unstructured data like images, audio, and text [49] [50] .
Hierarchical Representation:
Each hidden layer in a DNN learns increasingly abstract representations of the data. For
example, in image recognition, early layers might detect edges, while deeper layers identify
shapes, objects, or faces [48] .

Applications of Deep Neural Networks


DNNs have enabled breakthroughs in areas such as:
Image and speech recognition
Natural language processing (NLP)
Autonomous vehicles
Medical diagnosis
Recommendation systems [47] [48]

Summary Table: Deep Neural Network Architecture


Layer Type Function

Input Layer Receives raw data (e.g., pixels, text, audio)

Hidden Layers Extract features, build complex representations

Output Layer Produces final prediction or classification

Deep neural networks have transformed machine learning by enabling systems to learn directly
from vast, complex datasets-often surpassing human-level performance in tasks like image and
speech recognition. Their ability to automatically extract and combine features from raw data is
what sets them apart from traditional machine learning approaches [49] [47] [50] .

Comparison Between Logistic Regression and Linear Regression


Feature Linear Regression Logistic Regression

Predicts continuous outcomes Predicts categorical outcomes


Purpose
(regression tasks) (classification tasks)

Continuous values (can range from Probabilities between 0 and 1; final


Output
to ) output is categorical

Dependent Variable
Continuous Categorical (often binary: 0/1, yes/no)
Type

Equation , where

Model Shape Straight line (best-fit line) S-shaped curve (sigmoid/logistic curve)

Relationship Assumes a linear relationship between Models the log-odds (logit) of the
Assumption variables outcome

Least Squares (minimizes sum of Maximum Likelihood Estimation


Estimation Method
squared errors) (maximizes likelihood)

Accuracy, precision, recall, confusion


Evaluation Metrics R-squared, RMSE, MAE
matrix

House price prediction, stock Spam detection, disease diagnosis, fraud


Common Use Cases
forecasting detection

Interprets effect on log-odds or


Interpretability Direct interpretation of variable impact
probability

Prone to overfitting with irrelevant


Overfitting Risk Handles irrelevant variables better
variables

Requires threshold (e.g., 0.5) for


Cut-off/Threshold Not applicable
classification
Summary of Key Differences
Nature of Prediction:
Linear regression is used for predicting continuous numeric values, such as prices or
temperatures [51] [52] [53] [54] [55] [56] .
Logistic regression is used for predicting the probability of a categorical event, typically
binary (e.g., yes/no, spam/not spam) [51] [52] [53] [54] [55] [56] .
Mathematical Approach:
Linear regression fits a straight line to the data using a linear equation [57] [56] .
Logistic regression uses the logistic (sigmoid) function to map predictions to
probabilities between 0 and 1 [54] [57] [55] .
Output Interpretation:
Linear regression outputs a numeric value directly [52] [58] [55] .
Logistic regression outputs a probability, which is then mapped to a class label using a
threshold (commonly 0.5) [53] [54] [55] .
Evaluation:
Linear regression is evaluated using metrics like mean squared error (MSE), R-squared,
and root mean squared error (RMSE) [59] [58] [57] .
Logistic regression is evaluated using classification metrics such as accuracy, precision,
recall, and confusion matrix [59] [58] [57] .
Assumptions:
Linear regression assumes a linear relationship between dependent and independent
variables [53] [54] .
Logistic regression does not require a strictly linear relationship and models the log odds
instead [53] [54] .

In summary:
Use linear regression when your target variable is continuous and you need to predict a
numeric value. Use logistic regression when your target variable is categorical (especially
binary) and you need to estimate the probability of class membership [51] [52] [53] [54] [55] [56] .

1. https://www.spiceworks.com/tech/artificial-intelligence/articles/what-is-logistic-regression/
2. https://www.linkedin.com/pulse/understanding-logistic-regression-machine-learning-aritra-pain
3. https://www.grammarly.com/blog/ai/what-is-logistic-regression/
4. https://www.keboola.com/blog/logistic-regression-machine-learning
5. https://www.ibm.com/think/topics/logistic-regression
6. https://www.w3schools.com/python/python_ml_logistic_regression.asp
7. https://en.wikipedia.org/wiki/Support_vector_machine
8. https://uk.mathworks.com/discovery/support-vector-machine.html
9. https://www.spiceworks.com/tech/big-data/articles/what-is-support-vector-machine/
10. https://www.tutorialspoint.com/introduction-to-support-vector-machines-svm
11. https://www.techtarget.com/whatis/definition/support-vector-machine-SVM
12. https://serokell.io/blog/support-vector-machine-algorithm
13. https://www.analytixlabs.co.in/blog/introduction-support-vector-machine-algorithm/
14. https://scikit-learn.org/stable/modules/svm.html
15. https://dida.do/blog/what-is-kernel-in-machine-learning
16. https://www.appliedaicourse.com/blog/kernel-methods-in-machine-learning/
17. https://data-flair.training/blogs/svm-kernel-functions/
18. https://wikidoc.org/index.php/Kernel_trick
19. https://blog.devgenius.io/machine-learning-algorithm-series-polynomial-kernel-svm-understanding-th
e-basics-and-applications-89b4b42df137?gi=ad51f19f389d
20. https://techvidvan.com/tutorials/svm-kernel-functions/
21. https://www.ibm.com/think/topics/neural-networks
22. https://en.wikipedia.org/wiki/Neural_network_(machine_learning)
23. https://nordvpn.com/blog/what-is-neural-network/
24. https://cloud.google.com/discover/what-is-a-neural-network
25. https://builtin.com/machine-learning/nn-models
26. https://www.omdena.com/blog/types-of-neural-network-algorithms-in-machine-learning
27. https://www.techtarget.com/searchenterpriseai/definition/neural-network
28. https://en.wikipedia.org/wiki/Perceptron
29. https://www.analytixlabs.co.in/blog/what-is-perceptron/
30. https://www.pickl.ai/blog/perceptron-a-comprehensive-overview/
31. https://www.scaler.com/topics/machine-learning/perceptron-learning-algorithm/
32. https://klu.ai/glossary/perceptron
33. https://www.simplilearn.com/tutorials/deep-learning-tutorial/perceptron
34. https://futurense.com/uni-blog/what-is-perceptron-in-machine-learning
35. https://brilliant.org/wiki/perceptron/
36. https://www.datacamp.com/tutorial/multilayer-perceptrons-in-machine-learning
37. https://alan-turing-institute.github.io/Intro-to-transparent-ML-course/10-deep-cnn-rnn/multilayer-nn.ht
ml
38. https://www.devx.com/terms/multi-layer-neural-network/
39. https://web.engr.oregonstate.edu/~huanlian/teaching/ML/2024fall/unit4/multilayer.html
40. https://en.wikipedia.org/wiki/Multilayer_perceptron
41. https://www.youtube.com/watch?v=pzjmmiK1uKg
42. https://www.techtarget.com/searchenterpriseai/definition/backpropagation-algorithm
43. https://www.ibm.com/think/topics/backpropagation
44. https://www.appliedaicourse.com/blog/backpropagation-algorithm-in-machine-learning/
45. https://intellipaat.com/blog/tutorial/artificial-intelligence-tutorial/back-propagation-algorithm/
46. https://www.globaltechcouncil.org/machine-learning/propagation-algorithm/
47. https://botpress.com/blog/deep-neural-network
48. https://data-flair.training/blogs/deep-learning-tutorial/
49. https://www.linkedin.com/pulse/introduction-deep-learning-basics-neural-networks-ibrahim-chaudhry-
7my7c
50. https://aws.amazon.com/what-is/neural-network/
51. https://www.spiceworks.com/tech/artificial-intelligence/articles/linear-regression-vs-logistic-regressio
n/
52. https://www.simplilearn.com/tutorials/machine-learning-tutorial/linear-regression-vs-logistic-regression
53. https://aws.amazon.com/compare/the-difference-between-linear-regression-and-logistic-regression/
54. https://www.wallstreetmojo.com/logistic-regression-vs-linear-regression/
55. https://www.upgrad.com/blog/linear-regression-vs-logistic-regression/
56. https://www.linkedin.com/pulse/logistic-regression-vs-linear-understanding-key-erin
57. https://www.coursera.org/articles/linear-regression-vs-logistic-regression
58. https://www.freecodecamp.org/news/linear-regression-vs-logistic-regression/
59. https://enjoymachinelearning.com/blog/linear-vs-logistic-regression/

You might also like