Skip to content

Commit a23ddf3

Browse files
authored
Merge pull request animator#677 from Antiquely3059/main
Added Logistic Regression
2 parents ed0ea4b + 7054742 commit a23ddf3

File tree

2 files changed

+116
-0
lines changed

2 files changed

+116
-0
lines changed

contrib/machine-learning/index.md

+1
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@
99
- [TensorFlow.md](tensorFlow.md)
1010
- [PyTorch.md](pytorch.md)
1111
- [Types of optimizers](Types_of_optimizers.md)
12+
- [Logistic Regression](logistic-regression.md)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Logistic Regression
2+
3+
Logistic Regression is a statistical method used for binary classification problems. It is a type of regression analysis where the dependent variable is categorical. This README provides an overview of logistic regression, including its fundamental concepts, assumptions, and how to implement it using Python.
4+
5+
## Table of Contents
6+
7+
1. [Introduction](#introduction)
8+
2. [Concepts](#concepts)
9+
3. [Assumptions](#assumptions)
10+
4. [Implementation](#implementation)
11+
- [Using Scikit-learn](#using-scikit-learn)
12+
- [Code Example](#code-example)
13+
5. [Evaluation Metrics](#evaluation-metrics)
14+
6. [Conclusion](#conclusion)
15+
7. [References](#references)
16+
17+
## Introduction
18+
19+
Logistic Regression is used to model the probability of a binary outcome based on one or more predictor variables (features). It is widely used in various fields such as medical research, social sciences, and machine learning for tasks such as spam detection, fraud detection, and predicting user behavior.
20+
21+
## Concepts
22+
23+
### Sigmoid Function
24+
25+
The logistic regression model uses the sigmoid function to map predicted values to probabilities. The sigmoid function is defined as:
26+
27+
$$
28+
\sigma(z) = \frac{1}{1 + e^{-z}}
29+
$$
30+
31+
Where \( z \) is a linear combination of the input features.
32+
33+
### Odds and Log-Odds
34+
35+
- **Odds**: The odds represent the ratio of the probability of an event occurring to the probability of it not occurring.
36+
37+
$$\text{Odds} = \frac{P(Y=1)}{P(Y=0)}$$
38+
39+
- **Log-Odds**: The log-odds is the natural logarithm of the odds.
40+
41+
$$\text{Log-Odds} = \log \left( \frac{P(Y=1)}{P(Y=0)} \right)$$
42+
43+
Logistic regression models the log-odds as a linear combination of the input features.
44+
45+
### Model Equation
46+
47+
The logistic regression model equation is:
48+
49+
$$
50+
\log \left( \frac{P(Y=1)}{P(Y=0)} \right) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_n X_n
51+
$$
52+
53+
Where:
54+
- β₀ is the intercept.
55+
- &beta;<sub>i</sub> are the coefficients for the predictor variables X<sub>i</sub>.
56+
57+
58+
## Assumptions
59+
60+
1. **Linearity**: The log-odds of the response variable are a linear combination of the predictor variables.
61+
2. **Independence**: Observations should be independent of each other.
62+
3. **No Multicollinearity**: Predictor variables should not be highly correlated with each other.
63+
4. **Large Sample Size**: Logistic regression requires a large sample size to provide reliable results.
64+
65+
## Implementation
66+
67+
### Using Scikit-learn
68+
69+
Scikit-learn is a popular machine learning library in Python that provides tools for logistic regression.
70+
71+
### Code Example
72+
73+
```python
74+
import numpy as np
75+
import pandas as pd
76+
from sklearn.model_selection import train_test_split
77+
from sklearn.linear_model import LogisticRegression
78+
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
79+
80+
# Load dataset
81+
data = pd.read_csv('path/to/your/dataset.csv')
82+
83+
# Define features and target variable
84+
X = data[['feature1', 'feature2', 'feature3']]
85+
y = data['target']
86+
87+
# Split data into training and testing sets
88+
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
89+
90+
# Initialize and train logistic regression model
91+
model = LogisticRegression()
92+
model.fit(X_train, y_train)
93+
94+
# Make predictions
95+
y_pred = model.predict(X_test)
96+
97+
# Evaluate the model
98+
accuracy = accuracy_score(y_test, y_pred)
99+
conf_matrix = confusion_matrix(y_test, y_pred)
100+
class_report = classification_report(y_test, y_pred)
101+
102+
print("Accuracy:", accuracy)
103+
print("Confusion Matrix:\n", conf_matrix)
104+
print("Classification Report:\n", class_report)
105+
```
106+
107+
## Evaluation Metrics
108+
109+
- **Accuracy**: The proportion of correctly classified instances among all instances.
110+
- **Confusion Matrix**: A table showing the number of true positives, true negatives, false positives, and false negatives.
111+
- **Precision, Recall, and F1-Score**: Metrics to evaluate the performance of the classification model.
112+
113+
## Conclusion
114+
115+
Logistic regression is a fundamental classification technique that is easy to implement and interpret. It is a powerful tool for binary classification problems and provides a probabilistic framework for predicting binary outcomes.

0 commit comments

Comments
 (0)