0% found this document useful (0 votes)
30 views

Python Endsem PPT

Uploaded by

dhanrajkrbksc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Python Endsem PPT

Uploaded by

dhanrajkrbksc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Understanding

Chi-Square,
Normal Distribution, and
Z-Score Standardization
in Python

@Sumitra Pun
@Dhanraj Kumar
Introduction

Agenda Why These Topics?

• Chi-Square for Tables These statistical concepts are essential


• Understanding the Normal Distribution for data-driven decision making across
• Z-Score Standardization industries. Mastering them will elevate
your analytical capabilities.
• Comparison and Key Takeaways
Chi-Square for Tables
What is Chi-Square? When to Use It
Chi-square is a statistical test used Chi-square is commonly used to
to determine if there is a analyze contingency tables, which
significant difference between show the relationship between
observed and expected two categorical variables.
frequencies in one or more
categories.

Key Assumptions Interpretation


Chi-square requires that expected The chi-square statistic and p-
frequencies are at least 5 in at value indicate whether the
least 80% of cells, and that no cell observed differences between
has an expected frequency less groups are statistically significant.
than 1.
Calculating Chi-Square in
Python
from scipy.stats import
1 Load Data
chi2_contingency
Import a contingency table as a Pandas DataFrame. import numpy as np

Calculate Chi-Square table = np.array([[10, 20], [30, 40]])


2
stat, p, dof, expected =
Use the chi2_contingency() function from the scipy.stats
module.
chi2_contingency(table)
print(f"Chi-Square Stat: {stat}, p-
value: {p}")
3 Interpret Results
Examine the chi-square statistic and p-value to determine
statistical significance.
Understanding the Normal
Distribution

1 What is the Normal 2 Properties


Distribution?
It is defined by the mean (μ)
A symmetrical, bell-shaped and standard deviation (σ), and
probability distribution that is has a characteristic bell curve
commonly used in statistics. shape.

3 Importance 4 Applications
Many natural phenomena and Widely used in hypothesis
statistical data follow the testing, regression analysis,
normal distribution, making it and other statistical modeling
a foundational concept. techniques.
Visualizing Normal Distribution
in Python
import numpy as np Generate Random Data
1
import matplotlib.pyplot as plt Use NumPy to create a sample from a normal distribution.

data = np.random.normal(0, 1,
1000) # Mean=0, Std=1 2 Plot the Distribution

plt.hist(data, bins=30, Visualize the normal distribution using Matplotlib or Seaborn.

density=True)
plt.title("Normal Distribution") Annotate the Plot
3
plt.show() Add labels, title, and highlight key features like the mean and
standard deviation.
Z-Score Standardization
What is a Z-Score? Why Use Z-Scores?

A z-score represents how many standard deviations a data Z-scores are essential for normalization, outlier detection, and
point is from the mean, allowing for direct comparisons comparing variables with different scales or units.
across different data sets.
Z-Score Calculation in Python
Calculate Mean
Determine the mean of the data set using NumPy.

import numpy as np
Calculate Standard Deviation
data = [10, 20, 30, 40, 50]
Calculate the standard deviation of the data set using NumPy.
mean = np.mean(data)
std = np.std(data)
Apply Z-Score Formula
z_scores = [(x - mean) / std for x in
For each data point, subtract the mean and divide by the standard deviation.
data]
print(z_scores)

Interpret Z-Scores
Examine the z-scores to identify outliers and compare data points across
different variables.
Comparison and Key Takeaways
Chi-Square vs. Normal Importance of Z-Scores
Distribution
Z-scores enable standardization, outlier
Chi-square is used for categorical data, detection, and meaningful comparisons
while the normal distribution is used across different variables and data
for continuous data. Both are essential sets.
for statistical inference.

Python for Statistical Analysis Key Takeaways


Python's powerful data analysis Mastering chi-square, normal
libraries like NumPy, SciPy, and Pandas distribution, and z-scores will
make it an excellent tool for strengthen your data analysis skills
implementing these statistical and empower data-driven decision
concepts. making.
Thank You !!

You might also like