0% found this document useful (0 votes)
13 views3 pages

Practice Questions Answers IA

The document discusses exploratory data analysis (EDA), which involves examining a dataset to uncover patterns and insights. EDA consists of three stages - summarization using descriptive statistics like mean and median, visualization using charts and plots, and normalization to adjust data scales. An example of calculating mean, median and mode for customer age, height and weight data is provided.

Uploaded by

studyacc.tc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views3 pages

Practice Questions Answers IA

The document discusses exploratory data analysis (EDA), which involves examining a dataset to uncover patterns and insights. EDA consists of three stages - summarization using descriptive statistics like mean and median, visualization using charts and plots, and normalization to adjust data scales. An example of calculating mean, median and mode for customer age, height and weight data is provided.

Uploaded by

studyacc.tc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Practice Questions IA-1

Q1. What do you mean by EDA?

Q.2 Explain three stages of Exploratory Data Analysis.

Q3. With the help of example explain any one of the technique of EDA

Ans.

Exploratory Data Analysis (EDA) is a crucial initial step in the data analysis process that
involves examining and understanding a dataset to uncover insights, patterns, and anomalies.
EDA helps analysts gain a better understanding of the data's structure, distribution, and
characteristics before applying more advanced statistical or machine learning techniques.
EDA typically consists of three main stages: summarization, visualization, and normalization.

1. Summarization:

 In this stage, the aim is to summarize the main characteristics of the dataset using
descriptive statistics and metrics. This provides a high-level overview of the data.
 Common summarization techniques include:
 Mean: Calculated by summing all values and dividing by the total count.
 Median: The median is the middle value in a sorted dataset
 Mode: The mode is the most frequently occurring value in a dataset.
 Variance:

 Standard Deviation:
 Quartiles: Quartiles divide a dataset into four equal parts, with three quartiles
(Q1, Q2, Q3)
 Example: Suppose we have a dataset of customer age, height, weight. We can
calculate the mean, median, mode to find out the missing/null values

2. Visualization:
 Visualization is a powerful tool in EDA that helps to explore data patterns,
relationships, and outliers through charts, graphs, and plots.
 Common visualization techniques include histograms, box plots, scatter plots, and
bar charts. These visualizations provide insights into data distributions, correlations,
and potential anomalies.
 Example, we can create a histogram to visualize the Weight distribution in the
customer data set. A histogram will help to understand the range and the spread of
the data.

3. Normalization:
 Normalization is a method used to adjust data so that it follows a standard scale or
distribution. This adjustment makes it simpler to compare and analyze various
features or datasets.
 Common normalization techniques include min-max scaling (scaling data to a
specific range, e.g., [0, 1]) and z-score standardization (scaling data to have a mean
of 0 and a standard deviation of 1).
 Example:

Q.4 Explain Central tendency with the help of example.

You might also like