0% found this document useful (0 votes)
3 views

ALOJIPAN Assessment_Task_1_Sampling_Data_Visualization

The document outlines an assessment task for students in a Quantitative Methods course at Emilio Aguinaldo College, focusing on exploratory data analysis and visualization of an iPhone purchase dataset. Students will utilize Python libraries like Matplotlib and Seaborn to analyze purchasing patterns based on demographics and visualize trends over time. The task includes specific instructions for data manipulation and visualization techniques, with expected outputs and interpretations for various graphical representations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ALOJIPAN Assessment_Task_1_Sampling_Data_Visualization

The document outlines an assessment task for students in a Quantitative Methods course at Emilio Aguinaldo College, focusing on exploratory data analysis and visualization of an iPhone purchase dataset. Students will utilize Python libraries like Matplotlib and Seaborn to analyze purchasing patterns based on demographics and visualize trends over time. The task includes specific instructions for data manipulation and visualization techniques, with expected outputs and interpretations for various graphical representations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

EMILIO AGUINALDO COLLEGE

SCHOOL OF ENGINEERING AND TECHNOLOGY

QUANTITATIVE METHODS
(INCLUDING MODELING AND SIMULATION)

ASSESSMENT TASK 1
(Exploratory Data Analysis and Visualization)

Name (Student):
SHIEL DAWN AMON ALOJIPAN

Instructor / Professor:
ALEX HERNANDEZ

Submission Date:
12/02/2025
Objective:
The objective of this data visualization practice exercise is to analyze and interpret the iPhone
purchase dataset using various graphical representations. Through this exercise, students will:

● Gain hands-on experience with data visualization techniques in Python.

● Learn to use libraries such as Matplotlib and Seaborn for effective data representation.

● Identify patterns and trends in the dataset, including purchase behavior by age, gender, education
level, and location.
● Understand how different types of visualizations help convey meaningful insights from data.

Instructions:

● Please download the iphone purchase data set


(https://classroom.google.com/c/NzQ4ODU3NDAzNDMx/m/NzM3NjYwNjM2ODg5/details)
● Upload the data set under files of Google Colab.

● Use Google Colab to run the python code.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load the dataset
df = pd.read_csv("iphone_purchases.csv")
# Convert Date of Purchase to datetime format
df["Date of Purchase"] = pd.to_datetime(df["Date of Purchase"])
# 1. Distribution of Age
df["Age"].plot(kind='hist', bins=20, edgecolor='black', title='Age Distribution')
plt.xlabel("Age")
plt.show()

Output

After observing the generated histogram, you might describe the findings like this:
"The histogram shows that the customer ages are mostly concentrated between [age range], with
a peak around [age]. The distribution appears to be slightly skewed to the [left/right], indicating a
higher proportion of [younger/older] customers. There are a few outliers above [age], representing
a small segment of older customers."
# 2. Count of Purchases by Gender
sns.countplot(x="Gender", data=df, palette="coolwarm")
plt.title("Purchases by Gender")
plt.show()

Output

After observing the generated countplot, you might describe the findings like this:
"The countplot reveals that [Gender] made significantly more purchases compared to [Gender],
accounting for [Percentage]% of total purchases. This suggests a potential preference for
[product/brand] among [Gender] customers. The difference in purchase counts is substantial,
indicating a clear distinction in purchasing behavior between the two genders."

# 3. Count of Purchases by Education Level


sns.countplot(y="Education", data=df, palette="viridis")
plt.title("Purchases by Education Level")
plt.show()

Output
"The countplot reveals that customers with [Education Level] have the highest number of
purchases, followed by [Education Level] and [Education Level]. This suggests a potential
correlation between education level and purchasing behavior for this product/service. There is a
noticeable trend of [increasing/decreasing] purchases with higher education levels."

# 4. Most Purchased iPhone Models


sns.countplot(y="Item Purchased", data=df, order=df["Item Purchased"].value_counts().index,
palette="pastel")
plt.title("Most Purchased iPhone Models")
plt.show()

Output

he iPhone 15 is the most purchased iPhone model, followed by [second most purchased model]
and [third most purchased model]. There is a noticeable difference in purchase frequency
between the iPhone 15 and other models, indicating its strong popularity among customers. The
remaining iPhone models have relatively similar purchase counts, suggesting a more balanced
distribution for those models."

# 5. Total Sales by Location


df.groupby("Purchase Location")["Total Amount"].sum().sort_values().plot(kind='barh', color='skyblue',
title='Total Sales by Location')
plt.xlabel("Total Sales ($)")
plt.show()

Output
The bar chart highlights that [Location 1] and [Location 2] are the top-performing locations,
generating the highest total sales and contributing [Percentage 1]% and [Percentage 2]% of overall
revenue, respectively. [Location 3] follows with a significantly lower sales figure, indicating a
potential need for further investigation or marketing efforts in that region. The remaining locations
show a relatively even distribution of sales, suggesting a balanced market presence across those
areas."

# 6. Purchase Trends Over Time


df.groupby("Date of Purchase")["Quantity"].sum().plot(figsize=(12,5), title='Purchase Trends Over Time',
color='purple')
plt.ylabel("Quantity Sold")
plt.show()
Output

he plot reveals an overall [increasing/decreasing/stable] trend in purchase quantity over time.


There appears to be a seasonal pattern with sales peaking around [month/quarter/year] and
declining during [month/quarter/year]. A significant increase in sales occurred in [month/year], as
indicated by a sharp rise in the percentage change line. This could be attributed to [potential
reason, e.g., new product launch, marketing campaign]. Overall, the purchase trends show a
[positive/negative/mixed] outlook for sales performance.

# 7. Boxplot of Total Amount by Gender


sns.boxplot(x="Gender", y="Total Amount", data=df, palette="coolwarm")
plt.title("Total Amount Spent by Gender")
plt.show()
Output

The boxplot reveals that [Gender] tend to spend more on average, as indicated by a higher median
spending amount. The spread of spending is wider for [Gender], suggesting greater variability in
their purchase amounts. There are a few outliers for [Gender], representing unusually high
spending instances. Overall, [Gender] contribute a larger percentage ([Percentage]%) to the total
spending compared to [Gender] ([Percentage]%)."

# 8. Correlation Heatmap
plt.figure(figsize=(8,6))
sns.heatmap(df.corr(numeric_only=True), annot=True, cmap="coolwarm", linewidths=0.5)
plt.title("Correlation Heatmap")
plt.show()

Output
The correlation heatmap reveals a strong positive correlation between [Variable 1] and [Variable
2], suggesting that they tend to move together in the same direction. There is a moderate negative
correlation between [Variable 3] and [Variable 4], indicating an inverse relationship. [Variable 5]
shows weak or no correlation with other variables. These findings provide insights into the
relationships between different features in the dataset and can guide further analysis or
modeling."

# 9. Please provide the mean, median, mode of the purchase data.

Output

You might also like