ALOJIPAN Assessment_Task_1_Sampling_Data_Visualization
ALOJIPAN Assessment_Task_1_Sampling_Data_Visualization
QUANTITATIVE METHODS
(INCLUDING MODELING AND SIMULATION)
ASSESSMENT TASK 1
(Exploratory Data Analysis and Visualization)
Name (Student):
SHIEL DAWN AMON ALOJIPAN
Instructor / Professor:
ALEX HERNANDEZ
Submission Date:
12/02/2025
Objective:
The objective of this data visualization practice exercise is to analyze and interpret the iPhone
purchase dataset using various graphical representations. Through this exercise, students will:
● Learn to use libraries such as Matplotlib and Seaborn for effective data representation.
● Identify patterns and trends in the dataset, including purchase behavior by age, gender, education
level, and location.
● Understand how different types of visualizations help convey meaningful insights from data.
Instructions:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load the dataset
df = pd.read_csv("iphone_purchases.csv")
# Convert Date of Purchase to datetime format
df["Date of Purchase"] = pd.to_datetime(df["Date of Purchase"])
# 1. Distribution of Age
df["Age"].plot(kind='hist', bins=20, edgecolor='black', title='Age Distribution')
plt.xlabel("Age")
plt.show()
Output
After observing the generated histogram, you might describe the findings like this:
"The histogram shows that the customer ages are mostly concentrated between [age range], with
a peak around [age]. The distribution appears to be slightly skewed to the [left/right], indicating a
higher proportion of [younger/older] customers. There are a few outliers above [age], representing
a small segment of older customers."
# 2. Count of Purchases by Gender
sns.countplot(x="Gender", data=df, palette="coolwarm")
plt.title("Purchases by Gender")
plt.show()
Output
After observing the generated countplot, you might describe the findings like this:
"The countplot reveals that [Gender] made significantly more purchases compared to [Gender],
accounting for [Percentage]% of total purchases. This suggests a potential preference for
[product/brand] among [Gender] customers. The difference in purchase counts is substantial,
indicating a clear distinction in purchasing behavior between the two genders."
Output
"The countplot reveals that customers with [Education Level] have the highest number of
purchases, followed by [Education Level] and [Education Level]. This suggests a potential
correlation between education level and purchasing behavior for this product/service. There is a
noticeable trend of [increasing/decreasing] purchases with higher education levels."
Output
he iPhone 15 is the most purchased iPhone model, followed by [second most purchased model]
and [third most purchased model]. There is a noticeable difference in purchase frequency
between the iPhone 15 and other models, indicating its strong popularity among customers. The
remaining iPhone models have relatively similar purchase counts, suggesting a more balanced
distribution for those models."
Output
The bar chart highlights that [Location 1] and [Location 2] are the top-performing locations,
generating the highest total sales and contributing [Percentage 1]% and [Percentage 2]% of overall
revenue, respectively. [Location 3] follows with a significantly lower sales figure, indicating a
potential need for further investigation or marketing efforts in that region. The remaining locations
show a relatively even distribution of sales, suggesting a balanced market presence across those
areas."
The boxplot reveals that [Gender] tend to spend more on average, as indicated by a higher median
spending amount. The spread of spending is wider for [Gender], suggesting greater variability in
their purchase amounts. There are a few outliers for [Gender], representing unusually high
spending instances. Overall, [Gender] contribute a larger percentage ([Percentage]%) to the total
spending compared to [Gender] ([Percentage]%)."
# 8. Correlation Heatmap
plt.figure(figsize=(8,6))
sns.heatmap(df.corr(numeric_only=True), annot=True, cmap="coolwarm", linewidths=0.5)
plt.title("Correlation Heatmap")
plt.show()
Output
The correlation heatmap reveals a strong positive correlation between [Variable 1] and [Variable
2], suggesting that they tend to move together in the same direction. There is a moderate negative
correlation between [Variable 3] and [Variable 4], indicating an inverse relationship. [Variable 5]
shows weak or no correlation with other variables. These findings provide insights into the
relationships between different features in the dataset and can guide further analysis or
modeling."
Output