Real Estate Analysis
Real Estate Analysis
Real Estate Analysis
In this presentation, we analyze key trends in the real estate market using a dataset that includes property
characteristics across different U.S. cities, such as Phoenix, New York, Los Angeles, Chicago, and Houston. The data
consists of attributes like the number of rooms, square footage, price, number of bathrooms, garage size, lot size, and
neighborhood ratings.
The goal of this analysis is to uncover the relationships between these factors and housing prices, while also identifying
broader market trends. We aim to provide insights that can help stakeholders—whether real estate investors,
developers, or policymakers—make informed decisions in the dynamic property market.
Tools of Central Tendency
Central tendency measures summarize data by identifying a central point or typical value in a dataset. The common tools
are:
1. Mean: The average of all data points, calculated by summing values and dividing by the number of observations
2. Median: The middle value of the ordered dataset. If the dataset has an even number of observations, the median is the
average of the two middle numbers.
- Example: Median salary package of an institute’s student placements.
3. Mode: The most frequent value(s) in a dataset.
- Example: Mode of shoe sizes sold in a store.
Tools of Dispersion
Dispersion measures describe the spread or variability of data. The common tools are:
2. Variance: Measures how far each data point is from the mean, representing data variability.
.
3. Standard Deviation: The square root of variance, indicating how much individual data points deviate from the mean.
Correlation
Correlation measures the strength and direction of a linear relationship between two variables. It is expressed as a correlation
coefficient, typically denoted as r, which ranges from -1 to +1.
- Positive Correlation (+1): As one variable increases, the other also increases.
- Example: Height and weight often show a positive correlation.
- Negative Correlation (-1): As one variable increases, the other decreases.
- Example: As temperature increases, heating costs often decrease.
- No Correlation (0): No linear relationship exists between the variables.
- Example: Shoe size and intelligence are not correlated.
Chicago
Known for its stunning
architecture, rich cultural history,
and diverse food scene, Chicago is
a bustling metropolis on the
shores of Lake Michigan. It’s a
central hub for finance, industry,
and the arts, offering a unique
blend of affordability and big-city
amenities.
Chicago Major Takeaways-
● The variance in price is just 49.88, with a standard deviation of 7.06, meaning housing prices are tightly clustered around
the mean price of $49,318.56.
● This suggests price stability in the dataset. A stable price market could indicate less speculation and a more balanced market
where prices are not fluctuating wildly. For buyers, this offers predictability in home value.
3. Low Correlation Between Variables and Price
● Correlation with Price for most variables (e.g., number of rooms, square footage, number of bathrooms, garage size, lot size) is relatively low, all close to zero.
● None of these factors strongly influence the price in this dataset. In real estate, one would generally expect variables like square footage and number of rooms to have a higher positive correlation with price. this suggests that other factors, such as location or market conditions, might play a more crucial role in
determining housing prices in this hypothetical data
4. Lot Size and Price Relationship
● Lot size has a negative correlation of -0.041 with price, and a negative covariance (-8.22).
● This could mean that in this dataset, larger lot sizes are associated with lower prices, possibly because the larger lots may be
located in less desirable areas. In real estate, a larger lot size often contributes positively to price, but this anomaly could be
explained by specific local factors like zoning or distance from amenities.
● The neighborhood rating has a very small positive correlation (0.006) with price.
● Neighborhood ratings often have a significant impact on property value. The near-zero correlation here might indicate that
buyers are not prioritizing neighborhood quality in this particular dataset, which contrasts with real-world behavior where
location and neighborhood desirability are key drivers of home prices.
● The mode (most frequent value) for variables like number of bathrooms (85) and garage size (88) are unusually high and may
not align with typical real estate trends.
● This could suggest the presence of a few properties with very high values skewing the data. In real estate, such outliers can
distort summary statistics, and it might be important to remove extreme values when analyzing the typical market.
Conclusion:
Importance of External Factors: The low correlation between typical home attributes (rooms, square footage) and price could suggest
that external factors like location, market demand, or proximity to amenities may be more crucial in this market. In Chicago, for
example, neighborhoods with better transportation links or schools might fetch higher prices despite smaller homes.
Price Stability is a Positive Sign: In many real estate markets, particularly in established areas, price stability is a sign of a mature,
balanced market. Buyers and investors in such a market might feel more confident, knowing that property values won’t fluctuate
dramatically in the short term.
Data Considerations: Some of the unusual mode and variance values (e.g., 85 bathrooms or 88 garages) could be due to data issues or
extreme outliers. In real-world real estate analysis, cleaning the data by removing outliers would be essential to producing more reliable
insights
Phoenix
Phoenix, Arizona, is known for its warm
weather, desert landscapes, and growing
economy. It attracts residents with its
relatively affordable housing market
compared to other large cities. Phoenix
offers outdoor recreational activities, like
hiking in the Sonoran Desert and exploring
mountain parks.
PHOENIX
MAJOR INSIGHTS
The analysis reveals unexpected relationships between common real estate features and property prices. Typically, larger lot
sizes and more bathrooms are expected to correlate positively with higher prices; however, this analysis shows a negative or
negligible correlation instead. This deviation may result from factors such as market saturation, regional preferences for smaller
spaces, or the influence of other amenities and location attributes that overshadow these features. This finding highlights the
need for a deeper understanding of real estate dynamics, as conventional assumptions may not apply across all market
segments.
Significant variability in features like the number of rooms, bathrooms, and garage sizes indicates potential outliers in the
dataset. These outliers can skew market perceptions and lead to inaccurate conclusions if not properly addressed. Therefore,
thorough data cleaning and exploration are crucial to ensure that summary statistics accurately reflect general market
conditions. Identifying and analyzing these outliers is essential for determining their impact on overall trends and ensuring
1. Neighborhood Quality Underemphasized:
The low correlation between neighborhood ratings and property prices suggests that buyers in this dataset may not
prioritize neighborhood quality as typically expected in real estate. This could indicate unique market conditions or specific
buyer preferences within the analyzed area, warranting a closer examination of local dynamics.
Given the weak correlations and unconventional findings, stakeholders—such as real estate agents, investors, and
analysts—should reassess which features genuinely influence property values in this dataset. Conventional wisdom may
not apply, necessitating a shift in focus toward more relevant characteristics that could better reflect buyer priorities and
market trends.
The observed anomalies and low correlations emphasize the importance of contextual analysis when interpreting real
estate data. Local factors, market conditions, and buyer behaviors can significantly influence these relationships, indicating
that deeper qualitative insights are essential for a comprehensive understanding of market dynamics.
4. Implications for Pricing Strategy:
Real estate professionals may need to adapt their pricing strategies and valuation models based on these
findings. Relying solely on standard metrics may lead to inaccurate appraisals, while more nuanced
approaches that account for local characteristics and buyer motivations could yield better results.
Lastly, the neighborhood rating presents a very weak positive correlation of 0.0119, suggesting a slight tendency for better-rated areas
to command higher prices, though this effect is minimal. Overall, the data indicate that factors such as economic conditions or market
dynamics may play a more critical role in determining property prices than the specific attributes analyzed.
Los Angeles
Los Angeles is synonymous with
the entertainment industry, sunny
weather, and a lifestyle focused
on outdoor activities. Home to
Hollywood, LA attracts residents
and visitors alike with its beaches,
cultural diversity, and booming
tech scene.
LOS ANGELES
Number of rooms
Your analysis provides a clear summary of the distribution and relationship between the number of rooms and price. Here’s a concise interpretation:
Distribution
Mean vs. Median: The mean (50.38) and median (51) being close suggests a fairly symmetrical distribution. However, the mode of 5 rooms indicates a significant number of smaller
homes in the dataset, highlighting a potential skew towards lower room counts.
Variability
Standard Deviation & Variance: The standard deviation (29.13) and variance (848.47) indicate a wide range of room counts, suggesting a diverse set of properties. This variability
means that while many homes may have around 50 rooms, there are also many outliers with either very few or very many rooms.
Relationship with Price
Correlation & Covariance: The weak positive correlation (0.016) and low covariance (3.25) with price imply that the number of rooms does not have a strong impact on pricing.
While homes with more rooms may be slightly more expensive, the effect is minimal and indicates that other factors likely play a more significant role in determining home prices.
Overall, your findings suggest that while room count does influence home pricing to some extent, it is not a primary driver, and there is considerable diversity in the types of homes
present in the dataset.
Square Footage
Your analysis of square footage provides a thorough overview. Here’s a concise interpretation:Distribution Mean vs. Median: The mean (50.53) and median (51) are close, indicating
an overall even distribution of square footage. However, the mode of 12 sq ft suggests there is a notable number of smaller homes, pointing to a concentration of properties on the
lower end of the size spectrum.Variability Standard Deviation & Variance: The high standard deviation (28.17) and variance (793.79) indicate a significant range in square footage,
reflecting a diverse array of home sizes in the dataset. This suggests that while many homes hover around the mean, there are numerous properties that are much larger or
smaller.Relationship with Price Correlation & Covariance: The negative correlation (-0.006) and covariance (-1.24) imply a very weak inverse relationship between square footage and
price. This indicates that, contrary to what one might expect, larger homes by square footage do not necessarily command higher prices. Other factors likely influence pricing more
significantly than square footage alone..
Price
Your analysis of home prices provides valuable insights. Here’s a concise interpretation:Distribution Mean, Median, and Mode: The close alignment of the mean
($49,318.81), median ($49,321.49), and mode ($49,321.49) suggests that home prices are tightly clustered around the $49,318 mark. This indicates a relatively uniform
pricing structure within the dataset, with most homes priced similarly.Variability Standard Deviation & Variance: The low standard deviation (6.77) and variance (45.77)
reinforce the idea of price stability, indicating minimal fluctuation in home prices across the dataset. This suggests that there is not a wide range of pricing, which can be
indicative of a homogenous market or similar property features.In summary, the data indicates a stable and tightly grouped pricing structure for homes, suggesting that price
differences among properties in this dataset are relatively small and consistent.
Number of Bathrooms
Your analysis of the number of bathrooms provides important insights. Here’s a detailed interpretation:
Distribution
Mean, Median, and Mode: The mean (49.96) and median (51) are close, indicating that most homes tend to have around 50 bathrooms.
However, the mode of 73 suggests the presence of an outlier or a small number of homes with significantly more bathrooms, which skews
the mode upward.
Variability
Standard Deviation & Variance: The high standard deviation (28.42) and variance (807.44) indicate a wide range in the number of
bathrooms across homes, reflecting diverse property types. This variability suggests that while many homes have a similar number of
bathrooms, there are also properties with exceptionally high counts.
Relationship with Price
Correlation & Covariance: The weak negative correlation (-0.010) and covariance (-1.88) suggest that there is virtually no relationship
between the number of bathrooms and home prices in this dataset. This implies that having more bathrooms does not necessarily lead to
higher prices, indicating that other factors may be more influential in determining property value.
Garage size
Your analysis of garage size provides a clear understanding of its characteristics and its relationship with price. Here’s a detailed
interpretation:
Distribution
•Mean vs. Median: The mean (49.61) and median (49) being close indicates a fairly symmetrical distribution of garage sizes.
This suggests that most homes have garage sizes around this average, without a significant skew in one direction.
•Mode: The mode of 26 indicates that there is a notable number of homes with smaller garages, which may reflect a common
preference or availability in the market.
Variability
•Standard Deviation & Variance: The high standard deviation (28.51) and variance (812.58) indicate a significant range of
garage sizes among the homes in the dataset. This suggests that while many properties may have a size close to the mean, there
are also numerous homes with much larger or smaller garages, highlighting diversity in property types.
Relationship with Price
•Correlation & Covariance: The positive correlation (0.041) and covariance (7.89) suggest a weak positive relationship between
garage size and home prices. This means that, in general, homes with larger garages tend to be associated with higher prices, but
the relationship is not strong enough to be a reliable predictor of price. Other factors are likely more influential in determining the
overall market value.
Lot Size
Your analysis of lot size provides important insights into its characteristics and its relationship with price. Here’s a detailed interpretation:
Distribution
Mean vs. Median: The mean (49.76) and median (50) being closely aligned indicates that lot sizes are fairly symmetrically distributed around the average. This suggests that most
properties have lot sizes near this value, reflecting a balanced distribution without significant skew.
Mode: The mode of 36 indicates that there is a notable concentration of homes with smaller lots. This suggests that smaller lot sizes are relatively common in this dataset, which
may reflect market trends or preferences among buyers.
Variability
Standard Deviation & Variance: The high standard deviation (28.07) and variance (788.05) indicate a wide range of lot sizes across the dataset. This variability means that while
many homes are around the average lot size, there are also numerous properties with significantly larger or smaller lots, highlighting the diversity of the market.
Your analysis of neighborhood ratings provides a clear understanding of their characteristics and relationship with home prices.
Here’s a detailed interpretation:
Distribution
•Mean vs. Median: The mean (5.56) and median (5.62) are closely aligned, suggesting that neighborhood ratings are fairly consistent across the
dataset. This indicates that most properties are located in neighborhoods with ratings near this average value, reflecting a stable perception of
neighborhood quality.
Variability
•Standard Deviation & Variance: The low standard deviation (2.61) and variance (6.80) indicate that there is minimal variability in neighborhood
ratings. This suggests that most homes fall within a narrow range of ratings, implying a general uniformity in how neighborhoods are assessed.
The average number of rooms is 50.75, with a median of 52, showing that most homes are medium to large in size. However, the
mode of 62 suggests a higher concentration of larger homes in certain areas. Despite this, the correlation with price is slightly
negative (-0.0115), meaning having more rooms doesn’t strongly affect price.
Key Takeaway: In real estate, other factors like neighborhood or design may matter more than just the number of rooms. Larger
homes might not automatically command higher prices unless they offer additional attractive features.
Square Footage:
The mean square footage is 49.46 sqft, with a balanced median of 50 sq ft, but the mode of 18 sqft highlights a few much smaller
properties in the dataset. The correlation with price is very weak (-0.0076), indicating that square footage alone does not
significantly impact property prices.
Key Takeaway: Buyers often consider factors beyond just the size of the house, such as location or amenities. This is why we
see only a small relationship between the home’s size and its price. Larger square footage may only translate into higher prices if
the home is located in a high-demand area or has desirable characteristics.
Price:
The average property price is 49,318, with a median and mode very close to that number, indicating a stable and predictable price range in
the market. The low standard deviation of 7.06 suggests that most homes are priced similarly.
Key Takeaway: This stability reflects a mature and competitive market where prices are well-established, likely due to consistent demand
and supply dynamics. Buyers in such markets expect relatively predictable pricing, with less volatility or unexpected jumps in property
values.
Number of Bathrooms:
The average number of bathrooms is 50.37, with a median of 50. Interestingly, the mode is just 10, indicating that a substantial number of
homes have fewer bathrooms. However, the correlation with price is perfectly positive (1), meaning that more bathrooms directly lead to
higher property values.
Key Takeaway: Bathrooms are a critical feature in real estate, and the data clearly shows that homes with more bathrooms tend to be
priced higher. This could be due to buyer preferences for convenience and comfort, especially in larger homes designed for families.
Garage Size:
The average garage size is 50.35, with a mode of 60, indicating that larger garages are fairly common. Despite this, the correlation with price is very
weak (-0.0083), meaning garage size doesn’t significantly affect home prices.
Key Takeaway: While a larger garage may be appealing to certain buyers, it’s not a key driver of property value. In real estate, factors like proximity
to amenities, schools, or scenic views may be more important to buyers than extra storage or parking space.
Lot Size:
The average lot size is 49.92, with a median of 50, but the mode of 92 suggests that a few properties have large lots. Still, the weak negative correlation
(-0.0259) indicates that larger lot sizes don’t necessarily increase property prices.
Key Takeaway: The size of the land itself isn’t always the deciding factor in a property’s value. Homes on larger lots may not command higher
prices unless the land is in a desirable area, or there is potential for further development or recreational use.
Neighbourhood Rating:
The average neighborhood rating is 5.53, with a median of 5.57, indicating that most properties are located in well-rated areas. The correlation with
price is weakly positive (0.0110), suggesting slightly higher prices in better-rated neighborhoods.
Key Takeaway: While the neighborhood’s rating does affect price, it’s not a dominant factor. Other variables like proximity to city centers, public
transport, or local schools may drive prices higher than the general perception of the neighborhood alone.
Overall Takeaways
• Property size and number of rooms are not the main price drivers: While larger homes or more rooms may
be appealing, they don’t automatically command higher prices. Other elements like location, amenities, and
neighborhood quality can play a bigger role.
• Bathrooms are a key selling feature: There’s a clear trend that homes with more bathrooms are priced higher,
showing that buyers highly value this convenience, particularly in family-oriented or luxury homes.
• Stable pricing suggests a mature market: With low price variability, this real estate market is likely
established and competitive, which may benefit both buyers (predictability) and sellers (steady demand).
• Lot size and garage space offer limited price influence: While larger lots and garages can be appealing to
certain buyers, they do not significantly impact property values. Other aspects of the home and its
surroundings seem to hold more weight in determining price.
• Neighborhood rating slightly impacts price: Living in a better-rated neighborhood does help boost property
value slightly, though other factors like accessibility and development potential may play a larger role.
New York
New York City is famous for its fast-paced
urban life, iconic landmarks like the Statue
of Liberty, and its status as a global
business hub. The real estate market in New
York is highly competitive and expensive,
with demand driven by the city’s cultural,
financial, and entertainment sectors.
New York
Major Statistical Insights
Higher Variance and Standard Deviation in Key Features
-Number of Rooms
-Standard Deviation: 28.63
- Variance: 819.96
- Insight: These values indicate a large spread in the number of rooms, suggesting that homes range widely in size.
-Square Footage:
- Standard Deviation: 28.46
- Variance: 809.71
- Insight: Significant variation in home sizes reflects the diverse housing options available in this market.
Number of Bathrooms:
- Standard Deviation: 28.76
- Variance: 826.98
- Insight: The diversity in bathroom count further highlights the market's range, from small homes to larger, high-end properties.
- Number of Rooms:
- Correlation: 0.000998
- Insight: Virtually no relationship between the number of rooms and price, suggesting room count isn't a major pricing factor
in this market.
- Number of Bathrooms:
- Correlation: 0.0059
- Insight: Like rooms, bathroom count has minimal impact on price.
- Lot Size:
- Correlation: -0.0276
- Insight: Surprisingly, larger lot sizes have a negative relationship with price, which contradicts traditional real estate trends.
Key Insight: The weak correlations suggest that factors other than home characteristics—such as location, neighborhood
amenities, or proximity to city centers—are likely more influential in determining home prices.
Negative Slope for Lot Size and Garage Size
- Lot Size:
- Slope: -0.1074
- Correlation: -0.0276
- Insight: Larger lot sizes slightly reduce home prices in this dataset, which is unusual and suggests that lot size may not be highly
valued in this market.
- Garage Size:
- Slope: -0.0386
- Correlation: -0.0097
- Insight: Garage size also shows a slight negative impact on price, indicating that garage space may not be a key selling point.
Key Insight: In urban areas, factors like proximity to amenities or public transit often outweigh the value of larger lots or garages.
This could explain the negative relationships in this dataset.
Price Stability
Key Insight: Price stability often indicates a balanced market with consistent demand, attracting buyers and investors who seek
steady returns over time.
- Correlation: 0.0091
- Slope: 0.0033
- Insight: Neighborhood ratings have a slight positive impact on price, but the effect is minimal in this dataset.
Key Insight: While neighborhood desirability is typically a key driver of property values, other factors—such as local amenities or
infrastructure—might be more important in determining home prices in this market.
1. External Factors Likely Driving Prices: The weak correlation between home features (e.g., rooms, square footage) and price
suggests that location or market conditions are more influential. In cities like Chicago, smaller homes in prime areas can command
higher prices than larger homes in less desirable neighborhoods.
2. Focus on Home Flexibility and Variability: The high variance in features such as room count and square footage suggests a broad
range of housing options, from entry-level homes to luxury properties. This diversity offers opportunities to target different buyer
preferences.
3. Garage and Lot Size May Not Be Key Selling Points: The negative correlation and slope for garage and lot size suggest that these
features are not highly valued in this market. In densely populated urban areas, buyers may prioritize other factors like location or
modern amenities over additional space.
4. Stable Price Environment: The low price variance signals a stable real estate market, which is attractive to both buyers and
investors looking for predictable property values and returns.
Conclusion
In conclusion, the analysis reveals that physical attributes—such as the number of rooms, bathrooms, or garage size—
have limited influence on property prices across the cities analyzed. Despite high variability in home features like
square footage or room count, these factors show weak correlations with price. Instead, broader influences like location,
market demand, and economic conditions appear to play a much larger role in determining property values.
The relatively stable price patterns observed in the data suggest that the market is mature and competitive, with limited
fluctuation based on physical features alone. This highlights the importance of considering qualitative factors—such as
neighborhood desirability, proximity to key amenities, and local market trends—when making real estate investment
decisions.
Future research should expand on these findings by exploring other influential variables, including interior quality,
market timing, buyer preferences, and interest rates, to gain a more comprehensive understanding of what drives
housing prices. A combination of these insights can empower real estate investors and market analysts to make more
informed and strategic decisions in navigating the complexities of the real estate market.A