Excel
Excel
Before going into depth of Data Analytics, let's know Data a bit and it's importance,
Think of data as raw ingredients in a kitchen. Just like vegetables, spices, and grains, data can be
numbers, text, or even clicks on a website! By itself, it’s just a jumble of information, but with the
right recipe, you can turn it into something delicious—meaningful insights!
Imagine running a lemonade stand without knowing which flavors your customers like best. Data
is like a super-smart friend who tells you what’s working, what’s not, and how to improve your
business! It helps companies make smarter decisions, keep customers happy, and be more
successful overall. 🍋📈
1. Descriptive Analysis:
🎉 What Happened? This analysis acts like a storyteller, summarizing historical data to provide a
clear picture of past events. It focuses on metrics and key performance indicators (KPIs) to
understand trends over time.
Key Features:
Uses measures like averages, totals, and percentages.
Helps in creating reports and dashboards that visualize data.
2. Diagnostic Analysis:
🔍 Why Did It Happen? This is where the detective work begins! Diagnostic analysis digs deeper into
the data to identify causes behind certain trends or anomalies. It often involves comparing
different datasets to find correlations.
Key Features:
Utilizes techniques like data mining and correlation analysis.
Helps in identifying patterns that explain why certain events occurred.
3. Predictive Analysis:
🔮 What Will Happen Next? This analysis is like peering into a crystal ball! Predictive analysis uses
statistical models and machine learning algorithms to forecast future outcomes based on historical
data. It helps businesses anticipate trends and prepare accordingly.
Key Features:
Involves techniques such as regression analysis and time series forecasting.
Provides insights that can lead to proactive decision-making.
4. Prescriptive Analysis:
What Should We Do? This is the action-oriented analysis! Prescriptive analysis recommends
specific actions based on insights derived from descriptive, diagnostic, and predictive analyses. It
helps organizations make informed decisions that optimize outcomes.
Key Features:
Utilizes optimization algorithms and simulation models.
Offers actionable recommendations tailored to business objectives.
1. Netflix: When you binge-watch Stranger Things, Netflix uses data analytics to recommend other
shows you might like. It knows your watching habits better than your best friend. They look at
everything—what shows you click on, how long you watch, and even when you hit pause. This is
how they keep you hooked.
2. Spotify: Ever wonder how your Discover Weekly playlist is so on point? Spotify’s algorithm digs
through data from millions of users to suggest songs that match your music taste. It's basically a
data DJ!
3. Pizza Shop (Yes, your favourite pizzaaaaaaa!): Let’s say you run a promotion on Sundays for half-
price pizza. Your data shows that more people are ordering pizza at 5 p.m. than any other time.
What can you do with this info? Maybe extend the promotion to Saturday or start an afternoon
pizza deal. That’s data-driven decision-making.
Curious Detective: They ask questions to understand what the business needs, like why
sales dropped last month.
Data Treasure Hunter: They collect data from various sources, like customer feedback and
sales reports, finding the golden nuggets of information.
Data Cleaner: They ensure the data is accurate and tidy, just like making sure your room is
clean before throwing a party!
Data Explorer: They analyze trends and patterns, spotting opportunities for growth (e.g.,
noticing more pizzas sell on weekends).
Storyteller: They present findings through visuals, making complex data easy to
understand—like turning a boring lecture into a fun story!
Decision-Maker: They provide actionable insights that help businesses decide their next
steps, like launching a new promotion.
Continuous Learner: They keep updating their skills to stay ahead in the ever-evolving
world of data.
So, the next time you hear "data analyst," think of them as your go-to friend for insights that can
make a big difference! 🌟
Data analysts have their secret weapons: Excel, SQL, Python, and many more . They’re like the
Avengers of data!
In the upcoming weeks, you’ll become proficient in these powerful tools, preparing you to take on
the world of data analysis with confidence! 🌟
Excel Foundations
Q. How to select particular column upto certain row or upto a certain table row?
1. Introduction to MS-Excel
Microsoft Excel is a spreadsheet program used for storing, organizing, and manipulating data. It
provides tools for performing calculations, creating charts, and analyzing large datasets.
Examples:
Ribbon Overview: The Ribbon in Excel is divided into several tabs (Home, Insert, Data, etc.), each
containing a set of tools. For example, the Data Tab is used for importing and analyzing external data.
Worksheets and Cells: Excel organizes data into worksheets, where each worksheet is made up of
rows and columns. The intersection of a row and a column is called a cell, which is identified by a cell
address (e.g., A1, B2). You can enter data into these cells.
Quick Access Toolbar: This toolbar is located at the top of the Excel window and contains commonly
used commands such as Save, Undo, and Redo.
Formula Bar: Displays the contents or formulas of the active cell. For instance, if you
input =SUM(A1:A10), the formula bar will display it, showing the calculation of the sum of values
from A1 to A10.
Excel offers several basic features that make it easy to organize and analyze data, including sorting,
filtering, and formatting options.
Examples:
1. Sorting Data: Sort the "products.csv" file by BuyPrice to list products from most expensive to
least expensive.
Go to Data → Sort → Choose the column to sort by (BuyPrice) and select "Descending".
2. Filtering Data: Filter the "customers.csv" file to show only customers from the USA.
3. AutoSum: Use the SUM() function to calculate the total sales from the "orders.csv" file.
OrderNumber OrderDate TotalAmount
Formula
4. Text Formatting: Format the text in the "products.csv" file by making the product names
bold and setting a currency format for prices.
1948 Porsche
S18_2325 $43.40
Type 356
Use the Home Tab → Font section to bold the ProductName and format BuyPrice as currency.
Data validation allows you to set rules that restrict what type of data can be entered into cells.
Examples:
1. Restricting Buy Prices: Set a data validation rule in the "products.csv" file so that
the BuyPrice must be between 20and20and100.
1969 Harley
S18_1749 $51.30
Davidson
1952 Alpine
S18_2248 $34.30
Renault
Go to Data → Data Validation → Set a rule that only allows decimal values between 20 and 100.
2. Order Quantity Limits: Restrict the QuantityOrdered in "order details.csv" to values between
1 and 100.
ProductCode QuantityOrdered
S18_1749 25
ProductCode QuantityOrdered
S18_2248 40
Go to Data Validation → Set the criteria as a whole number between 1 and 100.
3. Date Range: Ensure that the OrderDate in "orders.csv" falls between January 1, 2022, and
December 31, 2023.
2023-07-05 S18_1749 25
2022-09-15 S18_2325 45
4. Restricting Product Codes: Prevent duplicate entries in the "products.csv" file by ensuring
each ProductCode is unique.
1969 Harley
S18_1749 $51.30
Davidson
1948 Porsche
S18_2325 $43.40
Type 356
Use a custom data validation rule with the formula =COUNTIF(A:A, A1) = 1 to restrict duplicate
entries.
4. Conditional Formatting in Excel
Conditional formatting is a feature that allows you to format cells based on their values.
Examples:
1. Highlight High-Value Payments: In the "payments.csv" file, highlight payments greater than
$5000.
CustomerNumber PaymentAmount
112 $5,750
121 $3,300
2. Highlight Low Stock Products: In the "products.csv" file, highlight products with
a BuyPrice lower than $30.
ProductCode ProductName BuyPrice
1952 Alpine
S18_2248 $34.30
Renault
1948 Porsche
S18_2325 $43.40
Type 356
ProductCode QuantityOrdered
S18_1749 50
S18_2248 40
4. Color Scale for Sales: Apply a color scale to the TotalAmount column in "orders.csv" to
visualize sales distribution.
OrderNumber TotalAmount
10101 $1,250
10102 $3,000
Go to Conditional Formatting → Color Scales → Select a gradient to represent different sales values.
Sorting helps you organize data in ascending or descending order based on a specific column.
Examples:
1. Sort by Product Price: Sort the "products.csv" file by BuyPrice in descending order to see the
most expensive products first.
15CV
1948 Porsche
S18_2325 $43.40
Type 356
3. Sort by Payment Date: In the "payments.csv" file, sort by PaymentDate to organize payments
chronologically.
Select the PaymentDate column and apply Sort from oldest to newest.
Filtering allows you to display only the data that meets specific criteria.
Examples:
1. Filter by Country: In the "customers.csv" file, filter the data to show only customers from
France.
128 Cafeter
7. Dealing with Duplicate Values in Excel
Duplicate values in your dataset can lead to misleading insights. Here’s how you can handle them
using Excel’s built-in features.
Examples:
1. Remove Duplicate Product Records: In the products.csv file, each ProductCode should be
unique. We will identify and remove any duplicates to maintain data accuracy.
1969 Harley
S18_1749 $51.30 $95.70
Davidson
1969 Harley
S18_1749 $51.30 $95.70
Davidson
Steps:
Steps:
Steps:
2023-05-
10101 103 Shipped
10
2023-05-
10102 112 Cancelled
12
2023-05-
10101 103 Shipped
10
Steps:
Null values can disrupt data analysis and lead to errors. It’s crucial to handle these values
appropriately to maintain data integrity.
Examples:
1. Identify Missing Product Prices: In the products.csv file, check for missing BuyPrice values
and replace them with a default or average price.
1969 Harley
S18_1749 $51.30 $95.70
Davidson
Steps:
Replace Missing Payment Amounts: In the payments.csv file, any missing Amount values can
be replaced with 0 or another default value.
Steps:
Type 0 and press Ctrl + Enter to fill all blank cells with 0.
Steps:
Fill missing values manually or use logical formulas (e.g., =IF(ISBLANK(B2), "Unknown", B2)).
Handle Null Order Dates: In the orders.csv file, flag any missing OrderDate values and
replace them with a default or estimated date.
Steps:
Use a formula like =IF(ISBLANK(B2), "2023-01-01", B2) to fill missing dates with a default
value.
Data Cleaning & Functions in Excel
Whitespaces, especially leading or trailing spaces, can cause issues when analyzing data. Excel
provides tools to remove these unnecessary spaces, ensuring that data is clean and consistent.
Examples:
1. Removing Leading and Trailing Spaces: In the "customers.csv" file, some customer names
may have extra spaces that affect sorting or searching.
CustomerNumber CustomerName
Steps:
This removes any leading and trailing spaces from the customer names.
2. Removing Spaces from Product Names: In the "products.csv" file, remove unwanted spaces
from ProductName for cleaner data.
ProductCode ProductName
=TRIM(B2)
3. Cleaning Payment Data: In the "payments.csv" file, ensure there are no extra spaces in
the CustomerNumber or Amount fields to prevent issues in analysis.
Apply the TRIM() function to the Amount column to remove any leading or trailing spaces:
=TRIM(C2
4. Combining and Cleaning Names: If FirstName and LastName are stored separately in a
dataset, like in the "employees.csv" file, use TRIM() when combining them to avoid extra
spaces.
FirstName LastName
Column formats are crucial when working with numerical, date, or text data. Incorrect formats can
cause calculation errors or make the data unreadable.
Examples:
1. Fixing Date Formats: In the "orders.csv" file, ensure that all dates are correctly formatted
as Date types instead of text.
Steps:
2. Fixing Currency Formats: In the "payments.csv" file, ensure that payment amounts are
formatted as currency.
Steps:
Go to Home → Number Format → Currency to format all payment amounts with the
appropriate currency symbols.
3. Standardizing Product Prices: In the "products.csv" file, ensure the BuyPrice is formatted
with two decimal places for better consistency.
1969 Harley
S18_1749 51.3
Davidson
1952 Alpine
S18_2248 34
Renault
Steps:
Steps:
Text functions in Excel help manipulate and clean up text-based data, making it easier to manage
large datasets.
Examples:
FirstName LastName
John Doe
Jane Smith
Steps:
2. Extracting Product Codes: In the "order details.csv" file, extract the first 3 characters of
the ProductCode to group similar products.
ProductCode QuantityOrdered
S18_1749 25
S18_2325 45
Steps:
Use the LEFT() function to extract the first 3 characters:
=LEFT(A2, 3)
3. Finding Text in Product Names: In the "products.csv" file, find all products containing the
word "Harley".
ProductCode ProductName
Steps:
=SEARCH("Harley", B2)
4. Changing Case: In the "customers.csv" file, convert customer names to uppercase for
consistency.
CustomerNumber CustomerName
Steps:
=UPPER(B2)
Logical functions allow you to perform conditional operations based on specific criteria,
returning TRUE or FALSE.
Examples:
S18_174
60 $25
9
S18_224
40 $35
8
Steps:
Use the IF() function to apply a 10% discount for large orders:
2. Checking Late Payments: In the "payments.csv" file, check if the payment is late by
comparing PaymentDate with the due date.
2023-08-
103 2023-08-10
05
2023-08-
112 2023-08-12
15
Steps:
3. Checking Stock Levels: In the "products.csv" file, check if the StockLevel is below the reorder
level of 20.
1969 Harley
S18_1749 18
Davidson
1948 Porsche
S18_2325 25
Type 356
Steps:
Steps:
=IF