0% found this document useful (0 votes)
10 views61 pages

Data Visualization in Data Science

Data visualization is the practice of translating data into visual formats to enhance understanding and insights, essential for various careers and big data projects. It includes various techniques and tools, such as charts and graphs, to identify patterns and trends in data. The document outlines the benefits of data visualization, its applications, and popular tools like Tableau and Infogram.

Uploaded by

Manjeet Kaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views61 pages

Data Visualization in Data Science

Data visualization is the practice of translating data into visual formats to enhance understanding and insights, essential for various careers and big data projects. It includes various techniques and tools, such as charts and graphs, to identify patterns and trends in data. The document outlines the benefits of data visualization, its applications, and popular tools like Tableau and Infogram.

Uploaded by

Manjeet Kaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 61

Data Visualization in Data Science

Agenda

• Data visualization

What? Why?

Benefits

Techniques

Who uses it?
• Types of Graphs
• Tools
• Techniques in programming
• Best resources
Just check...
Data

• Data visualization is the practice of


translating information into a visual
context, such as a map or graph, to
make data easier for the human brain
to understand and pull insights from.
• The main goal of data visualization is to
make it easier to identify patterns, trends
and outliers in large data sets.
• The term is often used interchangeably
with others, including information
Data
graphics, information visualization and
statistical graphics.
Data

• Data visualization is one of the steps of the


data science process, which states that
after data has been collected, processed
and modeled, it must be visualized for
conclusions to be made.
• Data visualization is also an element of the
broader data presentation architecture (DPA)
discipline, which aims to identify, locate,
manipulate, format and deliver data in the
most efficient way possible.
Data

• Data visualization is important for almost every


career.
• It can be used by teachers to display student
test results, by computer scientists exploring
advancements in artificial intelligence (AI) or by
executives looking to share information with
stakeholders.
• It also plays an important role in big data
projects. As businesses accumulated massive
collections of data during the early years of the
big data trend, they needed a way to get an
overview of their data quickly and easily.
Data
• Visualization tools were a natural fit.
Benefits of Data

• The ability to absorb information


quickly, improve insights and make
faster decisions;
• An increased understanding of the next
steps that must be taken to improve the
organization;
• An improved ability to maintain the
audience's interest with information
they can understand;
Benefits of Data
• An easy distribution of information that
increases the opportunity to share
insights with everyone involved;
Benefits of Data

• Eliminate the need for data scientists


since data is more accessible and
understandable; and
• An increased ability to act on findings
quickly and, therefore, achieve
success with greater speed and less
mistakes.
Data Visualization Roles

• Showing change over time


• Showing a part-to-whole composition
• Depicting flows and processes
• Looking at how data is distributed
• Comparing values between groups
• Observing relationships between variables
• Looking at geographical data
Sparkline A Line Chart shows data points connected by lines. It’s like
Used for: Small, simple trend overview joining dots to show how something grows or shrinks over time.
A Sparkline is a tiny line chart—no labels, no axes. It’s made to Each dot shows a value at a specific time (like sales in January,
fit inside text or tables.Just a quick visual idea of how data February, March, etc.).These dots are connected with lines to show
moves—up, down, or steady. It doesn't take space and gives if the values are going up or down. Comparing trends (like
mini trend summary.Adding trend info in Excel monthly sales, temperature, stock prices). Seeing how values
tables.Dashboards or reports where space is limitedYou’re increase or decrease over time. Imagine you track your monthly
making a table of your weekly steps: mobile bill:
Wee Step Tren
k s d o Jan: ₹300
4,00 ▄▅▃ o Feb: ₹280
1 o Mar: ₹350
0 ▆▂
6,20 ▃▆▅
2 You can draw a line chart to show these ups and downs month by
0 ▄▇
The last column uses sparklines to show movement in steps month.
without writing numbers.
Change over time

Connected Scatter Plot Bar Chart


Used for: Showing the relationship between two numeric values Used for: Showing comparison between values or time periods.
that change over time. A Bar Chart uses bars to show values. The taller the bar, the
A Connected Scatter Plot looks like a scatter plot (dots showing higher the value. Each bar shows a value for one item (like
points), but it connects the points with lines in the order of time. It months, people, or products). The height of the bar shows “how
shows how two numbers (like temperature and time, or height and much” or “how many.” Comparing items easily (like sales in Jan,
weight) change together. The points are connected, but the line may Feb, Mar). Easy to read when the number of items is not too
go zigzag or in loops—not always left to right. Showing patterns many. Suppose you track monthly sales:
or relationships between two changing values. Comparing how one Mont Sales
variable reacts when the other changes over time. h (₹)
Let’s say: X = Study Hours Y = Test Scores. Each dot is a test day. Jan 1,000
The plot connects those dots showing how your score changed based Feb 1,500
on how much you studied. Mar 900
A bar chart would show three vertical bars—Feb would be the
tallest.
Change over time

Box Plot (Box and Whisker Plot) Candlestick Chart

A box plot shows how a set of numbers is spread out (its Looks similar to a box plot but is used for tracking price changes
distribution). Each “box” represents where most of the data lies. The (like in the stock market). Each candle shows: Where the price
"whiskers" show the smallest and largest values (excluding outliers). opened. Where it closed, How high and low the price went during
Best when you have many values recorded over time (like test scores the time, Color tells you if the price went up or down. Used mainly
every month). Helpful for comparing different groups (e.g., scores in finance to track stock prices daily, weekly, or monthly.
of boys vs. girls). Imagine you record test scores of 5 students every Say a stock: Opened at ₹100, Went up to ₹120, down to ₹90, Closed
month: Some scored low, some high, some average. A box plot will at ₹110, A candlestick will show this: Tall line from 90 to 120
show the middle score, lowest, highest, and how spread out the (whiskers). A colored box from 100 to 110, If price went up, box is
scores are — all in one graphic. white/green; if down, it's black/red.
Change over

PieWaffle
Chart Chart / Grid Plot Stacked
Doughnut Bar Chart
Chart
A waffle
A pie chart chart usesdivided
is a circle small squares (often
into slices. Each10×10
slice = 100
represents A
A bar chart where
doughnut chart iseach
likebar is divided
a pie chart with intoa smaller colored
hole in the
squares). Each square represents 1% of the total.
a part (or percentage) of the total. The whole circle = 100%, The sections (sub-bars). It shows parts of a whole within
middle. It shows the same information as a pie chart (parts a of a
andsquares are colored
slices show how much based oncategory
each categorycontributes.
size. Showing Youhow
have single
whole),bar.
but Comparing
it leaves spacetotal
invalues across
the center. Youcategories and a
can also display
much each category contributes to a whole
5 or fewer categories. You want to compare parts to the (like a pie seeing
number how
or each part
text in thecontributes.
middle (e.g.,An alternative
total amount).toYou
pie want
or to
chartImagine
whole. but in square form).
you have ₹100Good
andfor
youvisualizing
spend it like percentages
this: doughnut charts — with
show proportions easier to compare
a cleaner sizes. look. You want to
or modern
clearly.
 ₹40Let’s say you surveyed 100 people: 40 like chocolate,
on food, Imagine
add textshowing
or a value monthly
in the expenses:
center (like Bar“Total
1 (January)
Sales =shows:
₹5000”).
30 like vanilla, 30 like strawberry. A waffle chart will show: rent,
Same example as above, but in the center of theon
food, travel, savings. Each part is stacked top of theyou
doughnut,
40
 brown
₹30 onsquares
rent, (chocolate), 30 white squares (vanilla), 30 other
could to make
write one ₹
“Total tall bar for January. Bar 2 (February) does
100”.
pink squares (strawberry).
 ₹20 on travel, the same — you can compare both the total and individual
 ₹10 on others. parts.

A pie chart will show 4 slices representing how your money is


divided.
Part-to-whole

Stacked Area Chart Stream Graph Waterfall Chart


A line chart with shaded areas below A curvy version of the stacked area A chart made with bars that shows how
each line. These shaded parts stack on chart. The layers are stacked around a a starting value changes step by step to
top of each other, showing how smaller center line (instead of all starting from reach an end value. Some bars are
parts add up to a total over time. the bottom). It shows relative changes, floating to show gains or losses in
Showing how different parts
Mosaic Plot / Marimekko Chart (like not exact totals. Showing how each
Treemap between. Explaining profit/loss or
departments
Think of it asora product
stackedsales)
bar chart split both category increases
vertically and or A decreases
big box made overup of step-by-step
many smallerchanges in data.
boxes. Each Showing
smaller box
contribute to a total. Good for trends time without
horizontally. It shows two categorical variables at the same focusing on precise how you moved from Point
represents a subcategory and its size reflects the value. A to Point B.
over
time:time.
One Imagine
divides theyour company
width of theearns numbers.
boxes. The Good for music
other divides genres,
Unlike social
mosaic Let’s
plots, the say
cuts yourhave
don’t company’s
to followprofit: Starts at
a fixed
money from 3 products: A, B, and C. media
the height inside each box. Showing relationships between trends, etc. If 3 music genres (pop, ₹ 1000, Gains ₹ 200 from sales,
direction. Visualizing many parts of a whole and their sizes. Loses
two categories (e.g., gender and educationrock, jazz)
level). change
Great for in popularity over
Showing hierarchical 150 in
₹data in expenses,
a compactEndsway.at ₹1050.
Invisualizing earns ₹100, Bor
January, Ahierarchical ₹150, and C years: The
comparative data. Suppose you stream graph will flow
Let’s say you run an e-commerce store with different product
₹are
250.The chart Gender
studying: will show the total
(Male, as ₹500,
Female), smoothly
Education Levelshowing
(High how much each
categories: Electronics,A Clothing,
waterfallHome
chart Appliances,
shows each stepetc. as a bar
with each colored layer stacked on top of grows
School, Graduate, Postgraduate). Each gender gets its own or shrinks, like a river with going up or down, like stairs.
Each box shows how much sales each category made: Bigger
the previous one. (e.g., wider for more people), colored waves.
vertical space and inside each, the box = more sales. Inside "Electronics", you can further
education levels are shown as horizontal sections. have sub-boxes like "Mobiles", "Laptops", etc.
Part-to-whole
Part-to-whole
Part-to-whole
Flows and
Flows and
How data is
How data is
How data is
How data is
Comparing values between
Comparing values between
Comparing values between
Comparing values between
Comparing values between
Comparing values between
Comparing values between
Relationships between
Relationships between
Relationships between
Relationships between
Relationships between
Relationships between
Relationships between
Geographical
Geographical
Geographical
Raw
Data Visualization

• Tableau
• Infogram
• ChartBlocks
• D3.js
• Google Charts
• Fusion Charts
• Chart.js
Tablea

• Tableau has a variety of options available,


including a desktop app, server and hosted
online versions, and a free public option.
• There are hundreds of data import options
available, from CSV files to Google Ads and
Analytics data to Salesforce data.
• Output options include multiple chart formats
as well as mapping capability. That means
designers can create color-coded maps that
showcase geographically important data in a
format that’s much easier to digest than a
Tablea
table or chart could ever be.
Infogram

• Infogram is a fully-featured drag-and-drop


visualization tool that allows even non-
designers to create effective visualizations of
data for marketing reports, infographics, social
media posts, maps, dashboards, and more.
• Finished visualizations can be exported into a
number of formats: .PNG, .JPG, .GIF, .PDF, and
.HTML. Interactive visualizations are also
possible, perfect for embedding into websites or
apps.
• Infogram also offers a WordPress plugin that
makes embedding visualizations even easier
for WordPress users.
ChartBlocks

• ChartBlocks claims that data can be imported from


“anywhere” using their API, including from live feeds.
While they say that importing data from any source can
be done in “just a few clicks,” it’s bound to be more
complex than other apps that have automated modules
or extensions for specific data sources.
• The app allows for extensive customization of the final
visualization created, and the chart building wizard
helps users pick exactly the right data for their
charts before importing the data.
• Designers can create virtually any kind of chart, and the
output is responsive—a big advantage for data
visualization designers who want to embed charts into
websites that are likely to be viewed on a variety of
devices.
D3.js

• D3.js is a JavaScript library for


manipulating documents using data.
• D3.js requires at least some JS knowledge,
though there are apps out there that allow
non- programming users to utilize the
library.
• Those apps include NVD3, which offers
reusable charts for D3.js; Plotly’s Chart
Studio, which also allows designers to create
WebGL and other charts; and Ember Charts,
which also uses the Ember.js framework.
Google Charts

• Google Charts is a powerful, free data visualization


tool that is specifically for creating interactive charts
for embedding online.
• It works with dynamic data and the outputs are
based purely on HTML5 and SVG, so they work in
browsers without the use of additional plugins. Data
sources include Google Spreadsheets, Google Fusion
Tables, Salesforce, and other SQL databases.
• There are a variety of chart types, including maps,
scatter charts, column and bar charts, histograms,
area charts, pie charts, treemaps, timelines, gauges,
and many others. These charts can be customized
completely, via simple CSS editing.
FusionCharts

• FusionCharts is another JavaScript-based


option for creating web and mobile
dashboards. It includes over 150 chart types
and 1,000 map types.
• It can integrate with popular JS frameworks
(including React, jQuery, React, Ember, and
Angular) as well as with server-side
programming languages (including PHP, Java,
Django, and Ruby on Rails).
• FusionCharts gives ready-to-use code for all of
the chart and map variations, making it easier
to embed in websites even for those
designers with limited programming
knowledge.
Chart.js

• Chart.js is a simple but flexible JavaScript


charting library. It’s open source, provides
a good variety of chart types (eight total),
and allows for animation and interaction.
• Chart.js uses HTML5 Canvas for output, so it
renders charts well across all modern
browsers. Charts created are also
responsive, so it’s great for creating
visualizations that are mobile-friendly.
Visualization using Programming

• Python

matplotlib

seaborn

plotly

pylab
•R

graphics

ggplot2
Best Resources to Learn

• https://python-graph-gallery.com
• https://www.r-graph-gallery.com
• https://chartio.com
Thank you
This presentation is created using LibreOffice Impress 7.4.1.2, can be used freely as per GNU General Public
License

@mITuSkillologies @mitu_group @mitu-skillologies


@MITUSkillologies

Web Resources
https://mitu.co.in
@mituskillologies http://tusharkute.co @mituskillologies
m

contact@mitu.co.in
tushar@tusharkute.com

You might also like