0% found this document useful (0 votes)
8 views17 pages

MCA_S3_Data Visualisation_U2

Uploaded by

Ramu Atmuri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views17 pages

MCA_S3_Data Visualisation_U2

Uploaded by

Ramu Atmuri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Data Visualisation

Unit-02
Introduction to Charts and Plots

Semester-03
Master of Computer Applications 2
UNIT

Introduction to Charts and Plots

Names of Sub-Units

Charts/Plots Used For Visualisation, Types Of Charts/Plots Used For Data Visualisation, Line Chart,
Area Chart, Bar Chart, Scatter Chart, Pie Chart, Surface Chart, Bubble Chart, Doughnut Chart,
Histogram, Box Plot, Hexbin Plots, Violin Plot, Heat Maps, Gantt Charts, Word Clouds (Text
Visualisation), Effectiveness Of Visualisation Across Data Types, Quantitative Data, Qualitative
(Categorical) Data

Overview

This unit begins by discussing the concept of charts or plots. Next, the unit explains the different types
of charts or plots used in data visualisation. Towards the end, the unit covers the effectiveness of
visualisation across data types.

Learning Objectives

In this unit, you will learn to:


 Explain the concept of chart or plot
 Describe the uses of line chart, scatter chart, bubble chart, histogram, bar chart, pie chart and
doughnut chart
 Discuss the use of box plot, hexbin plots, violin plot and heat maps
 Outline the concept of gantt chart and word cloud
 Interpret the effectiveness of data visualisation across data types

3
Learning Outcomes

At the end of this unit, you would:


 Enlist the types of chart or plot
 Examine the significance of using the line chart, scatter chart, bubble chart, histogram, bar chart,
pie chart, doughnut chart in data visualisation
 Define Quantitative Data and Qualitative (Categorical) Data
 Analyse the concept of gantt chart and word cloud
 Evaluate the effectiveness of data visualisation across data types

Pre-Unit Preparatory Material

 https://sites.tufts.edu/gis/files/2016/02/Introduction_to_Data_Visualisation.pdf

2.1 INTRODUCTION
A chart is a graphical representation for all the data visualisation, in which “the data is represented by
the indicators or symbols, such as bars in the bar chart, lines in the line chart or slices in a pie chart.
A chart can, thus show all of the tabular numeric data as well as functions or quality structures and
provide varied information.
A data chart is a type of diagram or graph that then organises & represents a set of al numerical or
qualitative data. Charts are maps that are embellished with all the additional info for a specific purpose,
such as a nautical chart or an aeronautical chart and are often distributed across numerous map sheets.
Other domain-specific constructs that are commonly referred to as charts include the chord chart,
which is used in music notation and the record chart, which is used to track album popularity.
Charts are often used to ease the understanding of large quantities of data & the relationships between
parts of the data. Charts can usually be read more quickly than the actual raw data. They are used in a
wide variety of fields and can also be created by hand or by the computer using a charting application.
Certain types of charts are then more useful for presenting a given data set than others.

2.2 CHARTS/PLOTS USED FOR VISUALISATION


When data is collected, there’s a desire to interpret and analyse it to produce insight into it. This insight
is often concerning patterns, trends or relationships between variables. Data interpretation is the method
of reviewing data in well-defined ways. They assist assign assuming to the information andgain a
relevant conclusion. The analysis is the method of ordering, categorising and summarising data to
answer analysis queries. It ought to be done quickly and effectively. The results got to stand out and may
be right in your face. Data Plot varieties for the image is a vital side of this finish. With growing knowledge,
this want is growing and thus data plots become important in today’s world. However, thereare many
varieties of plots utilised in knowledge images. It is usually difficult to settle on which kindis best for
your business or data. Every one of those plots has its strengths & weaknesses that build it higher than
others in some things.

4
2.3 TYPES OF CHARTS/PLOTS USED FOR DATA VISUALISATION
Data can be presented in various visual forms, which include simple line diagrams, bar graphs, tables,
matrices, etc. Some techniques used for a visual presentation of data are as follows:
 Line chart  Histogram
 Area chart  Box plot
 Bar chart  Hexbin plot
 Scatter chart  Violin plot
 Pie chart  Heat maps
 Surface chart  chart
 Bubble chart  Word clouds (text visualisation)
 Doughnut chart

Let us discuss these types one by one in the subsequent sections.

2.3.1 Line Chart


Line charts are used to plot continuous data in the form of lines. Therefore, each point on a line chart
corresponds to a value. A line chart can use any number of data series (that is, continuous related data
in a column) and you can distinguish the lines by using different colours or line styles. For example,
plotting the budget and expenses of an organisation as a line chart may enable you to identify cost
fluctuations. Figure 1 shows the line chart:
50 60
Number of houses sold

50 50

40
40

30 30
30

20
20
10

Figure 1: Line Chart


To represent data, a line chart uses a horizontal axis (x-axis) and a vertical axis (y-axis). The x-axis
represents the time period, while the y-axis represents the item being evaluated. A line chart vividly
depicts a specific item’s rising or declining trend. You can use the line chart in various conditions, which
are as follows:
 To track the evolution of a dependent variable through time
 To identify trends and spot peaks and troughs
 To compare multiple section layouts

5
2.3.2 Area Chart
In an area chart, areas are used to represent values. It is similar to a line chart in that it displays a series
as a set of points connected by a line. However, the difference is that in an area chart, the area below
the line is filled with the colour of the line. Area charts help to draw attention to the total value across
a given data. For example, data showing the turnover of a business over time can be plotted in an area
chart to focus on the total turnover of the business.
Figure 2 shows the area chart:

Sales
Expenses

800

600

400

200

Year

Figure 2: Area Chart


You can use the area chart in various conditions, which are as follows:
 To give your data a sense of volume
 To see how parts of groups relate to the total
 To determine the magnitude of a quantitative data trend
 To highlight the magnitude of a modification
 To show massive variations between values

2.3.3 Bar Chart


A bar chart is a visual presentation of category data. The data is represented using a bar chart, which
has a number of bars, every representing a different category. Each bar’s height corresponds to a
specific aggregate (for example the sum of the values in the category it represents). A geographical
area or an age group could be used as categories. You may also colour or break each bar into a distinct
classified column in the data, allowing you to observe how different categories contribute to each bar
or set of bars in the bar chart.
The bars can be plotted either horizontally or vertically. A column chart is another name for a vertical bar
chart.

6
Figure 3 shows the bar chart:

Records

4k

0
<18 18-29 40-49 60-69 70-79

Figure 3: Bar Chart


A bar chart is the greatest solution if you have comparison data that you want to convey through a
chart. This type of chart is one of the most used since it is simple to understand. These graphs can be
used to illustrate data that has been classified into nominal or ordinal categories.

2.3.4 Scatter Chart


Scatter charts are used to show the relationship between the numeric values in two data series. This type
of chart represents one data series in the X-axis (that is, horizontally) and the other data series in the Y-
axis (that is, vertically). Scatter charts are usually used to analyse and compare scientific, statistical and
engineering values.
Figure 4 shows the scatter chart:

9000
8000
7000
6000
Data

5000
4000
3000
2000
1000

1998 2000 2002 2004 2006


Year

Figure 4: Scatter Chart

7
The fundamental purpose of a scatter plot chart is to analyse and display relations between two different
numeric variables. A scatter chart can also help you spot additional patterns in your data. This chart
also used to identify the correlation relationship.

2.3.5 Pie Chart


A pie chart is used to show relative proportions or contributions to a whole, which is contributed by each
value in a single data series. Pie charts are most effective while representing a small amount of data.
A chart highlights information and statistics in pie-slice format. This sort of chart represents numbers
in percentages and also the total of all pies ought to equal 100%. Figure 5 shows the daily activity of an
individual using the pie chart:

Work

Figure 5: Pie Chart


You can use the area chart in various conditions, which are as follows:
 Illustrate part-to-whole comparisons — from business graphs to room charts.
 Identify the littlest and largest things at intervals in an information set.
 Compare variations between multiple information points.

2.3.6 Surface Chart


A surface chart shows a three-dimensional surface that connects a set of data points. A surface chart is
useful when you want to find optimum combinations between two sets of data. Similar to a topographic
map, the colours and patterns in a surface chart indicate areas that contain the same range of values.
Unlike other chart types, colours in a surface chart are not used to distinguish each data series. Instead,
colours are used to distinguish the values. Figure 6 shows the surface chart:

8
$40,000
$20,000
$0

$0-$20,000 $20,000-$40,000 $40,000-$60,000

Figure 6: Surface Chart

2.3.7 Bubble Chart


A bubble chart is similar to a Scatter chart with the only difference that it displays bubbles instead of data
points. You can use a bubble chart in place of a Scatter chart if your data has three or more data series,
which contains a set of values. The size of the bubbles in a bubble chart is determined by the values of the
third data series. Figure 7 shows the bubble chart:

70

60

50

40 Series 1
Series 2
30 Series 3
Series 4
20

20 30 40 50

Figure 7: Bubble Chart


Bubble charts are commonly used to compare and display correlations between categorises circles
using location and proportions. Bubble Charts can be used to look for patterns and correlations by looking
at the full picture.

2.3.8 Doughnut Chart


A doughnut chart, similar to a pie chart, is used to show the relationship of parts to a whole. However,
unlike a pie chart, a doughnut chart contains more than one data series. Figure 8 shows the doughnut
chart:

9
Figure 8: Doughnut Chart

2.3.9 Histogram
Histogram chart is used to show data in the form of frequency within a distribution. Each column in
the histogram chart is known as Bin. However, the continuously flowing data can be represented using
Histogram. It makes it easy to analyse the data defined within various data ranges.
Figure 9 shows the histogram chart:

20
Frequency

20

15 20 25 35 40

Figure 9: Histogram Chart


Histograms are effective for displaying a single scale variable’s variation. A number or proportion
statistic is used to binned and summarise data. A frequency polygon is a kind of histogram that looks
like a regular histogram but uses the area graphic element instead of the bar visual element.

2.3.10 Box Plot


A boxplot, often known as a box and whisker plot, is a visual representation of a data set’s spread and
centres. This plot is appropriate to represent statistical data sets related to each other, without using any
formula. This plot produces answers from the raw data. The data is distributed into quartiles, along with

10
highlighted mean and outliers.
Figure 10 shows the box plot:

800
700
600

400
300
200

Experiment 20 Experiment 22 Experiment 24

Data Set 1 Data Set 2

Figure 10: Box Plot

2.3.11 Hexbin Plots


When you have a lot of data points, a hexbin plot is handy for representing the relationship between two
numerical variables. The plotting window is divided into numerous hexagonal bin (hexbins) to avoid
point overlapping. The number of points in each hexbin is indicated by its colour. Figure 11 shows the
hexbin plot:

20

-2

-2

Figure 11: Hexbin Plot


A hexbin plot is made by using a regular array of hexagons to span the data range and colouring each
hexagon according to the number of observations it covers. The hex-binned plots, like all bin plots, are
useful for viewing big data sets that a scatter plot would over plot.

11
2.3.12 Violin Plot
A violin plot is a cross between a box plot and a statistical plot that shows data peaks. It is used to show
how numerical data is distributed. Unlike a box plot, which simply shows summary data, violin plots show
the intensity of each variable as well. A violin plot uses density curves to represent numeric data
distributions for one or more groups. Figure 12 shows the violin plot:

0.5

0.0

-0.5

Figure 12: Violin Plot

2.3.13 Heat Maps


A heat map is a sort of data visualisation in which the unique values in a matrix are represented by
colour variations. Heat maps are effective for displaying patterns in correlations by depicting variation
among various variables.
Heatmaps may be used to cross-examine multivariate data in a tabulated form by inserting variables
in the rows and columns and colouring the cells inside the table. Heatmaps are useful for expressing
variance across numerous variables, identifying patterns, exhibiting whether or not variables are
comparable to one another and determining whether or not there are any correlations between them.
Figure 13 shows the heat map:

Town A

Town B

Town C

Town D

Town E

Town F

Town G

Town H

Town I

Town K

12
Figure 13: Heat Map
Typically, all of the rows belong to one category, while all of the columns belong to another. The
subcategories are separated into different rows and columns, that are all matched together in a matrix.

2.3.14 Gantt Charts


The Gantt chart illustrates the project schedule by summarising the tasks to be performed, their start and
end dates, their sequence or order of occurrence and overall duration. It also indicates important
milestones in the project. The Gantt chart, developed by Henry Gantt in 1917, is a type of bar chart
that illustrates the entire project schedule. A Gantt chart consists of a horizontal axis representing
the total time span of the project, divided into increments (for example- days, months or weeks) and a
vertical axis representing tasks that make up the project. Horizontal bars of varying length represent a
sequence, timing and duration for each task. The bar may overlap for tasks that are carried out during
the same time span. Arrows may be added to show the sequencing. A vertical line is used to represent the
report date.
Figure 14 shows the gantt chart:

Task One
Task Two
Task Three
Task Four
Task Five
Task Six
Task Seven
Task Eight
Task Nine
Task Ten
Task Eleven

Figure 14: Gantt Chart


Gantt charts are useful techniques to represent phases and activities that make up a project and
illustrate the project status but they do not reflect the dependencies between tasks. In other words,
Gantt charts cannot clearly tell how one task is falling behind the schedule and what effect it has on
other tasks. They mainly focus on the schedule management. In addition, Gantt charts neither represent
the size of the project nor provide any idea regarding the workload of a project. Therefore, these charts
are more helpful for small projects in comparison to large projects.

2.3.15 Word Clouds (Text Visualisation)


The size of each word in a word cloud is related to the frequency with which it appears in a given piece
of text. The words are then grouped in a cloud. The text can be organised in any manner, including
diagonal lines, columns and within a shape.
Word clouds are also used to represent words with assigned meta-data. The colour on word clouds is
normally irrelevant and solely for artistic reasons, although it is used to identify words or represent other
data variables. Figure 15 shows the word cloud:

13
Figure 15: Word Cloud

2.4 EFFECTIVENESS OF VISUALISATION ACROSS DATA TYPES


Data may be shared in a number of ways, but data visualisation is the most important way. It is not just
a useful tool for properly expressing data, but it is also majorly used in business.

It is crucial to understand the different categories of data types before choosing the optimal data
visualisation for a given situation. Data are broadly divided into two types, which are as follows:
 Quantitative
 Qualitative (also known as categorical)

2.4.1 Quantitative Data


Quantitative data is information that is based on a numerical value. Quantitative data is defined as
information that is recorded on a quantitative scale and for which the distance between categories (if
any) is significant. Quantitative data can be manipulated statistically and visualised in a number of
graphs and charts, including bar charts, histograms, scatter plots, boxplots, pie charts, line graphs, etc.

There are two types of quantitative data, which are:


 Discrete data: A count involving integers is referred to as discrete data. These values do not have
to be whole numbers, but they must be constant. It only includes finite values, none of which may
be subdivided. It only contains independent values that can only be counted in whole numbers or
integers, implying that the data cannot be divided into fractions or decimals.
The examples of ordinal data are:
 The total number of consumers that purchased various things
 Each department’s total number of computers
 The number of goods you buy each week at the grocery store
 Continuous data: Continuous data can have any value and is always changing. This form of data
can be broken down indefinitely and usefully into smaller and smaller pieces.

14
The examples of ordinal data are:
 Website traffic
 Water temperature
 Wind speed

2.4.2 Qualitative (Categorical) Data


Data that cannot be quantified or tallied in the numeric form is referred to as qualitative or categorical
data. These kinds of data are organised by category rather than by number. Categorical Data is the
name given to it because of this. Audio, images, symbols and text are all examples of data. A person’s
gender, whether male, female or other, is qualitative data.

People’s perceptions are revealed through qualitative data. This information aids market researchers
in gaining a better understanding of their clients’ preferences, allowing them to tailor their ideas and
plans accordingly.

There are two types of qualitative data, which are:


 Ordinal data: Ordinal data contain natural ordering, which means that numbers appear in some
form of order based on their scale position. These data are useful for observational purposes, such as
customer review, but we are unable to do any numeric operations on them. The examples of
ordinal data are:
 In the exam, you will receive letter grades (A, B, C, D, etc.)
 In a competition, people are ranked (First, Second, Third, etc.)
 Financial Situation (High, Medium and Low)
 Nominal data: Nominal data is used to label variables that do not have a numerical value. We
cannot execute any numeric activities with nominal data and we cannot organise the data in any
order. These statistics have no discernible order and their values are divided throughout various
categories. The examples of ordinal data are gender, nationality, colour of items, etc.

You can apply the different types of data visualisation techniques to both quantitative and qualitative
data.

Conclusion 2.5 CONCLUSION

 A chart is a graphical representation for all the data visualisation, in which “the data is to represented
by the indicators or symbols.
 A data chart is a type of diagram or graph that then organises & represents a set of al numerical or
qualitative data.
 Data can be presented in various visual forms, which include simple line diagrams, bar graphs,
tables, matrices, etc.
 Line charts are used to plot continuous data in the form of lines.
 In an area chart, areas are used to represent values. It is similar to a line chart in that it displays a
series as a set of points connected by a line.

15
 A bar chart is a visual presentation of category data.
 Scatter charts are used to show the relationship between the numeric values in two data series.
 A pie chart is used to show relative proportions or contributions to a whole, which is contributed by
each value in a single data series.
 A doughnut chart is used to show the relationship of parts to a whole.
 A boxplot, often known as a box and whisker plot, is a visual representation of a data set’s spread
and centres.
 A violin plot is a cross between a box plot and a statistical plot that shows data peaks.
 The Gantt chart illustrates the project schedule by summarising the tasks to be performed, their
start and end dates, sequence or order of occurrence and overall duration.
 Data are broadly divided into two types, which are: quantitative and qualitative (also known as
categorical).

2.6 GLOSSARY

 Chart: A graphical representation for all the data visualisation, in which “the data is to represented
by the indicators or symbols
 Scatter chart: It is used to show the relationship between the numeric values in two data series
 Surface chart: It shows a three-dimensional surface that connects a set of data points
 Histogram chart: It is used to show data in the form of frequency within a distribution
 Heat map: A sort of data visualisation in which the unique values in a matrix are represented by
colour variations

2.7 SELF-ASSESSMENT QUESTIONS

A. Essay Type Questions


1. Explain the concept of the term chart.
2. Describe the definition and use of Bar Chart.
3. Define the significance of gantt chart in data visualisation.
4. Outline the term “Categorical Data” and it types.
5. Write a short note on Word Cloud.

2.8 ANSWERS AND HINTS FOR SELF-ASSESSMENT QUESTIONS

A. Hints for Essay Type Questions


1. A chart is actually a graphical representation for all the data visualisation, in which “the data is
represented by the indicators or symbols, such as bars in the bar chart, lines in the line chart or

16
slices in a pie chart. A chart can thus show all of the tabular numeric data as well as functions or
quality structures, and provide varied information. Refer to Section Introduction
2. A bar chart is a visual presentation of category data. The data is represented using a bar chart,
which has a number of bars, every representing a different category. Each bar’s height corresponds
to a specific aggregate (for example the sum of the values in the category it represents). Refer to
Section Types of Charts/Plots Used for Data Visualisation
3. The Gantt chart illustrates the project schedule by summarising the tasks to be performed, their start
and end dates, sequence or order of occurrence and overall duration. It also indicates important
milestones in the project. The Gantt chart, developed by Henry Gantt in 1917, is a type of bar chart
that illustrates the entire project schedule. Refer to Section Types of Charts/Plots Used for Data
Visualisation
4. Data that cannot be quantified or tallied in the numeric form is referred to as qualitative or
categorical data. These kinds of data are organised by category rather than by number. Categorical
Data is the name given to it because of this. Audio, images, symbols and text are all examples of data.
A person’s gender, whether male, female or other, is qualitative data. Refer to Section Effectiveness
of Visualisation Across Data Types
5. The size of each word in a word cloud is related to the frequency with which it appears in a given
piece of text. The words are then grouped in a cloud. The text can be organised in any manner,
including diagonal lines, columns and within a shape. Refer to Section Types of Charts/Plots Used
for Data Visualisation

@ 2.9 POST-UNIT READING MATERIAL


 https://john.cs.olemiss.edu/~nhassan/teaching/csci444/fall2017/file/Handbook%20of%20Data%20
Visualisation.pdf
 https://boostlabs.com/blog/10-types-of-data-visualisation-tools/

2.10 TOPICS FOR DISCUSSION FORUMS

 Discuss with your friends about the different types of charts and plots used for data visualisation
and their purposes.

17
18

You might also like