Unit-5: Data Visualization using DataFrame
Unit-5 Data Visualization Using DataFrame
Data Visualization:
Using visualization elements like graphs, charts, maps, etc., it becomes easier for
clients to understand the underlying structure, trends, patterns and relationships among
variables within the dataset.
Installing matplotlib
Easiest way to install matplotlib is to use pip. Type following command in
terminal:
pip install matplotlib
Importing matplotlib
matplotlib.pyplot is a collection of command-style functions and methods
that have been intentionally made to work.
Import matplotlib.pyplot as pt
Plotting a Line
Plotting a Single Line Data( One dimension)
Example: (plot4.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 1
Unit-5: Data Visualization using DataFrame
plot():
plot() function is used to draw points in a diagram. By default plot()
function draws a line from point to point.
Syntax:
Pt.plot(parameter1, parameter2)
Here, parameter 1 containing the points on the x-axis and
Parameter 2 is an array containing the points on the y-axis.
title():
Pt.title() function is used to provide a title of a graph.
Syntax:
Pt.title(“Title of a Graph”)
xlabel() and tlabel():
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 2
Unit-5: Data Visualization using DataFrame
xlabel() and ylabel() functions are used to provide a label of x-axis and y-
axis respectively.
Syntax:
Pt.xlabel(“x-axis name”)
Pt.ylabel(“Y axis name”)
show():
pt.show() function is used to display the plotted points in a terms of graph.
Syntax:
Pt.show()
Plotting a Single Line Data( Two dimension)
Example: (plot1.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 3
Unit-5: Data Visualization using DataFrame
Multple Points:
Example: (plot2.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 4
Unit-5: Data Visualization using DataFrame
Markers:
You can use the keyword argument marker to emphasize each point with a
specified marker:
Example: (plot3.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 5
Unit-5: Data Visualization using DataFrame
You can Select any marker from the following list:
Marker Description
‘o’ Circle
‘*’ Star
‘.’ Point
‘,’ Pixel
‘+’ Plus
‘s’ Square
You can Specify Color for the line. You can select Color from the following list:
Color Description
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 6
Unit-5: Data Visualization using DataFrame
‘r’ Red
‘g’ Green
‘b’ Blue
‘c’ Cyan
‘m’ Magenta
‘y’ Yellow
‘k’ Black
You can Select Line reference from the following list
Line Description
‘-’ Solid Line
‘:’ Dotted Line
‘--’ Dashed Line
Example: (plot5.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 7
Unit-5: Data Visualization using DataFrame
Plot Different Line With Different Formats
Example: (plot6.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 8
Unit-5: Data Visualization using DataFrame
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 9
Unit-5: Data Visualization using DataFrame
What is markeredgecolor, markersize and markerfacecolor?
You can use the keyword argument markersize or the shorter version, ms
to set the size of the markers:
o Syntax:
Plt.plot(x,y,ms=size)
You can use the keyword argument markeredgecolor or the shorter mec to
set the color of the edge of the markers:
o Syntax:
Plt.plot(x,y,mec=’color’)
You can use the keyword argument markerfacecolor or the shorter mfc to
set the color inside the edge of the markers:
o Syntax:
Plt.plot(x,y,mfc=’color’)
Example: (plot7.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 10
Unit-5: Data Visualization using DataFrame
Functions:
1) range():
2) legend():
A legend is an area describing the elements of the graph.
Syntax:
Plt.legend(*‘string’+,loc,ncol)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 11
Unit-5: Data Visualization using DataFrame
Attributes:
loc:
The attribute loc in legend() is used to specify the
location of the legend.Default value of loc is loc=”best”
(upper left). The strings ‘upper left’, ‘upper right’, ‘lower
left’, ‘lower right’ place the legend at the corresponding
corner of the axis.
ncol:
Represents the number of columns that the legend has.It’s
default value is 1.
Example: (plot8.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 12
Unit-5: Data Visualization using DataFrame
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 13
Unit-5: Data Visualization using DataFrame
3) subplot():
With the subplots() function you can draw multiple plots in one figure. The
subplots() function takes three arguments that describes the layout of the figure.
The layout is organized in rows and columns, which are represented by the first
and second argument. The third argument represents the index of the current
plot.
Syntax:
Plt.subplot(rows,columns,index_current_plot)
Example:
plt.subplot(1, 2, 1)
#the figure has 1 row, 2 columns, and this plot is the first plot.
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 14
Unit-5: Data Visualization using DataFrame
Example: (plot9.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 15
Unit-5: Data Visualization using DataFrame
Example: (plot10.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 16
Unit-5: Data Visualization using DataFrame
You can Specify Title and Supertitle of graphs using title() and suptitle() function.
Example: (plot11.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 17
Unit-5: Data Visualization using DataFrame
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 18
Unit-5: Data Visualization using DataFrame
How to create bar chart?
With Pyplot, you can use the bar() function to draw bar graphs.
The bar() function takes arguments that describes the layout of the bars.
The categories and their values represented by the first and second argument as
arrays.
Syntax:
Plt.bar(x,y)
Example: (plot12.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 19
Unit-5: Data Visualization using DataFrame
Horizontal Bar:
If you want the bars to be displayed horizontally instead of vertically, use the barh()
function.
Syntax:
barh(x,y)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 20
Unit-5: Data Visualization using DataFrame
Note:
o you can apply color of the bar using color parameter.
o You can adjust width of the bar using width parameter. Default width is
0.8.
Plt.bar(x,y,color=’r’,width=0.15)
o For horizontal bars, use height instead of width. You can also adjust the
height of the bar using height parameter. Default height is 0.8
Plt.barh(x,y,color=’k’,height=0.2)
Example: (plo13.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 21
Unit-5: Data Visualization using DataFrame
Example: (plot14.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 22
Unit-5: Data Visualization using DataFrame
How to create Histogram chart?
A histogram is a graph showing frequency distributions. It is a graph showing the
number of observations within each given interval.
Example:
You can read from the histogram that there are approximately:
2 people from 140 to 145cm
5 people from 145 to 150cm
15 people from 151 to 156cm
31 people from 157 to 162cm
46 people from 163 to 168cm
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 23
Unit-5: Data Visualization using DataFrame
53 people from 168 to 173cm
45 people from 173 to 178cm
28 people from 179 to 184cm
21 people from 185 to 190cm
4 people from 190 to 195cm
In Matplotlib, we use the hist() function to create histograms.
The hist() function will use an array of numbers to create a histogram, the array is sent
into the function as an argument.
Syntax:
Plt.hist(data)
Example: (plot15.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 24
Unit-5: Data Visualization using DataFrame
rwidth attribute with hist() function and ylim() function
xlim() and ylim() functions
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 25
Unit-5: Data Visualization using DataFrame
xlim():
The xlim() function in pyplot module of matplotlib library is used to get or set the
x-limits of the current axes.
Syntax:
plt.xlim(left,right)
Parameter:
left: This parameter is used to set the xlim to left.
right: This parameter is used to set the xlim to right.
ylim():
The ylim() function in pyplot module of matplotlib library is used to get or set the
y-limits of the current axes.
Syntax:
plt.ylim(left,right)
Parameter:
left: This parameter is used to set the ylim to left.
right: This parameter is used to set the ylim to right.
Example: (plot16.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 26
Unit-5: Data Visualization using DataFrame
Functions: range(), grid():
Plt.grid():
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 27
Unit-5: Data Visualization using DataFrame
With Pyplot, you can use the grid() function to add grid lines to the plot. You can
use the axis parameter in the grid() function to specify which grid lines to display.
Legal values are: 'x', 'y', and 'both'. Default value is 'both'.
You can also set the line properties of the grid,
like this: grid(color = 'color', linestyle = 'linestyle', linewidth = number).
Syntax:
Plt.grid(color = 'color', linestyle = 'linestyle', linewidth = number)
bins:
histogram displays numerical data by grouping data into "bins" of equal width.
Each bin is plotted as a bar whose height corresponds to how many data points are
in that bin. Bins are also sometimes called "intervals", "classes", or "buckets".
Example: (plot17.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 28
Unit-5: Data Visualization using DataFrame
How to create Scatter Plot?
With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two arrays of the
same length, one for the values of the x-axis, and one for values on the y-axis.
You can set your own color for each scatter plot with the color or the c argument.
colormap:
The Matplotlib module has a number of available colormaps. A colormap is like a
list of colors, where each color has a value that ranges from 0 to 100.
In this case 'viridis' which is one of the built-in colormaps available in Matplotlib.
You can specify the colormap with the keyword argument cmap with the value of
the colormap.
You can include the colormap in the drawing by including the plt.colorbar()
statement.
size:
You can change the size of the dots with the s argument.
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 29
Unit-5: Data Visualization using DataFrame
alpha:
You can adjust the transparency of the dots with the alpha argument.
Syntax:
Plt.scatter(x,y,c=”color”,s=size,alpha=transparency)
Example: (plot18.py)
Example: (plot19.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 30
Unit-5: Data Visualization using DataFrame
Note:
The two plots are plotted with two different colors, by default blue and orange,
you can change colors.
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 31
Unit-5: Data Visualization using DataFrame
Example: (plot20.py)
Example: (plot21.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 32
Unit-5: Data Visualization using DataFrame
Example: (plot22.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 33
Unit-5: Data Visualization using DataFrame
Example: (plot23.py)
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 34
Unit-5: Data Visualization using DataFrame
How to plot Bar Graph in Python using CSV file?
Steps:
1) Import module
2) Read file using read_csv() function
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 35
Unit-5: Data Visualization using DataFrame
3) Plot bar graph
4) Display graph
Example: (plot25.py)
Student_details.csv
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 36
Unit-5: Data Visualization using DataFrame
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 37
Unit-5: Data Visualization using DataFrame
BY: Heta S. Desai Shri S.V.Patel College of CS & BM Page 38