100% found this document useful (1 vote)
250 views70 pages

21css303t-Data Science Unit-3 Visualization

Matplotlib

Uploaded by

Arthy J
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
250 views70 pages

21css303t-Data Science Unit-3 Visualization

Matplotlib

Uploaded by

Arthy J
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 70

21CSS303T-DATA SCIENCE

UNIT-3 VISUALIZATION

by
J. Arthy,
AP\CSE,
SRMIST,
Ramapuram
AGENDA
1. Introduction to MatplotLib

2. Customizing Plot

3. Seaborne Library

4. 3d Plot of Surface
INTRODUCTION TO MATPLOTLIB

1. Matplotlib is a multiplatform data visualization library built on NumPy arrays

2. Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

3. Matplotlib makes easy things easy and hard things possible.

4. Create publication quality plots.

5. Make interactive figures that can zoom, pan, update.

6. Customize visual style and layout.

7. Export to many file formats.

8. Embed in JupyterLab and Graphical User Interfaces.

9. Use a rich array of third-party packages built on Matplotlib.


IMPORTING MATPLOTLIB
In[1]: import matplotlib as mpl

import matplotlib.pyplot as plt

The plt interface is what we will use most often, as we’ll see throughout this presentation.
SETTING STYLES

We will use the plt.style directive to choose appropriate aesthetic styles for our figures. Here we will set the classic style,
which ensures that the plots we create use the classic Matplotlib style:

In[2]: plt.style.use('classic')

from matplotlib import style


print(plt.style.available)

Output:
[‘Solarize_Light2’, ‘_classic_test_patch’, ‘bmh’, ‘classic’, ‘dark_background’, ‘fast’, ‘fivethirtyeight’,
‘ggplot’,’grayscale’,’seaborn’,’seaborn-bright’,’seaborn-colorblind’, ‘seaborn-dark’, ‘seaborn-dark-palette’, ‘seaborn-
darkgrid’, ‘seaborn-deep’, ‘seaborn-muted’, ‘seaborn-notebook’, ‘seaborn-paper’, ‘seaborn-pastel’, ‘seaborn-
poster’,’seaborn-talk’,’seaborn-ticks’,’seaborn-white’,’seaborn-whitegrid’,’tableau-colorblind10′]
PLOTTING FROM A SCRIPT

# ------- file: myplot.py ------


import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 100)
plt.plot(x, np.sin(x))
plt.plot(x, np.cos(x))

plt.show()
$ python myplot.py

The plt.show() command does a lot under the hood, as it must interact with your system’s interactive
graphical backend. The details of this operation can vary greatly from system to system and even installation
to installation, but Matplotlib does its best to hide all these details from you.

One thing to be aware of: the plt.show() command should be used only once per Python session, and is
most often seen at the very end of the script. Multiple show() commands can lead to unpredictable
backend-dependent behavior, and should mostly be avoided.
PLOTTING FROM AN IPYTHON SHELL

It can be very convenient to use Matplotlib interactively within an IPython shell . IPython is built to work well with
Matplotlib if you specify Matplotlib mode. To enable this mode, you can use the %matplotlib magic command
after starting ipython:
In [1]: %matplotlib
Using matplotlib backend: TkAgg

In [2]: import matplotlib.pyplot as plt

At this point, any plt plot command will cause a figure window to open, and further commands can be run to
update the plot. Some changes (such as modifying properties of lines that are already drawn) will not draw
automatically; to force an update, use plt.draw(). Using plt.show() in Matplotlib mode is not required.
PLOTTING FROM AN IPYTHON NOTEBOOK

The IPython notebook is a browser-based interactive data analysis tool that can combine narrative, code, graphics, HTML elements, and
much more into a single executable document .Plotting interactively within an IPython notebook can be done with the %matplotlib
command, and works in a similar way to the IPython shell. In the IPython notebook, you also have the option of embedding graphics
directly in the notebook, with two possible options:

● %matplotlib notebook will lead to interactive plots embedded within the notebook
● %matplotlib inline will lead to static images of your plot embedded in the notebook
● In[3]: %matplotlib inline
After you run this command (it needs to be done only once per kernel/session), any cell within the notebook that creates a plot
will embed a PNG image of the resulting graphic :
ln[4]: import matplotlib.pyplot as plt
import numpy as np
# Create an array of values for x from 0 to 10
x = np.linspace(0, 10, 100)
# Create a new figure object
fig = plt.figure()
# Plot sin(x) with a solid line
plt.plot(x, np.sin(x), '-', label='sin(x)')
# Plot cos(x) with a dashed line
plt.plot(x, np.cos(x), '--', label='cos(x)')
# Add labels and a title
plt.xlabel('x')
plt.ylabel('y')
plt.title('Plot of sin(x) and cos(x)')
# Display a legend
plt.legend()
# Show the plot
plt.show()
SAVING FIGURES TO FILE

One nice feature of Matplotlib is the ability to save figures in a wide variety of formats. You can save a figure using the savefig()
command. For example, to save the previous figure as a PNG file, you can run this:

In[5]: fig.savefig('my_figure.png')

We now have a file called my_figure.png in the current working directory:

In[6]: !ls -lh my_figure.png

-rw-r--r-- 1 jakevdp staff 16K Aug 11 10:59 my_figure.png

To confirm that it contains what we think it contains, let’s use the IPython Image object to display the contents of this file (
Figure 4-2):
In[7]: from IPython.display import Image

Image('my_figure.png')
In savefig(), the file format is inferred from the extension of the given filename. Depending on what backends you have
installed, many different file formats are available. You can find the list of supported file types for your system by using the
following method of the figure canvas object:

In[8]: fig.canvas.get_supported_filetypes()
Out[8]: {'eps': 'Encapsulated Postscript',
'jpeg': 'Joint Photographic Experts Group',
'jpg': 'Joint Photographic Experts Group',
'pdf': 'Portable Document Format',
'pgf': 'PGF code for LaTeX',
'png': 'Portable Network Graphics',
'ps': 'Postscript',
'raw': 'Raw RGBA bitmap',
'rgba': 'Raw RGBA bitmap',
'svg': 'Scalable Vector Graphics',
'svgz': 'Scalable Vector Graphics',
'tif': 'Tagged Image File Format',

'tiff': 'Tagged Image File Format'}


In[9]: plt.figure() # create a plot figure

# create the first of two panels and set current axis

plt.subplot(2, 1, 1) # (rows, columns, panel number)

plt.plot(x, np.sin(x))

# create the second panel and set current axis

plt.subplot(2, 1, 2)

plt.plot(x, np.cos(x));
INTERFACES
A potentially confusing feature of Matplotlib is its dual interfaces: a convenient MATLAB-style state-based interface, and a more
powerful object-oriented interface.

1. Matplotlib was originally written as a Python alternative for MATLAB users, and much of its syntax reflects that fact.
The MATLAB-style tools are contained in the pyplot (plt) interface. For example, the following code will probably look
quite familiar to MATLAB users.
2. The object-oriented interface is available for these more complicated situations, and for when you want more control over
your figure. Rather than depending on some notion of an “active” figure or axes, in the object-oriented interface the
plotting functions are methods of explicit Figure and Axes objects. To re-create the previous plot using this style of
plotting, you might do the following.
In[10]: # First create a grid of plots

# ax will be an array of two Axes objects

fig, ax = plt.subplots(2)

# Call plot() method on the appropriate object

ax[0].plot(x, np.sin(x))

ax[1].plot(x, np.cos(x));
In[10]: # First create a grid of plots

# ax will be an array of two Axes objects

fig, ax = plt.subplots(2)

# Call plot() method on the appropriate object

ax[0].plot(x, np.sin(x))

ax[1].plot(x, np.cos(x));
SIMPLE PLOT GRAPH

Perhaps the simplest of all plots is the visualization of a single function y=f(x). Here we will take a first look at creating a simple
plot of this type. As with all the following sections, we’ll start by setting up the notebook for plotting and importing the functions
we will use:
In[1]: %matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import numpy as np
For all Matplotlib plots, we start by creating a figure and an axes. In their simplest form, a figure and axes can be created as
follows
In[2]: fig = plt.figure()
ax = plt.axes()
In Matplotlib, the figure (an instance of the class plt.Figure) can be thought of as a single container that
contains all the objects representing axes, graphics, text, and labels. The axes (an instance of the class
plt.Axes) a bounding box with ticks and labels, which will eventually contain the plot elements that
make up our visualization.
Once we have created an axes, we can use the ax.plot function to plot some data. Let’s start with a simple
sinusoid

In[3]: fig = plt.figure()

ax = plt.axes()

x = np.linspace(0, 10, 1000)

ax.plot(x, np.sin(x));
If we want to create a single figure with multiple lines, we can simply call the plot function multiple times
In[5]: plt.plot(x, np.sin(x))
plt.plot(x, np.cos(x));
Adjusting the Plot: Line Colors and Styles

In[6]:

plt.plot(x, np.sin(x - 0), color='blue') # specify color by name

plt.plot(x, np.sin(x - 1), color='g') # short color code (rgbcmyk)

plt.plot(x, np.sin(x - 2), color='0.75') # Grayscale between 0 and 1

plt.plot(x, np.sin(x - 3), color='#FFDD44') # Hex code (RRGGBB from 00 to FF)

plt.plot(x, np.sin(x - 4), color=(1.0,0.2,0.3)) # RGB tuple, values 0 and 1

plt.plot(x, np.sin(x - 5), color='chartreuse'); # all HTML color names supported


In[7]: plt.plot(x, x + 0, linestyle='solid')

plt.plot(x, x + 1, linestyle='dashed')

plt.plot(x, x + 2, linestyle='dashdot')

plt.plot(x, x + 3, linestyle='dotted');

# For short, you can use the following codes:

plt.plot(x, x + 4, linestyle='-') # solid

plt.plot(x, x + 5, linestyle='--') # dashed

plt.plot(x, x + 6, linestyle='-.') # dashdot

plt.plot(x, x + 7, linestyle=':'); # dotted


If you would like to be extremely terse, these linestyle and color codes can be combined into a single
non keyword argument to the plt.plot() function :

In[8]: plt.plot(x, x + 0, '-g') # solid green

plt.plot(x, x + 1, '--c') # dashed cyan

plt.plot(x, x + 2, '-.k') # dashdot black

plt.plot(x, x + 3, ':r'); # dotted red


ADJUSTING THE PLOT:AXES LIMITS
The most basic way to adjust axis limits is to use the plt.xlim() and plt.ylim() methods :

In[9]: plt.plot(x, np.sin(x))

plt.xlim(-1, 11)

plt.ylim(-1.5, 1.5);
A useful related method is plt.axis() (note here the potential confusion between axes with an e, and
axis with an i). The plt.axis() method allows you to set the x and y limits with a single call, by passing
a list that specifies [xmin, xmax, ymin, ymax]:

In[11]: plt.plot(x, np.sin(x))

plt.axis([-1, 11, -1.5, 1.5]);


LABELLING PLOTS

In[14]: import matplotlib.pyplot as plt

import numpy as np

x = np.linspace(0, 10, 100)

# Plot sine function


plt.plot(x, np.sin(x))

# Add title and labels


plt.title("A Sine Curve")
plt.xlabel("x")
plt.ylabel("sin(x)")

# Display the plot


plt.show()
When multiple lines are being shown within a single axes, it can be useful to create a plot legend that labels
each line type. Again, Matplotlib has a built-in way of quickly creating such a legend. It is done via the
plt.legend() method.

In[15]: plt.plot(x, np.sin(x), '-g', label='sin(x)')

plt.plot(x, np.cos(x), ':b', label='cos(x)')

plt.axis('equal')

plt.legend();
SIMPLE SCATTER PLOTS

In[1]: %matplotlib inline

import matplotlib.pyplot as plt

plt.style.use('seaborn-whitegrid')

import numpy as np
In the previous section, we looked at plt.plot/ax.plot to produce line plots. It turns out that this same
function can produce scatter plots as well :

In[2]: import matplotlib.pyplot as plt

import numpy as np

# Define x values
x = np.linspace(0, 10, 30)
y = np.sin(x)
plt.plot(x, y, 'o', color='black');
plt.show()
The third argument in the function call is a character that represents the type of symbol used for the plotting. Just as
you can specify options such as '-' and '--' to control the line style, the marker style has its own set of short string
codes. The full list of available symbols can be seen in the documentation of plt.plot, or in Matplotlib’s online
documentation. Most of the possibilities are fairly intuitive, and we’ll show a number of the more common ones here :
In[3]: rng = np.random.RandomState(0)
for marker in ['o', '.', ',', 'x', '+', 'v', '^', '<', '>', 's', 'd']:
plt.plot(rng.rand(5), rng.rand(5), marker,
label="marker='{0}'".format(marker))
plt.legend(numpoints=1)

plt.xlim(0, 1.8);
A second, more powerful method of creating scatter plots is the plt.scatter function, which can be used
very similarly to the plt.plot function :

In[6]: plt.scatter(x, y, marker='o');


VISUALIZING THE 3D IMAGE

import matplotlib.pyplot as plt


import numpy as np

def f(x, y):


# Function body properly indented
return np.sin(x) ** 10 + np.cos(10 + y * x) * np.cos(x)

# Define x and y ranges


x = np.linspace(0, 5, 50)
y = np.linspace(0, 5, 40)
X, Y = np.meshgrid(x, y)

# Compute Z values
Z = f(X, Y)

# Create contour plot


plt.contour(X, Y, Z, colors='black')
plt.show()
The primary difference of plt.scatter from plt.plot is that it can be used to create scatter plots where the properties of each
individual point (size, face color, edge color, etc.) can be individually controlled or mapped to data.

Let’s show this by creating a random scatter plot with points of many colors and sizes. In order to better see the overlapping results,
we’ll also use the alpha keyword to adjust the transparency level:
In[7]: rng = np.random.RandomState(0)
x = rng.randn(100)
y = rng.randn(100)
colors = rng.rand(100)
sizes = 1000 * rng.rand(100)
plt.scatter(x, y, c=colors, s=sizes, alpha=0.3,
cmap='viridis')

plt.colorbar(); # show color scale


In[5]: plt.contour(X, Y, Z, 20, cmap='RdGy');
In[6]: plt.contourf(X, Y, Z, 20, cmap='RdGy')

plt.colorbar();
HISTOGRAMS,BINNING AND DENSITY

A simple histogram can be a great first step in understanding a dataset.

import numpy as np
import matplotlib.pyplot as plt

# Use a different style


plt.style.use('ggplot') # You can try 'seaborn-white', 'bmh', or others

# Generate random data


data = np.random.randn(1000)

# Plot a histogram
plt.hist(data)

# Display the plot


The hist() function has many options to tune both the calculation and the display; here’s an example of a
more customized histogram:

In[3]: plt.hist(data, bins=30, normed=True, alpha=0.5,

histtype='stepfilled', color='steelblue'

,edgecolor='none');
The plt.hist docstring has more information on other customization options available. I find this combination of
histtype='stepfilled' along with some transparency alpha to be very useful when comparing histograms of several
distributions

In[4]: x1 = np.random.normal(0, 0.8, 1000)

x2 = np.random.normal(-2, 1, 1000)

x3 = np.random.normal(3, 2, 1000)

kwargs = dict(histtype='stepfilled', alpha=0.3, normed=True, bins=40)

plt.hist(x1, **kwargs)

plt.hist(x2, **kwargs)

plt.hist(x3, **kwargs);
Customizing Colorbars
In[3]: x = np.linspace(0, 10, 1000)
I = np.sin(x) * np.cos(x[:, np.newaxis])

plt.imshow(I)
plt.colorbar();
In[12]: # load images of the digits 0 through 5 and visualize several of them

from sklearn.datasets import load_digits

digits = load_digits(n_class=6)

fig, ax = plt.subplots(8, 8, figsize=(6, 6))

for i, axi in enumerate(ax.flat):

axi.imshow(digits.images[i], cmap='binary')

axi.set(xticks=[], yticks=[])
plt.axes: Subplots

The most basic method of creating an axes is to use the plt.axes function. As we’ve seen previously, by default this creates a standard axes object that fills the entire figure. plt.axes also takes an optional argument that is a list of four numbers in the figure coordinate system. These numbers represent [bottom, left, width, height] in the figure coordinate system, which ranges from 0 at the bottom left of the figure to 1 at the top right of the figure.

For example, we might create an inset axes at the top-right corner of another axes by setting the x and y position to 0.65 (that is, starting at 65% of the width and 65% of the height of the figure) and the x and y extents to 0.2 (that is, the size of the axes is 20% of the width and 20% of the height of the figure).
In[2]: ax1 = plt.axes() # standard axes

ax2 = plt.axes([0.65, 0.65, 0.2, 0.2])


plt.subplot: Simple Grids of Subplots

Aligned columns or rows of subplots are a common enough need that Matplotlib has several convenience
routines that make them easy to create. The lowest level of these is plt.subplot(), which creates a
single subplot within a grid. As you can see, this command takes three integer arguments—the number of
rows, the number of columns, and the index of the plot to be created in this scheme, which runs from the
upper left to the bottom right :
In[4]: for i in range(1, 7):

plt.subplot(2, 3, i)

plt.text(0.5, 0.5, str((2, 3, i)),

fontsize=18, ha='center')
Text and Annotation

Creating a good visualization involves guiding the reader so that the figure tells a story. In some cases, this
story can be told in an entirely visual manner, without the need for added text, but in others, small textual
cues and labels are necessary. Perhaps the most basic types of annotations you will use are axes labels and
titles, but the options go beyond this.
In[1]: %matplotlib inline

import matplotlib.pyplot as plt

import matplotlib as mpl

plt.style.use('seaborn-whitegrid')

import numpy as np

import pandas as pd
Arrows and Annotation

In[7]: %matplotlib inline


fig, ax = plt.subplots()
x = np.linspace(0, 20, 1000)
ax.plot(x, np.cos(x))
ax.axis('equal')
ax.annotate('local maximum', xy=(6.28, 1), xytext=(10, 4),
arrowprops=dict(facecolor='black', shrink=0.05))
ax.annotate('local minimum', xy=(5 * np.pi, -1), xytext=(2, -6),
arrowprops=dict(arrowstyle="->",
connectionstyle="angle3,angleA=0,angleB=-90"));
Major and Minor Ticks

Within each axis, there is the concept of a major tick mark and a minor tick mark. As the names would imply,
major ticks are usually bigger or more pronounced, while minor ticks are usually smaller. By default,
Matplotlib rarely makes use of minor ticks, but one place you can see them is within logarithmic plots :
In[1]: %matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')

import numpy as np

In[2]: ax = plt.axes(xscale='log', yscale='log')


import matplotlib.pyplot as plt

import numpy as np

import math

x = np.arange(0, math.pi*2, 0.05)

fig = plt.figure()

ax = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # main axes

y = np.sin(x)

ax.plot(x, y)

ax.set_xlabel(‘angle’)

ax.set_title('sine')

ax.set_xticks([0,2,4,6])

ax.set_xticklabels(['zero','two','four','six'])

ax.set_yticks([-1,0,1])
Locator class Description

NullLocator No ticks

FixedLocator Tick locations are fixed

IndexLocator Locator for index plots (e.g., where x = range(len(y)))

LinearLocator Evenly spaced ticks from min to max

LogLocator Logarithmically ticks from min to max

MultipleLocator Ticks and range are a multiple of base

MaxNLocator Finds up to a max number of ticks at nice locations


Three-Dimensional Plotting in Matplotlib

Matplotlib was initially designed with only two-dimensional plotting in mind. Around the time of the 1.0 release, some three-
dimensional plotting utilities were built on top of Matplotlib’s two-dimensional display, and the result is a convenient (if
somewhat limited) set of tools for three-dimensional data visualization. We enable three-dimensional plots by importing the
mplot3d toolkit, included with the main Matplotlib installation:
In[1]: from mpl_toolkits import mplot3d
Once this submodule is imported, we can create a three-dimensional axes by passing the
keyword projection='3d' to any of the normal axes creation routines:
In[2]: %matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
In[3]: fig = plt.figure()
ax = plt.axes(projection='3d')
Three-Dimensional Points and Lines

The most basic three-dimensional plot is a line or scatter plot created from sets of (x, y, z) triples. In analogy
with the more common two-dimensional plots discussed earlier, we can create these using the ax.plot3D
and ax.scatter3D functions. The call signature for these is nearly identical to that of their two-
dimensional counterparts, so you can refer to “Simple Line Plots” and “Simple Scatter Plots” for more
information on controlling the output. Here we’ll plot a trigonometric spiral, along with some points drawn
randomly near the line (Figure 4-93):
I n[4]: ax = plt.axes(projection='3d')
# Data for a three-dimensional line
zline = np.linspace(0, 15, 1000)
xline = np.sin(zline)
yline = np.cos(zline)
ax.plot3D(xline, yline, zline, 'gray')
# Data for three-dimensional scattered points
zdata = 15 * np.random.random(100)
xdata = np.sin(zdata) + 0.1 * np.random.randn(100)
ydata = np.cos(zdata) + 0.1 * np.random.randn(100)
ax.scatter3D(xdata, ydata, zdata, c=zdata, cmap='Greens');
SEABORN
Seaborn has many of its own high-level plotting routines, but it can also overwrite Matplotlib’s default
parameters and in turn get even simple Matplotlib scripts to produce vastly superior output. We can set the
style by calling Seaborn’s set() method. By convention, Seaborn is imported as sns:
In[4]: import seaborn as sns

sns.set()

In[5]: # same plotting code as above!

plt.plot(x, y)

plt.legend('ABCDEF', ncol=2, loc='upper left');


In[6]: data = np.random.multivariate_normal([0, 0], [[5, 2], [2, 2]], size=2000)

data = pd.DataFrame(data, columns=['x', 'y'])

for col in 'xy':

plt.hist(data[col], normed=True, alpha=0.5)


In[8]: sns.distplot(data['x'])

sns.distplot(data['y']);
Thank you

You might also like