Python Dashboards
Python Dashboards
Python Dashboards
Contents
How to Use this Document: 6
Lectures: 6
Plotly Basics 6
Plotly Basics Overview 6
Scatter Plots 8
Line Charts 11
Bar Charts 13
Bubble Charts 16
Box Plots 18
Histograms 22
Histograms - BONUS Example 26
Distplots 28
Heatmaps 31
Exercises: Plotly Basics 35
Ex1-Scatterplot.py 35
Ex2-Linechart.py 35
Ex3-Barchart.py 35
Ex4-Bubblechart.py 35
Ex5-Boxplot.py 35
Ex6-Histogram.py 35
Ex7-Distplot.py 35
Ex8-Heatmap.py 35
Plotly Basics Exercise Solutions 35
Dash Basics - Layout 36
Introduction to Dash Basics 36
Dash Layout 36
Converting Simple Plotly Plot to Dashboard with Dash 41
Exercise: Create a Simple Dashboard 42
Simple Dashboard Exercise Solution 43
Dash Components 45
HTML Components 45
Core Components 47
Markdown 49
1
Using Help() with Dash 50
Writing Help() to HTML: 50
Dash - Interactive Components 52
Interactive Components Overview 52
Connecting Components with Callbacks 52
Adding a callback to one component 52
Connecting two components with callbacks 53
Concerning style: 55
Concerning connectivity: 55
Multiple Inputs 56
Multiple Outputs 58
Exercise: Interactive Components 63
Interactive Components Exercise Solution 63
Controlling Callbacks with Dash State 64
Interacting with Visualizations 66
Introduction to Interacting with Visualizations 66
Hover Over Data 66
Click Data 71
Selected Data 72
Updating Graphs on Interactions 76
Code Along Milestone Project 82
Introduction to Live Updating 83
Simple Live Updating Example 83
Deployment 89
Introduction to Deploying Apps 89
App Authorization 89
Deploying App to Heroku 91
STEP 1 - Install Heroku and Git 91
STEP 2 - Install virtualenv 92
STEP 3 - Create a Development Folder 92
STEP 4 - Initialize Git 92
STEP 5 (WINDOWS) - Create, Activate and Populate a virtualenv 93
STEP 5 (macOS/Linux) - Create, Activate and Populate a virtualenv 93
STEP 6 - Add Files to the Development Folder 94
app1.py 94
.gitignore 94
Procfile 94
requirements.txt 95
STEP 6 - Log onto your Heroku Account 95
STEP 7 - Initialize Heroku, add files to Git, and Deploy 96
STEP 8 - Visit Your App on the Web! 96
STEP 9 - Update Your App 96
TROUBLESHOOTING 96
2
APPENDIX I - EXAMPLES CODE: 97
Plotly Basics 97
Plotly Basics Overview 97
basic1.py 97
basic2.py 97
Scatter Plots 98
scatter1.py 98
scatter2.py 98
scatter3.py 99
Line Charts 100
line1.py 101
line2.py 101
line3.py 102
Bar Charts 103
bar1.py 103
bar2.py 104
bar3.py 105
Bubble Charts 106
bubble1.py 106
bubble2.py 107
Box Plots 108
box1.py 108
box2.py 108
box3.py 109
Histograms 110
hist1.py 110
hist2.py 110
hist3.py 111
hist4.py 111
histBONUS.py 112
Distplots 113
dist1.py 113
dist2.py 113
dist3.py 114
Heatmaps 114
heat1.py 114
heat2.py 115
heat3.py 115
heat4.py 116
Plotly Basics Exercise Solutions 117
Sol1-Scatterplot.py 117
A Note About the Line Chart Exercise: 118
Sol2a-Linechart.py 119
3
Sol2b-Linechart.py 120
Sol3a-Barchart.py 121
Sol3b-Barchart.py 122
Sol4-Bubblechart.py 123
Sol5-Boxplot.py 124
Sol6-Histogram.py 125
Sol7-Distplot.py 126
Sol8-Heatmap.py 127
5
How to Use this Document:
Underlined text usually indicates a hyperlink, either to an external website or to a location within this document.
Click once on the text to see the link, then click on the link to jump there. External links should open in a separate
browser tab. For example, click here to jump to the heading above.
The Table of Contents at the top of this document offers similar navigation.
Lectures:
Plotly Basics
Plotly Basics Overview
This section compares Plotly to matplotlib using the same data to show the interactivity of plotly in the browser.
The first example provides a static matplotlib plot of four lines (called traces) drawn from random samples.
Create a file named basic1.py and add the following code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
At the terminal run python basic1.py. A separate plot window should appear:
6
Next we’ll build a Plotly plot with similar data. Create a new file called basic2.py and add the following code:
import numpy as np
import pandas as pd
import plotly.offline as pyo
● We’ve assigned the alias pyo to the plotly.offline import, to distinguish it from
import plotly.plotly as py as shown in most online examples.
Plotly offers online hosting from their website for those who set up an account with them.
Throughout this course we will create offline plotly graphs and run them locally.
● basic2.py uses a list comprehension to build a trace for each column in the DataFrame.
This technique is covered in more detail later.
Run this script at the terminal. A browser should open automatically and you should see something like this:
7
Plots vs. Charts - we seem to use these terms interchangeably. We’ll say things like “a bubble chart is a particular kind
of scatter plot”. The only real difference is that charts use some kind of symbol to represent the data.
From https://en.wikipedia.org/wiki/Chart:
“A chart is a graphical representation of data, in which the data is represented by symbols, such as bars in a bar chart, lines in a line
chart, or slices in a pie chart. A chart can represent tabular numeric data, functions or some kinds of qualitative structure and provides
different info.”
Scatter Plots
A basic scatter plot maps a distribution of data points along an x- and y-axis. To illustrate, we’ll take a random
sample of 100 coordinate pairs, but we’ll seed NumPy’s random number generator so that everyone receives the
same “random” sample.
Create a file named scatter1.py and add the following code:
import plotly.offline as pyo
import plotly.graph_objs as go
import numpy as np
np.random.seed(42)
random_x = np.random.randint(1,101,100)
random_y = np.random.randint(1,101,100)
data = [go.Scatter(
x = random_x,
y = random_y,
mode = 'markers',
)]
pyo.plot(data, filename='scatter1.html')
● scatter1.py plots 100 random coordinate pairs. By seeding the random number generator, we can reproduce
the same plot each time the script is run.
● Now is a good time to mention that random number generators are algorithmic and not really random - and
should never be used for cybersecurity! This explains why we can set seed values to obtain the same results.
Run the script and you should see:
● You’ll notice that the plot has no title and no axis labels. To add them we’ll use the graph_objs Layout module to
add features to our graph.
8
● You may also notice that when you move the cursor across the graph, information is displayed about points on
the graph. However, if more than one point occurs on the same vertical, you’ll see that only one of the points
has data displayed! Fortunately, this can be fixed by adding another parameter inside the layout.
Make a duplicate of scatter1.py and name it scatter2.py. Add the following code (shown in bold):
import plotly.offline as pyo
import plotly.graph_objs as go
import numpy as np
np.random.seed(42)
random_x = np.random.randint(1,101,100)
random_y = np.random.randint(1,101,100)
data = [go.Scatter(
x = random_x,
y = random_y,
mode = 'markers',
)]
layout = go.Layout(
title = 'Random Data Scatterplot', # Graph title
xaxis = dict(title = 'Some random x-values'), # x-axis label
yaxis = dict(title = 'Some random y-values'), # y-axis label
hovermode ='closest' # handles multiple points landing on the same vertical
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='scatter2.html')
● scatter2.py plots the same points as scatter1, but adds a Layout layer which includes a title, axis labels, and fixes
the hover issue. Notice how we bundled both the data and the layout inside a Figure, and had plotly graph the
figure as HTML.
9
There’s a lot you can do in Plotly to customize the appearance of the graph. scatter3.py is the same as scatter2,
except we’ve added some style to the marker. We changed its color, size, shape, and added a line around it:
import plotly.offline as pyo
import plotly.graph_objs as go
import numpy as np
np.random.seed(42)
random_x = np.random.randint(1,101,100)
random_y = np.random.randint(1,101,100)
data = [go.Scatter(
x = random_x,
y = random_y,
mode = 'markers',
marker = dict( # change the marker style
size = 12,
color = 'rgb(51,204,153)',
symbol = 'pentagon',
line = dict(
width = 2,
)
)
)]
layout = go.Layout(
title = 'Random Data Scatterplot', # Graph title
xaxis = dict(title = 'Some random x-values'), # x-axis label
yaxis = dict(title = 'Some random y-values'), # y-axis label
hovermode ='closest' # handles multiple points landing on the same vertical
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='scatter3.html')
For more information on how you can customize your graphs, visit https://plot.ly/python/reference/#scatter
Resources: https://plot.ly/python/line-and-scatter/ and https://plot.ly/python/reference/#scatter
10
Line Charts
Line charts are little more than scatter plots that have only one data point per x-value, and (optionally) a line
connecting the markers. To illustrate this, we’ll take another random sample of data that is evenly distributed along
the x-axis.
line1.py makes three copies of the same random dataset. Each set becomes a trace, that is, an independent set of
data that appears on our graph.
import plotly.offline as pyo
import plotly.graph_objs as go
import numpy as np
np.random.seed(56)
x_values = np.linspace(0, 1, 100) # 100 evenly spaced values
y_values = np.random.randn(100) # 100 random values
# Create traces
trace0 = go.Scatter(
x = x_values,
y = y_values+5,
mode = 'markers',
name = 'markers'
)
trace1 = go.Scatter(
x = x_values,
y = y_values,
mode = 'lines+markers',
name = 'lines+markers'
)
trace2 = go.Scatter(
x = x_values,
y = y_values-5,
mode = 'lines',
name = 'lines'
)
data = [trace0, trace1, trace2] # assign traces to data
layout = go.Layout(
title = 'Line chart showing three different modes'
)
fig = go.Figure(data=data,layout=layout)
pyo.plot(fig, filename='line1.html')
● Note that each trace is assigned a name (markers, lines+markers, lines). Names appear in the legend
to the upper right (similar to the A B C D names we saw in our first plotly example) and as hover text.
11
line2.py takes some online data and makes a stacked series of line charts. For this exercise we imported a dataset
from the U.S. Census Bureau and whittled it down to a small file named population.csv. This file is stored in a
neighboring folder called /data:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
# create traces
traces = [go.Scatter(
x = df.columns,
y = df.loc[name],
mode = 'markers+lines',
name = name
) for name in df.index]
layout = go.Layout(
title = 'Population Estimates of the Six New England States'
)
fig = go.Figure(data=traces,layout=layout)
pyo.plot(fig, filename='line2.html')
Resources: https://plot.ly/python/line-charts/
Data source: https://www.census.gov/data/datasets/2017/demo/popest/nation-total.html#ds
https://www2.census.gov/programs-surveys/popest/datasets/2010-2017/national/totals/nst-est2017-alldata.csv
12
Bar Charts
Bar Charts plot different categories along the x-axis, and numerical values along the y-axis. Categories are compared
by looking at the height of each bar. For this reason, it’s important that the y-axis always start at zero, to avoid any
visual misrepresentations.
This section starts with a very simple, monochromatic bar chart showing the number of medals won by countries in
the 2018 Winter Olympics in PyeongChang, South Korea.
We added a .csv file to the ../data folder called 2018WinterOlympics.csv, and we plot the data with bar1.py:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2018WinterOlympics.csv')
data = [go.Bar(
x=df['NOC'], # NOC stands for National Olympic Committee
y=df['Total']
)]
layout = go.Layout(
title='2018 Winter Olympic Medals by Country'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bar1.html')
● Note that Country names fall under the NOC column - NOC stands for National Olympic Committee.
● We should mention that OAR stands for “Olympic Athletes from Russia”. Russia was banned from these Olympic
games, but some athletes were invited to compete.
● Countries are ranked in scoring order from left to right, and yet some countries like South Korea earned more
medals than countries that scored higher, like Sweden. We find out why on the next two plots.
13
Let’s take a look at the types of medals earned by each country, Gold, Silver and Bronze with bar2.py:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2018WinterOlympics.csv')
trace1 = go.Bar(
x=df['NOC'], # NOC stands for National Olympic Committee
y=df['Gold'],
name = 'Gold',
marker=dict(color='#FFD700') # set the marker color to gold
)
trace2 = go.Bar(
x=df['NOC'],
y=df['Silver'],
name='Silver',
marker=dict(color='#9EA0A1') # set the marker color to silver
)
trace3 = go.Bar(
x=df['NOC'],
y=df['Bronze'],
name='Bronze',
marker=dict(color='#CD7F32') # set the marker color to bronze
)
data = [trace1, trace2, trace3]
layout = go.Layout(
title='2018 Winter Olympic Medals by Country'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bar2.html')
14
bar3.py does a Stacked Bar Chart. Note the addition of barmode='stack' in the layout section.
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2018WinterOlympics.csv')
trace1 = go.Bar(
x=df['NOC'], # NOC stands for National Olympic Committee
y=df['Gold'],
name = 'Gold',
marker=dict(color='#FFD700') # set the marker color to gold
)
trace2 = go.Bar(
x=df['NOC'],
y=df['Silver'],
name='Silver',
marker=dict(color='#9EA0A1') # set the marker color to silver
)
trace3 = go.Bar(
x=df['NOC'],
y=df['Bronze'],
name='Bronze',
marker=dict(color='#CD7F32') # set the marker color to bronze
)
data = [trace1, trace2, trace3]
layout = go.Layout(
title='2018 Winter Olympic Medals by Country',
barmode='stack'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bar3.html')
● Because Gold is placed at the bottom, you can see now why Sweden outscored South Korea!
15
Bubble Charts
Bubble charts are simply scatter plots with the added feature that the size of the marker can be set by the data.
For this exercise we look at the mpg.csv dataset, a collection of 399 vehicles manufactured from 1970 to 1982.
When brought into a DataFrame, the first ten records look like this:
bubble1.py compares mpg to horsepower. The size of the marker is set by the number of cylinders.
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/mpg.csv')
layout = go.Layout(
title='Vehicle mpg vs. horsepower',
xaxis = dict(title = 'horsepower'), # x-axis label
yaxis = dict(title = 'mpg'), # y-axis label
hovermode='closest'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bubble1.html')
16
● The graph shows a definite relationship between high horsepower and low mpg, and also displays a trend
toward higher horsepower with greater numbers of cylinders (note that displacement is not factored here).
● We added text to each marker to show the name of the vehicle on hover.
● We added hovermode='closest' to the Layout - otherwise, only the bottom-most marker is described if several
markers appear on the same vertical x-value.
● It is worth noting that bubble charts and scatter plots suffer a potential limitation, should more than one data
point land on the same spot. A bubble may be a shade darker, but it’s hard to tell that that multiple data points
could be obscured. This limitation is addressed in the Dash section Selected Data, select2.py file, showing the
“density” of similar looking scatter plots.
● bubble2.py is just like bubble1, except we show how to add multiple fields to the hover text. Since one of the
fields was numeric (model_year), we first added a column to the DataFrame converting it to text, then another
column that combined it with Name. This last column is used for the hover text.
17
Box Plots
At times it’s important to determine if two samples of data belong to the same population. Box plots are great for
this! The shape of a box plot (also called a box-and-whisker-plot) doesn’t depend on aggregations like sample mean.
Rather, the plot represents the true shape of the data. Also, depending on how the whiskers are constructed,
box plots are useful for identifying true outliers of a data set. While some visualizations might arbitrarily discard the
“top and bottom 5%” as outliers, a box plot identifies those points that lie far from the median compared to the rest
of the data.
To construct the plot:
● First mark the median value (usually with a line segment). This sets the location of the distribution.
● Construct a box to contain all the inner-quartile values.
● Next draw the whiskers. There are several ways this is done, but usually you start from a distance one box-
length out from the side of the box, and then come inward until you reach the first data point.
● Finally, plot any remaining points outside the whiskers as outliers.
box1.py takes a set of twenty points and plots them, showing one outlier:
import plotly.offline as pyo
import plotly.graph_objs as go
data = [
go.Box(
y=y,
boxpoints='all', # display the original data points
jitter=0.3, # spread them out so they all appear
pointpos=-1.8 # offset them to the left of the box
)
]
pyo.plot(data, filename='box1.html')
● Because we offset the data points to the left, the outlier doesn’t appear over the box plot itself.
box2.py shows what a box plot would look like with displayed outliers:
18
import plotly.offline as pyo
import plotly.graph_objs as go
data = [
go.Box(
y=y,
boxpoints='outliers' # display only outlying data points
)
]
pyo.plot(data, filename='box2.html')
19
The Quintus Curtius Snodgrass Letters
As a forensic example of applied statistics, there was a famous case where Mark Twain was accused of being a
Confederate deserter during the Civil War, and the evidence given were ten essays published in the New Orleans
Daily Crescent under the name Quintus Curtius Snodgrass. In 1963 Claude Brinegar published an article in the
Journal of the American Statistical Association where he uses word frequencies and a chi-squared test to show that
the essays were almost certainly not Twain’s.
Brinegar’s Abstract:
“Mark Twain is widely credited with the authorship of 10 letters published in 1861 in the New Orleans
Daily Crescent. The adventures described in these letters, which are signed “Quintus Curtius Snodgrass,”
provide the historical basis of a main part of Twain’s presumed role in the Civil War. This study applies
an old, though little used statistical test of authorship - a word-length frequency test - to show that
Twain almost certainly did not write these 10 letters. The statistical analysis includes a visual comparison
of several word-length frequency distributions and applications of the 𝜒2 and two-sample t tests.”
The following table shows relative frequencies of three-letter-words from the Snodgrass letters, and from samples
of Twain’s known works. Rather than run them through complex calculations, let’s make box plots!
Citation: Brinegar, C., "Mark Twain and the Quintus Curtius Snodgrass Letters: A Statistical Test of Authorship",
Journal. American Statistical Association, 1963, 58 (301): 85-96.
20
box3.py compares the two datasets side-by-side:
import plotly.offline as pyo
import plotly.graph_objs as go
snodgrass = [.209,.205,.196,.210,.202,.207,.224,.223,.220,.201]
twain = [.225,.262,.217,.240,.230,.229,.235,.217]
data = [
go.Box(
y=snodgrass,
name='QCS'
),
go.Box(
y=twain,
name='MT'
)
]
layout = go.Layout(
title = 'Comparison of three-letter-word frequencies<br>\
between Quintus Curtius Snodgrass and Mark Twain'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='box3.html')
● As you can see from the plots, there’s barely any overlap!
The 10 Quintus Curtius Snodgrass letters were very likely not written by Mark Twain.
21
Histograms
Histograms are one of the most frequently used (and abused) visualizations. While they’re great for showing which
range of values has a greater frequency, it’s hard to tell how much greater. And when converted to 3D, as seen in
many flashy magazine articles, perspective can be completely distorted.
Still, if you’re just starting your analysis and you want a quick peek at the data, histograms are a handy tool.
We should point out that while they look similar, histograms differ from bar charts in two important ways. First,
histograms plot a numerical value along the x-axis - something that can be measured. Bar charts put categories
along the x-axis, like the countries competing in the Olympics in our previous example. Second, unlike bar charts, the
height of a histogram bar does not indicate frequency - rather, it’s the volume of the bar (height x width) that does.
The width of a histogram bar is determined by binning; since the x-axis usually displays a continuous range of values,
like time or temperature, each vertical bar represents a range of values.
Also, while bar charts usually have space between bars, histograms generally have no space between adjacent bars.
For this section we’ll revisit the mpg dataset. Let’s take a look at a frequency distribution of mpg values from our set
of 1970’s era vehicles.
hist1.py takes plotly’s default settings:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/mpg.csv')
data = [go.Histogram(
x=df['mpg']
)]
layout = go.Layout(
title="Miles per Gallon Frequencies of<br>\
1970's Era Vehicles"
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='basic_histogram.html')
● Note that each bin has a width of 2. The first bin spans 8 to 9.9, the last one from 48 to 49.9.
hist2.py sets bins wider, to 6. (since 50-8=42, seven equally spaced bins makes sense)
import plotly.offline as pyo
import plotly.graph_objs as go
22
import pandas as pd
df = pd.read_csv('../data/mpg.csv')
data = [go.Histogram(
x=df['mpg'],
xbins=dict(start=8,end=50,size=6),
)]
layout = go.Layout(
title="Miles per Gallon Frequencies of<br>\
1970's Era Vehicles"
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='wide_histogram.html')
df = pd.read_csv('../data/mpg.csv')
data = [go.Histogram(
x=df['mpg'],
xbins=dict(start=8,end=50,size=1),
)]
layout = go.Layout(
title="Miles per Gallon Frequencies of<br>\
1970's Era Vehicles"
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='narrow_histogram.html')
23
● After comparing all three plots, it looks like plotly’s default settings were a good choice for this dataset!
Our next example shows how to overlay two histograms, assign an opacity value, and compare two sets of data.
The data we’ll use comes from a Cardiac Arrhythmia Database at https://archive.ics.uci.edu/ml/datasets/arrhythmia
We’ve stripped all but three columns and selected 420 records. The columns are ‘Age’, ‘Sex’ and ‘Height’.
For ‘Sex’, 0=male and 1=female, and height is measured in centimeters.
Create a file called hist4.py and add the following code:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/arrhythmia.csv')
data = [go.Histogram(
x=df[df['Sex']==0]['Height'],
opacity=0.75,
name='Male'
),
go.Histogram(
x=df[df['Sex']==1]['Height'],
opacity=0.75,
name='Female'
)]
layout = go.Layout(
barmode='overlay',
title="Height comparison by gender"
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='basic_histogram2.html')
24
Now each trace has its own color, and opacity allows us to see each one independently.
25
Histograms - BONUS Example
What if the dataset itself contains frequency data? Histograms count the number of occurrences within one column.
If you want to base your x-values on one column, but sum the values from another column, you need
to use a bar chart. Let’s try an example!
Fremont Bridge in Seattle, Washington has a pedestrian/bicycle path on either side. Cyclists on the eastern side
generally travel north over the bridge, and south on the western side. The city installed sensors to record the
number of bicycles that cross the bridge each day.
Images: http://sdotblog.seattle.gov/2016/02/25/how-does-
that-bike-counter-work-at-the-fremont-bridge-and-who-named-fremont/
26
histBONUS.py performs these operations and plots the result:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/FremontBridgeBicycles.csv')
trace1 = go.Bar(
x=df2.index,
y=df2['Fremont Bridge West Sidewalk'],
name="Southbound",
width=1 # eliminates space between adjacent bars
)
trace2 = go.Bar(
x=df2.index,
y=df2['Fremont Bridge East Sidewalk'],
name="Northbound",
width=1
)
data = [trace1, trace2]
layout = go.Layout(
title='Fremont Bridge Bicycle Traffic by Hour',
barmode='stack'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='fremont_bridge.html')
27
Distplots
Distribution Plots, or Displots, typically layer three plots on top of one another. The first is a histogram, where each
data point is placed inside a bin of similar values. The second is a rug plot - marks are placed along the x-axis for
every data point, which lets you see the distribution of values inside each bin. Lastly, Displots often include a “kernel
density estimate”, or KDE line that tries to describes the shape of the distribution.
Because KDEs use computations to derive the shape of the line - using too large a bandwidth gives a line without
enough detail, and too small a bandwidth can yield an unhelpful, jagged line - we say that we plot a histogram, but
we fit a KDE line to the plot.
We obtain distplots from Plotly’s Figure Factory module in place of Graph Objects.
x = np.random.randn(1000)
hist_data = [x]
group_labels = ['distplot']
● Note that distplots display relative frequencies, not actual ones. The total area under the plot is equal to 1.
● By convention we use the label hist_data in place of data, as a reminder that this forms the histogram
portion of the plot.
28
A random number generator will never show a perfectly normal (Gaussian) distribution - but the higher the number
of data points, the closer you’ll get. To demonstrate this we’ll plot four relatively small samples side-by-side.
dist2.py compares four similar plots, each drawn from a different set of 200 random numbers:
import plotly.offline as pyo
import plotly.figure_factory as ff
import numpy as np
x1 = np.random.randn(200)-2
x2 = np.random.randn(200)
x3 = np.random.randn(200)+2
x4 = np.random.randn(200)+4
hist_data = [x1,x2,x3,x4]
group_labels = ['Group1','Group2','Group3','Group4']
● A normal distribution would show an even, symmetric bell curve. These generally do not.
29
Distplots are not very informative on small datasets.
dist3.py goes back to our Mark Twain example, and plots two groups of only 10 and 8 points, respectively.
import plotly.offline as pyo
import plotly.figure_factory as ff
snodgrass = [.209,.205,.196,.210,.202,.207,.224,.223,.220,.201]
twain = [.225,.262,.217,.240,.230,.229,.235,.217]
hist_data = [snodgrass,twain]
group_labels = ['Snodgrass','Twain']
● We set the bin_size to .005, and the results are confusing at best.
● Box plots were clearly a better choice here!
30
Heatmaps
In their simplest forms, Bar Charts, Box Plots, Histograms and Distplots help visualize “univariate distributions” -
that is, the frequency of only one variable across a range of values or categories.
Heatmaps provide a “multivariate” plot by adding a third dimension - color - to the marker. This is somewhat similar
to changing the size of the marker in our bubble plots.
For these examples we take temperature data for the same one-week period in 2010 from three US weather
stations: Santa Barbara, California, Yuma, Arizona, and Sitka, Alaska. The raw data was obtained from the U.S.
Climate Reference Network (USCRN) website, specifically
https://www1.ncdc.noaa.gov/pub/data/uscrn/products/hourly02/2010/ .
From this we whittled down the data to three columns (date, time, avg temp), added a column for “day”, and
removed all but one week’s worth of recordings (June 1st - 7th). The resulting files are SantaBarbaraCA.csv,
YumaAZ.csv and SitkaAK.csv.
For starters, let’s create basic heatmaps for each dataset (these are, heat2.py and heat3.py) and accept plotly’s
default parameters. Note that temperatures are given in degrees Celsius.
heat1.py creates a heatmap from SantaBarbaraCA.csv:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2010SantaBarbaraCA.csv')
data = [go.Heatmap(
x=df['DAY'],
y=df['LST_TIME'],
z=df['T_HR_AVG'].values.tolist(),
colorscale='Jet'
)]
layout = go.Layout(
title='Hourly Temperatures, June 1-7, 2010 in<br>\
Santa Barbara, CA USA'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='Santa_Barbara.html')
heat2.py and heat3.py create similar heatmaps from YumaAZ.csv and SitkaAK.csv respectively:
31
● Although all three heatmaps appear fairly similar
(warm during the day, cold at night),
the temperature ranges are quite different for each one.
32
heat4.py
import plotly.offline as pyo
import plotly.graph_objs as go
from plotly import tools
import pandas as pd
df1 = pd.read_csv('../data/2010SitkaAK.csv')
df2 = pd.read_csv('../data/2010SantaBarbaraCA.csv')
df3 = pd.read_csv('../data/2010YumaAZ.csv')
trace1 = go.Heatmap(
x=df1['DAY'],
y=df1['LST_TIME'],
z=df1['T_HR_AVG'].values.tolist(),
colorscale='Jet',
zmin = 5, zmax = 40 # add max/min color values to make each plot consistent
)
trace2 = go.Heatmap(
x=df2['DAY'],
y=df2['LST_TIME'],
z=df2['T_HR_AVG'].values.tolist(),
colorscale='Jet',
zmin = 5, zmax = 40
)
trace3 = go.Heatmap(
x=df3['DAY'],
y=df3['LST_TIME'],
z=df3['T_HR_AVG'].values.tolist(),
colorscale='Jet',
zmin = 5, zmax = 40
)
33
With this final plot we see data from three different regions, using the same scale, side-by-side for comparison.
Not bad!
Resources: https://plot.ly/python/heatmaps/
Data source: https://www1.ncdc.noaa.gov/pub/data/uscrn/products/hourly02/2010/
34
Exercises: Plotly Basics
Ex1-Scatterplot.py
Objective: Create a scatterplot of 1000 random data points.
x-axis values should come from a normal distribution using np.random.randn(1000).
y-axis values should come from a uniform distribution over [0,1) using np.random.rand(1000)
Ex2-Linechart.py
Objective: Using the file 2010YumaAZ.csv, develop a Line Chart that plots seven days worth of temperature data on
one graph. You can use a list comprehension to assign each day to its own trace.
See https://pandas.pydata.org/pandas-docs/stable/generated/pandas.unique.html
for help on finding unique values with pandas
Ex3-Barchart.py
Objective: Create a stacked bar chart from the file ../data/mocksurvey.csv. Note that questions appear in the index
(and should be used for the x-axis), while responses appear as column labels.
Extra Credit: make a horizontal bar chart!
See https://plot.ly/python/horizontal-bar-charts/ for extra credit help.
Ex4-Bubblechart.py
Objective: Create a bubble chart that compares three other features from the mpg.csv dataset.
Fields include: 'mpg', 'cylinders', 'displacement' 'horsepower', 'weight', 'acceleration', 'model_year', 'origin', 'name'
Ex5-Boxplot.py
Objective: Make a DataFrame sing the Abalone dataset (../data/abalone.csv). Take two independent random
samples of different sizes from the 'rings' field. HINT: np.random.choice(df['rings'],10,replace=False) takes 10
random values
Use box plots to show that the samples do derive from the same population.
Ex6-Histogram.py
Objective: Create a histogram that plots the 'length' field from the Abalone dataset (../data/abalone.csv).
Set the range from 0 to 1, with a bin size of 0.02
Ex7-Distplot.py
Objective: Using the iris dataset, develop a Distplot that compares the petal lengths of each class.
File: '../data/iris.csv'
Fields: 'sepal_length','sepal_width','petal_length','petal_width','class'
Classes: 'Iris-setosa','Iris-versicolor','Iris-virginica'
Ex8-Heatmap.py
Objective: Using the "flights" dataset available from Python's Seaborn module (see
https://seaborn.pydata.org/generated/seaborn.heatmap.html) create a heatmap with the following parameters:
x-axis="year"
y-axis="month"
z-axis(color)="passengers"
35
Dash Basics - Layout
Introduction to Dash Basics
If you haven’t already done so, follow the installation instructions in Lecture 4.
As a quick review, the Dash installation steps are:
pip install dash==0.21.0 # The core dash backend
pip install dash-renderer==0.11.3 # The dash front-end
pip install dash-html-components==0.9.0 # HTML components
pip install dash-core-components==0.21.2 # Supercharged components
pip install plotly --upgrade # Plotly graphing library used in examples
Dash apps are composed of two parts. The first part is the layout of the app and it describes what the application
looks like. The second part describes the interactivity of the application.
The good news is that you don’t need to know any HTML or CSS to use Dash. Most html tags are provided as Python
classes. For example, typing html.H1(children='Hello Dash') into your Dash script results in the HTML element
<h1>Hello Dash</h1>.
Dash offers two distinct component libraries. The code above comes from the dash_html_components library which
has a component for every HTML tag, like the first-level heading H1. Another library, dash_core_components, offers
higher-level, interactive components that are generated with JavaScript, HTML, and CSS through the React.js library.
Dash components - be they html or core - are described entirely through keyword attributes. Dash is declarative:
you will primarily describe your application through these attributes.
Dash Layout
Let’s create a simple HTML page that displays a bar chart. Create a file called layout1.py and enter the following:
# -*- coding: utf-8 -*-
import dash
import dash_core_components as dcc
import dash_html_components as html
app = dash.Dash()
app.layout = html.Div(children=[
html.H1(children='Hello Dash'),
html.Div(children='Dash: A web application framework for Python.'),
dcc.Graph(
id='example-graph',
figure={
'data': [
{'x': [1, 2, 3], 'y': [4, 1, 2], 'type': 'bar', 'name': 'SF'},
{'x': [1, 2, 3], 'y': [2, 4, 5], 'type': 'bar', 'name': u'Montréal'},
],
'layout': {
'title': 'Dash Data Visualization'
}
}
)
])
if __name__ == '__main__':
app.run_server()
36
$ python layout1.py
...Running on http://127.0.0.1:8050/ (Press CTRL+C to quit)
and visit http://127.0.0.1:8050/ in your web browser. You should see a page that looks like this:
Note: the interactive portions only appear when your cursor hovers over a bar.
TROUBLESHOOTING: Some text editors do not properly encode utf. If you receive an error message that states
File "app.py", line 17
SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xe9 in position 5: invalid continuation byte
the problem is likely with the extended Unicode character in u'Montréal'. Change this to a regular e instead.
Save the file and try running it again as shown above.
2. import dash
import dash_core_components as dcc
import dash_html_components as html
We import Dash and both of its component libraries.
3. app = dash.Dash()
We launch a Dash application. “app” is just a convenient name for our Dash instance.
37
4. app.layout = html.Div(children=[
html.H1(children='Hello Dash'),
html.Div(children='Dash: A web application framework for Python.'),
Here we start to define the application layout.
H1 and Div are component attributes that map to corresponding HTML tags.
H1 we’ve seen; it creates a level one heading. Div creates a <div> tag which is like an HTML container.
children is a property of HTML components (we’ll use this keyword later when we add interactivity to our
dashboards). By default this is the first property listed, so we don’t really need to add children= to our
code.
5. dcc.Graph(
id='example-graph',
figure={
'data': [
{'x': [1, 2, 3], 'y': [4, 1, 2], 'type': 'bar', 'name': 'SF'},
{'x': [1, 2, 3], 'y': [2, 4, 5], 'type': 'bar', 'name': u'Montréal'},
],
'layout': {
'title': 'Dash Data Visualization'
This is all one core component!
The 'data' and 'layout' keyword attributes should look familiar as they’re taken directly from Plotly.
Graph components have a figure property in place of children. This is the same figure used in Plotly.
6. if __name__ == '__main__':
app.run_server()
This last section launches a local server only if layout1.py is run as a script.
If we import this file into another program, this line of code is ignored.
It’s important to note that, unlike Plotly, layout1.py is an active script that requires a local web server running in the
background. If you should make changes to layout1.py that prevent it from running properly then the terminal will
display an error and shut down the server.
Dash uses Flask as its server back end. You can pass debug=True into the server call to enable some diagnostic
services (you wouldn’t want to do this in production!). The code would look like this:
if __name__ == '__main__':
app.run_server(debug=True)
38
Before we move on, let’s make some changes to the HTML on our page. Open a new file called layout2.py and copy
the contents of layout1 to layout2. (You can simply duplicate layout1.py if you want).
Next, insert the code shown below in black (original code is shown in blue):
# -*- coding: utf-8 -*-
import dash
import dash_core_components as dcc
import dash_html_components as html
app = dash.Dash()
colors = {
'background': '#111111',
'text': '#7FDBFF'
}
app.layout = html.Div(children=[
html.H1(
children='Hello Dash',
style={
'textAlign': 'center',
'color': colors['text']
}
),
html.Div(
children='Dash: A web application framework for Python.',
style={
'textAlign': 'center',
'color': colors['text']
}
),
dcc.Graph(
id='example-graph',
figure={
'data': [
{'x': [1, 2, 3], 'y': [4, 1, 2], 'type': 'bar', 'name': 'SF'},
{'x': [1, 2, 3], 'y': [2, 4, 5], 'type': 'bar', 'name': u'Montréal'},
],
'layout': {
'plot_bgcolor': colors['background'],
'paper_bgcolor': colors['background'],
'font': {
'color': colors['text']
},
'title': 'Dash Data Visualization'
}
}
)],
style={'backgroundColor': colors['background']}
)
if __name__ == '__main__':
app.run_server()
In this version we add a dictionary of color styles. These are referenced in the style properties added to each
component. Run python layout2.py in the terminal and you should see this page:
39
In this example, we modified the inline styles of the html.Div and html.H1 components with the style property.
html.H1('Hello Dash', style={'textAlign':'center', 'color':'#7FDFF'}) is rendered in
the Dash application as <h1 style="text-align:center; color:#7FDFF">Hello Dash</h1>.
There are a few important differences between the dash_html_components and the HTML attributes:
1. The style property in HTML is a semicolon-separated string. In Dash, you can just supply a dictionary.
2. The keys in the style dictionary are camelCased. So, instead of text-align, it's textAlign.
3. The HTML class attribute is className in Dash. We’ll see this in upcoming examples.
4. The children of an HTML tag are specified through the children keyword argument.
By convention, this is always the first argument and so it is often omitted.
html.H1(children='Hello Dash') is the same as html.H1('Hello Dash').
That’s it! You’ve just created your first dashboard! Up next, we’ll convert a simple Plotly plot to Dash.
40
Converting Simple Plotly Plot to Dashboard with Dash
For this exercise we’ll go back to our scatter3.py script. This involved a scatter plot of 100 random data points.
We’ll seed the random number generator so that everyone sees the same result.
Open a new file, and name it plotly1.py. Enter the following code:
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import numpy as np
app = dash.Dash()
np.random.seed(42)
random_x = np.random.randint(1,101,100)
random_y = np.random.randint(1,101,100)
app.layout = html.Div([
dcc.Graph(
id='scatter3',
figure={
'data': [
go.Scatter(
x = random_x,
y = random_y,
mode = 'markers',
marker = {
'size': 12,
'color': 'rgb(51,204,153)',
'symbol': 'pentagon',
'line': {'width': 2}
}
)
],
'layout': go.Layout(
title = 'Random Data Scatterplot',
xaxis = {'title': 'Some random x-values'},
yaxis = {'title': 'Some random y-values'},
hovermode='closest'
)
}
)
])
if __name__ == '__main__':
app.run_server()
As you can see, much of this is the same code as was used in scatter3.py. In Dash, the dash_core_components
library includes a component called Graph, which renders interactive data visualizations using Plotly’s JavaScript
graphing library. In fact, the figure argument in the dcc.Graph component is the same figure argument that is used
by Plotly.
41
Once you have saved the file, run python plotly1.py in the terminal. Open your browser again to
http://127.0.0.1:8050/ and the following page should appear:
42
Simple Dashboard Exercise Solution
This is our suggested solution:
# Perform imports here:
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import pandas as pd
43
If all goes well, your finished dashboard should open a page like this:
Which shows a clear correlation between an eruption’s duration and the expected wait to the next eruption!
44
Dash Components
Dash components are provided by two libraries: dash_html_components which we usually abbreviate to html, and
dash_core_components, usually abbreviated to dcc. Normally, html components describe the layout of the page,
including placement and alignment of different graphs. dcc components describe the individual graphs themselves.
HTML Components
For a description of Dash’s HTML components, visit https://dash.plot.ly/dash-html-components
Common components include:
html.Div([ section ]) applies CSS to section of page <div> </div>
● html.P() paragraph <p> </p>
● html.H1(‘text’) heading (level 1) <h1> </h1>
● html.Label(‘text’) label <label> </label>
HTML elements and Dash classes are mostly the same but there are a few key differences:
● The style property is a dictionary
● Properties in the style dictionary are camelCased
● The class key is renamed as className
● Style properties in pixel units can be supplied as just numbers without the px unit
</div>
45
To provide an example of how dash_html_components can be laid out on a page, create a file called
HTMLComponents.py and add the following code:
import dash
import dash_html_components as html
app = dash.Dash()
app.layout = html.Div([
'This is the outermost Div',
html.Div(
'This is an inner Div',
style={'color':'blue', 'border':'2px blue solid', 'borderRadius':5,
'padding':10, 'width':220}
),
html.Div(
'This is another inner Div',
style={'color':'green', 'border':'2px green solid',
'margin':10, 'width':220}
),
],
# this styles the outermost Div:
style={'width':500, 'height':200, 'color':'red', 'border':'2px red dotted'})
if __name__ == '__main__':
app.run_server()
Note how style can be individually applied to each Div, providing color, borders, padding and margins.
46
Core Components
For a complete description of Dash’s core components, visit https://dash.plot.ly/dash-core-components
Here we describe a few useful tools.
Create a file called CoreComponents.py and add the following code.
Keep this file handy - you may want to add components to it that you find useful!
CoreComponents.py
import dash
import dash_core_components as dcc
import dash_html_components as html
app = dash.Dash()
app.layout = html.Div([
# DROPDOWN https://dash.plot.ly/dash-core-components/dropdown
html.Label('Dropdown'),
dcc.Dropdown(
options=[
{'label': 'New York City', 'value': 'NYC'},
{'label': 'Montréal', 'value': 'MTL'},
{'label': 'San Francisco', 'value': 'SF'}
],
value='MTL'
),
html.Label('Multi-Select Dropdown'),
dcc.Dropdown(
options=[
{'label': 'New York City', 'value': 'NYC'},
{'label': u'Montréal', 'value': 'MTL'},
{'label': 'San Francisco', 'value': 'SF'}
],
value=['MTL', 'SF'],
multi=True
),
# SLIDER https://dash.plot.ly/dash-core-components/slider
html.Label('Slider'),
html.P(
dcc.Slider(
min=-5,
max=10,
step=0.5,
marks={i: i for i in range(-5,11)},
value=-3
)),
47
if __name__ == '__main__':
app.run_server()
● We put the Slider inside an html paragraph html.P() to prevent the radio buttons beneath it from overwriting
the slider marks.
Before we get there, let’s investigate the Markdown component (a shortcut for writing HTML text),
and Dash’s Help() method.
48
Markdown
While Dash exposes HTML through the dash_html_components library, it can be tedious to write your copy in
HTML. For writing blocks of text, you can use the Markdown component in the dash_core_components library.
Create a file called markdown.py and add the following code:
import dash
import dash_core_components as dcc
import dash_html_components as html
app = dash.Dash()
markdown_text = '''
### Dash and Markdown
Markdown includes syntax for things like **bold text** and *italics*,
[links](http://commonmark.org/help), inline `code` snippets, lists,
quotes, and more.
'''
app.layout = html.Div([
dcc.Markdown(children=markdown_text)
])
if __name__ == '__main__':
app.run_server()
Run the program at the terminal, open a browser to http://127.0.0.1:8050/ and you should see the following page:
Notice how in the code three hash marks ### translates to an <h3> tag on the page.
Notice also that the line break between “Dash apps can be written in Markdown.” and “Dash uses the
[CommonMark](http://commonmark.org/) specification of Markdown.” is ignored.
To start a new paragraph on the page requires a blank line.
49
Using Help() with Dash
Dash components are declarative: every configurable aspect of these components is set during instantiation as a
keyword argument. Call help in your Python console on any of the components to learn more about a component
and its available arguments.
class Div(dash.development.base_component.Component)
| A Div component.
|
|
| Keyword arguments:
| - children (optional): The children of this component
| - id (optional): The ID of this component, used to identify dash components
| in callbacks. The ID needs to be unique across all of the
| components in an app.
| - n_clicks (optional): An integer that represents the number of times
| that this element has been clicked on.
| - key (optional): A unique identifier for the component, used to improve
| performance by React.js while rendering components
| See https://reactjs.org/docs/lists-and-keys.html for more info
| - accessKey (optional): Defines a keyboard shortcut to activate or add focus to the element.
| - className (optional): Often used with CSS to style elements with common properties.
| - contentEditable (optional): Indicates whether the element's content is editable.
| - contextMenu (optional): Defines the ID of a <menu> element which will serve as the
-- More --
Hit <space> to see more content on this topic.
50
This is what help looks like in the Python console:
51
Dash - Interactive Components
Interactive Components Overview
The first part of this tutorial covered the layout of Dash apps:
● The layout of a Dash app describes what the app looks like. It is a hierarchical tree of components.
● The dash_html_components library provides classes for all of the HTML tags and the keyword arguments
describe the HTML attributes like style, className, and id.
● The dash_core_components library generates higher-level components like controls and graphs.
The second part of the tutorial describes how to make your Dash apps interactive.
Let's get started with a simple example.
app = dash.Dash()
app.layout = html.Div([
dcc.Input(id='my-id', value='initial value', type='text'),
html.Div(id='my-div')
])
@app.callback(
Output(component_id='my-div', component_property='children'),
[Input(component_id='my-id', component_property='value')]
)
def update_output_div(input_value):
return 'You\'ve entered "{}"'.format(input_value)
if __name__ == '__main__':
app.run_server()
Run the script, open a browser to http://127.0.0.1:8050/ and you should see:
52
Now type something into the input box. Immediately you should see the output change to reflect the input!
1. We set up our dcc.Input in the usual way, except that we assigned an id to it, and added another Div after it with
an assigned id ('my-id' and 'my-div' respectively)
2. app.callback is called as a decorator function over update_output_div. The "inputs" and "outputs" of our
application interface are described declaratively through the app.callback decorator.
For more on Python decorators visit https://en.wikipedia.org/wiki/Python_syntax_and_semantics#Decorators
3. Inside @app.callback, Output and Input are abbreviated forms of dash.dependencies.Output and
dash.dependencies.Input. Note how we imported them from dash.dependencies by name.
4. In Dash, the inputs and outputs of our application are simply the properties of a particular component.
In this example, our input is the "value" property of the component that has the ID "my-id".
Our output is the "children" property of the component with the ID "my-div".
5. Whenever an input property changes, the function that the callback decorator wraps will get called
automatically. Dash provides the function with the new value of the input property as an input argument and
Dash updates the property of the output component with whatever was returned by the function.
6. The component_id and component_property keywords inside Output and Input are optional (there are only
two arguments for each of those objects). We included them here for clarity but we’ll omit them from here on
out for brevity and readability.
7. Don't confuse the dash.dependencies.Input object inside app.callback from the dash_core_components.Input
object inside app.layout. The former is just used in these callbacks and the latter is an actual component.
8. Notice how we don't set a value for the children property of the my-div component in the layout. When the
Dash app starts, it automatically calls all of the callbacks with the initial values of the input components in order
to populate the initial state of the output components. In this example, if you specified something like
html.Div(id='my-div', children='Hello world'), it would get overwritten when the app
starts.
It's sort of like programming with Microsoft Excel: whenever an input cell changes, all of the cells that depend on
that cell will get updated automatically. This is called "Reactive Programming".
Remember how every component was described entirely through its set of keyword arguments? Those properties
are important now. With Dash interactivity, we can dynamically update any of those properties through a callback
function. Frequently we'll update the children of an html component to display new text or the figure of a
dcc.Graph component to display new data, but we could also update the style of a component or even the available
options of a dcc.Dropdown component!
53
Create a file called callback2.py and add the following code:
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/gapminderDataFiveYear.csv')
app = dash.Dash()
# https://dash.plot.ly/dash-core-components/dropdown
# We need to construct a dictionary of dropdown values for the years
year_options = []
for year in df['year'].unique():
year_options.append({'label':str(year),'value':year})
app.layout = html.Div([
dcc.Graph(id='graph-with-slider'),
dcc.Dropdown(id='year-picker',options=year_options,value=df['year'].min())
])
@app.callback(Output('graph-with-slider', 'figure'),
[Input('year-picker', 'value')])
def update_figure(selected_year):
filtered_df = df[df['year'] == selected_year]
traces = []
for continent_name in filtered_df['continent'].unique():
df_by_continent = filtered_df[filtered_df['continent'] == continent_name]
traces.append(go.Scatter(
x=df_by_continent['gdpPercap'],
y=df_by_continent['lifeExp'],
text=df_by_continent['country'],
mode='markers',
opacity=0.7,
marker={'size': 15},
name=continent_name
))
return {
'data': traces,
'layout': go.Layout(
xaxis={'type': 'log', 'title': 'GDP Per Capita'},
yaxis={'title': 'Life Expectancy'},
hovermode='closest'
)
}
if __name__ == '__main__':
app.run_server()
54
Run the script, open a browser to http://127.0.0.1:8050/ and you should see:
You can hover over any data point to reveal its Country, Continent and axis data. More importantly, you can use the
dropdown to change the displayed graph!
Concerning style:
Before we discuss the connectivity, let’s look at some of the style choices made:
● the x-axis is logarithmic, becoming denser as values increase
● we use the pandas .unique() method to extract the years for the dropdown (similar to our Plotly Linechart
exercise!)
Concerning connectivity:
In this example, the "value" property of the Dropdown is the input of the app and the output of the app is the
"figure" property of the Graph. Whenever the value of the Dropdown changes, Dash calls the callback function
update_figure with the new value. The function filters the DataFrame with this new value, constructs a figure
object, and returns it to the Dash application.
● We're using the Pandas library for importing and filtering datasets in memory.
● We load our DataFrame at the start of the app: df = pd.read_csv('...'). This DataFrame df is in
the global state of the app and can be read inside the callback functions.
● Loading data into memory can be expensive. By loading querying data at the start of the app instead of inside
the callback functions, we ensure that this operation is only done when the app server starts. When a user visits
the app or interacts with the app, that data (the df) is already in memory. If possible, expensive initialization
(like downloading or querying data) should be done in the global scope of the app instead of within the callback
functions.
● The callback does not modify the original data, it just creates copies of the dataframe by filtered through pandas
filters. This is important: your callbacks should never mutate variables outside of their scope. If your callbacks
modify global state, then one user's session might affect the next user's session and when the app is deployed
on multiple processes or threads, those modifications will not be shared across sessions.
55
Multiple Inputs
Input parameters are passed to the callback decorator as a list. For this reason, we can include multiple inputs in our
dashboard to affect the same output through a callback function. For this example we’ll use the mpg.csv dataset to
show two input components - both dropdowns - will let us set the x-axis and y-axis features from our dataset.
app = dash.Dash()
df = pd.read_csv('../data/mpg.csv')
features = df.columns
app.layout = html.Div([
html.Div([
html.Div([
dcc.Dropdown(
id='xaxis',
options=[{'label': i, 'value': i} for i in features],
value='displacement'
)
],
style={'width': '48%', 'display': 'inline-block'}),
html.Div([
dcc.Dropdown(
id='yaxis',
options=[{'label': i, 'value': i} for i in features],
value='acceleration'
)
],style={'width': '48%', 'float': 'right', 'display': 'inline-block'})
]),
dcc.Graph(id='feature-graphic')
], style={'padding':10})
@app.callback(
Output('feature-graphic', 'figure'),
[Input('xaxis', 'value'),
Input('yaxis', 'value')])
def update_graph(xaxis_name, yaxis_name):
return {
'data': [go.Scatter(
x=df[xaxis_name],
y=df[yaxis_name],
text=df['name'],
mode='markers',
marker={
'size': 15,
'opacity': 0.5,
'line': {'width': 0.5, 'color': 'white'}
}
)],
'layout': go.Layout(
56
xaxis={'title': xaxis_name},
yaxis={'title': yaxis_name},
margin={'l': 40, 'b': 40, 't': 10, 'r': 0},
hovermode='closest'
)
}
if __name__ == '__main__':
app.run_server()
● We set a variable features equal to the column names in our dataset. An alternative would be to set it to a
recurring value in one dataset column. Note that setting this variable is optional - we could just as easily pass
df.columns wherever features is used.
● Nothing new has happened in the layout section. Inside a Div we set our two dropdown components, followed
by our Graph.
● Notice, though, that app.callback now has two Input parameters, one for each dropdown.
● Other than two inputs, however, the returning update is relatively straightforward. We set up a Scatter plot with
our x- and y-axes.
Run the script, open a browser to http://127.0.0.1:8050/ and you should see:
You can change either dropdown entry and immediately the x-axis and y-axis features change!
As a quick formatting choice, what if we wanted our features to appear capitalized? Even though our dataset column
name is “displacement”, how do we make “Displacement” appear on our graph both in the dropdown list and the
axis title? This is actually a quick fix:
...
app.layout = html.Div([
57
html.Div([
html.Div([
dcc.Dropdown(
id='xaxis',
options=[{'label': i.title(), 'value': i} for i in features],
value='displacement'
...
html.Div([
dcc.Dropdown(
id='yaxis',
options=[{'label': i.title(), 'value': i} for i in features],
value='acceleration'
)
...
def update_graph(xaxis_name, yaxis_name):
...
'layout': go.Layout(
xaxis={'title': xaxis_name.title()},
yaxis={'title': yaxis_name.title()},
margin={'l': 40, 'b': 40, 't': 10, 'r': 0},
hovermode='closest'
...
For another example of multiple inputs, visit the Dash documentation at https://dash.plot.ly/getting-started-part-2.
This shows not only dropdown lists but also radio buttons and a slider used as simultaneous input choices on the
same graph.
Multiple Outputs
Each Dash callback function can only update a single Output property. In the above examples we show how to pass
multiple inputs inside an Input list parameter. To update multiple Outputs, just write multiple functions.
58
For this example, we’ll set up two sets of radio buttons, and two separate output areas.
Next, we’ll add a third output that’s determined by the combination of radio buttons selected!
app = dash.Dash()
df = pd.read_csv('../data/wheels.csv')
app.layout = html.Div([
dcc.RadioItems(
id='wheels',
options=[{'label': i, 'value': i} for i in df['wheels'].unique()],
value=1
),
html.Div(id='wheels-output'),
@app.callback(
Output('wheels-output', 'children'),
[Input('wheels', 'value')])
def callback_a(wheels_value):
return 'You\'ve selected "{}"'.format(wheels_value)
@app.callback(
Output('colors-output', 'children'),
[Input('colors', 'value')])
def callback_b(colors_value):
return 'You\'ve selected "{}"'.format(colors_value)
if __name__ == '__main__':
app.run_server()
59
Run the script, open a browser to http://127.0.0.1:8050/ and you should see:
Resources: https://dash.plot.ly/getting-started-part-2
60
Let’s expand this example, and have an output be determined by both inputs.
Make a duplicate of callback4.py and name it callback5.py. Add the following code (shown in bold).
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import pandas as pd
import base64
app = dash.Dash()
df = pd.read_csv('../data/wheels.csv')
def encode_image(image_file):
encoded = base64.b64encode(open(image_file, 'rb').read())
return 'data:image/png;base64,{}'.format(encoded.decode())
app.layout = html.Div([
dcc.RadioItems(
id='wheels',
options=[{'label': i, 'value': i} for i in df['wheels'].unique()],
value=1
),
html.Div(id='wheels-output'),
@app.callback(
Output('wheels-output', 'children'),
[Input('wheels', 'value')])
def callback_a(wheels_value):
return 'You\'ve selected "{}"'.format(wheels_value)
@app.callback(
Output('colors-output', 'children'),
[Input('colors', 'value')])
def callback_b(colors_value):
return 'You\'ve selected "{}"'.format(colors_value)
@app.callback(
Output('display-image', 'src'),
[Input('wheels', 'value'),
Input('colors', 'value')])
def callback_image(wheel, color):
path = '../data/images/'
return encode_image(path+df[(df['wheels']==wheel) & \
(df['color']==color)]['image'].values[0])
if __name__ == '__main__':
app.run_server()
Now when you run the script, the default values of 1 and blue display an image of a blue unicycle. Change either
input to change the displayed image!
61
A couple of interesting techniques were introduced here:
● As of this writing, Dash doesn’t serve up static files gracefully. To display images stored on the hard drive
requires a conversion to base64. For this we defined a conversion function named “encode_image” and then
used it inside our callback function.
● For our Output, ‘display-image’ is the component ID, and ‘src’ is the component_property we’re affecting.
● We used pandas to obtain the name of our image file from the dataset using conditional selection. Note that the
table only includes the filename, not the PATH. For this we set our own path variable inside the callback
function. This way, we can modify our script to fit any other file structure.
● As of this writing, html.Img takes a height= argument, but not an alt= for providing alternate text in the event an
image can’t be retrieved.
62
Exercise: Interactive Components
For this exercise we want to take two or more integer inputs, and output their product. Be creative! You can use
radio buttons, dropdowns, even a RangeSlider to obtain two input values. Use a callback to return the product of the
two values. Don’t forget to assign IDs to each component. Good luck!
● range(min, max+1) won’t work here. It has to be hardcoded unless min & max are defined outside of layout.
63
Controlling Callbacks with Dash State
In the previous interactive examples we’ve seen how inputs immediately affect outputs. As soon as values are
entered, the page updates to reflect any changes.
What if we wanted to wait before displaying the page? What if we wanted time to enter a series of changes before
submitting them? This is where dash.dependencies.State comes in. Dash offers the ability to store saved changes,
and send them back on command. Consider this very basic example of Input/Output with a callback:
callback6.py
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
app = dash.Dash()
app.layout = html.Div([
dcc.Input(
id='number-in',
value=1,
style={'fontSize':28}
),
html.H1(id='number-out')
])
@app.callback(
Output('number-out', 'children'),
[Input('number-in', 'value')])
def output(number):
return number
if __name__ == '__main__':
app.run_server()
As soon as you type characters into the Input box, they appear below as an HTML header.
Now let’s add a Submit button, and store characters until the button is pressed:
64
callback6a.py (additional code is shown in bold)
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output, State
app = dash.Dash()
app.layout = html.Div([
dcc.Input(
id='number-in',
value=1,
style={'fontSize':28}
),
html.Button(
id='submit-button',
n_clicks=0,
children='Submit',
style={'fontSize':28}
),
html.H1(id='number-out')
])
@app.callback(
Output('number-out', 'children'),
[Input('submit-button', 'n_clicks')],
[State('number-in', 'value')])
def output(n_clicks, number):
return number
if __name__ == '__main__':
app.run_server()
Now our Input is the action of clicking the html.Button element. The value typed into the Input box is stored inside
of State, and is not passed to our Output until the Input registers a button click!
So what is n_clicks? It turns out, this stores the number of clicks that have occurred during the session.
We can show this as part of our output if we want:
callback6b.py
...
@app.callback(
Output('number-out', 'children'),
[Input('submit-button', 'n_clicks')],
[State('number-in', 'value')])
def output(n_clicks, number):
return '{} displayed after {} clicks!'.format(number,n_clicks)
if __name__ == '__main__':
app.run_server()
Each time you submit a new value, the page also reports the number of times the button has been clicked!
It should be noted that any HTML element can be assigned an 'n_clicks' property.
Resources: https://dash.plot.ly/state
65
Interacting with Visualizations
Introduction to Interacting with Visualizations
The first part of this tutorial covered the layout of Dash apps:
● The layout of a Dash app describes what the app looks like. It is a hierarchical tree of components.
● The dash_html_components library provides classes for all of the HTML tags and the keyword arguments
describe the HTML attributes like style, className, and id.
● The dash_core_components library generates higher-level components like controls and graphs.
In this next section, we revisit dash_core_components.Graph, and take a deep dive back into Plotly charts.
Resources: https://dash.plot.ly/interactive-graphing
Here we’ll show how simply hovering over a data point can immediately affect another part of the figure!
We’ll start by building a 3x3 scatterplot from our wheels.csv file. Recall that there are 3 x-axis values
(red, yellow, blue) and 3 y-axis values (1,2,3).
Next we’ll add a callback that takes in 'hoverData', and displays that data to the screen as a JSON object.
66
Create a file called hover1.py and add the following code:
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import pandas as pd
import json
app = dash.Dash()
df = pd.read_csv('../data/wheels.csv')
app.layout = html.Div([
html.Div([
dcc.Graph(
id='wheels-plot',
figure={
'data': [
go.Scatter(
x = df['color'],
y = df['wheels'],
dy = 1,
mode = 'markers',
marker = {
'size': 12,
'color': 'rgb(51,204,153)',
'line': {'width': 2}
}
)
],
'layout': go.Layout(
title = 'Wheels & Colors Scatterplot',
xaxis = {'title': 'Color'},
yaxis = {'title': '# of Wheels','nticks':3},
hovermode='closest'
)
}
)], style={'width':'30%', 'float':'left'}),
html.Div([
html.Pre(id='hover-data', style={'paddingTop':35})
], style={'width':'30%'})
])
@app.callback(
Output('hover-data', 'children'),
[Input('wheels-plot', 'hoverData')])
def callback_image(hoverData):
return json.dumps(hoverData, indent=2)
if __name__ == '__main__':
app.run_server()
67
Some things of note:
● we import json so that we can display the captured hoverData as a json.dumps object.
● we label our output box 'hover-data' only for convenience - this could be anything.
● Our input from 'wheels-plot' captures 'hoverData' and we then pass hoverData into our callback function.
These tags are important!
● We display the hoverData inside an html <pre> tag, which allows for pre-formatting (which we didn’t use) and
displays the contents in a fixed-width font, preserving spaces and line breaks.
● We added 'nticks':3 to the y-axis layout property. Without it the ticks would be [1, 1.5, 2, 2.5, 3]
Run the script, open a browser to http://127.0.0.1:8050/ and you should see something like this:
● Note that the initial state of the JSON output is “null”, and only changes once a point is hovered over.
68
Make a duplicate of hover1.py and call it hover2.py. Add the following code (shown in bold).
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import pandas as pd
import base64
app = dash.Dash()
df = pd.read_csv('../data/wheels.csv')
def encode_image(image_file):
encoded = base64.b64encode(open(image_file, 'rb').read())
return 'data:image/png;base64,{}'.format(encoded.decode())
app.layout = html.Div([
html.Div([
dcc.Graph(
id='wheels-plot',
figure={
'data': [
go.Scatter(
x = df['color'],
y = df['wheels'],
dy = 1,
mode = 'markers',
marker = {
'size': 12,
'color': 'rgb(51,204,153)',
'line': {'width': 2}
}
)
],
'layout': go.Layout(
title = 'Wheels & Colors Scatterplot',
xaxis = {'title': 'Color'},
yaxis = {'title': '# of Wheels','nticks':3},
hovermode='closest'
)
}
)], style={'width':'30%', 'float':'left'}),
html.Div([
html.Img(id='hover-image', src='children', height=300)
], style={'paddingTop':35})
])
@app.callback(
Output('hover-image', 'src'),
[Input('wheels-plot', 'hoverData')])
def callback_image(hoverData):
wheel=hoverData['points'][0]['y']
color=hoverData['points'][0]['x']
path = '../data/images/'
return encode_image(path+df[(df['wheels']==wheel) & \
(df['color']==color)]['image'].values[0])
if __name__ == '__main__':
app.run_server()
69
The sections in blue bold are merely for handling images (recall that we have to convert files to base64 first).
Note how we use hoverData['points'][0]['y'] to obtain the y-axis value.
We feed that into Pandas to retrieve the corresponding image file.
Run the script, open a browser to http://127.0.0.1:8050/, hover over any of the data points
and you should see something like this:
70
Click Data
Click Data is handled nearly the same way as Hover Data - it’s simply an attribute of the graph that can be accessed
using dictionary calls.
Make a duplicate of hover2.py and name it click1.py. Changes are shown in bold:
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import pandas as pd
import base64
app = dash.Dash()
df = pd.read_csv('../data/wheels.csv')
def encode_image(image_file):
encoded = base64.b64encode(open(image_file, 'rb').read())
return 'data:image/png;base64,{}'.format(encoded.decode())
app.layout = html.Div([
html.Div([
dcc.Graph(
id='wheels-plot',
figure={
'data': [
go.Scatter(
x = df['color'],
y = df['wheels'],
dy = 1,
mode = 'markers',
marker = {
'size': 12,
'color': 'rgb(51,204,153)',
'line': {'width': 2}
}
)
],
'layout': go.Layout(
title = 'Wheels & Colors Scatterplot',
xaxis = {'title': 'Color'},
yaxis = {'title': '# of Wheels','nticks':3},
hovermode='closest'
)
}
)], style={'width':'30%', 'float':'left'}),
html.Div([
html.Img(id='click-image', src='children', height=300)
], style={'paddingTop':35})
])
@app.callback(
Output('click-image', 'src'),
[Input('wheels-plot', 'clickData')])
def callback_image(clickData):
wheel=clickData['points'][0]['y']
color=clickData['points'][0]['x']
path = '../data/images/'
71
return encode_image(path+df[(df['wheels']==wheel) & \
(df['color']==color)]['image'].values[0])
if __name__ == '__main__':
app.run_server()
Now when you run the script, images appear on the screen as data points are clicked on instead of hovered over.
That’s it! Everything else - including the dictionary call to obtain our x- and y-axis values - remains the same.
Selected Data
Selection Data makes use of the lasso or rectangle tool in the graph’s menu bar:
To see what this looks like, duplicate hover1.py from Section 37, and name it select1.py. Changes are shown in bold:
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import pandas as pd
import json
app = dash.Dash()
df = pd.read_csv('../data/wheels.csv')
app.layout = html.Div([
html.Div([
dcc.Graph(
id='wheels-plot',
figure={
'data': [
go.Scatter(
x = df['color'],
y = df['wheels'],
dy = 1,
mode = 'markers',
marker = {
'size': 12,
'color': 'rgb(51,204,153)',
'line': {'width': 2}
}
)
],
'layout': go.Layout(
title = 'Wheels & Colors Scatterplot',
xaxis = {'title': 'Color'},
yaxis = {'title': '# of Wheels','nticks':3},
hovermode='closest'
)
72
}
)], style={'width':'30%', 'display':'inline-block'}),
html.Div([
html.Pre(id='selection', style={'paddingTop':25})
], style={'width':'30%', 'display':'inline-block', 'verticalAlign':'top'})
])
@app.callback(
Output('selection', 'children'),
[Input('wheels-plot', 'selectedData')])
def callback_image(selectedData):
return json.dumps(selectedData, indent=2)
if __name__ == '__main__':
app.run_server()
Run the script, open a browser to http://127.0.0.1:8050/ , and use the lasso and rectangle selection tools in the
graph menu bar to select groups of data points. You should see something like this:
The returning dictionary has a key for 'points' and another key for either 'range' or 'lassoPoints'.
Points data is similar to what we saw above for hover and click, only this time the list contains a dictionary
for every encircled point.
Range data contains 'x' and 'y' axis boundaries for the selection box itself.
LassoPoints can be a fairly long list. These are the (x,y) coordinate pairs that define the selection boundary.
73
Let’s put this to use! One problem we find with scatter plots is it can be difficult to identify overlapping data points.
Setting opacity helps (two points occupying the same space will be darker than one point alone), but not foolproof.
For this example we’ll make an artificial dataset, plot points, and use Selected Data to determine the density of
points in a given region of the plot.
app = dash.Dash()
# combine them into one DataFrame (df1 and df2 points overlap!)
df = pd.concat([df1,df2,df3])
app.layout = html.Div([
html.Div([
dcc.Graph(
id='plot',
figure={
'data': [
go.Scatter(
x = df['x'],
y = df['y'],
mode = 'markers'
)
],
'layout': go.Layout(
title = 'Random Scatterplot',
hovermode='closest'
)
}
)], style={'width':'30%', 'display':'inline-block'}),
html.Div([
html.H1(id='density', style={'paddingTop':25})
], style={'width':'30%', 'display':'inline-block', 'verticalAlign':'top'})
])
@app.callback(
Output('density', 'children'),
[Input('plot', 'selectedData')])
def find_density(selectedData):
pts = len(selectedData['points'])
rng_or_lp = list(selectedData.keys())
rng_or_lp.remove('points')
74
max_x = max(selectedData[rng_or_lp[0]]['x'])
min_x = min(selectedData[rng_or_lp[0]]['x'])
max_y = max(selectedData[rng_or_lp[0]]['y'])
min_y = min(selectedData[rng_or_lp[0]]['y'])
area = (max_x-min_x)*(max_y-min_y)
d = pts/area
return 'Density = {:.2f}'.format(d)
if __name__ == '__main__':
app.run_server()
Run the script, open a browser to http://127.0.0.1:8050/ , and use the lasso and rectangle selection tools in the
graph menu bar to select groups of data points on either side of the graph. You should see something like this:
Everything we did here resembles the JSON output script, except for finding the density. Because Selected Data
returns either a “range” key or a “lassoPoints” key depending on the tool used, we had to get creative with how we
mined the size of the selection. Note that lassos will always have overstated areas, since essentially we’re just
building a box around the min and max “x” and “y” values of the blob.
In this example, the points on the left half of the plot are doubled up (wherever you see a point, there are actually
two overlapping points). The right half of the plot is occupied by single points. Thus, the calculated density is twice
as high on the left as on the right for similar selections of points.
If you’re curious what the JSON output looks like for this chart, run the included select2a.py file that’s included in
the course materials.
75
Updating Graphs on Interactions
So far in this section on Interacting with Visualizations, we’ve only used Hover, Click and Select to display new data
on the screen. In this next part we show how to apply these tools to one graph, and have them trigger changes to
other graphs in the same dashboard.
For this exercise we revisit the mpg.csv dataset since it has a convenient number of data points we can hover over.
To set up a useful scatter plot we’ll want to spread the points out along the x-axis. Model Year is a good feature, but
we’ll add an artificial “jitter” to the data so that points don’t all line up along distinct verticals.
To the right of our scatter plot we’ll create a line plot that represents the acceleration of a selected vehicle. The
steeper the line, the quicker the acceleration. We’ll remove the x- and y-axis ticks - all we want is for the line to
show relative comparisons.
Some math: recall that the dataset has a column for acceleration that represents the time in seconds to go from
zero to sixty miles per hour. To translate this into slope we’ll use the following formula:
app = dash.Dash()
df = pd.read_csv('../data/mpg.csv')
# Add a random "jitter" to model_year to spread out the plot
df['year'] = random.randint(-4,5,len(df))*0.10 + df['model_year']
app.layout = html.Div([
dcc.Graph(
id='mpg_scatter',
figure={
'data': [go.Scatter(
x = df['year']+1900, # our "jittered" data
y = df['mpg'],
text = df['name'],
hoverinfo = 'text',
mode = 'markers'
)],
'layout': go.Layout(
title = 'mpg.csv dataset',
xaxis = {'title': 'model year'},
yaxis = {'title': 'miles per gallon'},
hovermode='closest'
)
}
)
])
76
if __name__ == '__main__':
app.run_server()
Run the script, open a browser to http://127.0.0.1:8050/ , and you should see something like:
We used random values for our jitter, so yours may look slightly different.
If we hadn’t added the jitter, the graph would have looked like this:
Next, we’ll add a line graph representing acceleration, and tie it back to our scatter plot with hoverData.
Copy updating1.py and name the new file updating2.py. Add the following code (shown in bold):
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import pandas as pd
from numpy import random
app = dash.Dash()
df = pd.read_csv('../data/mpg.csv')
app.layout = html.Div([
77
html.Div([ # this Div contains our scatter plot
dcc.Graph(
id='mpg_scatter',
figure={
'data': [go.Scatter(
x = df['year']+1900, # our "jittered" data
y = df['mpg'],
text = df['name'],
hoverinfo = 'text',
mode = 'markers'
)],
'layout': go.Layout(
title = 'mpg.csv dataset',
xaxis = {'title': 'model year'},
yaxis = {'title': 'miles per gallon'},
hovermode='closest'
)
}
# add style to the Div to make room for our output graph
)], style={'width':'50%','display':'inline-block'}),
# add a new Div for our output graph
html.Div([ # this Div contains our output graph
dcc.Graph(
id='mpg_line',
figure={
'data': [go.Scatter(
x = [0,1],
y = [0,1],
mode = 'lines'
)],
'layout': go.Layout(
title = 'acceleration',
margin = {'l':0}
)
}
)
], style={'width':'20%', 'height':'50%','display':'inline-block'})
])
# add a callback
@app.callback(
Output('mpg_line', 'figure'),
[Input('mpg_scatter', 'hoverData')])
def callback_graph(hoverData):
v_index = hoverData['points'][0]['pointIndex']
fig = {
'data': [go.Scatter(
x = [0,1],
y = [0,60/df.iloc[v_index]['acceleration']],
mode='lines',
line={'width':2*df.iloc[v_index]['cylinders']}
)],
'layout': go.Layout(
title = df.iloc[v_index]['name'],
xaxis = {'visible':False},
yaxis = {'visible':False, 'range':[0,60/df['acceleration'].min()]},
margin = {'l':0},
height = 300
)
}
return fig
78
if __name__ == '__main__':
app.run_server()
Run the script, open a browser to http://127.0.0.1:8050/ , and you should see something like:
As you hover over different vehicles, the graph on the right changes pitch (higher for quicker cars), and thickness
depending on the number of cylinders.
Let’s add one more feature, and have vehicle statistics appear as a dcc.Markdown element.
Copy updating2.py and name the new file updating3.py. Add the following code (shown in bold):
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import pandas as pd
from numpy import random
app = dash.Dash()
df = pd.read_csv('../data/mpg.csv')
app.layout = html.Div([
html.Div([ # this Div contains our scatter plot
dcc.Graph(
id='mpg_scatter',
figure={
'data': [go.Scatter(
x = df['year']+1900, # our "jittered" data
y = df['mpg'],
text = df['name'],
hoverinfo = 'text',
mode = 'markers'
)],
'layout': go.Layout(
title = 'mpg.csv dataset',
79
xaxis = {'title': 'model year'},
yaxis = {'title': 'miles per gallon'},
hovermode='closest'
)
}
)], style={'width':'50%','display':'inline-block'}),
html.Div([ # this Div contains our output graph and vehicle stats
dcc.Graph(
id='mpg_line',
figure={
'data': [go.Scatter(
x = [0,1],
y = [0,1],
mode = 'lines'
)],
'layout': go.Layout(
title = 'acceleration',
margin = {'l':0}
)
}
),
# add a Markdown section
dcc.Markdown(
id='mpg_stats'
)
], style={'width':'20%', 'height':'50%','display':'inline-block'})
])
@app.callback(
Output('mpg_line', 'figure'),
[Input('mpg_scatter', 'hoverData')])
def callback_graph(hoverData):
v_index = hoverData['points'][0]['pointIndex']
fig = {
'data': [go.Scatter(
x = [0,1],
y = [0,60/df.iloc[v_index]['acceleration']],
mode='lines',
line={'width':2*df.iloc[v_index]['cylinders']}
)],
'layout': go.Layout(
title = df.iloc[v_index]['name'],
xaxis = {'visible':False},
yaxis = {'visible':False, 'range':[0,60/df['acceleration'].min()]},
margin = {'l':0},
height = 300
)
}
return fig
# add a second callback for our Markdown
@app.callback(
Output('mpg_stats', 'children'),
[Input('mpg_scatter', 'hoverData')])
def callback_stats(hoverData):
v_index = hoverData['points'][0]['pointIndex']
stats = """
{} cylinders
{}cc displacement
0 to 60mph in {} seconds
""".format(df.iloc[v_index]['cylinders'],
df.iloc[v_index]['displacement'],
df.iloc[v_index]['acceleration'])
80
return stats
if __name__ == '__main__':
app.run_server()
Run the script, open a browser to http://127.0.0.1:8050/ , and you should see something like:
That’s it! Now we’ve used hover to dynamically change another graph on the same page, and populate a Markdown
section at the same time.
81
Code Along Milestone Project
We offer a culminating project here. Some new material may be covered, so we encourage research into pandas,
plotly, and the Dash documentation.
This project develops a Stock Ticker Dashboard that allows the user to either enter a ticker symbol into an input box,
or to select item(s) from a dropdown list, and uses pandas_datareader to look up and display stock data on a graph.
The final project will include a DatePicker to set the start and end dates for the graph:
82
Introduction to Live Updating
So far we’ve shown a lot of ways to work with static data. For convenience we have provided the .csv files
themselves, but in most cases we could just as easily have programmed the source websites into our graphs.
But what if the information on the web is constantly changing? For this section we can’t supply a .csv file because
our source data, https://www.flightradar24.com updates its information every 8 seconds!
This section introduces the dash_core_components.Interval component. Instead of waiting for some user
interaction to update the page, Interval lets you update components in your application every few seconds or
minutes.
Resources: https://dash.plot.ly/live-updates
app = dash.Dash()
crash_free = 0
crash_free += 1
if __name__ == '__main__':
app.run_server()
Run the script to display the page, and then refresh the page several times. Note that the layout doesn’t change.
Copy layoutupdate0.py to a new file layoutupdate1.py and add the following code (shown in bold):
import dash
import dash_html_components as html
app = dash.Dash()
crash_free = 0
def refresh_layout():
global crash_free
crash_free += 1
return html.H1('Crash free for {} refreshes'.format(crash_free))
app.layout = refresh_layout
if __name__ == '__main__':
app.run_server()
Run the script. Now you should see that refreshing the page does update the layout.
83
Now it’s time to make the page refresh at regular intervals automatically.
Copy layoutupdate1.py to a new file layoutupdate2.py and add the following code (shown in bold):
import dash
import dash_html_components as html
import dash_core_components as dcc
from dash.dependencies import Input, Output
app = dash.Dash()
app.layout = html.Div([
html.H1(id='live-update-text'),
dcc.Interval(
id='interval-component',
interval=2000, # 2000 milliseconds = 2 seconds
n_intervals=0
)
])
@app.callback(Output('live-update-text', 'children'),
[Input('interval-component', 'n_intervals')])
def update_layout(n):
return 'Crash free for {} refreshes'.format(n)
if __name__ == '__main__':
app.run_server()
Here we’re using a callback Input (dcc.Interval) to trigger a callback Output (our html.H1 tag) at regular intervals.
Run the script, and the layout should update automatically every 2 seconds!
Remember, the IDs we assign our Input and Output elements are arbitrary ('live-update-text' and 'interval-
component' in this case). However, the property names we use are important. We want to input the 'n_intervals'
property of the dcc.Interval component, and in this situation we want to return a 'children' property to our html.H1
component (here it’s the string that will be become the Header text).
84
In this next example we’ll scrape a website that updates every eight seconds. The site
https://www.flightradar24.com receives flight data from around the world and continually updates its page by
plotting real time flight data on top of Google maps.
The data we care about is only going to be the total number of active flights worldwide. This is shown in the upper
left corner of the screen, right next to the number of flights contained in the current view. It’s worth noting that
flightradar24 data arrives from a number of sources, including radar stations (ADS-B, FLARM, MLAT, FAA) as well as
estimated numbers.
It would be nice to be able to scrape the opening page and grab this data. The script would look something like this:
import bs4, requests
res = requests.get('https://www.flightradar24.com', headers={'User-Agent': 'Mozilla/5.0'})
soup = bs4.BeautifulSoup(res.text,'lxml')
soup.select('#statTotal')
Unfortunately, most of the data displayed on flightradar24’s page is derived from JavaScript calls!
Fortunately, we can still handle this with a little JSON parsing. If you’re curious where the url we’re about to use
came from, simply inspect the #statTotal element in developer tools, open Network, and take a look at the
various JavaScript calls that are going on.
Open a new file and name it liveupdating1.py. Add the following code:
import dash
import dash_html_components as html
import requests
url = "https://data-live.flightradar24.com/zones/fcgi/feed.js?faa=1\
&mlat=1&flarm=1&adsb=1&gnd=1&air=1&vehicles=1&estimated=1&stats=1"
app = dash.Dash()
app.layout = html.Div([
html.Div([
html.Iframe(src = 'https://www.flightradar24.com', height = 500, width = 1200)
]),
html.Div([
html.Pre('Active flights worldwide: {}'.format(counter))
])
])
if __name__ == '__main__':
app.run_server()
Here we’ve embedded the flightradar24 website itself into our own page, followed by the counter value obtained
via web scraping! Note that if you refresh the page, the counter value doesn’t change. Once set by our script, that
value remains until the script is halted and restarted.
85
Next, let’s add a dcc.Interval component.
Make a duplicate of liveupdating1.py and name it liveupdating2.py. Add the following code (shown in bold):
import dash
import dash_html_components as html
import dash_core_components as dcc
from dash.dependencies import Input, Output
import requests
app = dash.Dash()
app.layout = html.Div([
html.Div([
html.Iframe(src = 'https://www.flightradar24.com', height = 500, width = 1200)
]),
html.Div([
html.Pre(
id='counter_text',
children='Active flights worldwide:'
),
dcc.Interval(
id='interval-component',
interval=6000, # 6000 milliseconds = 6 seconds
n_intervals=0
)])
])
@app.callback(Output('counter_text', 'children'),
[Input('interval-component', 'n_intervals')])
def update_layout(n):
url = "https://data-live.flightradar24.com/zones/fcgi/feed.js?faa=1\
&mlat=1&flarm=1&adsb=1&gnd=1&air=1&vehicles=1&estimated=1&stats=1"
# A fake header is necessary to access the site:
res = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
data = res.json()
counter = 0
for element in data["stats"]["total"]:
counter += data["stats"]["total"][element]
return 'Active flights worldwide: {}'.format(counter)
if __name__ == '__main__':
app.run_server()
Note that we simply moved the url request section to inside the update_layout function definition.
Run the script, and you’ll notice that the flight total updates every six seconds. It won’t be in perfect sync with
flightradar24, but it will be close.
86
Make a duplicate of liveupdating2.py and name it liveupdating3.py. Add the following code (shown in bold):
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import requests
app = dash.Dash()
app.layout = html.Div([
html.Div([
html.Iframe(src = 'https://www.flightradar24.com', height = 500, width = 1200)
]),
html.Div([
html.Pre(
id='counter_text',
children='Active flights worldwide:'
),
dcc.Graph(id='live-update-graph',style={'width':1200}),
dcc.Interval(
id='interval-component',
interval=6000, # 6000 milliseconds = 6 seconds
n_intervals=0
)])
])
counter_list = []
@app.callback(Output('counter_text', 'children'),
[Input('interval-component', 'n_intervals')])
def update_layout(n):
url = "https://data-live.flightradar24.com/zones/fcgi/feed.js?faa=1\
&mlat=1&flarm=1&adsb=1&gnd=1&air=1&vehicles=1&estimated=1&stats=1"
# A fake header is necessary to access the site:
res = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
data = res.json()
counter = 0
for element in data["stats"]["total"]:
counter += data["stats"]["total"][element]
counter_list.append(counter)
return 'Active flights worldwide: {}'.format(counter)
@app.callback(Output('live-update-graph','figure'),
[Input('interval-component', 'n_intervals')])
def update_graph(n):
fig = go.Figure(
data = [go.Scatter(
x = list(range(len(counter_list))),
y = counter_list,
mode='lines+markers'
)])
return fig
if __name__ == '__main__':
app.run_server()
Run the script, and now we have a constantly updating line chart beneath the website!
Notice we haven’t done anything with datetime. This graph simply plots the data we’ve stored
since the page was opened, letting us see the trend in the number of active flights worldwide.
87
After awhile, your page may look something like this:
Good job!
88
Deployment
Introduction to Deploying Apps
In this section we’ll look at the final phase of dashboard development - deployment! We show how to deploy your
app on Heroku, and how to add a user authentication to your app so that only invited guests can view its contents.
Before deploying your app, you may decide to add user authentication (username and password).
App Authorization
From the Dash documentation:
Authentication for dash apps is provided through a separate dash-auth package.
dash-auth provides two methods of authentication: HTTP Basic Auth and Plotly OAuth.
HTTP Basic Auth is one of the simplest forms of authentication on the web. As a Dash developer, you hardcode a set of
usernames and passwords in your code and send those usernames and passwords to your viewers. There are a few limitations to
HTTP Basic Auth:
● You are responsible for sending the usernames and passwords to your viewers over a secure channel
● Your viewers can not create their own account and cannot change their password
● You are responsible for safely storing the username and password pairs in your code.
Plotly OAuth provides authentication through your online Plotly account or through your company's Plotly On-Premise server.
As a Dash developer, this requires a paid Plotly subscription. Here's where you can subscribe to Plotly Cloud, and here's where
you can contact us about Plotly On-Premise. The viewers of your app will need a Plotly account but they do not need to upgrade
to a paid subscription.
Plotly OAuth allows you to share your apps with other users who have Plotly accounts. With Plotly On-Premise, this includes
sharing apps through the integrated LDAP system. Apps that you have saved will appear in your list of files at
https://plot.ly/organize and you can manage the permissions of the apps there. Viewers create and manage their own accounts.
HTTP Basic Auth will be sufficient for our purposes. To add authentication to your app, first make sure that both
dash and dash-auth are installed on your system:
$ pip install dash
$ pip install dash-auth
Next, pick an app from earlier in the course that you would like to deploy. We’re going to use the solution to our
Interactive Components exercise since it’s a fairly short script (it returns the product of two values submitted by a
range slider).
Create a new file called auth1.py and add the following code:
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
app = dash.Dash()
app.layout = html.Div([
dcc.RangeSlider(
id='range-slider',
min=-5,
max=6,
marks={i:str(i) for i in range(-5, 7)},
89
value=[-3, 4]
),
html.H1(id='product') # this is the output
], style={'width':'50%'})
@app.callback(
Output('product', 'children'),
[Input('range-slider', 'value')])
def update_value(value_list):
return value_list[0]*value_list[1]
if __name__ == '__main__':
app.run_server()
Run the script just to make sure it works, then add the following code (shown in bold):
import dash
import dash_auth
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
USERNAME_PASSWORD_PAIRS = [
['JamesBond', '007'],['LouisArmstrong', 'satchmo']
]
app = dash.Dash()
auth = dash_auth.BasicAuth(app,USERNAME_PASSWORD_PAIRS)
app.layout = html.Div([
dcc.RangeSlider(
id='range-slider',
min=-5,
max=6,
marks={i:str(i) for i in range(-5, 7)},
value=[-3, 4]
),
html.H1(id='product') # this is the output
], style={'width':'50%'})
@app.callback(
Output('product', 'children'),
[Input('range-slider', 'value')])
def update_value(value_list):
return value_list[0]*value_list[1]
if __name__ == '__main__':
app.run_server()
That’s it! Run the script, open a browser to http://127.0.0.1:8050/ , and you should see be prompted for a username
and password before the app will load. We should point out a couple of things:
● The username is case sensitive. JamesBond will work, but jamesbond will not.
● In production, you should store your USERNAME_PASSWORD_PAIRS in a separate file or database, and not
inside your source code as we have it.
● The field name is arbitrary; we used USERNAME_PASSWORD_PAIRS but you can name yours anything you want
so long as the same name is passed into dash_auth.BasicAuth.
Resources: https://dash.plot.ly/authentication
90
Deploying App to Heroku
Every Dash script so far has used app.run_server() to launch the app. By default the app runs on localhost,
and you can only see it on your own machine.
The good news is that Dash uses Flask as its web framework, so anywhere you can deploy Flask, you can deploy
Dash. While there are many options out there including Digital Ocean, PythonAnywhere, Google Cloud, Amazon
Web Services, Azure, etc., we’ll walk through an app deployment on Heroku.
91
3. Click on Python. On the next screen select Set Up. An option should appear to download the Heroku Command
Line Interface (CLI). Choose your operating system from the dropdown list and follow the instructions to install
the utility. You should have the option to install Git as well.
4. If git was not installed with Heroku CLI, you can download it directly from https://git-scm.com/downloads and
follow the instructions for your operating system.
92
STEP 5 (WINDOWS) - Create, Activate and Populate a virtualenv
see below for macOS/Linux instructions!
8. Create a virtual environment. We’re calling ours “venv” but you can use any name you want:
C:\my_dash_app>python -m virtualenv venv
10. Install dash and any desired dependencies into your virtual environment
(venv) C:\my_dash_app>pip install dash
(venv) C:\my_dash_app>pip install dash-auth
(venv) C:\my_dash_app>pip install dash-renderer
(venv) C:\my_dash_app>pip install dash-core-components
(venv) C:\my_dash_app>pip install dash-html-components
(venv) C:\my_dash_app>pip install plotly (requirement may be satisfied, see below)
10. Install dash and any desired dependencies into your virtual environment
$ pip install dash
$ pip install dash-auth
$ pip install dash-renderer
$ pip install dash-core-components
$ pip install dash-html-components
$ pip install plotly (requirement may be satisfied, see above)
93
STEP 6 - Add Files to the Development Folder
The following files need to be added:
app1.py
Copy the file used in the Basic Authorization section (or any file you’d like to deploy) and add the following code,
shown in bold:
import dash
import dash_auth
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
USERNAME_PASSWORD_PAIRS = [
['JamesBond', '007'],['LouisArmstrong', 'satchmo']
]
app = dash.Dash()
auth = dash_auth.BasicAuth(app,USERNAME_PASSWORD_PAIRS)
server = app.server
app.layout = html.Div([
dcc.RangeSlider(
id='range-slider',
min=-5,
max=6,
marks={i:str(i) for i in range(-5, 7)},
value=[-3, 4]
),
html.H1(id='product') # this is the output
], style={'width':'50%'})
@app.callback(
Output('product', 'children'),
[Input('range-slider', 'value')])
def update_value(value_list):
return value_list[0]*value_list[1]
if __name__ == '__main__':
app.run_server()
.gitignore
venv
*.pyc
.DS_Store
.env
Procfile
web: gunicorn app1:server
94
app1 refers to the filename of our application (app1.py) and server refers to the variable server inside that
file.
requirements.txt
This can be automatically generated by running pip freeze > requirements.txt at the terminal.
Make sure to do it from inside the development folder with the virtual environment activated.
95
STEP 7 - Initialize Heroku, add files to Git, and Deploy
(venv) C:\my_dash_app>heroku create my-dash-app
You have to change my-dash-app to a unique name. The name must start with a letter
and can only contain lowercase letters, numbers, and dashes.
(venv) C:\my_dash_app>git add .
Note the period at the end. This adds all files to git (except those listed in .gitignore)
(venv) C:\my_dash_app>git commit -m "Initial launch"
Every git commit should include a brief descriptive comment. Depending on your operating system, this
comment may require double-quotes (not single-quotes).
(venv) C:\my_dash_app>git push heroku master
This deploys your current code to Heroku. The first time you push may take awhile as it has to set up Python and
all your dependencies on the remote server.
(venv) C:\my_dash_app>heroku ps:scale web=1
Scaling dynos... done, now running web at 1:Free
This runs the app with a 1 heroku "dyno"
In all cases:
$ git status # view the changes (optional)
$ git add . # add all the changes
$ git commit -m "a description of the changes"
$ git push heroku master
TROUBLESHOOTING
If your app won’t launch on Heroku, follow this checklist:
☐ If unable to trace locally, visit your Heroku dashboard and click on More / View logs
Resources: https://dash.plot.ly/deployment
96
APPENDIX I - EXAMPLES CODE:
Plotly Basics
Plotly Basics Overview
basic1.py
#######
# This script creates a static matplotlib plot
######
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
basic2.py
#######
# This script creates the same type of plot as basic1.py,
# but in Plotly. Note that it creates an .html file!
######
import numpy as np
import pandas as pd
import plotly.offline as pyo
import plotly.graph_objs as go
97
Scatter Plots
scatter1.py
#######
# This plots 100 random data points (set the seed to 42 to
# obtain the same points we do!) between 1 and 100 in both
# vertical and horizontal directions.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import numpy as np
np.random.seed(42)
random_x = np.random.randint(1,101,100)
random_y = np.random.randint(1,101,100)
data = [go.Scatter(
x = random_x,
y = random_y,
mode = 'markers',
)]
pyo.plot(data, filename='scatter1.html')
scatter2.py
#######
# This plots 100 random data points (set the seed to 42 to
# obtain the same points we do!) between 1 and 100 in both
# vertical and horizontal directions.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import numpy as np
np.random.seed(42)
random_x = np.random.randint(1,101,100)
random_y = np.random.randint(1,101,100)
data = [go.Scatter(
x = random_x,
y = random_y,
mode = 'markers',
)]
layout = go.Layout(
title = 'Random Data Scatterplot', # Graph title
xaxis = dict(title = 'Some random x-values'), # x-axis label
yaxis = dict(title = 'Some random y-values'), # y-axis label
hovermode ='closest' # handles multiple points landing on the same vertical
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='scatter2.html')
98
scatter3.py
#######
# This plots 100 random data points (set the seed to 42 to
# obtain the same points we do!) between 1 and 100 in both
# vertical and horizontal directions.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import numpy as np
np.random.seed(42)
random_x = np.random.randint(1,101,100)
random_y = np.random.randint(1,101,100)
data = [go.Scatter(
x = random_x,
y = random_y,
mode = 'markers',
marker = dict( # change the marker style
size = 12,
color = 'rgb(51,204,153)',
symbol = 'pentagon',
line = dict(
width = 2,
)
)
)]
layout = go.Layout(
title = 'Random Data Scatterplot', # Graph title
xaxis = dict(title = 'Some random x-values'), # x-axis label
yaxis = dict(title = 'Some random y-values'), # y-axis label
hovermode ='closest' # handles multiple points landing on the same vertical
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='scatter3.html')
99
Line Charts
line1.py
#######
# This line chart displays the same data
# three different ways along the y-axis.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import numpy as np
np.random.seed(56)
x_values = np.linspace(0, 1, 100) # 100 evenly spaced values
y_values = np.random.randn(100) # 100 random values
# create traces
trace0 = go.Scatter(
x = x_values,
y = y_values+5,
mode = 'markers',
name = 'markers'
)
trace1 = go.Scatter(
x = x_values,
y = y_values,
mode = 'lines+markers',
name = 'lines+markers'
)
trace2 = go.Scatter(
x = x_values,
y = y_values-5,
mode = 'lines',
name = 'lines'
)
data = [trace0, trace1, trace2] # assign traces to data
layout = go.Layout(
title = 'Line chart showing three different modes'
)
fig = go.Figure(data=data,layout=layout)
pyo.plot(fig, filename='line1.html')
100
line2.py
#######
# This line chart shows U.S. Census Bureau
# population data from six New England states.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
# create traces
traces = [go.Scatter(
x = df.columns,
y = df.loc[name],
mode = 'markers+lines',
name = name
) for name in df.index]
layout = go.Layout(
title = 'Population Estimates of the Six New England States'
)
fig = go.Figure(data=traces,layout=layout)
pyo.plot(fig, filename='line2.html')
101
line3.py
#######
# This line chart shows U.S. Census Bureau
# population data from six New England states.
# THIS PLOT USES PANDAS TO EXTRACT DESIRED DATA FROM THE SOURCE
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../sourcedata/nst-est2017-alldata.csv')
# Alternatively:
# df = pd.read_csv('https://www2.census.gov/programs-
surveys/popest/datasets/2010-2017/national/totals/nst-est2017-alldata.csv')
traces=[go.Scatter(
x = df2.columns,
y = df2.loc[name],
mode = 'markers+lines',
name = name
) for name in df2.index]
layout = go.Layout(
title = 'Population Estimates of the Six New England States'
)
fig = go.Figure(data=traces,layout=layout)
pyo.plot(fig, filename='line3.html')
102
Bar Charts
bar1.py
#######
# A basic bar chart showing the total number of
# 2018 Winter Olympics Medals won by Country.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
data = [go.Bar(
x=df['NOC'], # NOC stands for National Olympic Committee
y=df['Total']
)]
layout = go.Layout(
title='2018 Winter Olympic Medals by Country'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bar1.html')
103
bar2.py
#######
# This is a grouped bar chart showing three traces
# (gold, silver and bronze medals won) for each country
# that competed in the 2018 Winter Olympics.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2018WinterOlympics.csv')
trace1 = go.Bar(
x=df['NOC'], # NOC stands for National Olympic Committee
y=df['Gold'],
name = 'Gold',
marker=dict(color='#FFD700') # set the marker color to gold
)
trace2 = go.Bar(
x=df['NOC'],
y=df['Silver'],
name='Silver',
marker=dict(color='#9EA0A1') # set the marker color to silver
)
trace3 = go.Bar(
x=df['NOC'],
y=df['Bronze'],
name='Bronze',
marker=dict(color='#CD7F32') # set the marker color to bronze
)
data = [trace1, trace2, trace3]
layout = go.Layout(
title='2018 Winter Olympic Medals by Country'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bar2.html')
104
bar3.py
#######
# This is a stacked bar chart showing three traces
# (gold, silver and bronze medals won) for each country
# that competed in the 2018 Winter Olympics.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2018WinterOlympics.csv')
trace1 = go.Bar(
x=df['NOC'], # NOC stands for National Olympic Committee
y=df['Gold'],
name = 'Gold',
marker=dict(color='#FFD700') # set the marker color to gold
)
trace2 = go.Bar(
x=df['NOC'],
y=df['Silver'],
name='Silver',
marker=dict(color='#9EA0A1') # set the marker color to silver
)
trace3 = go.Bar(
x=df['NOC'],
y=df['Bronze'],
name='Bronze',
marker=dict(color='#CD7F32') # set the marker color to bronze
)
data = [trace1, trace2, trace3]
layout = go.Layout(
title='2018 Winter Olympic Medals by Country',
barmode='stack'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bar3.html')
105
Bubble Charts
bubble1.py
#######
# A bubble chart is simply a scatter plot
# with the added feature that the size of the
# marker can be set by the data.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/mpg.csv')
layout = go.Layout(
title='Vehicle mpg vs. horsepower',
hovermode='closest'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bubble1.html')
106
bubble2.py
#######
# A bubble chart is simply a scatter plot
# with the added feature that the size of the
# marker can be set by the data.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/mpg.csv')
data = [go.Scatter(
x=df['horsepower'],
y=df['mpg'],
text=df['text2'], # use the new column for the hover text
mode='markers',
marker=dict(size=1.5*df['cylinders'])
)]
layout = go.Layout(
title='Vehicle mpg vs. horsepower',
hovermode='closest'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='bubble2.html')
107
Box Plots
box1.py
#######
# This simple box plot places the box beside
# the original data points on the same graph.
######
import plotly.offline as pyo
import plotly.graph_objs as go
data = [
go.Box(
y=y,
boxpoints='all', # display the original data points
jitter=0.3, # spread them out so they all appear
pointpos=-1.8 # offset them to the left of the box
)
]
pyo.plot(data, filename='box1.html')
box2.py
#######
# This simple box plot displays outliers
# above and below the box.
######
import plotly.offline as pyo
import plotly.graph_objs as go
data = [
go.Box(
y=y,
boxpoints='outliers' # display only outlying data points
)
]
pyo.plot(data, filename='box2.html')
108
box3.py
#######
# This plot compares sample distributions
# of three-letter-words in the works of
# Quintus Curtius Snodgrass and Mark Twain
######
import plotly.offline as pyo
import plotly.graph_objs as go
snodgrass = [.209,.205,.196,.210,.202,.207,.224,.223,.220,.201]
twain = [.225,.262,.217,.240,.230,.229,.235,.217]
data = [
go.Box(
y=snodgrass,
name='QCS'
),
go.Box(
y=twain,
name='MT'
)
]
layout = go.Layout(
title = 'Comparison of three-letter-word frequencies<br>\
between Quintus Curtius Snodgrass and Mark Twain'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='box3.html')
109
Histograms
hist1.py
#######
# This histogram looks back at the mpg dataset
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/mpg.csv')
data = [go.Histogram(
x=df['mpg']
)]
layout = go.Layout(
title="Miles per Gallon Frequencies of<br>\
1970's Era Vehicles"
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='basic_histogram.html')
hist2.py
#######
# This histogram has wider bins than the previous hist1.py
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/mpg.csv')
data = [go.Histogram(
x=df['mpg'],
xbins=dict(start=8,end=50,size=6),
)]
layout = go.Layout(
title="Miles per Gallon Frequencies of<br>\
1970's Era Vehicles"
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='wide_histogram.html')
110
hist3.py
#######
# This histogram has narrower bins than the previous hist1.py
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/mpg.csv')
data = [go.Histogram(
x=df['mpg'],
xbins=dict(start=8,end=50,size=1),
)]
layout = go.Layout(
title="Miles per Gallon Frequencies of<br>\
1970's Era Vehicles"
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='narrow_histogram.html')
hist4.py
#######
# This histogram displays the number of Reddit button presses
# over the two months of their social experiment.
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/thebutton_presses.csv')
data = [go.Histogram(
x=df['press time']
)]
layout = go.Layout(
title="Number of presses per timeslot"
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='button_presses.html')
111
histBONUS.py
#######
# This bar chart mimics a histogram as the x-axis
# is a continuous time series, and the y-axis sums
# a frequency that is already part of the dataset
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/FremontBridgeBicycles.csv')
trace1 = go.Bar(
x=df2.index,
y=df2['Fremont Bridge West Sidewalk'],
name="Southbound",
width=1 # eliminates space between adjacent bars
)
trace2 = go.Bar(
x=df2.index,
y=df2['Fremont Bridge East Sidewalk'],
name="Northbound",
width=1
)
data = [trace1, trace2]
layout = go.Layout(
title='Fremont Bridge Bicycle Traffic by Hour',
barmode='stack'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='fremont_bridge.html')
112
Distplots
dist1.py
#######
# This distplot uses plotly's Figure Factory
# module in place of Graph Objects
######
import plotly.offline as pyo
import plotly.figure_factory as ff
import numpy as np
x = np.random.randn(1000)
hist_data = [x]
group_labels = ['distplot']
dist2.py
#######
# This distplot demonstrates that random samples
# seldom fit a "normal" distribution.
######
import plotly.offline as pyo
import plotly.figure_factory as ff
import numpy as np
x1 = np.random.randn(200)-2
x2 = np.random.randn(200)
x3 = np.random.randn(200)+2
x4 = np.random.randn(200)+4
hist_data = [x1,x2,x3,x4]
group_labels = ['Group1','Group2','Group3','Group4']
113
dist3.py
#######
# This distplot looks back at the Mark Twain/
# Quintus Curtius Snodgrass data and tries
# to compare them.
######
import plotly.offline as pyo
import plotly.figure_factory as ff
snodgrass = [.209,.205,.196,.210,.202,.207,.224,.223,.220,.201]
twain = [.225,.262,.217,.240,.230,.229,.235,.217]
hist_data = [snodgrass,twain]
group_labels = ['Snodgrass','Twain']
Heatmaps
heat1.py
#######
# Heatmap of temperatures for Santa Barbara, California
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2010SantaBarbaraCA.csv')
data = [go.Heatmap(
x=df['DAY'],
y=df['LST_TIME'],
z=df['T_HR_AVG'].values.tolist(),
colorscale='Jet'
)]
layout = go.Layout(
title='Hourly Temperatures, June 1-7, 2010 in<br>\
Santa Barbara, CA USA'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='Santa_Barbara.html')
114
heat2.py
#######
# Heatmap of temperatures for Yuma, Arizona
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2010YumaAZ.csv')
data = [go.Heatmap(
x=df['DAY'],
y=df['LST_TIME'],
z=df['T_HR_AVG'].values.tolist(),
colorscale='Jet'
)]
layout = go.Layout(
title='Hourly Temperatures, June 1-7, 2010 in<br>\
Yuma, AZ USA'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='Yuma.html')
heat3.py
#######
# Heatmap of temperatures for Sitka, Alaska
######
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
df = pd.read_csv('../data/2010SitkaAK.csv')
data = [go.Heatmap(
x=df['DAY'],
y=df['LST_TIME'],
z=df['T_HR_AVG'].values.tolist(),
colorscale='Jet'
)]
layout = go.Layout(
title='Hourly Temperatures, June 1-7, 2010 in<br>\
Sitka, AK USA'
)
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='Sitka.html')
115
heat4.py
#######
# Side-by-side heatmaps for Sitka, Alaska,
# Santa Barbara, California and Yuma, Arizona
# using a shared temperature scale.
######
import plotly.offline as pyo
import plotly.graph_objs as go
from plotly import tools
import pandas as pd
df1 = pd.read_csv('../data/2010SitkaAK.csv')
df2 = pd.read_csv('../data/2010SantaBarbaraCA.csv')
df3 = pd.read_csv('../data/2010YumaAZ.csv')
trace1 = go.Heatmap(
x=df1['DAY'],
y=df1['LST_TIME'],
z=df1['T_HR_AVG'].values.tolist(),
colorscale='Jet',
zmin = 5, zmax = 40 # add max/min color values to make each plot consistent
)
trace2 = go.Heatmap(
x=df2['DAY'],
y=df2['LST_TIME'],
z=df2['T_HR_AVG'].values.tolist(),
colorscale='Jet',
zmin = 5, zmax = 40
)
trace3 = go.Heatmap(
x=df3['DAY'],
y=df3['LST_TIME'],
z=df3['T_HR_AVG'].values.tolist(),
colorscale='Jet',
zmin = 5, zmax = 40
)
116
Plotly Basics Exercise Solutions
[Return to Topic]
Sol1-Scatterplot.py
#######
# Objective: Create a scatterplot of 1000 random data points.
# x-axis values should come from a normal distribution using
# np.random.randn(1000)
# y-axis values should come from a uniform distribution over [0,1) using
# np.random.rand(1000)
######
# Create a fig from data and layout, and plot the fig
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='solution1.html')
117
A Note About the Line Chart Exercise:
By itself, a for loop won’t work as expected! The code
data = []
for day in df['DAY']:
trace = go.Scatter(x=df['LST_TIME'],
y=df[df['DAY']==day]['T_HR_AVG'],
mode='lines',
name=day)
data.append(trace)
has each row as its own trace, not each day.
118
Sol2a-Linechart.py
#######
# Objective: Using the file 2010YumaAZ.csv, develop a Line Chart
# that plots seven days worth of temperature data on one graph.
# You can use a list comprehension to assign each day to its own trace.
######
# Perform imports here:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
# Use a for loop to create the traces for the seven days
# There are many ways to do this!
data = []
# Create a fig from data and layout, and plot the fig
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='solution2a.html')
119
Sol2b-Linechart.py
####################
## NOTE: ADVANCED SOLUTION THAT USES ONLY PURE DF CALLS
## THIS IS FOR MORE ADVANCED PANDAS USERS TO TAKE A LOOK AT! :)
#######
# Objective: Using the file 2010YumaAZ.csv, develop a Line Chart
# that plots seven days worth of temperature data on one graph.
# You can use a list comprehension to assign each day to its own trace.
######
# Perform imports here:
import plotly.offline as pyo
import plotly.graph_objs as go
import pandas as pd
# Create a fig from data and layout, and plot the fig
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='solution2b.html')
120
Sol3a-Barchart.py
#######
# Objective: Create a stacked bar chart from
# the file ../data/mocksurvey.csv. Note that questions appear in
# the index (and should be used for the x-axis), while responses
# appear as column labels. Extra Credit: make a horizontal bar chart!
######
# create a fig from data & layout, and plot the fig
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='solution3a.html')
121
Sol3b-Barchart.py
#######
# Objective: Create a stacked bar chart from
# the file ../data/mocksurvey.csv. Note that questions appear in
# the index (and should be used for the x-axis), while responses
# appear as column labels. Extra Credit: make a horizontal bar chart!
######
# create a fig from data & layout, and plot the fig
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='solution3a.html')
122
Sol4-Bubblechart.py
#######
# Objective: Create a bubble chart that compares three other features
# from the mpg.csv dataset. Fields include: 'mpg', 'cylinders', 'displacement'
# 'horsepower', 'weight', 'acceleration', 'model_year', 'origin', 'name'
######
# create a fig from data & layout, and plot the fig
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='solution4.html')
#######
# So what happened?? Why is the trend sloping downward?
# Remember that acceleration is the number of seconds to go from 0 to 60mph,
# so fewer seconds means faster acceleration!
######
123
Sol5-Boxplot.py
#######
# Objective: Make a DataFrame using the Abalone dataset (../data/abalone.csv).
# Take two independent random samples of different sizes from the 'rings' field.
# HINT: np.random.choice(df['rings'],10,replace=False) takes 10 random values
# Use box plots to show that the samples do derive from the same population.
######
# add a layout
layout = go.Layout(
title = 'Comparison of two samples taken from the same population'
)
# create a fig from data & layout, and plot the fig
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='solution5.html')
124
Sol6-Histogram.py
#######
# Objective: Create a histogram that plots the 'length' field
# from the Abalone dataset (../data/abalone.csv).
# Set the range from 0 to 1, with a bin size of 0.02
######
# add a layout
layout = go.Layout(
title="Shell lengths from the Abalone dataset"
)
# create a fig from data & layout, and plot the fig
fig = go.Figure(data=data, layout=layout)
pyo.plot(fig, filename='solution6.html')
125
Sol7-Distplot.py
#######
# Objective: Using the iris dataset, develop a Distplot
# that compares the petal lengths of each class.
# File: '../data/iris.csv'
# Fields: 'sepal_length','sepal_width','petal_length','petal_width','class'
# Classes: 'Iris-setosa','Iris-versicolor','Iris-virginica'
######
# Create a fig from data and layout, and plot the fig
fig = ff.create_distplot(hist_data, group_labels)
pyo.plot(fig, filename='solution7.html')
########
# Great! This shows that if given a flower with a petal length
# between 1-2cm, it is almost certainly an Iris Setosa!
######
126
Sol8-Heatmap.py
#######
# Objective: Using the "flights" dataset available from Python's
# Seaborn module (see https://seaborn.pydata.org/generated/seaborn.heatmap.html)
# create a heatmap with the following parameters:
# x-axis="year"
# y-axis="month"
# z-axis(color)="passengers"
######
#######
# Excellent! This shows two distinct trends - an increase in
# passengers flying over the years, and a greater number of
# passengers flying in the summer months.
######
127
APPENDIX II – DASH CORE COMPONENTS
https://dash.plot.ly/dash-core-components
Dropdown
import dash_core_components as dcc
dcc.Dropdown(
options=[
{'label': 'New York City', 'value': 'NYC'},
{'label': 'Montréal', 'value': 'MTL'},
{'label': 'San Francisco', 'value': 'SF'}
],
value='MTL'
)
Single dropdown list, value sets initial displayed entry
dcc.Dropdown(
options=[
{'label': 'New York City', 'value': 'NYC'},
{'label': 'Montréal', 'value': 'MTL'},
{'label': 'San Francisco', 'value': 'SF'}
],
multi=True,
value="MTL"
)
multi permits multiple selection
Slider
import dash_core_components as dcc
dcc.Slider(
min=-5,
max=10,
step=0.5,
value=-3,
)
Basic slider
dcc.Slider(
min=0,
max=9,
marks={i: 'Label {}'.format(i) for i in range(10)},
value=5,
)
Lets you set labels as Label 0, Label 1, etc.
RangeSlider
import dash_core_components as dcc
dcc.RangeSlider(
128
count=1,
min=-5,
max=10,
step=0.5,
value=[-3, 7]
)
dcc.RangeSlider(
marks={i: 'Label {}'.format(i) for i in range(-5, 7)},
min=-5,
max=6,
value=[-3, 4]
)
Input
import dash_core_components as dcc
dcc.Input(
placeholder='Enter a value...',
type='text',
value=''
)
Textarea
import dash_core_components as dcc
dcc.Textarea(
placeholder='Enter a value...',
value='This is a TextArea component',
style={'width': '100%'}
)
129
Checklists
import dash_core_components as dcc
dcc.Checklist(
options=[
{'label': 'New York City', 'value': 'NYC'},
{'label': 'Montréal', 'value': 'MTL'},
{'label': 'San Francisco', 'value': 'SF'}
],
values=['MTL', 'SF']
)
Vertical list
dcc.Checklist(
options=[
{'label': 'New York City', 'value': 'NYC'},
{'label': 'Montréal', 'value': 'MTL'},
{'label': 'San Francisco', 'value': 'SF'}
],
values=['MTL', 'SF'],
labelStyle={'display': 'inline-block'}
)
Horizontal array
Radio Items
import dash_core_components as dcc
dcc.RadioItems(
options=[
{'label': 'New York City', 'value': 'NYC'},
{'label': 'Montréal', 'value': 'MTL'},
{'label': 'San Francisco', 'value': 'SF'}
],
value='MTL'
)
Vertical list
dcc.RadioItems(
options=[
{'label': 'New York City', 'value': 'NYC'},
{'label': 'Montréal', 'value': 'MTL'},
{'label': 'San Francisco', 'value': 'SF'}
],
value='MTL',
labelStyle={'display': 'inline-block'}
)
Horizontal array
130
Button
import dash
import dash_html_components as html
import dash_core_components as dcc
from dash.dependencies import Input, Output, State
app = dash.Dash()
app.layout = html.Div([
html.Div(dcc.Input(id='input-box', type='text')),
html.Button('Submit', id='button'),
html.Div(id='output-container-button',
children='Enter a value and press submit')
])
@app.callback(
Output('output-container-button', 'children'),
[Input('button', 'n_clicks')],
[State('input-box', 'value')])
def update_output(n_clicks, value):
return 'The input value was "{}" and the button has been clicked {} times'.format(
value,
n_clicks
)
if __name__ == '__main__':
app.run_server(debug=True)
DatePickerSingle
import dash_core_components as dcc
from datetime import datetime as dt
dcc.DatePickerSingle(
id='date-picker-single',
date=dt(1997, 5, 10)
)
DatePickerRange
import dash_core_components as dcc
from datetime import datetime as dt
dcc.DatePickerRange(
id='date-picker-range',
start_date=dt(1997, 5, 3),
end_date_placeholder_text='Select a date!'
)
131
Markdown
import dash_core_components as dcc
dcc.Markdown('''
#### Dash and Markdown
Graphs
The Graph component shares the same syntax as the open-source plotly.py library. View the plotly.py docs to learn
more.
Still in Development
Interactive Tables
The dash_html_components library exposes all of the HTML tags. This includes the Table, Tr, and Tbody tags that can be
used to create an HTML table. See Create Your First Dash App, Part 1 for an example.
Dash is currently incubating an interactive table component that provides built-in filtering, row-selection, editing, and
sorting. Prototypes of this component are being developed in the dash-table-experiments repository. Join the discussion
in the Dash Community Forum.
Upload Component
The dcc.Upload component allows users to upload files into your app through drag-and-drop or the system's native file
explorer.
Tabs
The dcc.Tabs component is currently available in the prerelease channel of the dash-core-components package. To try it
out, see the tab component Pull Request on GitHub.
132
APPENDIX III - ADDITIONAL RESOURCES
Plotly User Guide for Python
● Scatter
● ScatterGL 3D Charts:
● Bar ● Scatter3D
● Box ● Surface
● Pie ● Mesh
● Area Maps:
● Heatmap ● Scatter Geo
● Contour ● Choropleth
● Histogram ● Scatter Mapbox
● Histogram 2D Advanced Charts:
● Histogram 2D Contour ● Carpet
● OHLC ● Scatter Carpet
● Candlestick ● Contour Carpet
● Table ● Parallel Coordinates
● Scatter Ternary
● Sankey
Dash User Guide
Dash Tutorial
● Part 1 - Installation
● Part 2 - Dash Layout
● Part 3 - Basic Callbacks
● Part 4 - Dash State
● Part 5 - Interactive Graphing and Crossfiltering
● Part 6 - Sharing Data Between Callbacks
133