0% found this document useful (0 votes)

42 views80 pages

Module 4 - Data Exploration and Visualization

Uploaded by

Rachell Ann Uson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views80 pages

Module 4 - Data Exploration and Visualization

Uploaded by

Rachell Ann Uson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 80

Exploring and Visualizing Data

Stephen F Elston| Principle Consultant, Quantia Analytics, LLC

Module Outline

• Exploring Data
• Visualizing Data
Exploring Data

• Introduction to R and Python for Data Science

• Working with Data Frames in R and Python
• Working with Data Frames in Azure ML
• Working with Metadata
Data Frames

• Available in R and Python Pandas Column1 Column2 … ColumnN

1 ABC … 12.2
– Map to and from Azure ML tables
2 XYZ … 13.1

• Rectangular tables 3 ABC … 12.8

– Each column of one type 4 XYZ … 10.9

5 ABC … 3.75
• Common Tasks:
– Subsetting by rows and columns
– Logical filtering of rows and columns
Dplyr
library(dplyr)
Col1 Col2 Col3
2012 14 45
2013 13 76
2013 34 65
2014 23 47

dir <- "C:\data"

file <- "values.csv"
path <- file.path(dir, file)
frame1 <- read.csv(path, header=TRUE, stringsAsFactors = FALSE)
Col1 Col2 Col3
2012
2013 14
13 45
76
2013 13
34 76
65
2013 34 65
2014 23 47

frame1 <- filter(frame1, Col1 == 2013)

Col1 Col3
Col2 Col3
2012 45
14 45
2013 76
13 76
2013 65
34 65
2014 47
23 47

frame1 <- select(frame1, Col1, Col3)

Col1 Col2 Col2 Col3 Col3 Col4
2012 14 14 45 45 59
2013 13 13 76 76 89
2013 34 34 65 65 99
2014 23 23 47 47 70

frame1 <- mutate(frame1, Col4 = Col2 + Col3)

Other useful dplyr verbs include:

frame1 <- group_by(frame1, Col1)

frame1 <- distinct(frame1, Col1)
frame1 <- sample_frac(frame1, 0.5)
frame1 <- sample_n(frame1, 500)
frame1 <- summarize(frame1, m1 = mean(Col1))
Col1 Col2 Col2 Col3 Col3 Col4
2013
2012 13 14 76 45 89
2013 34 13 65 76 99
2013 34 65
2014 23 47

frame1 <- frame1 %>%

filter(Col1 == 2013) %>%
mutate(Col4 = Col2 + Col3)
Pandas
Col1 Col2 Col3
2012 14 45
2013 13 76
2013 34 65
2014 23 47

import pandas as pd
import os
dir = "c:\data"
file = "values.csv"
path = os.path.join(dir, file)
frame1 = pd.read_csv(path)
Col1 Col2 Col3
2012 14 45
2013 13 76
2013 34 65
2014 23 47

frame1 = frame1["Col2"]
Col1 Col2 Col3
2012 14 45
2013 13 76
2013 34 65
2014 23 47

frame1 = frame1[["Col1", "Col2"]]

Col1 Col2 Col3
2013
2012 13
14 76
45
2013 34
13 65
76
2013 34 65
2014 23 47

frame1 = frame1[1:3:1]
Col1 Col2 Col3
2012 14 45
2013 13 76
2013 34 65
2014 23 47

frame1 = frame1[:3]
Col1 Col2 Col3
2012 14 45
2013 13 76
2013 34 65
2014 23 47

frame1 = frame1["Col2"][1:2]
Col1 Col2 Col3
2012
4 14
4 45
4
2013 13
21 76
58.25
2013
0.816497 34
9.763879 65
14.863266
2014
2012 23
13 47
45
2012.75 13.75 46.5
…

frame1 = frame1.describe()
Col1 Col2 Col2 Col3 Col3 Col4
2012 14 14 45 45 59
2013 13 13 76 76 89
2013 34 34 65 65 99
2014 23 23 47 47 70

frame1["Col4"] = frame1["Col2"] + frame1["Col3"]

Col1 Col2 Col3
2012 14 45
2013 13 76
2013 34 65
2014 23 47

frame1.drop("Col3", axis=1, inplace=True)

Other Useful Methods

isnull()
groupby(key|expression, axis)
copy()
where(Boolean)
Other Operations

Pandas.DataFrame.apply(function, axis)
Pandas.Series.Map(function, dictionary | series)
Pandas.DataFrame.applymap(function)
Col1 Col2 Col2 Col3 Col3
2012 14 14 45 45
2013 47 13 141 76
2013 23 34 47 65
2014 23 47

frame1= frame1.groupby("Col1").sum()
R Data Frames in Azure ML
Azure ML

Dataset

Azure ML Table

Execute R Script

Data Frame
1 2

frame1 <- maml.mapInputPort(1)

frame2 <- maml.mapInputPort(2)
source("src/myScript.R")
print("Hello world")
maml.mapOutputPort("frame1")

R Device Port
Python Data Frames in Azure ML
Azure ML

Dataset

Azure ML Table

Execute Python Script

Data Frame
1 2

def azureml_main(frame1, frame2)

import myModule as mm
print("Hello world")
return frame1

Device Port
Data Types and Metadata

Stephen F Elston | Principle Consultant , Quantia Analytics, LLC

Chapter Overview

• Data types
• Continuous and discreet values
• Categorical variables
• Azure ML tools
• Quantization of categorical variables
Azure ML Table Data Types
• Numeric; Floating Point • Categorical
• Numeric: Integer • Date-time
• Boolean • Time-Span
• String • Image

Data type is Metadata

Continuous vs discrete variables

• Continuous variable can take on any value within the resolution

– Temperature
– Distance
– Weight
• Discrete variables have fixed values
– Number of people
– Number of wheels on a vehicle
Categorical variables

• Categories are metadata

• Too many categories can lead to problems
– Not enough data per category
– Too many dimensions in a model
• Often need to combine categories
– Reduce number of categories
– Group like categories
Continuous vs categorical variables

• Categories are metadata

• Meta data includes:

– Data type
– Categories of categorical data
– Field type; feature, label, etc.
– Column name
• Editor enables manipulation of metadata
Quantizing Continuous Variables

• Convert continuous variable to categorical

• Bin values into categories
– Small, medium, large
– Hot, cold
– Income groups
Visualizing Data
Overview

• Exploratory data analysis through visualization

• The R ggplot2 package
• The Python Pandas plotting and matplotlib package
Exploratory data analysis

• Explore the data with visualization

• Understand the relationships in the data
• Create multiple views of data
• Aesthetics to project multiple dimensions
• Conditioning to project multiple dimensions
• Understand sources of model errors
John Tukey, Exploratory Data Analysis, 1977, Addison-
Westley
Views of data

• Relationships in data can be complex

• Data exploration requires multiple views
• Views reveal different aspects of the relationships
• Different plots highlight different relationships
Different plots for different views

• Scatter
• Scatter plot matrix
• Line plots
• Bar plots
• Histograms
• Box plots
• Violin plots
• Q-Q plots
Aesthetics for visualization
• Allow projection of additional dimensions
• But don’t over do it!
• Color
• Shape
• Size
• Transparency
• Aesthetics specific to plot type
Scatter plot
Scatter plot (larger point size)
Scatter plot (+ color by category)
Scatter plot (+ shape by category)
Scatter plot (+ alpha = 0.3)
Scatter plot matrix
Line plot
Bar Plot - unordered
Bar Plot - ordered
Histogram
Box Plot (group by category)
Violin Plot (group by category)
Q-Q Normal Plot
Conditioned Plots
Conditioned plots

• How can you project multiple dimensions?

• Analog with conditional probability: p( d | g)
• Plots of subsets (group by)
• Also know as facetted plots

William S. Cleveland, Visualizing Data, 1993, Hobart

Conditioned plots (faceting)
One conditioning variable
Conditioned plots (faceting)
With two dimensions of conditioning
Conditioning (faceting)
With scatter plot
Conditioning (faceting)
With two conditioning categorical variables
Conditioning (faceting)
With three conditioning categorical variables
Another view
Different views reveal different relationships
Introduction to ggplot2
Overview of ggplot2

• Produces presentation quality charts

• Uses grammar of graphics
• Operators define graphics properties
• Operators chained to create complex plots
The Grammar of Graphics

1. Import library
library(ggplot2)

2. Chain methods to define plot

ggplot(dataframe,aes(x
ggplot(dataframe, aes(x==xcol,
xcol,yy==ycol,
ycol,by
by==opt))
opt))+
geom_plottype(arguments)

3. Add attributes to chain

+
xlab("X label") + ylab("Y label") + ggtitle("Title") +
other_properties()
ggplot2 Types

geom_bar
geom_boxplot
geom_histogram
geom_line
geom_point
stat_smooth
stat_hexbin
ggplot2 Options and Asthetics

facet_grid()
xlab(), ylab()
ggtitle()
shape
color
alpha
size
Execute R Script
Azure ML Tables zip file

myFrame <- maml.mapInputPort(1,2)

source("src/myScript.R")

maml.mapOutputPort(“myFrame")

plots

Azure ML Table R Device Port

Introduction to pandas plotting and
matplotlib
Python plotting
• matplotlib underpins plotting in Python
e.g. matplotlib.pyplot
• pandas.DataFrame.plot built on matplotlib.pyplot
• Other libraries built on matplotlib
• For some plot types of more control use matplotlib.pyplot directly
Pandas Plotting
1. Import libraries
import matplotlib.pyplot as plt

2. Define and clear a figure

fig1 = plt.figure(figsize=(9, 9))
fig1.clf()

3. Define one or more axis

ax = fig1.gca()

4. Apply plot method

pandas.DataFrame.plot(kind = 'someType', ax = ax, ….)

fig1.savefig('scatter2.png')
5. Save figure
Python Plotting in Azure ML
def azureml_main(frame1):

import matplotlib.pyplot as plt ## Import libraries

fig1 = plt.figure(figsize=(9, 9)) ## Define a figure

fig1.clf() ## Clear the current figure
ax = fig1.gca() ## Define axis to plot

pandas.DataFrame.plot(kind = 'someType', ax = ax, ….)

fig1.savefig('scatter2.png') ## Save figure in a file for output

return frame1 ## Must return a Pandas dataframe
Types for pandas.DataFrame.plot()
• ‘line’ : line plot (default)
• ‘bar’ : vertical bar plot
• ‘barh’ : horizontal bar plot
• ‘kde’ or ‘density’: Kernel Density Estimation plot
• ‘scatter’ : scatter plot
Options and Aesthetics for pandas.DataFrame.plot()

• ax – pyplot axis
• x, y – coordinates
• color – line or symbol color
• s – size by value
• shape
• alpha – transparency
Execute Python Script
Azure ML Tables zip file

Def azureml_main(inFrame1, inFrame2)

import my_package

return myFrame

fig.savefig(‘fig.png')

Azure ML Table Python Device Port

©2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Office, Azure, System Center, Dynamics and other product names are or may be registered trademarks and/or trademarks in the
U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft
must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Nigel Slack's Operations Strategy-Chapter 1 Lecture Notes
100% (1)
Nigel Slack's Operations Strategy-Chapter 1 Lecture Notes
4 pages
The Unlimited Self Destroy Limiting Beliefs, Uncover Inner Greatness, and Live The Good Life (Jonathan Heston) (Z-Library)
92% (12)
The Unlimited Self Destroy Limiting Beliefs, Uncover Inner Greatness, and Live The Good Life (Jonathan Heston) (Z-Library)
165 pages
Matplotlib Review 2021 Complete
No ratings yet
Matplotlib Review 2021 Complete
352 pages
LAB 1 Arduino
No ratings yet
LAB 1 Arduino
11 pages
CS1010S Lecture 11 - Visualising Data
No ratings yet
CS1010S Lecture 11 - Visualising Data
68 pages
Datascienece
No ratings yet
Datascienece
18 pages
1st Class-Introduction and Python Package
No ratings yet
1st Class-Introduction and Python Package
93 pages
SyamilFakhruddin - DS - Summary - Data Analysis
No ratings yet
SyamilFakhruddin - DS - Summary - Data Analysis
17 pages
CSE445 NSU Week - 3
No ratings yet
CSE445 NSU Week - 3
48 pages
Summary: Introduction To Data Visualization Tools
No ratings yet
Summary: Introduction To Data Visualization Tools
13 pages
Pandas Complete + Visualisation Summary of IBM Visualization
No ratings yet
Pandas Complete + Visualisation Summary of IBM Visualization
21 pages
Data Manipulation and Visualization
No ratings yet
Data Manipulation and Visualization
21 pages
BDA File
No ratings yet
BDA File
26 pages
Matploib
No ratings yet
Matploib
24 pages
Data Visualization
No ratings yet
Data Visualization
35 pages
Dev Lab Manual Org
No ratings yet
Dev Lab Manual Org
28 pages
Data Minds - Data Science Curriculum 2023 V2
No ratings yet
Data Minds - Data Science Curriculum 2023 V2
15 pages
Code Basics & Data Manipulation With R: Literature: Wickham & Grolemund R For Data Science Ch. 3, 16
No ratings yet
Code Basics & Data Manipulation With R: Literature: Wickham & Grolemund R For Data Science Ch. 3, 16
31 pages
Exploratory Data Analysis (EDA) in Python
No ratings yet
Exploratory Data Analysis (EDA) in Python
6 pages
Data Analysis
No ratings yet
Data Analysis
20 pages
PP&DS Unit Iii
No ratings yet
PP&DS Unit Iii
26 pages
Edap Lab
No ratings yet
Edap Lab
47 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
Introduction Tom at Plot Lib
No ratings yet
Introduction Tom at Plot Lib
38 pages
Data Visualization
No ratings yet
Data Visualization
19 pages
Data Visualization
No ratings yet
Data Visualization
29 pages
Lec 19
No ratings yet
Lec 19
14 pages
Data Science
No ratings yet
Data Science
42 pages
Matplotlib Cheat Sheet
No ratings yet
Matplotlib Cheat Sheet
6 pages
Unit 4 (2) Python
No ratings yet
Unit 4 (2) Python
27 pages
Lab Record Dev
No ratings yet
Lab Record Dev
20 pages
ML Week 7
No ratings yet
ML Week 7
12 pages
Data Visualization Using Matplotlib in Python
No ratings yet
Data Visualization Using Matplotlib in Python
15 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
Datascience
No ratings yet
Datascience
26 pages
Exploratory Data Analysis: by Neha Mathur
No ratings yet
Exploratory Data Analysis: by Neha Mathur
14 pages
Exploratory Data Analysis: by Neha Mathur
No ratings yet
Exploratory Data Analysis: by Neha Mathur
14 pages
BarPlot and Histogram
No ratings yet
BarPlot and Histogram
28 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
14 pages
2,3. Introduction Pandas & Matplotlib
No ratings yet
2,3. Introduction Pandas & Matplotlib
32 pages
DMV U4 RK
No ratings yet
DMV U4 RK
16 pages
Module 1
No ratings yet
Module 1
91 pages
Python Dataviz
No ratings yet
Python Dataviz
16 pages
6) Exploratory Data Analysis
No ratings yet
6) Exploratory Data Analysis
29 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
Dsi237 Group 2
No ratings yet
Dsi237 Group 2
27 pages
Python For Statistics
No ratings yet
Python For Statistics
40 pages
Python Libraries 2
No ratings yet
Python Libraries 2
80 pages
Unit 2
No ratings yet
Unit 2
13 pages
IntroToPython Unit 5
No ratings yet
IntroToPython Unit 5
42 pages
Data Visualisation
No ratings yet
Data Visualisation
12 pages
Class 1 Data Visualization in Python Using Matplotlib
No ratings yet
Class 1 Data Visualization in Python Using Matplotlib
13 pages
Software Engineering and Project Management 7
No ratings yet
Software Engineering and Project Management 7
14 pages
DAUP Exam Notes - 2in1
No ratings yet
DAUP Exam Notes - 2in1
35 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
12 pages
DS 2
No ratings yet
DS 2
38 pages
Unit2 Modified
No ratings yet
Unit2 Modified
42 pages
UNIT - 1 EDA Continuation
No ratings yet
UNIT - 1 EDA Continuation
113 pages
Chapter 2. Data Analysis and Processing - Full
No ratings yet
Chapter 2. Data Analysis and Processing - Full
49 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Lisp Programming Language
From Everand
Lisp Programming Language
Faiz ul haque Zeya
No ratings yet
Statcon Digest Module 3
No ratings yet
Statcon Digest Module 3
21 pages
Statcon Module 3 With Mylegalwhiz Summaries
No ratings yet
Statcon Module 3 With Mylegalwhiz Summaries
106 pages
Module 5 - Data Cleaning and Transformation
No ratings yet
Module 5 - Data Cleaning and Transformation
26 pages
Self Discovery Prompts
No ratings yet
Self Discovery Prompts
2 pages
Module 2 Iris Data Set
No ratings yet
Module 2 Iris Data Set
1 page
Program Brief - NICTM 2023 - v4
No ratings yet
Program Brief - NICTM 2023 - v4
3 pages
Module 3 - Lets Elaborate
No ratings yet
Module 3 - Lets Elaborate
2 pages
Module 2 Intro To R
No ratings yet
Module 2 Intro To R
26 pages
Statcon Module 4 With Mylegalwhiz Summaries
No ratings yet
Statcon Module 4 With Mylegalwhiz Summaries
109 pages
Consti Module 2 Case Digests and Mylegalwhiz Supplementals
No ratings yet
Consti Module 2 Case Digests and Mylegalwhiz Supplementals
40 pages
Statcon Module 2 Summaries (3 Versions For Supplements)
No ratings yet
Statcon Module 2 Summaries (3 Versions For Supplements)
65 pages
Co Kim Chan Vs Valdez Tan Keh 75 Phil 113
No ratings yet
Co Kim Chan Vs Valdez Tan Keh 75 Phil 113
3 pages
Statcon Module 2 Case Digest Summary
No ratings yet
Statcon Module 2 Case Digest Summary
23 pages
Journal Prompts To Get To Know Yourself
No ratings yet
Journal Prompts To Get To Know Yourself
8 pages
Unit 7 Writing Assignment
No ratings yet
Unit 7 Writing Assignment
4 pages
Government of The Philippine Islands Vs Monte de Piedad
No ratings yet
Government of The Philippine Islands Vs Monte de Piedad
4 pages
Written Assignment Unit 7
No ratings yet
Written Assignment Unit 7
5 pages
Protecting Filipino Pride
No ratings yet
Protecting Filipino Pride
8 pages
Subject: Call For Intellectual Property (IP) Rights Requests
No ratings yet
Subject: Call For Intellectual Property (IP) Rights Requests
2 pages
5111 Written Assignment Unit 7
No ratings yet
5111 Written Assignment Unit 7
6 pages
Pour Ahmad I Lal Eh 2013
No ratings yet
Pour Ahmad I Lal Eh 2013
9 pages
Brielle Kiewiet - Resume PDF
No ratings yet
Brielle Kiewiet - Resume PDF
2 pages
2nd Chamber Liner Manual
No ratings yet
2nd Chamber Liner Manual
5 pages
Moment Resistant Connections and Simple Connections
No ratings yet
Moment Resistant Connections and Simple Connections
13 pages
Gigaset DA410 User Guide
No ratings yet
Gigaset DA410 User Guide
12 pages
A Deaerator Model: July 2013
No ratings yet
A Deaerator Model: July 2013
6 pages
English Grammar For Class 1 CBSE English Grammar (PDF)
No ratings yet
English Grammar For Class 1 CBSE English Grammar (PDF)
14 pages
(Journal of Environmental Geography) Drought Monitoring With Spectral Indices Calculated From Modis Satellite Images in Hungary
No ratings yet
(Journal of Environmental Geography) Drought Monitoring With Spectral Indices Calculated From Modis Satellite Images in Hungary
10 pages
Unit I Introduction To 8085 Microprocessor
No ratings yet
Unit I Introduction To 8085 Microprocessor
55 pages
The Art of Tarot Reading
No ratings yet
The Art of Tarot Reading
9 pages
Honeywell Thermostat T87F
100% (1)
Honeywell Thermostat T87F
8 pages
Local Knowledge, Global Goals
100% (1)
Local Knowledge, Global Goals
48 pages
MBD-100 Manual en
No ratings yet
MBD-100 Manual en
15 pages
Production of Pulp From Banana Tree
No ratings yet
Production of Pulp From Banana Tree
38 pages
OEL Report
No ratings yet
OEL Report
4 pages
Theory of Structures - SEM IX - Long Span Structures
No ratings yet
Theory of Structures - SEM IX - Long Span Structures
99 pages
Eta H Convex Function MM
No ratings yet
Eta H Convex Function MM
11 pages
Harmonics Report
100% (1)
Harmonics Report
57 pages
ME 209: Project 2: Part 1 (Due Lecture 19) : Create A MATLAB User-Defined Function That Takes Some Initial
No ratings yet
ME 209: Project 2: Part 1 (Due Lecture 19) : Create A MATLAB User-Defined Function That Takes Some Initial
2 pages
Boundary Layer Notes PDF
No ratings yet
Boundary Layer Notes PDF
10 pages
DH2012CV
No ratings yet
DH2012CV
2 pages
PBAAO25
No ratings yet
PBAAO25
1 page
Computerized Elementary Enrollment System
80% (5)
Computerized Elementary Enrollment System
62 pages
Rockwell-Samsung NX700 CPU700p
No ratings yet
Rockwell-Samsung NX700 CPU700p
122 pages
Arduino Rocket Stabilization - WorkshopScience
100% (1)
Arduino Rocket Stabilization - WorkshopScience
9 pages
Proiect Sistem Fotovoltaic SC AGROMEC STAFANESTI SA - Combinat
No ratings yet
Proiect Sistem Fotovoltaic SC AGROMEC STAFANESTI SA - Combinat
27 pages
Dell Technologies Cloud Implementation
No ratings yet
Dell Technologies Cloud Implementation
26 pages

Module 4 - Data Exploration and Visualization

Uploaded by

Module 4 - Data Exploration and Visualization

Uploaded by

Exploring and Visualizing Data

Stephen F Elston| Principle Consultant, Quantia Analytics, LLC

• Introduction to R and Python for Data Science

• Available in R and Python Pandas Column1 Column2 … ColumnN

• Rectangular tables 3 ABC … 12.8

– Each column of one type 4 XYZ … 10.9

dir <- "C:\data"

frame1 <- filter(frame1, Col1 == 2013)

frame1 <- select(frame1, Col1, Col3)

frame1 <- mutate(frame1, Col4 = Col2 + Col3)

frame1 <- group_by(frame1, Col1)

frame1 <- frame1 %>%

frame1 = frame1[["Col1", "Col2"]]

frame1["Col4"] = frame1["Col2"] + frame1["Col3"]

frame1.drop("Col3", axis=1, inplace=True)

frame1 <- maml.mapInputPort(1)

Execute Python Script

def azureml_main(frame1, frame2)

Stephen F Elston | Principle Consultant , Quantia Analytics, LLC

Data type is Metadata

• Continuous variable can take on any value within the resolution

• Categories are metadata

• Categories are metadata

• Meta data includes:

• Convert continuous variable to categorical

• Exploratory data analysis through visualization

• Explore the data with visualization

• Relationships in data can be complex

• How can you project multiple dimensions?

William S. Cleveland, Visualizing Data, 1993, Hobart

• Produces presentation quality charts

2. Chain methods to define plot

3. Add attributes to chain

myFrame <- maml.mapInputPort(1,2)

Azure ML Table R Device Port

2. Define and clear a figure

3. Define one or more axis

4. Apply plot method

import matplotlib.pyplot as plt ## Import libraries

fig1 = plt.figure(figsize=(9, 9)) ## Define a figure

pandas.DataFrame.plot(kind = 'someType', ax = ax, ….)

fig1.savefig('scatter2.png') ## Save figure in a file for output

Def azureml_main(inFrame1, inFrame2)

Azure ML Table Python Device Port

You might also like