100% found this document useful (4 votes)

247 views877 pages

Finance Fundamentals in Python

Learn the finance and Python fundamentals you need to make data-driven financial decisions. There’s no prior coding experience needed. In this track, you’ll learn about data types, lists, arrays, and the time value of money, before discovering how to work with time series data to evaluate index performance. Throughout the track, you’ll work with popular Python packages, including pandas, NumPy, statsmodels, and pyfolio, as you learn to import and manage financial data from different sources, including Excel files and from the web. Hands-on exercises will reinforce your new skills, as you work with real-world data, including NASDAQ stock data, AMEX, investment portfolios, and data from the S&P 100. By the end of the track, you'll be ready to navigate the world of finance using Python—having learned how to work with investment portfolios, calculate measures of risk, and calculate an optimal portfolio based on risk and return. https://ebooks-tech.sellfy.store/p/finance-fundamentals-in-python/

Uploaded by

jcmayac

We take content rights seriously. If you suspect this is your content, claim it here.

100% found this document useful (4 votes)

247 views877 pages

Finance Fundamentals in Python

Uploaded by

jcmayac

We take content rights seriously. If you suspect this is your content, claim it here.

You are on page 1/ 877

Introduction to

Python for Finance

INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Why Python for Finance?
Easy to Learn and Flexible
General purpose

Dynamic

High-level language

Integrates with other languages

Open source
Accessible to anyone

INTRODUCTION TO PYTHON FOR FINANCE

Python Shell
In [1]:

Calculations in IPython

In [1]: 1 + 1

INTRODUCTION TO PYTHON FOR FINANCE

INTRODUCTION TO PYTHON FOR FINANCE
Common mathematical operators
Operator Meaning
+ Add
- Subtract
* Multiply
/ Divide
% Modulus (remainder of division)
** Exponent

INTRODUCTION TO PYTHON FOR FINANCE

Common mathematical operators
In [1]: 8 + 4

Out [1]: 12

In [2]: 8 / 4

Out [2]: 2

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Comments and
variables
INTRODUCTION TO PYTHON FOR FINANCE

Name Surname
Instructor
Any comments?
# Example, do not modify!
print(8 / 2 )
print(2**2)

# Put code below here

print(1.0 + 0.10)

INTRODUCTION TO PYTHON FOR FINANCE

Outputs in IPython vs. script.py
IPython Shell script.py

In [1]: 1 + 1 1 + 1

Out[1]: 2 # No output

In [1]: print(1 + 1) print(1 + 1)

2 <script.py> output:
2

INTRODUCTION TO PYTHON FOR FINANCE

Variables
Variable names

Names can be upper or lower case le ers, digits, and underscores

Variables cannot start with a digit

Some variable names are reserved in Python (e.g., class or type) and should be avoided

INTRODUCTION TO PYTHON FOR FINANCE

Variable example
# Correct
day_2 = 5

# Incorrect, variable name starts with a digit

2_day = 5

INTRODUCTION TO PYTHON FOR FINANCE

Using variables to evaluate stock trends
Market price
Price to earning ratio =
Earnings per share

price = 200
earnings = 5
pe_ratio = price / earnings
print(pe_ratio)

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Variable Data Types
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Python Data Types
Variable Types Example
Strings 'hello world'
Integers 40
Floats 3.1417
Booleans True or False

INTRODUCTION TO PYTHON FOR FINANCE

Variable Types
Variable Types Example Abbreviations

Strings 'Tuesday' str

Integers 40 int

Floats 3.1417 float

Booleans True or False bool

INTRODUCTION TO PYTHON FOR FINANCE

What data type is a variable: type()
To identify the type, we can use the function type() :

type(variable_name)

pe_ratio = 40
print(type(pe_ratio))

INTRODUCTION TO PYTHON FOR FINANCE

Booleans
operators descriptions

== equal

!= does not equal

> greater than

< less than

INTRODUCTION TO PYTHON FOR FINANCE

Boolean Example
print(1 == 1)

True

print(type(1 == 1))

INTRODUCTION TO PYTHON FOR FINANCE

Variable manipulations
x = 5 y = 'stock'
print(x * 3) print(y * 3)

15 'stockstockstock'

print(x + 3) print(y + 3)

8 TypeError: must be str, not int

INTRODUCTION TO PYTHON FOR FINANCE

Changing variable types
pi = 3.14159
print(type(pi))

pi_string = str(pi)
print(type(pi_string))

print('I love to eat ' + pi_string + '!')

I love to eat 3.14159!

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Lists in Python
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Lists - square brackets [ ]
months = ['January', 'February', 'March', 'April', 'May', 'June']

INTRODUCTION TO PYTHON FOR FINANCE

Python is zero-indexed

INTRODUCTION TO PYTHON FOR FINANCE

Subset lists
months = ['January', 'February', 'March', 'April', 'May', 'June']

months[0]

'January'

months[2]

'March'

INTRODUCTION TO PYTHON FOR FINANCE

Negative indexing of lists
months = ['January', 'February', 'March', 'April', 'May', 'June']

months[-1]

'June'

months[-2]

'May'

INTRODUCTION TO PYTHON FOR FINANCE

Subsetting multiple list elements with slicing
Slicing syntax

# Includes the start and up to (but not including) the end

mylist[startAt:endBefore]

Example

months = ['January', 'February', 'March', 'April', 'May', 'June']

months[2:5]

['March', 'April', 'May']

months[-4:-1]

['March', 'April', 'May']

INTRODUCTION TO PYTHON FOR FINANCE

Extended slicing with lists
months = ['January', 'February', 'March', 'April', 'May', 'June']

months[3:]

['April', 'May', 'June']

months[:3]

['January', 'February', 'March']

INTRODUCTION TO PYTHON FOR FINANCE

Slicing with Steps
# Includes the start and up to (but not including) the end
mylist[startAt:endBefore:step]

months = ['January', 'February', 'March', 'April', 'May', 'June']

months[0:6:2]

['January', 'March', 'May']

months[0:6:3]

['January', 'April']

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Lists in Lists
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Lists in Lists
Lists can contain various data types, including lists themselves.

Example: a nested list describing the month and its associated consumer price index

cpi = [['Jan', 'Feb', 'Mar'], [238.11, 237.81, 238.91]]

INTRODUCTION TO PYTHON FOR FINANCE

Subsetting Nested Lists
months = ['Jan', 'Feb', 'Mar']
print(months[1])

'Feb'

cpi = [['Jan', 'Feb', 'Mar'], [238.11, 237.81, 238.91]]

print(cpi[1])

[238.11, 237.81, 238.91]

INTRODUCTION TO PYTHON FOR FINANCE

More on Subsetting Nested Lists
How would one subset out a speci c price index?

cpi = [['Jan', 'Feb', 'Mar'], [238.11, 237.81, 238.91]]

print(cpi[1])

[238.11, 237.81, 238.91]

print(cpi[1][0])

238.11

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Methods and
functions
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Methods vs. Functions
Methods Functions
All methods are functions Not all functions are methods

List methods are a subset of built-in

functions in Python

Used on an object Requires an input of an object

prices.sort() type(prices)

INTRODUCTION TO PYTHON FOR FINANCE

List Methods - sort
Lists have several built-in methods that can help retrieve and manipulate data

Methods can be accessed as list.method()

list.sort() sorts list elements in ascending order

prices = [238.11, 237.81, 238.91]

prices.sort()
print(prices)

[237.81, 238.11, 238.91]

INTRODUCTION TO PYTHON FOR FINANCE

Adding to a list with append and extend
list.append() adds a single element to a list

months = ['January', 'February', 'March']

months.append('April')
print(months)

['January', 'February', 'March', 'April']

list.extend() adds each element to a list

months.extend(['May', 'June', 'July'])

print(months)

['January', 'February', 'March', 'April', 'May', 'June', 'July']

INTRODUCTION TO PYTHON FOR FINANCE

Useful list methods - index
list.index(x) returns the lowest index where the element x appears

months = ['January', 'February', 'March']

prices = [238.11, 237.81, 238.91]

months.index('February')

print(prices[1])

237.81

INTRODUCTION TO PYTHON FOR FINANCE

More functions ...
min(list) : returns the smallest element

max(list) : returns the largest element

INTRODUCTION TO PYTHON FOR FINANCE

Find the month with smallest CPI
months = ['January', 'February', 'March']
prices = [238.11, 237.81, 238.91]

# Identify min price

min_price = min(prices)
# Identify min price index
min_index = prices.index(min_price)
# Identify the month with min price
min_month = months[min_index]
print(min_month)

February

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Arrays
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Installing packages
pip3 install package_name_here

pip3 install numpy

INTRODUCTION TO PYTHON FOR FINANCE

Importing packages
import numpy

INTRODUCTION TO PYTHON FOR FINANCE

NumPy and Arrays
import numpy
my_array = numpy.array([0, 1, 2, 3, 4])
print(my_array)

[0, 1, 2, 3, 4]

print(type(my_array))

INTRODUCTION TO PYTHON FOR FINANCE

Using an alias
import package_name
package_name.function_name(...)

import numpy as np
my_array = np.array([0, 1, 2, 3, 4])
print(my_array)

[0, 1, 2, 3, 4]

INTRODUCTION TO PYTHON FOR FINANCE

Why use an array for financial analysis?
Arrays can handle very large datasets e ciently
Computationally-memory e cient

Faster calculations and analysis than lists

Diverse functionality (many functions in Python packages)

INTRODUCTION TO PYTHON FOR FINANCE

What's the difference?
NumPy arrays Lists

my_array = np.array([3, 'is', True]) my_list = [3, 'is', True]

print(my_array) print(my_list)

['3' 'is' 'True'] [3, 'is', True]

INTRODUCTION TO PYTHON FOR FINANCE

Array operations
Arrays Lists

import numpy as np list_A = [1, 2, 3]

list_B = [4, 5, 6]
array_A = np.array([1, 2, 3])
array_B = np.array([4, 5, 6]) print(list_A + list_B)

print(array_A + array_B) [1, 2, 3, 4, 5, 6]

[5 7 9]

INTRODUCTION TO PYTHON FOR FINANCE

Array indexing
import numpy as np

months_array = np.array(['Jan', 'Feb', 'March', 'Apr', 'May'])

print(months_array[3])

Apr

print(months_array[2:5])

['March' 'Apr' 'May']

INTRODUCTION TO PYTHON FOR FINANCE

Array slicing with steps
import numpy as np

months_array = np.array(['Jan', 'Feb', 'March', 'Apr', 'May'])

print(months_array[0:5:2])

['Jan' 'March' 'May']

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Two Dimensional
Arrays
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Two-dimensional arrays
import numpy as np

months = [1, 2, 3]
prices = [238.11, 237.81, 238.91]

cpi_array = np.array([months, prices])

print(cpi_array)

[[ 1. 2. 3. ]
[ 238.11 237.81 238.91]]

INTRODUCTION TO PYTHON FOR FINANCE

Array Methods
print(cpi_array)

[[ 1. 2. 3. ]
[ 238.11 237.81 238.91]]

.shape gives you dimensions of the array

print(cpi_array.shape)

(2, 3)

.size gives you total number of elements in the array

print(cpi_array.size)

INTRODUCTION TO PYTHON FOR FINANCE

Array Functions
import numpy as np

prices = [238.11, 237.81, 238.91]

prices_array = np.array(prices)

np.mean() calculates the mean of an input

print(np.mean(prices_array))

238.27666666666667

np.std() calculates the standard deviation of an input

print(np.std(prices_array))

0.46427960923946671

INTRODUCTION TO PYTHON FOR FINANCE

The `arange()` function
numpy.arange() creates an array with start, end, step

import numpy as np

months = np.arange(1, 13)

print(months)

[ 1 2 3 4 5 6 7 8 9 10 11 12]

months_odd = np.arange(1, 13, 2)

print(months_odd)

[ 1 3 5 7 9 11]

INTRODUCTION TO PYTHON FOR FINANCE

The `transpose()` function
numpy.transpose() switches rows and columns of a numpy array

print(cpi_array)

[[ 1. 2. 3. ]
[ 238.11 237.81 238.91]]

cpi_transposed = np.transpose(cpi_array)

print(cpi_transposed)

[[ 1. 238.11]
[ 2. 237.81]
[ 3. 238.91]]

INTRODUCTION TO PYTHON FOR FINANCE

Array Indexing for 2D arrays
print(cpi_array)

[[ 1. 2. 3. ]
[ 238.11 237.81 238.91]]

# row index 1, column index 2

cpi_array[1, 2]

238.91

# all row slice, third column

print(cpi_array[:, 2])

[ 3. 238.91]

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Using Arrays for
Analyses
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Indexing Arrays
import numpy as np

months_array = np.array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'])

indexing_array = np.array([1, 3, 5])

months_subset = months_array[indexing_array]
print(months_subset)

['Feb' 'Apr' 'Jun']

INTRODUCTION TO PYTHON FOR FINANCE

More on indexing arrays
import numpy as np

months_array = np.array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'])

negative_index = np.array([-1, -2])

print(months_array[negative_index])

['Jun' 'May']

INTRODUCTION TO PYTHON FOR FINANCE

Boolean arrays
import numpy as np

months_array = np.array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun'])

boolean_array = np.array([True, True, True, False, False, False])

print(months_array[boolean_array])

['Jan' 'Feb' 'Mar']

INTRODUCTION TO PYTHON FOR FINANCE

More on Boolean arrays
prices_array = np.array([238.11, 237.81, 238.91])
# Create a Boolean array
boolean_array = (prices_array > 238)

print(boolean_array)

[ True False True]

print(prices_array[boolean_array])

[ 238.11 238.91]

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Visualization in
Python
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Matplotlib: A visualization package
See more of the Matplotlib gallery by clicking this link.

INTRODUCTION TO PYTHON FOR FINANCE

matplotlib.pyplot - diverse plotting functions
import matplotlib.pyplot as plt

INTRODUCTION TO PYTHON FOR FINANCE

matplotlib.pyplot - diverse plotting functions
plt.plot()
takes arguments that describe the data to be plo ed

plt.show()
displays plot to screen

INTRODUCTION TO PYTHON FOR FINANCE

Plotting with pyplot
import matplotlib.pyplot as plt
plt.plot(months, prices)
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Plot result

INTRODUCTION TO PYTHON FOR FINANCE

Red solid line
import matplotlib.pyplot as plt
plt.plot(months, prices, color = 'red')
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Plot result

INTRODUCTION TO PYTHON FOR FINANCE

Dashed line
import matplotlib.pyplot as plt
plt.plot(months, prices, color = 'red', linestyle = '--')
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Plot result

INTRODUCTION TO PYTHON FOR FINANCE

Colors and linestyles
color linestyle
'green' green '-' solid line
'red' red '--' dashed line
'cyan' cyan '-.' dashed dot line
'blue' blue ':' do ed

INTRODUCTION TO PYTHON FOR FINANCE

Adding Labels and Titles
import matplotlib.pyplot as plt
plt.plot(months, prices, color = 'red', linestyle = '--')

# Add labels
plt.xlabel('Months')
plt.ylabel('Consumer Price Indexes, $')
plt.title('Average Monthly Consumer Price Indexes')

# Show plot
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Plot result

INTRODUCTION TO PYTHON FOR FINANCE

Adding additional lines
import matplotlib.pyplot as plt
plt.plot(months, prices, color = 'red', linestyle = '--')

# adding an additional line

plt.plot(months, prices_new, color = 'green', linestyle = '--')

plt.xlabel('Months')
plt.ylabel('Consumer Price Indexes, $')
plt.title('Average Monthly Consumer Price Indexes')
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Plot result

INTRODUCTION TO PYTHON FOR FINANCE

Scatterplots
import matplotlib.pyplot as plt
plt.scatter(x = months, y = prices, color = 'red')
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Scatterplot result

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Histograms
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Why histograms for financial analysis?

INTRODUCTION TO PYTHON FOR FINANCE

Histograms and Data
Is your data skewed?

Is your data centered around the average?

Do you have any abnormal data points (outliers) in your data?

INTRODUCTION TO PYTHON FOR FINANCE

Histograms and matplotlib.pyplot
import matplotlib.pyplot as plt
plt.hist(x=prices, bins=3)
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Changing the number of bins
import matplotlib.pyplot as plt
plt.hist(prices, bins=6)
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Normalizing histogram data
import matplotlib.pyplot as plt
plt.hist(prices, bins=6, normed=1)
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Layering histograms on a plot
import matplotlib.pyplot as plt
plt.hist(x=prices, bins=6, normed=1)
plt.hist(x=prices_new, bins=6, normed=1)
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Histogram result

INTRODUCTION TO PYTHON FOR FINANCE

Alpha: Changing transparency of histograms
import matplotlib.pyplot as plt
plt.hist(x=prices, bins=6, normed=1, alpha=0.5)
plt.hist(x=prices_new, bins=6, normed=1, alpha=0.5)
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Histogram result

INTRODUCTION TO PYTHON FOR FINANCE

Adding a legend
import matplotlib.pyplot as plt
plt.hist(x=prices, bins=6, normed=1, alpha=0.5, label="Prices 1")
plt.hist(x=prices_new, bins=6, normed=1, alpha=0.5, label="Prices New")
plt.legend()
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Histogram result

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Introducing the
dataset
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Overall Review
Python shell and scripts

Variables and data types

Lists

Arrays

Methods and functions

Indexing and subse ing

Matplotlib

INTRODUCTION TO PYTHON FOR FINANCE

S&P 100 Companies
Standard and Poor's S&P 100:

made up of major companies that span multiple industry groups

used to measure stock performance of large companies

INTRODUCTION TO PYTHON FOR FINANCE

S&P 100 Case Study
Sectors of Companies within the S&P 100 in 2017

INTRODUCTION TO PYTHON FOR FINANCE

The data

INTRODUCTION TO PYTHON FOR FINANCE

Price to Earnings Ratio
Market price
Price to earning ratio =
Earnings per share
The ratio for valuing a company that measures its current share price relative to its per-
share earnings

In general, higher P/E ratio indicates higher growth expectations

INTRODUCTION TO PYTHON FOR FINANCE

Your mission
GIVEN
Lists of data describing the S&P 100: names, prices, earnings, sectors

OBJECTIVE PART I
Explore and analyze the S&P 100 data, speci cally the P/E ratios of S&P 100 companies

INTRODUCTION TO PYTHON FOR FINANCE

Step 1: examine the lists
In [1]: my_list = [1, 2, 3, 4, 5]

# first element
In [2]: print(my_list[0])

# last element
In [3]: print(my_list[-1])

# range of elements
In [4]: print(my_list[0:3])

[1, 2, 3]

INTRODUCTION TO PYTHON FOR FINANCE

Step 2: Convert lists to arrays
# Convert lists to arrays
import numpy as np
my_array = np.array(my_list)

INTRODUCTION TO PYTHON FOR FINANCE

Step 3: Elementwise array operations
# Elementwise array operations
array_ratio = array1 / array2

INTRODUCTION TO PYTHON FOR FINANCE

Let's analyze!
INTRODUCTION TO PYTHON FOR FINANCE
A closer look at the
sectors
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Your mission
GIVEN
NumPy arrays of data describing the S&P 100: names, prices, earnings, sectors

OBJECTIVE PART II
Explore and analyze sector-speci c P/E ratios within companies of the S&P 100

INTRODUCTION TO PYTHON FOR FINANCE

Step 1: Create a boolean filtering array
stock_prices = np.array([100, 200, 300])
filter_array = (stock_prices >= 150)
print(filter_array)

[ False True True]

INTRODUCTION TO PYTHON FOR FINANCE

Step 2: Apply filtering array to subset another array
stock_prices = np.array([100, 200, 300])
filter_array = (stock_prices >= 150)
print(stock_prices[filter_array])

[200 300]

INTRODUCTION TO PYTHON FOR FINANCE

Step 3: Summarize P/E ratios
Calculate the average and standard deviation of these sector-speci c P/E ratios

import numpy as np
average_value = np.mean(my_array)
std_value = np.std(my_array)

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Visualizing trends
INTRODUCTION TO PYTHON FOR FINANCE

Adina Howe
Instructor
Your mission - outlier?

INTRODUCTION TO PYTHON FOR FINANCE

Step 1: Make a histogram
import matplotlib.pyplot as plt
plt.hist(hist_data, bins = 8)
plt.show()

INTRODUCTION TO PYTHON FOR FINANCE

Step 2: Identify the Outlier
Identify the outlier P/E ratio

Create a boolean array lter to subset this company

Filter out this company information from the provided datasets

INTRODUCTION TO PYTHON FOR FINANCE

Let's practice!
INTRODUCTION TO PYTHON FOR FINANCE
Representing time
with datetimes
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Datetimes

INTERMEDIATE PYTHON FOR FINANCE

Datetimes

INTERMEDIATE PYTHON FOR FINANCE

Datetimes
from datetime import datetime

black_monday = datetime(1987, 10, 19)

print(black_monday)

datetime.datetime(1987, 10, 19, 0, 0)

INTERMEDIATE PYTHON FOR FINANCE

Datetime now
datetime.now()

datetime.datetime(2019, 11, 6, 3, 48, 30, 886713)

INTERMEDIATE PYTHON FOR FINANCE

Datetime from string
black_monday_str = "Monday, October 19, 1987. 9:30 am"
format_str = "%A, %B %d, %Y. %I:%M %p"
datetime.datetime.strptime(black_monday_str, format_str)

datetime.datetime(1987, 10, 19, 9, 30)

INTERMEDIATE PYTHON FOR FINANCE

Datetime from string
Year

%y Without century (01, 02, ..., 98, 99)

%Y With century (0001, 0002, ..., 1998, 1999, ..., 9999)

Month

%b Abbreviated names (Jan, Feb, ..., Nov, Dec)

%B Full names (January, February, ... November, December)

%m As numbers (01, 02, ..., 11, 12)

Day of Month

%d (01, 02, ..., 30, 31)

INTERMEDIATE PYTHON FOR FINANCE

Datetime from string
Weekday

%a Abbreviated name (Sun, ... Sat)

%A Full name (Sunday, ... Saturday)

%w Number (0, ..., 6)

Hour

%H 24 hour (00, 01, ... 23)

%I 12 hour (01, 02, ... 12)

%M (01, 02, ..., 59)

INTERMEDIATE PYTHON FOR FINANCE

Datetime from string
Seconds

%S (00, 01, ... 59)

Micro-seconds

%f (000000, 000001, ... 999999)

AM/PM

%p (AM, PM)

INTERMEDIATE PYTHON FOR FINANCE

Datetime from string
%m Months

%M Minutes

INTERMEDIATE PYTHON FOR FINANCE

Datetime from string
"1837-05-10"

"%Y-%m-%d"

INTERMEDIATE PYTHON FOR FINANCE

Datetime from string
"Friday, 17 May 01"

"%A, %d %B %y"

INTERMEDIATE PYTHON FOR FINANCE

String from datetime
dt.strftime(format_string)

INTERMEDIATE PYTHON FOR FINANCE

String from datetime
great_depression_crash = datetime.datetime(1929, 10, 29)
great_depression_crash

datetime.datetime(1929, 10, 29, 0, 0)

great_depression_crash.strftime("%a, %b %d, %Y")

'Tue, Oct 29, 1929'

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Working with
datetimes
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Datetime attributes
now.year now.hour
now.month now.minute
now.day now.second

2019 22
11 34
13 56

INTERMEDIATE PYTHON FOR FINANCE

Comparing datetimes
equals ==

less than <

more than >

INTERMEDIATE PYTHON FOR FINANCE

Comparing datetimes
from datetime import datetime
asian_crisis = datetime(1997, 7, 2)
world_mini_crash = datetime(1997, 10, 27)

asian_crisis > world_mini_crash

False

asian_crisis < world_mini_crash

True

INTERMEDIATE PYTHON FOR FINANCE

Comparing datetimes
asian_crisis = datetime(1997, 7, 2)
world_mini_crash = datetime(1997, 10, 27)

text = "10/27/1997"
format_str = "%m/%d/%Y"
sell_date = datetime.strptime(text, format_str)

sell_date == world_mini_crash

True

INTERMEDIATE PYTHON FOR FINANCE

Difference between datetimes
Compare with < , > , or == .

Subtraction returns a timedelta object.

timedelta a ributes: weeks, days, minutes, seconds, microseconds

INTERMEDIATE PYTHON FOR FINANCE

Difference between datetimes
delta = world_mini_crash - asian_crisis

type(delta)

datetime.timedelta

delta.days

117

INTERMEDIATE PYTHON FOR FINANCE

Creating relative datetimes
dt

datetime.datetime(2019, 1, 14, 0, 0)

datetime(dt.year, dt.month, dt.day - 7)

datetime.datetime(2019, 1, 7, 0, 0)

datetime(dt.year, dt.month, dt.day - 15)

ValueError Traceback (most recent call last)

<ipython-input-28-804001f45cdb> in <module>()
-> 1 datetime(dt.year, dt.month, dt.day - 15)
ValueError: day is out of range for month

INTERMEDIATE PYTHON FOR FINANCE

Creating relative datetimes
delta = world_mini_crash - asian_crisis
type(delta)

datetime.timedelta

INTERMEDIATE PYTHON FOR FINANCE

Creating relative datetimes
from datetime import timedelta

offset = timedelta(weeks = 1)
offset

datetime.timedelta(7)

dt - offset

datetime.datetime(2019, 1, 7, 0, 0)

INTERMEDIATE PYTHON FOR FINANCE

Creating relative datetimes
offset = timedelta(days=16)
dt - offset

datetime.datetime(2018, 12, 29, 0, 0)

cur_week = last_week + timedelta(weeks=1)

# Do some work with date
# set last week variable to cur week and repeat
last_week = cur_week

source_dt = event_dt - timedelta(weeks=4)

# Use source datetime to look up market factors

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Dictionaries
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Lookup by index
my_list = ['a','b','c','d']

0 1 2 3
['a','b','c','d']

my_list[0]

'a'

my_list.index('c')

INTERMEDIATE PYTHON FOR FINANCE

Lookup by key
Dictionaries

INTERMEDIATE PYTHON FOR FINANCE

Representation
{ 'key-1':'value-1', 'key-2':'value-2', 'key-3':'value-3'}

INTERMEDIATE PYTHON FOR FINANCE

Creating dictionaries
my_dict = {}
my_dict

{}

my_dict = dict()
my_dict

{}

INTERMEDIATE PYTHON FOR FINANCE

Creating dictionaries
ticker_symbols = {'AAPL':'Apple', 'F':'Ford', 'LUV':'Southwest'}
print(ticker_symbols)

{'AAPL':'Apple', 'F':'Ford', 'LUV':'Southwest'}

ticker_symbols = dict([['APPL','Apple'],['F','Ford'],['LUV','Southwest']])
print(ticker_symbols)

{'AAPL':'Apple', 'F':'Ford', 'LUV':'Southwest'}

INTERMEDIATE PYTHON FOR FINANCE

Adding to dictionaries
ticker_symbols['XON'] = 'Exxon'
ticker_symbols

{'APPL': 'Apple', 'F': 'Ford', 'LUV': 'Southwest', 'XON': 'Exxon'}

ticker_symbols['XON'] = 'Exxon OLD'

ticker_symbols

{'APPL': 'Apple','F': 'Ford','LUV': 'Southwest','XON': 'Exxon OLD'}

INTERMEDIATE PYTHON FOR FINANCE

Accessing values
ticker_symbols['F']

'Ford'

INTERMEDIATE PYTHON FOR FINANCE

Accessing values
ticker_symbols['XOM']

KeyError Traceback (most recent call last)

<ipython-input-6-782fbf617bf7> in <module>()
-> 1 ticker_symbols['XOM']

KeyError: 'XOM'

INTERMEDIATE PYTHON FOR FINANCE

Accessing values
company = ticker_symbols.get('LUV')
print(company)

'Southwest'

company = ticker_symbols.get('XOM')
print(company)

None

company = ticker_symbols.get('XOM', 'MISSING')

print(company)

'MISSING'

INTERMEDIATE PYTHON FOR FINANCE

Deleting from dictionaries
ticker_symbols

{'APPL': 'Apple', 'F': 'Ford', 'LUV': 'Southwest', 'XON': 'Exxon OLD'}

del(ticker_symbols['XON'])

ticker_symbols

{'APPL': 'Apple', 'F': 'Ford', 'LUV': 'Southwest'}

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Comparison
operators
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Python comparison operators
Equality: == , !=

Order: < , > , <= , >=

INTERMEDIATE PYTHON FOR FINANCE

Equality operator vs assignment
Test equality: ==

Assign value: =

INTERMEDIATE PYTHON FOR FINANCE

Equality operator vs assignment
13 == 13

True

count = 13
print(count)

INTERMEDIATE PYTHON FOR FINANCE

Equality comparisons
datetimes

numbers ( oats, ints)

dictionaries

strings

almost anything else

INTERMEDIATE PYTHON FOR FINANCE

Comparing datetimes
date_close_high = datetime(2019, 11, 27)
date_intra_high = datetime(2019, 11, 27)
print(date_close_high == date_intra_high)

True

INTERMEDIATE PYTHON FOR FINANCE

Comparing dictionaries
d1 = {'high':56.88, 'low':33.22, 'closing':56.88}
d2 = {'high':56.88, 'low':33.22, 'closing':56.88}
print(d1 == d2)

True

d1 = {'high':56.88, 'low':33.22, 'closing':56.88}

d2 = {'high':56.88, 'low':33.22, 'closing':12.89}
print(d1 == d2)

False

INTERMEDIATE PYTHON FOR FINANCE

Comparing different types
print(3 == 3.0)

True

print(3 == '3')

False

INTERMEDIATE PYTHON FOR FINANCE

Not equal operator
print(3 != 4)

True

print(3 != 3)

False

INTERMEDIATE PYTHON FOR FINANCE

Order operators
Less than <

Less than or equal <=

Greater than >

Greater than or equal >=

INTERMEDIATE PYTHON FOR FINANCE

Less than operator
print(3 < 4)

True

print(3 < 3.6)

True

print('a' < 'b')

True

INTERMEDIATE PYTHON FOR FINANCE

Less than operator
date_close_high = datetime(2019, 11, 27)
date_intra_high = datetime(2019, 11, 27)
print(date_close_high < date_intra_high)

False

INTERMEDIATE PYTHON FOR FINANCE

Less than or equal operator
print(1 <= 4)

True

print(1.0 <= 1)

True

print('e' <= 'a')

False

INTERMEDIATE PYTHON FOR FINANCE

Greater than operator
print(6 > 5)
print(4 > 4)

True

False

INTERMEDIATE PYTHON FOR FINANCE

Greater than or equal operator
print(6 >= 5)
print(4 >= 4)

True

INTERMEDIATE PYTHON FOR FINANCE

Order comparison across types
print(3.45454 < 90)

True

print('a' < 23)

<hr />----------------------------------------------
TypeError Traceback (most recent call last)
...
TypeError: '<' not supported between instances of 'str' and 'int'

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Boolean operators
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Boolean logic

INTERMEDIATE PYTHON FOR FINANCE

What are Boolean operations?
1. and

2. or

3. not

INTERMEDIATE PYTHON FOR FINANCE

Object evaluation
Evaluates as False Evaluates as True
Constants: Almost everything else
False

None

Numeric zero:
0

0.0

Length of zero
""

[]

{}

INTERMEDIATE PYTHON FOR FINANCE

The AND operator
True and True

True

True and False

False

INTERMEDIATE PYTHON FOR FINANCE

The OR operator
False or True

True

True or True

True

False or False

False

INTERMEDIATE PYTHON FOR FINANCE

Short circuit.
is_current() and is_investment()

False

is_current() or is_investment()

True

INTERMEDIATE PYTHON FOR FINANCE

The NOT operator
not True

False

not False

True

INTERMEDIATE PYTHON FOR FINANCE

Order of operations with NOT
True == False

False

not True == False

True

INTERMEDIATE PYTHON FOR FINANCE

Object evaluation
"CUSIP" and True

True

INTERMEDIATE PYTHON FOR FINANCE

Object evaluation
[] or False

False

INTERMEDIATE PYTHON FOR FINANCE

Object evaluation
not {}

True

INTERMEDIATE PYTHON FOR FINANCE

Returning objects
"Federal" and "State"

"State"

[] and "State"

[]

INTERMEDIATE PYTHON FOR FINANCE

Returning objects.
13 or "account number"

0.0 or {"balance": 2200}

{"balance": 2200}

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
If statements
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Printing sales only
trns = { 'symbol': 'TSLA', 'type':'BUY', 'amount': 300}

print(trns['amount'])

300

INTERMEDIATE PYTHON FOR FINANCE

Compound statements
control statement
statement 1
statement 2
statement 3

INTERMEDIATE PYTHON FOR FINANCE

Control Statement
if <expression> :

if x < y:

if x in y:

if x and y:

if x:

INTERMEDIATE PYTHON FOR FINANCE

Code blocks
if <expression>:
statement
statement
statement

if <expression>: statement;statement;statement

INTERMEDIATE PYTHON FOR FINANCE

Printing sales only
trns = { 'symbol': 'TSLA', 'type':'BUY', 'amount': 300}

if trns['type'] == 'SELL':
print(trns['amount'])

trns['type'] == 'SELL'

False

INTERMEDIATE PYTHON FOR FINANCE

Printing sales only.
trns = { 'symbol': 'APPL', 'type':'SELL', 'amount': 200}

if trns['type'] == 'SELL':
print(trns['amount'])

200

INTERMEDIATE PYTHON FOR FINANCE

Else
if x in y:
print("I found x in y")
else:
print("No x in y")

INTERMEDIATE PYTHON FOR FINANCE

Elif
if x == y:
print("equals")
elif x < y:
print("less")

INTERMEDIATE PYTHON FOR FINANCE

Elif
if x == y:
print("equals")
elif x < y:
print("less")
elif x > y:
print("more")
elif x == 0
print("zero")

INTERMEDIATE PYTHON FOR FINANCE

Else with elif
if x == y:
print("equals")
elif x < y:
print("less")
elif x > y:
print("more")
elif x == 0
print("zero")
else:
print("None of the above")

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
For and while loops
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Repeating a code block
CUSIP SYMBOL

037833100 AAPL

17275R102 CSCO

68389X105 ORCL

INTERMEDIATE PYTHON FOR FINANCE

Loops.
For loop While loop

INTERMEDIATE PYTHON FOR FINANCE

Statement components
<Control Statement>
<Code Block>

execution 1

execution 2

execution 3

INTERMEDIATE PYTHON FOR FINANCE

For loops
for <variable> in <sequence>:

for x in [0, 1, 2]:

d = {'key': 'value1'}
for x in d:

for x in "ORACLE":

INTERMEDIATE PYTHON FOR FINANCE

List example
for x in [0, 1, 2]:
print(x)

0
1
2

INTERMEDIATE PYTHON FOR FINANCE

Dictionary example
symbols = {'037833100': 'AAPL',
'17275R102': 'CSCO'
'68389X105': 'ORCL'}
for k in symbols:
print(symbols[k])

AAPL
CSCO
ORCL

INTERMEDIATE PYTHON FOR FINANCE

String example
for x in "ORACLE":
print(x)

O
R
A
C
L
E

INTERMEDIATE PYTHON FOR FINANCE

While control statements
while <expression>:

INTERMEDIATE PYTHON FOR FINANCE

While example
x = 0

while x < 5:
print(x)
x = (x + 1)

0
1
2
3
4

INTERMEDIATE PYTHON FOR FINANCE

Infinite loops
x = 0

while x <= 5:
print(x)

INTERMEDIATE PYTHON FOR FINANCE

Skipping with continue
for x in [0, 1, 2, 3]:
if x == 2:
continue
print(x)

0
1
3

INTERMEDIATE PYTHON FOR FINANCE

Stopping with break.
while True:
transaction = get_transaction()
if transaction['symbol'] == 'ORCL':
print('The current symbol is ORCL, break now')
break
print('Not ORCL')

Not ORCL
Not ORCL
Not ORCL
The current symbol is ORCL, break now

INTERMEDIATE PYTHON FOR FINANCE

Let's practice 'for'
and 'while' loops!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Creating a
DataFrame
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Pandas
import pandas as pd

print(pd)

<module 'pandas' from '.../pandas/init.py'>

INTERMEDIATE PYTHON FOR FINANCE

Pandas DataFrame
pd.DataFrame()

INTERMEDIATE PYTHON FOR FINANCE

Pandas DataFrame
Col 1 Col 2 Col 3
0 v1 a 00
1 v2 b 01
2 v3 c 13.02

INTERMEDIATE PYTHON FOR FINANCE

From dict
data = {'Bank Code': ['BA', 'AAD', 'BA'],
'Account#': ['ajfdk2', '1234nmk', 'mm3d90'],
'Balance':[1222.00, 390789.11, 13.02]}

df = pd.DataFrame(data=data)

INTERMEDIATE PYTHON FOR FINANCE

From dict
data = {'Bank Code': ['BA', 'AAD', 'BA'],
'Account#': ['ajfdk2', '1234nmk', 'mm3d90'],
'Balance':[1222.00, 390789.11, 13.02]}

df = pd.DataFrame(data=data)

Bank Code Account# Balance

0 BA ajfdk2 1222.00
1 AAD 1234nmk 390789.11
1 BA mm3d90 13.02

INTERMEDIATE PYTHON FOR FINANCE

From list of dicts
data = [{'Bank Code': 'BA', 'Account#': 'ajfdk2', 'Balance': 1222.00},
{'Bank Code': 'AAD', 'Account#': '1234nmk', 'Balance': 390789.11},
{'Bank Code': 'BA', 'Account#': 'mm3d90', 'Balance': 13.02}]
df = pd.DataFrame(data=data)

INTERMEDIATE PYTHON FOR FINANCE

Bank Code Account# Balance

0 BA ajfdk2 1222.00
1 AAD 1234nmk 390789.11
1 BA mm3d90 13.02

INTERMEDIATE PYTHON FOR FINANCE

From list of lists
data = [['BA', 'ajfdk2', 1222.00],
['AAD', '1234nmk', 390789.11],
['BA', 'mm3d90', 13.02]]
df = pd.DataFrame(data=data)

INTERMEDIATE PYTHON FOR FINANCE

From list of lists
data = [['BA', 'ajfdk2', 1222.00],
['AAD', '1234nmk', 390789.11],
['BA', 'mm3d90', 13.02]]
df = pd.DataFrame(data=data)

0 1 2
0 BA ajfdk2 1222.00
1 AAD 1234nmk 390789.11
1 BA mm3d90 13.02

INTERMEDIATE PYTHON FOR FINANCE

From list of lists with column names
data = [['BA', 'ajfdk2', 1222.00],
['AAD', '1234nmk', 390789.11],
['BA', 'mm3d90', 13.02]]
columns = ['Bank Code', 'Account#', 'Balance']
df = pd.DataFrame(data=data, columns=columns)

Bank Code Account# Balance

0 BA ajfdk2 1222.00
1 AAD 1234nmk 390789.11
1 BA mm3d90 13.02

INTERMEDIATE PYTHON FOR FINANCE

Bank Code Account# Balance

0 BA ajfdk2 1222.00
1 AAD 1234nmk 390789.11
2 BA mm3d90 13.02

INTERMEDIATE PYTHON FOR FINANCE

Reading data
Excel pd.read_excel

JSON pd.read_json

HTML pd.read_html

Pickle pd.read_pickle

Sql pd.read_sql

Csv pd.read_csv

INTERMEDIATE PYTHON FOR FINANCE

CSV
Comma separated values

client id,trans type, amount

14343,buy,23.0
0574,sell,2000
7093,dividend,2234

INTERMEDIATE PYTHON FOR FINANCE

Reading a csv file
df = pd.read_csv('/data/daily/transactions.csv')

INTERMEDIATE PYTHON FOR FINANCE

Reading a csv file
df = pd.read_csv('/data/daily/transactions.csv')

client id trans type amount

14343 buy 23.0
0574 sell 2000
7093 dividend 2234

INTERMEDIATE PYTHON FOR FINANCE

Non-comma csv
client id|trans type| amount
14343|buy|23.0
0574|sell|2000
7093|dividend|2234

INTERMEDIATE PYTHON FOR FINANCE

Non-comma csv
df = pd.read_csv('/data/daily/transactions.csv', sep='|')

INTERMEDIATE PYTHON FOR FINANCE

Non-comma csv
df = pd.read_csv('/data/daily/transactions.csv', sep='|')

client id trans type amount

14343 buy 23.0
0574 sell 2000
7093 dividend 2234

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Accessing Data
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Account Balance

INTERMEDIATE PYTHON FOR FINANCE

Introducing lesson data
Bank Code Account# Balance
a BA ajfdk2 1222.00
b AAD 1234nmk 390789.11
c BA mm3d90 13.02

accounts

INTERMEDIATE PYTHON FOR FINANCE

Access column using brackets
accounts['Balance']

INTERMEDIATE PYTHON FOR FINANCE

Access column using brackets
accounts['Balance']

a 1222.00
b 390789.11
c 13.02

Name: Balance, dtype: oat6

INTERMEDIATE PYTHON FOR FINANCE

Access column using dot-syntax
accounts.Balance

Balance
a 1222.00
b 390789.11
c 13.02

INTERMEDIATE PYTHON FOR FINANCE

Access multiple columns
accounts[['Bank Code', 'Account#']]

INTERMEDIATE PYTHON FOR FINANCE

Access multiple columns
accounts[['Bank Code', 'Account#']]

Bank Code Account#

a BA ajfdk2
b AAD 1234nmk
c BA mm3d90

INTERMEDIATE PYTHON FOR FINANCE

Access rows using brackets
accounts[0:2]

INTERMEDIATE PYTHON FOR FINANCE

Access rows using brackets
accounts[0:2]

Bank Code Account# Balance

a BA ajfdk2 1222.00
b AAD 1234nmk 390789.11

INTERMEDIATE PYTHON FOR FINANCE

Access rows using brackets
accounts[[True, False, True]]

INTERMEDIATE PYTHON FOR FINANCE

Access rows using brackets
accounts[[True, False, True]]

Bank Code Account# Balance

a BA ajfdk2 1222.00
c BA mm3d90 13.02

INTERMEDIATE PYTHON FOR FINANCE

loc and iloc
loc access by name

iloc access by position

INTERMEDIATE PYTHON FOR FINANCE

loc
accounts.loc['b']

Bank Code AAD

Account# 1234nmk
Balance 390789

Name: b, dtype: object

INTERMEDIATE PYTHON FOR FINANCE

loc
accounts.loc[['a','c']] df.loc[[True, False, True]]

Bank Code Account# Balance Bank Code Account# Balance

a BA ajfdk2 1222.00 a BA ajfdk2 1222.00
c BA mm3d90 13.02 c BA mm3d90 13.02

INTERMEDIATE PYTHON FOR FINANCE

Columns with loc
accounts.loc['a':'c','Balance']

accounts.loc['a':'c', ['Balance','Account#']]

accounts.loc['a':'c',[True,False,True]]

accounts.loc['a':'c','Bank Code':'Balance']

INTERMEDIATE PYTHON FOR FINANCE

Columns with loc
accounts.loc['a':'c',['Bank Code', 'Balance']]

INTERMEDIATE PYTHON FOR FINANCE

Columns with loc
accounts.loc['a':'c',['Bank Code', 'Balance']]

Bank Code Balance

a BA 1222.00
b AAD 390789.11
c BA 13.02

INTERMEDIATE PYTHON FOR FINANCE

iloc
accounts.iloc[0:2, [0,2]]

INTERMEDIATE PYTHON FOR FINANCE

iloc
accounts.iloc[0:2, [0,2]]

INTERMEDIATE PYTHON FOR FINANCE

iloc
accounts.iloc[0:2, [0,2]]

Bank Code Balance

a BA 1222.00
b AAD 390789.11

INTERMEDIATE PYTHON FOR FINANCE

Setting a single value
Bank Code Account# Balance
a BA ajfdk2 1222.00
b AAD 1234nmk 390789.11
c BA mm3d90 13.02

accounts.loc['a', 'Balance'] = 0

INTERMEDIATE PYTHON FOR FINANCE

Setting a single value
Bank Code Account# Balance
a BA ajfdk2 0.00
b AAD 1234nmk 390789.11
c BA mm3d90 13.02

accounts.loc['a', 'Balance'] = 0

INTERMEDIATE PYTHON FOR FINANCE

Setting multiple values
Bank Code Account# Balance
a BA ajfdk2 1222.00
b AAD 1234nmk 390789.11
c BA mm3d90 13.02

accounts.iloc[:2, 1:] = 'NA'

INTERMEDIATE PYTHON FOR FINANCE

Setting multiple columns
Bank Code Account# Balance
a BA NA NA
b AAD NA NA
c BA mm3d90 13.02

accounts.iloc[:2, 1:] = 'NA'

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Aggregating and
summarizing
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
DataFrame methods
.count() .sum()

.min() .prod()

.max() .mean()

.first() .median()

.last() .std()

.var()

INTERMEDIATE PYTHON FOR FINANCE

Axis
Rows Columns
default axis=1

axis=0 axis='columns'

axis='rows'

INTERMEDIATE PYTHON FOR FINANCE

Count
AAD GDDL IMA df.count()
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99 AAD 4

2020-10-05 300.00 80.00 45.33 GDDL 4

IMA 4
2020-10-07 302.90 82.92 49.00
dtype: int64

INTERMEDIATE PYTHON FOR FINANCE

Sum
AAD GDDL IMA df.sum(axis=1)
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99 2020-10-03 415.44

2020-10-05 300.00 80.00 45.33 2020-10-04 426.47

2020-10-05 425.33
2020-10-07 302.90 82.92 49.00
2020-10-07 434.82
dtype: float64

INTERMEDIATE PYTHON FOR FINANCE

Product
AAD GDDL IMA df.prod(axis='columns')
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99 2020-10-03 9.022416e+05

2020-10-05 300.00 80.00 45.33 2020-10-04 1.084987e+06

2020-10-05 1.087920e+06
2020-10-07 302.90 82.92 49.00
2020-10-07 1.230707e+06
dtype: float64

INTERMEDIATE PYTHON FOR FINANCE

Mean
AAD GDDL IMA df.mean()
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99 AAD 301.1525

2020-10-05 300.00 80.00 45.33 GDDL 79.5575

IMA 44.8050
2020-10-07 302.90 82.92 49.00
dtype: float64

INTERMEDIATE PYTHON FOR FINANCE

Median
AAD GDDL IMA df.median()
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99 AAD 300.855

2020-10-05 300.00 80.00 45.33 GDDL 79.995

IMA 45.160
2020-10-07 302.90 82.92 49.00
dtype: float64

INTERMEDIATE PYTHON FOR FINANCE

Standard deviation
AAD GDDL IMA df.std()
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99 AAD 1.337345

2020-10-05 300.00 80.00 45.33 GDDL 3.143548

IMA 3.740183
2020-10-07 302.90 82.92 49.00
dtype: float64

INTERMEDIATE PYTHON FOR FINANCE

Variance
AAD GDDL IMA df.var()
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99 AAD 1.788492

2020-10-05 300.00 80.00 45.33 GDDL 9.881892

IMA 13.988967
2020-10-07 302.90 82.92 49.00
dtype: float64

INTERMEDIATE PYTHON FOR FINANCE

Columns and rows
AAD GDDL IMA df.loc[:,'AAD'].max()
2020-10-03 300.22 75.32 39.90
2020-10-04 301.49 79.99 44.99 302.9

2020-10-05 300.00 80.00 45.33

df.iloc[0].min()
2020-10-07 302.90 82.92 49.00

39.9

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Extending and
manipulating data
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
PCE
Personal consumption expenditures (PCE)

PCE =

INTERMEDIATE PYTHON FOR FINANCE

PCE
Personal consumption expenditures (PCE)

PCE = PCDG

Durable goods

1 By cactus cowboy 2 Open Clipart, CC0, h ps://commons.wikimedia.org/w/index.php?curid=64953673

INTERMEDIATE PYTHON FOR FINANCE

PCE
Personal consumption expenditures (PCE)

PCE = PCDG + PCNDG

Non-durable goods

1By Smart Servier 2 h ps://smart.servier.com/, CC BY 3.0, h ps://commons.wikimedia.org/w/index.php?

curid=74765623

INTERMEDIATE PYTHON FOR FINANCE

PCE
Personal consumption expenditures (PCE)

PCE = PCDG + PCNDG + PCESV

Services

1By Clip Art by Vector Toons 2 Own work, CC BY-SA 4.0, h ps://commons.wikimedia.org/w/index.php?
curid=65937611

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
DATE PCDGA
1929-01-01 9.829
1930-01-01 7.661
1931-01-01 5.911
1932-01-01 3.959

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce['PCND'] = [[33.941,
30.503,
25.798000000000002,
20.169]

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce

DATE PCDG PCND

1929-01-01 9.829 33.941
1930-01-01 7.661 30.503
1931-01-01 5.911 25.798
1932-01-01 3.959 20.169

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce pcesv

DATE PCDG PCND PCESV

1929-01-01 9.829 33.941 0 33.613
1930-01-01 7.661 30.503 1 31.972
1931-01-01 5.911 25.798 2 28.963
1932-01-01 3.959 20.169 3 24.587

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce['PCESV'] = pcesv pce

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce['PCESV'] = pcesv pce

DATE PCDG PCND PCESV

1929-01-01 9.829 33.941 33.613
1930-01-01 7.661 30.503 31.972
1931-01-01 5.911 25.798 28.963
1932-01-01 3.959 20.169 24.587

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce['PCE'] = pce['PCDG'] + pce['PCND'] + pce['PCESV']

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce['PCE'] = pce['PCDG'] + pce['PCND'] + pce['PCESV']

DATE PCDG PCND PCESV PCE

1929-01-01 9.829 33.941 33.613 77.383
1930-01-01 7.661 30.503 31.972 70.136
1931-01-01 5.911 25.798 28.963 60.672
1932-01-01 3.959 20.169 24.587 48.715

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce.drop(columns=['PCDG', 'PCND', 'PCESV'],
axis=1,
inplace=True)

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing columns
pce.drop(columns=['PCDG', 'PCND', 'PCESV'],
axis=1,
inplace=True)

DATE PCE
1929-01-01 77.383
1930-01-01 70.136
1931-01-01 60.672
1932-01-01 48.715

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
new_row

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
new_row pce.append(new_row)

DATE PCE
1933-01-01 45.945

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
new_row pce.append(new_row)

DATE PCE DATE PCE

1933-01-01 45.945 1929-01-01 77.383
1930-01-01 70.136
1931-01-01 60.672
1932-01-01 48.715
1933-01-01 45.945

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
Adding multiple rows

new_rows = [ row1, row2, row3

]
for row in new_rows:
pce = pce.append(row)

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
Adding multiple rows DATE PCE
1929-01-01 77.383
for row in new_rows:
1930-01-01 70.136
pce = pce.append(row)
1931-01-01 60.672
1932-01-01 48.715
1933-01-01 45.945
1934-01-01 51.461
1935-01-01 55.933

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
pce.drop(['1934-01-01',
'1935-01-01',
'1936-01-01',
'1937-01-01',
'1938-01-01',
'1939-01-01'],
inplace=True)

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
pce.drop(['1934-01-01', DATE PCE
'1935-01-01', 1929-01-01 77.383
'1936-01-01', 1930-01-01 70.136
'1937-01-01',
1931-01-01 60.672
'1938-01-01',
1932-01-01 48.715
'1939-01-01'],
inplace=True) 1933-01-01 45.945

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
all_rows = [row1, row2, row3, pce]

pd.concat(all_rows)

INTERMEDIATE PYTHON FOR FINANCE

PCE - adding and removing rows
all_rows = [row1, row2, row3, pce] DATE PCE
1929-01-01 77.383
pd.concat(all_rows) 1930-01-01 70.136
1931-01-01 60.672
1932-01-01 48.715
1933-01-01 45.945
1934-01-01 51.461
1935-01-01 55.933

INTERMEDIATE PYTHON FOR FINANCE

PCE - operations on DataFrames
ec = 0.88
pce * ec

INTERMEDIATE PYTHON FOR FINANCE

PCE - operations on DataFrames
ec = 0.88
pce * ec

DATE PCE
1934-01-01 45.28568
1935-01-01 49.22104
1936-01-01 54.72544
1937-01-01 58.81832

INTERMEDIATE PYTHON FOR FINANCE

PCE - map
def convert_to_euro(x):
return x * 0.88

pce['EURO'] = pce['PCE'].map(convert_to_euro)

INTERMEDIATE PYTHON FOR FINANCE

PCE - map
def convert_to_euro(x):
return x * 0.88

pce['EURO'] = pce['PCE'].map(convert_to_euro)

DATE PCE EURO

1934-01-01 51.461 45.28568
1935-01-01 55.933 49.22104
1936-01-01 62.188 54.72544

INTERMEDIATE PYTHON FOR FINANCE

Gross Domestic Product (GDP)
GDP = PCE + GE + GPDI + NE

PCE: Personal Consumption Expenditures

GE: Government Expenditures

GPDI: Gross Private Domestic Investment

NE: Net Exports

INTERMEDIATE PYTHON FOR FINANCE

GDP - apply
map - Elements in a column (series)

apply - Across rows or columns

INTERMEDIATE PYTHON FOR FINANCE

GDP - apply
GCE GPDI NE PCE
DATE
1929-01-01 9.622 17.170 0.383 77.383
1930-01-01 10.273 11.428 0.323 70.136
1931-01-01 10.169 6.549 0.001 60.672
1932-01-01 8.946 1.819 0.043 48.715

INTERMEDIATE PYTHON FOR FINANCE

GDP - apply
gdp.apply(np.sum, axis=1)

INTERMEDIATE PYTHON FOR FINANCE

GDP - apply
gdp['GDP'] = gdp.apply(np.sum, axis=1)

GCE GPDI NE PCE GDP

DATE
1929-01-01 9.622 17.170 0.383 77.383 104.558
1930-01-01 10.273 11.428 0.323 70.136 92.160
1931-01-01 10.169 6.549 0.001 60.672 77.391
1932-01-01 8.946 1.819 0.043 48.715 59.523

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Peeking at data with
head, tail, and
describe
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Understanding your data
Data is loaded correctly

Understand the data's shape

INTERMEDIATE PYTHON FOR FINANCE

First look at data
aapl

INTERMEDIATE PYTHON FOR FINANCE

First look at data
aapl

Date
03/27/2020
03/26/2020
03/25/2020
03/24/2020

INTERMEDIATE PYTHON FOR FINANCE

First look at data
aapl

Price
Date
03/27/2020 247.74
03/26/2020 258.44
03/25/2020 245.52
03/24/2020 246.88

INTERMEDIATE PYTHON FOR FINANCE

First look at data
aapl

Price Volume
Date
03/27/2020 247.74 51054150
03/26/2020 258.44 63140170
03/25/2020 245.52 75900510
03/24/2020 246.88 71882770

INTERMEDIATE PYTHON FOR FINANCE

First look at data
aapl

Price Volume Trend

Date
03/27/2020 247.74 51054150 Down
03/26/2020 258.44 63140170 Up
03/25/2020 245.52 75900510 Down
03/24/2020 246.88 71882770 Up

INTERMEDIATE PYTHON FOR FINANCE

Head
aapl.head()

Price Volumne Trend

Date
03/27/2020 247.74 51054150 Down
03/26/2020 258.44 63140170 Up
03/25/2020 245.52 75900510 Down
03/24/2020 246.88 71882770 Up
03/23/2020 224.37 84188210 Down

INTERMEDIATE PYTHON FOR FINANCE

Head
aapl.head()

INTERMEDIATE PYTHON FOR FINANCE

Head
aapl.head(3)

```out
Price Volumne Trend
Date
03/27/2020 247.74 51054150 Down
03/26/2020 258.44 63140170 Up
03/25/2020 245.52 75900510 Down

INTERMEDIATE PYTHON FOR FINANCE

Tail
aapl.tail()

Price Volumne Trend

Date
03/05/2020 292.92 46893220 Down
03/04/2020 302.74 54794570 Up
03/03/2020 289.32 79868850 Down
03/02/2020 298.81 85349340 Up
02/28/2020 273.36 106721200 Down

INTERMEDIATE PYTHON FOR FINANCE

Describe
aapl.describe()

Price Volume
count 21.000000 2.100000e+01
mean 263.715714 7.551468e+07
std 23.360598 1.669757e+07
min 224.370000 4.689322e+07
25% 246.670000 6.409497e+07
50% 258.440000 7.505841e+07
75% 285.340000 8.418821e+07
max 302.740000 1.067212e+08

INTERMEDIATE PYTHON FOR FINANCE

Include
aapl.describe(include='object')

Trend
count 21
unique 2
top Down
freq 14

INTERMEDIATE PYTHON FOR FINANCE

Include
aapl.describe(include='all')

Price Volumne Trend

count 21.000000 2.100000e+01 21
unique NaN NaN 2
top NaN NaN Down
freq NaN NaN 14
mean 263.715714 7.551468e+07 NaN
std 23.360598 1.669757e+07 NaN
min 224.370000 4.689322e+07 NaN
25% 246.670000 6.409497e+07 NaN

INTERMEDIATE PYTHON FOR FINANCE

aapl.describe(include=['float', 'object'])

Price Trend
count 21.000000 21
unique NaN 2
top NaN Down
freq NaN 14
mean 263.715714 NaN
std 23.360598 NaN
min 224.370000 NaN
25% 246.670000 NaN
50% 258.440000 NaN
75% 285.340000 NaN
max 302.740000 NaN

INTERMEDIATE PYTHON FOR FINANCE

Percentiles
aapl.describe(percentiles=[.1, .5, .9])

Price Volumne
count 21.000000 2.100000e+01
mean 263.715714 7.551468e+07
std 23.360598 1.669757e+07
min 224.370000 4.689322e+07
10% 242.210000 5.479457e+07
50% 258.440000 7.505841e+07
90% 292.920000 1.004233e+08
max 302.740000 1.067212e+08

INTERMEDIATE PYTHON FOR FINANCE

Exclude
aapl.describe(exclude='float')

Volumne Trend
count 2.100000e+01 21
unique NaN 2
top NaN Down
freq NaN 14
mean 7.551468e+07 NaN
std 1.669757e+07 NaN
min 4.689322e+07 NaN
25% 6.409497e+07 NaN

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Filtering data
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Introducing the data
prices.head()

INTERMEDIATE PYTHON FOR FINANCE

Introducing the data
prices.head()

Date Symbol High

0 2020-04-03 AAPL 245.70
1 2020-04-02 AAPL 245.15
2 2020-04-01 AAPL 248.72
3 2020-03-31 AAPL 262.49
4 2020-03-30 AAPL 255.52

INTERMEDIATE PYTHON FOR FINANCE

Introducing the data
prices.describe()

INTERMEDIATE PYTHON FOR FINANCE

Introducing the data
prices.describe()

High
count 378.000000
mean 881.593138
std 720.771922
min 227.490000
max 2185.950000

INTERMEDIATE PYTHON FOR FINANCE

Introducing the data
prices.describe(include='object')

Symbol
count 378
unique 3
top AMZN
freq 126

INTERMEDIATE PYTHON FOR FINANCE

Comparison operators
< <= > >= == !=

INTERMEDIATE PYTHON FOR FINANCE

Column comparison
prices.High > 2160

INTERMEDIATE PYTHON FOR FINANCE

Column comparison
prices.High > 2160

0 False
1 False
2 False
3 False
4 False
...
374 False
375 False
376 False
377 False

INTERMEDIATE PYTHON FOR FINANCE

Column comparison
prices.Symbol == 'AAPL'

INTERMEDIATE PYTHON FOR FINANCE

Column comparison
prices.Symbol == 'AAPL'

0 True
1 True
2 True
3 True
4 True
...
374 False
375 False
376 False
377 False

INTERMEDIATE PYTHON FOR FINANCE

Masking by symbol
mask_symbol = prices.Symbol == 'AAPL'
aapl = prices.loc[mask_symbol]

INTERMEDIATE PYTHON FOR FINANCE

Masking by symbol
mask_symbol = prices.Symbol == 'AAPL'
aapl = prices.loc[mask_symbol]
aapl.describe(include='object')

Symbol
count 126
unique 1
top AAPL
freq 126

INTERMEDIATE PYTHON FOR FINANCE

Masking by price
mask_high = prices.High > 2160
big_price = prices.loc[mask_high]

INTERMEDIATE PYTHON FOR FINANCE

Masking by price
big_price.describe()

High
count 6.000000
mean 2177.406567
std 7.999334
min 2166.070000
max 2185.95000

INTERMEDIATE PYTHON FOR FINANCE

Pandas Boolean operators
And &

Or |

Not ~

INTERMEDIATE PYTHON FOR FINANCE

Combining conditions
mask_prices = prices['Symbol'] != 'AMZN'

mask_date = historical_highs['Date'] > datetime(2020, 4, 1)

mask_amzn = mask_prices & mask_date

prices.loc[mask_amzn]

INTERMEDIATE PYTHON FOR FINANCE

Combining conditions
Date Symbol High
0 2020-04-03 AAPL 245.7000
1 2020-04-02 AAPL 245.1500
252 2020-04-03 TSLA 515.4900
253 2020-04-02 TSLA 494.2599

INTERMEDIATE PYTHON FOR FINANCE

Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Plotting data
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Look at your data

INTERMEDIATE PYTHON FOR FINANCE

exxon.head()

INTERMEDIATE PYTHON FOR FINANCE

Introducing the data
exxon.head()

Date High Volume Month

0 2015-05-01 90.089996 198924100 May
1 2015-06-01 85.970001 238808600 Jun
2 2015-07-01 83.529999 274029000 Jul
3 2015-08-01 79.290001 387523600 Aug
4 2015-09-01 75.470001 316644500 Sep

INTERMEDIATE PYTHON FOR FINANCE

Matplotlib
my_dataframe.plot()

INTERMEDIATE PYTHON FOR FINANCE

Line plot
exxon.plot(x='Date',
y='High' )

INTERMEDIATE PYTHON FOR FINANCE

INTERMEDIATE PYTHON FOR FINANCE
Rotate
exxon.plot(x='Date',
y='High',
rot=90 )

INTERMEDIATE PYTHON FOR FINANCE

INTERMEDIATE PYTHON FOR FINANCE
Title
exxon.plot(x='Date',
y='High',
rot=90,
title='Exxon Stock Price')

INTERMEDIATE PYTHON FOR FINANCE

INTERMEDIATE PYTHON FOR FINANCE
Index
exxon.set_index('Date', inplace=True)
exxon.plot(y='High',
rot=90,
title='Exxon Stock Price')

INTERMEDIATE PYTHON FOR FINANCE

INTERMEDIATE PYTHON FOR FINANCE
Plot types
line density

bar area

barh pie

hist scatter

box hexbin

kde

INTERMEDIATE PYTHON FOR FINANCE

Bar
exxon2018.plot(x='Month',
y='Volume',
kind='bar',
title='Exxon 2018')

INTERMEDIATE PYTHON FOR FINANCE

INTERMEDIATE PYTHON FOR FINANCE
Hist
exxon.plot(y='High',kind='hist')

INTERMEDIATE PYTHON FOR FINANCE

INTERMEDIATE PYTHON FOR FINANCE
Let's practice!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Wrapping up
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E

Kennedy Behrman
Data Engineer, Author, Founder
Chapter 1
Representing time Mapping data

datetime dict()

INTERMEDIATE PYTHON FOR FINANCE

Chapter 2
Comparison operators If statements

< <= > >=

if a < b:
print(a)
Equality operators

== != Loops

Boolean operators while a < b:

and or not a = a + 1

for a in c:
print(a)

INTERMEDIATE PYTHON FOR FINANCE

Chapter 3
Creating a DataFrame Aggregating, summarizing

DataFrame(data=data) stocks.mean()
pd.read_csv('/data.csv') stocks.median()

Accessing data Extending, manipulating

stocks.loc['a', 'Values'] pce['PCESV'] = pcesv

stocks.iloc[2:22, 12] gdp.apply(np.sum, axis=1)

INTERMEDIATE PYTHON FOR FINANCE

Chapter 4
Peeking Plo ing

aapl.head() exxon.plot(x='Date',
aapl.tail() y='High' )
aapl.describe()

Filtering

mask = prices.High > 216

prices.loc[mask]

INTERMEDIATE PYTHON FOR FINANCE

Congratulations!
I N T E R M E D I AT E P Y T H O N F O R F I N A N C E
Fundamental
financial concepts
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Course objectives
The Time Value of Money

Compound Interest

Discounting and Projecting Cash Flows

Making Rational Economic Decisions

Mortgage Structures

Interest and Equity

The Cost of Capital

Wealth Accumulation

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Calculating Return on Investment (% Gain)
vt2 − vt1
Return (% Gain) = =r
vt1
vt1 : The initial value of the investment at time
vt2 : The nal value of the investment at time

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Example
You invest $10,000 at time = year 1

At time = 2, your investment is worth $11,000

$11, 000 − $10, 000
∗ 100 = 10% annual return (gain) on y
$10, 000

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Calculating Return on Investment (Dollar Value)
vt2 = vt1 ∗ (1 + r)

vt1 : The initial value of the investment at time

vt2 : The nal value of the investment at time

r: The rate of return of the investment per period t

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Example
Annual rate of return = 10% = 10/100

You invest $10,000 at time = year 1

10
$10,000 ∗ (1 + ) = $11,000
100

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Cumulative growth (or depreciation)
r: The investment's expected rate of return (growth rate)

t: The lifespan of the investment (time)

vt0 : The initial value of the investment at time 0

Investment Value = vt0 ∗ (1 + r)t

If the growth rate r is negative, the investment's value will

depreciate (shrink) over time.

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Discount factors
1
df =
(1 + r)t
v = f v ∗ df

df : Discount factor
r: The rate of depreciation per period t
t: Time periods
v : Initial value of the investment
f v : Future value of the investment

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Compound interest
r t∗c
Investment Value = vt0 ∗ (1 + )
c
r: The investment's annual expected rate of return (growth
rate)

t: The lifespan of the investment

vt0 : The initial value of the investment at time 0

c: The number of compounding periods per year

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

The power of compounding returns
Consider a $1,000 investment with a 10% annual return,
compounded quarterly (every 3 months, 4 times per year):

0.10 1∗4
$1, 000 ∗ (1 + ) = $1, 103.81
4
Compare this with no compounding:

0.10 1∗1
$1, 000 ∗ (1 + ) = $1, 100.00
1
Notice the extra $3.81 due to the quarterly compounding?

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Exponential growth
Compounded Quarterly Over 30 Years:

0.10 30∗4
$1, 000 ∗ (1 + ) = $19, 358.15
4
Compounded Annually Over 30 Years:

0.10 30∗1
$1, 000 ∗ (1 + ) = $17, 449.40
1
Compounding quarterly generates an extra $1,908.75 over 30
years

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Present and future
value
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
The non-static value of money
Situation 1

Option A: $100 in your pocket today

Option B: $100 in your pocket tomorrow

Situation 2

Option A: $10,000 dollars in your pocket today

Option B: $10,500 dollars in your pocket one year from now

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Time is money
Your Options

A: Take the $10,000, stash it in the bank at 1% interest per

year, risk free

B: Invest the $10,000 in the stock market and earn an

average 8% per year

C: Wait 1 year, take the $10,500 instead

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Comparing future values
A: 10,000 * (1 + 0.01) = 10,100 future dollars

B: 10,000 * (1 + 0.08) = 10,800 future dollars

C: 10,500 future dollars

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Present value in Python
Calculate the present value of $100 received 3 years from now
at a 1.0% in ation rate.

import numpy as np
np.pv(rate=0.01, nper=3, pmt=0, fv=100)

-97.05

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Future value in Python
Calculate the future value of $100 invested for 3 years at a
5.0% average annual rate of return.

import numpy as np
np.fv(rate=0.05, nper=3, pmt=0, pv=-100)

115.76

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Net present value
and cash flows
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Cash flows
Cash ows are a series of gains or losses from an investment
over time.

Year Project 1 Cash Flows Project 2 Cash Flows

0 -$100 $100
1 $100 $100
2 $125 -$100
3 $150 $200
4 $175 $300

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Assume a 3% discount rate

Cash Present
Year Formula
Flows Value
pv(rate=0.03, nper=0, pmt=0,
0 -$100 -100
fv=-100)
pv(rate=0.03, nper=1, pmt=0,
1 $100 97.09
fv=100)
pv(rate=0.03, nper=2, pmt=0,
2 $125 117.82
fv=125)
pv(rate=0.03, nper=3, pmt=0,
3 $150 137.27
fv=150)
pv(rate=0.03, nper=4, pmt=0,
4 $175 155.49
fv=175)

Sum of all present values = 407.67

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Arrays in NumPy
Example:

import numpy as np
array_1 = np.array([100,200,300])
print(array_1*2)

[200 400 600]

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Net Present Value
Project 1

import numpy as np
np.npv(rate=0.03, values=np.array([-100, 100, 125, 150, 175]))

407.67

Project 2

import numpy as np
np.npv(rate=0.03, values=np.array([100, 100, -100, 200, 300]))

552.40

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
A tale of two project
proposals
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Common profitability analysis methods
Net Present Value (NPV)

Internal Rate of Return (IRR)

Equivalent Annual Annuity (EAA)

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Net Present Value (NPV)
NPV is equal to the sum of all discounted cash ows:

Ct
N P V = ∑Tt=1 (1+r)t
− C0

Ct : Cash ow C at time t

r: Discount rate

NPV is a simple cash ow valuation measure that does not allow

for the comparison of di erent sized projects or lengths.

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Internal Rate of Return (IRR)
The internal rate of return must be computed by solving for IRR
in the NPV equation when set equal to 0.

Ct
N P V = ∑Tt=1 (1+IRR)t
− C0 = 0

Ct : Cash ow C at time t

IRR: Internal Rate of Return

IRR can be used to compare projects of di erent sizes and

lengths but requires an algorithmic solution and does not
measure total value.

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

IRR in NumPy
You can use the NumPy function .irr(values) to compute the
internal rate of return of an array of values.

Example:

import numpy as np
project_1 = np.array([-100,150,200])
np.irr(project_1)

1.35

Project 1 has an IRR of 135%

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
The Weighted
Average Cost of
Capital (WACC)
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
What is WACC?
W ACC = FEquity ∗ CEquity + FDebt ∗ CDebt ∗ (1 − T R)

FEquity : The proportion (%) of a company's nancing via

equity

FDebt : The proportion (%) of a company's nancing via debt

CEquity : The cost of a company's equity

CDebt : The cost of a company's debt
T R : The corporate tax rate

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Proportion of financing
The proportion (%) of nancing can be calculated as follows:

MEquity
FEquity = MT otal
MDebt
FDebt = MT otal

MT otal = MDebt + MEquity

MDebt : Market value of a company's debt

MEquity : Market value of a company's equity
MT otal : Total value of a company's nancing

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Example:

Calculate the WACC of a company with a 12% cost of debt, 14%

cost of equity, 20% debt nancing and 80% equity nancing.
Assume a 35% e ective corporate tax rate.

percent_equity = 0.80
percent_debt = 0.20
cost_equity = 0.14
cost_debt = 0.12
tax_rate = 0.35
wacc = (percent_equity*cost_equity) + (percent_debt*cost_debt) *
(1 - tax_rate)
print(wacc)

0.1276

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Discounting using WACC
Example:

Calculate the NPV of a project that produces $100 in cash ow

every year for 5 years. Assume a WACC of 13%.

cf_project1 = np.repeat(100, 5)
npv_project1 = np.npv(0.13, cf_project1)
print(npv_project1)

397.45

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Comparing two
projects of different
life spans
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Different NPVs and IRRs
Year Project 1 Project 2 Project comparison

1 -$100 -$125
NPV IRR Length
2 $200 $100
#1 362.58 200% 3
3 $300 $100
#2 453.64 78.62% 8
4 N/A $100
Notice how you could
5 N/A $100
undertake multiple Project 1's
6 N/A $100 over 8 years? Are the NPVs fair
7 N/A $100 to compare?

8 N/A $100

Assume a 5% discount rate for

both projects

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Equivalent Annual Annuity (EAA) can be used to compare two
projects of di erent lifespans in present value terms.

Apply the EAA method to the previous two projects using the
computed NPVs * -1:

import numpy as np
npv_project1 = 362.58
npv_project2 = 453.64
np.pmt(rate=0.05, nper=3, pv=-1*npv_project1, fv=0)

133.14

np.pmt(rate=0.05, nper=8, pv=-1*npv_project2, fv=0)

70.18

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Mortgage basics
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Taking out a mortgage
A mortage is a loan that covers the remaining cost of a home
a er paying a percentage of the home value as a down
payment.

A typical down payment in the US is at least 20% of the

home value

A typical US mortgage loan is paid o over 30 years

Example:

$500,000 house

20% down ($100,000)

$400,000 remaining as a 30 year mortgage loan

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Converting from an annual rate
To convert from an annual rate Example:
to a periodic rate:
Convert a 12% annual interest
1
RP eriodic = (1 + RAnnual ) −
N rate to the equivalent monthly
rate.
R: Rate of Return (or Interest
1
Rate) (1 + 0.12) − 1 = 0.949% m
12

N: Number of Payment
Periods Per Year

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Mortgage loan payments
You can use the NumPy function .pmt(rate, nper, pv) to
compute the periodic mortgage loan payment.

Example:

Calculate the monthly mortgage payment of a $400,000 30

year loan at 3.8% interest:

import numpy as np
monthly_rate = ((1+0.038)**(1/12) - 1)
np.pmt(rate=monthly_rate, nper=12*30, pv=400000)

-1849.15

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Amortization,
interest and
principal
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Amortization
Principal (Equity): The amount PP: Principal Payment
of your mortgage paid that
MP: Mortgage Payment
counts towards the value of
IP: Interest Payment
the house itself
R: Mortgage Interest Rate
Interest Payment (IP P eriodic )
(Periodic)

= RM B ∗ RP eriodic RMB: Remaining Mortgage

Balance
Principal Payment (
P P P eriodic )

= M P P eriodic − IP P eriodic

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Accumulating values via for loops in Python
Example:

accumulator = 0
for i in range(3):
if i == 0:
accumulator = accumulator + 3
else:
accumulator = accumulator + 1
print(str(i)+": Loop value: "+str(accumulator))

0: Loop value: 3
1: Loop value: 4
2: Loop value: 5

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Home ownership,
equity and
forecasting
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Ownership
To calculate the percentage of the home you actually own
(home equity):

ECumulative,t
Percent Equity Ownedt = PDown + VHome

ECumulative,t = ∑Tt=1 PP rincipal,t

ECumulative,t : Cumulative home equity at time t

PP rincipal,t : Principal payment at time t
VHome : Total home value
PDown : Initial down payment

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Underwater mortgage
An underwater mortgage is when the remaining amount you
owe on your mortgage is actually higher than the value of the
house itself.

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Cumulative operations in NumPy
Cumulative Sum

import numpy as np
np.cumsum(np.array([1, 2, 3]))

array([1, 3, 6])

Cumulative Product

import numpy as np
np.cumprod(np.array([1, 2, 3]))

array([1, 2, 6])

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Forecasting cumulative growth
Example:

What is the cumulative value at each point in time of a $100

investment that grows by 3% in period 1, then 3% again in
period 2, and then by 5% in period 3?

import numpy as np
np.cumprod(1 + np.array([0.03, 0.03, 0.05]))

array([ 1.03, 1.0609, 1.113945])

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Budgeting project
proposal
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Project proposal
Your budget will have to take into account the following:

Rent

Food expenses

Entertainment expenses

Emergency fund

You will have to adjust for the following:

Taxes

Salary growth

In ation (for all expenses)

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Constant cumulative growth forecast
What is the cumulative growth of an investment that grows by
3% per year for 3 years?

import numpy as np
np.cumprod(1 + np.repeat(0.03, 3)) - 1

array([ 0.03, 0.0609, 0.0927])

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Forecasting values from growth rates
Compute the value at each point in time of an initial $100
investment that grows by 3% per year for 3 years?

import numpy as np
100*np.cumprod(1 + np.repeat(0.03, 3))

array([ 103, 106.09, 109.27])

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's build it!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Net worth and
valuation in your
personal financial
life
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Net Worth
Net Worth = Assets - Liabilities = Equity

This is the basis of modern accounting

A point in time measurement

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Valuation
NPV(discount rate, cash ows)

Take into account future cash ows, salary and expenses

Adjust for in ation

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Reaching financial goals
Saving will only earn you a low rate of return

In ation will destroy most of your savings over time if you let
it

The best way to combat in ation is to invest

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

The basics of investing
Investing is a risk-reward tradeo

Diversify

Plan for the worst

Invest as early as possible

Invest continuously over time

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's simulate it!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
The power of time
and compound
interest
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
The power of time
Goal: Save $1.0 million over 40 years. Assume an average 7%
rate of return per year.

import numpy as np
np.pmt(rate=((1+0.07)**1/12 - 1), nper=12*40, pv=0, fv=1000000)

-404.61

What if your investments only returned 5% on average?

import numpy as np
np.pmt(rate=((1+0.05)**1/12 - 1), nper=12*40, pv=0, fv=1000000)

-674.53

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

The power of time
Goal: Save $1.0 million over 25 years. Assume an average 7%
rate of return per year.

import numpy as np
np.pmt(rate=((1+0.07)**1/12 - 1), nper=12*25, pv=0, fv=1000000)

-1277.07

What if your investments only returned 5% on average?

import numpy as np
np.pmt(rate=((1+0.05)**1/12 - 1), nper=12*40, pv=0, fv=1000000)

-1707.26

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Inflation adjusting
Assume an average rate of in ation of 3% per year

import numpy as np
np.fv(rate=-0.03, nper=25, pv=-1000000, pmt=0)

466974.70

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Let's practice!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
Financial concepts
in your daily life
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Dakota Wixom
Quantitative Finance Analyst
Congratulations
The Time Value of Money

Compound Interest

Discounting and Projecting Cash Flows

Making Rational Economic Decisions

Mortgage Structures

Interest and Equity

The Cost of Capital

Wealth Accumulation

INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON

Congratulations!
INTRODUCTION TO FINANCIAL CONCEPTS IN PYTHON
How to use dates &
times with pandas
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Date & time series functionality
At the root: data types for date & time information
Objects for points in time and periods

A ributes & methods re ect time-related details

Sequences of dates & periods:

Series or DataFrame columns

Index: convert object into Time Series

Many Series/DataFrame methods rely on time information in

the index to provide time-series functionality

MANIPULATING TIME SERIES DATA IN PYTHON

Basic building block: pd.Timestamp
import pandas as pd # assumed imported going forward
from datetime import datetime # To manually create dates
time_stamp = pd.Timestamp(datetime(2017, 1, 1))
pd.Timestamp('2017-01-01') == time_stamp

True # Understands dates as strings

time_stamp # type: pandas.tslib.Timestamp

Timestamp('2017-01-01 00:00:00')

MANIPULATING TIME SERIES DATA IN PYTHON

Basic building block: pd.Timestamp
Timestamp object has many a ributes to store time-speci c
information

time_stamp.year

2017

time_stamp.day_name()

'Sunday'

MANIPULATING TIME SERIES DATA IN PYTHON

More building blocks: pd.Period & freq
period = pd.Period('2017-01')
period # default: month-end

Period object has freq

Period('2017-01', 'M') a ribute to store frequency
info
period.asfreq('D') # convert to daily

Period('2017-01-31', 'D')
Convert pd.Period() to
period.to_timestamp().to_period('M') pd.Timestamp() and back

Period('2017-01', 'M')

MANIPULATING TIME SERIES DATA IN PYTHON

More building blocks: pd.Period & freq
period + 2 Frequency info enables
basic date arithmetic
Period('2017-03', 'M')

pd.Timestamp('2017-01-31', 'M') + 1

Timestamp('2017-02-28 00:00:00', freq='M')

MANIPULATING TIME SERIES DATA IN PYTHON

Sequences of dates & times
pd.date_range : start , end , periods , freq

index = pd.date_range(start='2017-1-1', periods=12, freq='M')

index

DatetimeIndex(['2017-01-31', '2017-02-28', '2017-03-31', ...,

'2017-09-30', '2017-10-31', '2017-11-30', '2017-12-31'],
dtype='datetime64[ns]', freq='M')

pd.DateTimeIndex : sequence of Timestamp objects with

frequency info

MANIPULATING TIME SERIES DATA IN PYTHON

Sequences of dates & times
index[0]

Timestamp('2017-01-31 00:00:00', freq='M')

index.to_period()

PeriodIndex(['2017-01', '2017-02', '2017-03', '2017-04', ...,

'2017-11', '2017-12'], dtype='period[M]', freq='M')

MANIPULATING TIME SERIES DATA IN PYTHON

Create a time series: pd.DateTimeIndex
pd.DataFrame({'data': index}).info()

RangeIndex: 12 entries, 0 to 11
Data columns (total 1 columns):
data 12 non-null datetime64[ns]
dtypes: datetime64[ns](1)

MANIPULATING TIME SERIES DATA IN PYTHON

Create a time series: pd.DateTimeIndex
np.random.random :
Random numbers: [0,1]

12 rows, 2 columns

data = np.random.random((size=12,2))
pd.DataFrame(data=data, index=index).info()

DatetimeIndex: 12 entries, 2017-01-31 to 2017-12-31

Freq: M
Data columns (total 2 columns):
0 12 non-null float64
1 12 non-null float64
dtypes: float64(2)

MANIPULATING TIME SERIES DATA IN PYTHON

Frequency aliases & time info

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Indexing &
resampling time
series
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Time series transformation
Basic time series transformations include:

Parsing string dates and convert to datetime64

Selecting & slicing for speci c subperiods

Se ing & changing DateTimeIndex frequency

Upsampling vs Downsampling

MANIPULATING TIME SERIES DATA IN PYTHON

Getting GOOG stock prices
google = pd.read_csv('google.csv') # import pandas as pd
google.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 504 entries, 0 to 503
Data columns (total 2 columns):
date 504 non-null object
price 504 non-null float64
dtypes: float64(1), object(1)

google.head()

date price
0 2015-01-02 524.81
1 2015-01-05 513.87
2 2015-01-06 501.96
3 2015-01-07 501.10
4 2015-01-08 502.68

MANIPULATING TIME SERIES DATA IN PYTHON

Converting string dates to datetime64
pd.to_datetime() :
Parse date string

Convert to datetime64

google.date = pd.to_datetime(google.date)
google.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 504 entries, 0 to 503
Data columns (total 2 columns):
date 504 non-null datetime64[ns]
price 504 non-null float64
dtypes: datetime64[ns](1), float64(1)

MANIPULATING TIME SERIES DATA IN PYTHON

Converting string dates to datetime64
.set_index() :
Date into index

inplace :
don't create copy

google.set_index('date', inplace=True)
google.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 504 entries, 2015-01-02 to 2016-12-30
Data columns (total 1 columns):
price 504 non-null float64
dtypes: float64(1)

MANIPULATING TIME SERIES DATA IN PYTHON

Plotting the Google stock time series
google.price.plot(title='Google Stock Price')
plt.tight_layout(); plt.show()

MANIPULATING TIME SERIES DATA IN PYTHON

Partial string indexing
Selecting/indexing using strings that parse to dates

google['2015'].info() # Pass string for part of date

DatetimeIndex: 252 entries, 2015-01-02 to 2015-12-31

Data columns (total 1 columns):
price 252 non-null float64
dtypes: float64(1)

google['2015-3': '2016-2'].info() # Slice includes last month

DatetimeIndex: 252 entries, 2015-03-02 to 2016-02-29

Data columns (total 1 columns):
price 252 non-null float64
dtypes: float64(1)
memory usage: 3.9 KB

MANIPULATING TIME SERIES DATA IN PYTHON

Partial string indexing
google.loc['2016-6-1', 'price'] # Use full date with .loc[]

734.15

MANIPULATING TIME SERIES DATA IN PYTHON

.asfreq(): set frequency
.asfreq('D') :
Convert DateTimeIndex to calendar day frequency

google.asfreq('D').info() # set calendar day frequency

DatetimeIndex: 729 entries, 2015-01-02 to 2016-12-30

Freq: D
Data columns (total 1 columns):
price 504 non-null float64
dtypes: float64(1)

MANIPULATING TIME SERIES DATA IN PYTHON

.asfreq(): set frequency
Upsampling:
Higher frequency implies new dates => missing data

google.asfreq('D').head()

price
date
2015-01-02 524.81
2015-01-03 NaN
2015-01-04 NaN
2015-01-05 513.87
2015-01-06 501.96

MANIPULATING TIME SERIES DATA IN PYTHON

.asfreq(): reset frequency
.asfreq('B') :
Convert DateTimeIndex to business day frequency

google = google.asfreq('B') # Change to calendar day frequency

google.info()

DatetimeIndex: 521 entries, 2015-01-02 to 2016-12-30

Freq: B
Data columns (total 1 columns):
price 504 non-null float64
dtypes: float64(1)

MANIPULATING TIME SERIES DATA IN PYTHON

.asfreq(): reset frequency
google[google.price.isnull()] # Select missing 'price' values

price
date
2015-01-19 NaN
2015-02-16 NaN
...
2016-11-24 NaN
2016-12-26 NaN

Business days that were not trading days

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Lags, changes, and
returns for stock
price series
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Basic time series calculations
Typical Time Series manipulations include:
Shi or lag values back or forward back in time

Get the di erence in value for a given time period

Compute the percent change over any number of periods

pandas built-in methods rely on pd.DateTimeIndex

MANIPULATING TIME SERIES DATA IN PYTHON

Getting GOOG stock prices
Let pd.read_csv() do the parsing for you!

google = pd.read_csv('google.csv', parse_dates=['date'], index_col='date')

google.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 504 entries, 2015-01-02 to 2016-12-30
Data columns (total 1 columns):
price 504 non-null float64
dtypes: float64(1)

MANIPULATING TIME SERIES DATA IN PYTHON

Getting GOOG stock prices
google.head()

price
date
2015-01-02 524.81
2015-01-05 513.87
2015-01-06 501.96
2015-01-07 501.10
2015-01-08 502.68

MANIPULATING TIME SERIES DATA IN PYTHON

.shift(): Moving data between past & future
.shift() :
defaults to periods=1

1 period into future

google['shifted'] = google.price.shift() # default: periods=1

google.head(3)

price shifted
date
2015-01-02 542.81 NaN
2015-01-05 513.87 542.81
2015-01-06 501.96 513.87

MANIPULATING TIME SERIES DATA IN PYTHON

.shift(): Moving data between past & future
.shift(periods=-1) :
lagged data

1 period back in time

google['lagged'] = google.price.shift(periods=-1)
google[['price', 'lagged', 'shifted']].tail(3)

price lagged shifted

date
2016-12-28 785.05 782.79 791.55
2016-12-29 782.79 771.82 785.05
2016-12-30 771.82 NaN 782.79

MANIPULATING TIME SERIES DATA IN PYTHON

Calculate one-period percent change
xt / xt−1
google['change'] = google.price.div(google.shifted)
google[['price', 'shifted', 'change']].head(3)

price shifted change

Date
2017-01-03 786.14 NaN NaN
2017-01-04 786.90 786.14 1.000967
2017-01-05 794.02 786.90 1.009048

MANIPULATING TIME SERIES DATA IN PYTHON

Calculate one-period percent change
google['return'] = google.change.sub(1).mul(100)
google[['price', 'shifted', 'change', 'return']].head(3)

price shifted change return

date
2015-01-02 524.81 NaN NaN NaN
2015-01-05 513.87 524.81 0.98 -2.08
2015-01-06 501.96 513.87 0.98 -2.32

MANIPULATING TIME SERIES DATA IN PYTHON

.diff(): built-in time-series change
Di erence in value for two adjacent periods

xt − xt−1
google['diff'] = google.price.diff()
google[['price', 'diff']].head(3)

price diff
date
2015-01-02 524.81 NaN
2015-01-05 513.87 -10.94
2015-01-06 501.96 -11.91

MANIPULATING TIME SERIES DATA IN PYTHON

.pct_change(): built-in time-series % change
Percent change for two adjacent periods
xt
xt−1

google['pct_change'] = google.price.pct_change().mul(100)
google[['price', 'return', 'pct_change']].head(3)

price return pct_change

date
2015-01-02 524.81 NaN NaN
2015-01-05 513.87 -2.08 -2.08
2015-01-06 501.96 -2.32 -2.32

MANIPULATING TIME SERIES DATA IN PYTHON

Looking ahead: Get multi-period returns
google['return_3d'] = google.price.pct_change(periods=3).mul(100)
google[['price', 'return_3d']].head()

price return_3d
date
2015-01-02 524.81 NaN
2015-01-05 513.87 NaN
2015-01-06 501.96 NaN
2015-01-07 501.10 -4.517825
2015-01-08 502.68 -2.177594

Percent change for two periods, 3 trading days apart

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Compare time series
growth rates
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Comparing stock performance
Stock price series: hard to compare at di erent levels

Simple solution: normalize price series to start at 100

Divide all prices by rst in series, multiply by 100

Same starting point

All prices relative to starting point

Di erence to starting point in percentage points

MANIPULATING TIME SERIES DATA IN PYTHON

Normalizing a single series (1)
google = pd.read_csv('google.csv', parse_dates=['date'], index_col='date')
google.head(3)

price
date
2010-01-04 313.06
2010-01-05 311.68
2010-01-06 303.83

first_price = google.price.iloc[0] # int-based selection

first_price

313.06

first_price == google.loc['2010-01-04', 'price']

True

MANIPULATING TIME SERIES DATA IN PYTHON

Normalizing a single series (2)
normalized = google.price.div(first_price).mul(100)
normalized.plot(title='Google Normalized Series')

MANIPULATING TIME SERIES DATA IN PYTHON

Normalizing multiple series (1)
prices = pd.read_csv('stock_prices.csv',
parse_dates=['date'],
index_col='date')
prices.info()

DatetimeIndex: 1761 entries, 2010-01-04 to 2016-12-30

Data columns (total 3 columns):
AAPL 1761 non-null float64
GOOG 1761 non-null float64
YHOO 1761 non-null float64
dtypes: float64(3)

prices.head(2)

AAPL GOOG YHOO

Date
2010-01-04 30.57 313.06 17.10
2010-01-05 30.63 311.68 17.23

MANIPULATING TIME SERIES DATA IN PYTHON

Normalizing multiple series (2)
prices.iloc[0]

AAPL 30.57
GOOG 313.06
YHOO 17.10
Name: 2010-01-04 00:00:00, dtype: float64

normalized = prices.div(prices.iloc[0])
normalized.head(3)

AAPL GOOG YHOO

Date
2010-01-04 1.000000 1.000000 1.000000
2010-01-05 1.001963 0.995592 1.007602
2010-01-06 0.985934 0.970517 1.004094

.div() : automatic alignment of Series index & DataFrame

columns

MANIPULATING TIME SERIES DATA IN PYTHON

Comparing with a benchmark (1)
index = pd.read_csv('benchmark.csv', parse_dates=['date'], index_col='date')
index.info()

DatetimeIndex: 1826 entries, 2010-01-01 to 2016-12-30

Data columns (total 1 columns):
SP500 1762 non-null float64
dtypes: float64(1)

prices = pd.concat([prices, index], axis=1).dropna()

prices.info()

DatetimeIndex: 1761 entries, 2010-01-04 to 2016-12-30

Data columns (total 4 columns):
AAPL 1761 non-null float64
GOOG 1761 non-null float64
YHOO 1761 non-null float64
SP500 1761 non-null float64
dtypes: float64(4)

MANIPULATING TIME SERIES DATA IN PYTHON

Comparing with a benchmark (2)
prices.head(1)

AAPL GOOG YHOO SP500

2010-01-04 30.57 313.06 17.10 1132.99

normalized = prices.div(prices.iloc[0]).mul(100)
normalized.plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Plotting performance difference
diff = normalized[tickers].sub(normalized['SP500'], axis=0)

GOOG YHOO AAPL

2010-01-04 0.000000 0.000000 0.000000
2010-01-05 -0.752375 0.448669 -0.115294
2010-01-06 -3.314604 0.043069 -1.772895

.sub(..., axis=0) : Subtract a Series from each DataFrame

column by aligning indexes

MANIPULATING TIME SERIES DATA IN PYTHON

Plotting performance difference
diff.plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Changing the time
series frequency:
resampling
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Changing the frequency: resampling
DateTimeIndex : set & change freq using .asfreq()

But frequency conversion a ects the data

Upsampling: ll or interpolate missing data

Downsampling: aggregate existing data

pandas API:
.asfreq() , .reindex()

.resample() + transformation method

MANIPULATING TIME SERIES DATA IN PYTHON

Getting started: quarterly data
dates = pd.date_range(start='2016', periods=4, freq='Q')
data = range(1, 5)
quarterly = pd.Series(data=data, index=dates)
quarterly

2016-03-31 1
2016-06-30 2
2016-09-30 3
2016-12-31 4
Freq: Q-DEC, dtype: int64 # Default: year-end quarters

MANIPULATING TIME SERIES DATA IN PYTHON

Upsampling: quarter => month
monthly = quarterly.asfreq('M') # to month-end frequency

2016-03-31 1.0
2016-04-30 NaN
2016-05-31 NaN
2016-06-30 2.0
2016-07-31 NaN
2016-08-31 NaN
2016-09-30 3.0
2016-10-31 NaN
2016-11-30 NaN
2016-12-31 4.0
Freq: M, dtype: float64

Upsampling creates missing values

monthly = monthly.to_frame('baseline') # to DataFrame

MANIPULATING TIME SERIES DATA IN PYTHON

Upsampling: fill methods
monthly['ffill'] = quarterly.asfreq('M', method='ffill')
monthly['bfill'] = quarterly.asfreq('M', method='bfill')
monthly['value'] = quarterly.asfreq('M', fill_value=0)

MANIPULATING TIME SERIES DATA IN PYTHON

Upsampling: fill methods
bfill : back ll

ffill : forward ll

baseline ffill bfill value

2016-03-31 1.0 1 1 1
2016-04-30 NaN 1 2 0
2016-05-31 NaN 1 2 0
2016-06-30 2.0 2 2 2
2016-07-31 NaN 2 3 0
2016-08-31 NaN 2 3 0
2016-09-30 3.0 3 3 3
2016-10-31 NaN 3 4 0
2016-11-30 NaN 3 4 0
2016-12-31 4.0 4 4 4

MANIPULATING TIME SERIES DATA IN PYTHON

Add missing months: .reindex()
dates = pd.date_range(start='2016', quarterly.reindex(dates)
periods=12,
freq='M')
2016-01-31 NaN
2016-02-29 NaN
DatetimeIndex(['2016-01-31', 2016-03-31 1.0
'2016-02-29', 2016-04-30 NaN
..., 2016-05-31 NaN
'2016-11-30', 2016-06-30 2.0
'2016-12-31'], 2016-07-31 NaN
dtype='datetime64[ns]', freq='M') 2016-08-31 NaN
2016-09-30 3.0
2016-10-31 NaN
.reindex() : 2016-11-30 NaN

conform DataFrame to 2016-12-31 4.0

new index

same lling logic as

.asfreq()

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Upsampling &
interpolation with
.resample()
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Frequency conversion & transformation methods
.resample() : similar to .groupby()

Groups data within resampling period and applies one or

several methods to each group

New date determined by o set - start, end, etc

Upsampling: ll from existing or interpolate values

Downsampling: apply aggregation to existing data

MANIPULATING TIME SERIES DATA IN PYTHON

Getting started: monthly unemployment rate
unrate = pd.read_csv('unrate.csv', parse_dates['Date'], index_col='Date')
unrate.info()

DatetimeIndex: 208 entries, 2000-01-01 to 2017-04-01

Data columns (total 1 columns):
UNRATE 208 non-null float64 # no frequency information
dtypes: float64(1)

unrate.head()

UNRATE
DATE
2000-01-01 4.0
2000-02-01 4.1
2000-03-01 4.0
2000-04-01 3.8
2000-05-01 4.0

Reporting date: 1st day of month

MANIPULATING TIME SERIES DATA IN PYTHON

Resampling Period & Frequency Offsets
Resample creates new date for frequency o set

Several alternatives to calendar month end

Frequency Alias Sample Date

Calendar Month End M 2017-04-30
Calendar Month Start MS 2017-04-01
Business Month End BM 2017-04-28
Business Month Start BMS 2017-04-03

MANIPULATING TIME SERIES DATA IN PYTHON

Resampling logic

MANIPULATING TIME SERIES DATA IN PYTHON

Resampling logic

MANIPULATING TIME SERIES DATA IN PYTHON

Assign frequency with .resample()
unrate.asfreq('MS').info()

DatetimeIndex: 208 entries, 2000-01-01 to 2017-04-01

Freq: MS
Data columns (total 1 columns):
UNRATE 208 non-null float64
dtypes: float64(1)

unrate.resample('MS') # creates Resampler object

DatetimeIndexResampler [freq=<MonthBegin>, axis=0, closed=left,

label=left, convention=start, base=0]

MANIPULATING TIME SERIES DATA IN PYTHON

Assign frequency with .resample()
unrate.asfreq('MS').equals(unrate.resample('MS').asfreq())

True

.resample() : returns data only when calling another method

MANIPULATING TIME SERIES DATA IN PYTHON

Quarterly real GDP growth
gdp = pd.read_csv('gdp.csv')
gdp.info()

DatetimeIndex: 69 entries, 2000-01-01 to 2017-01-01

Data columns (total 1 columns):
gpd 69 non-null float64 # no frequency info
dtypes: float64(1)

gdp.head(2)

gpd
DATE
2000-01-01 1.2
2000-04-01 7.8

MANIPULATING TIME SERIES DATA IN PYTHON

Interpolate monthly real GDP growth
gdp_1 = gdp.resample('MS').ffill().add_suffix('_ffill')

gpd_ffill
DATE
2000-01-01 1.2
2000-02-01 1.2
2000-03-01 1.2
2000-04-01 7.8

MANIPULATING TIME SERIES DATA IN PYTHON

Interpolate monthly real GDP growth
gdp_2 = gdp.resample('MS').interpolate().add_suffix('_inter')

gpd_inter
DATE
2000-01-01 1.200000
2000-02-01 3.400000
2000-03-01 5.600000
2000-04-01 7.800000

.interpolate() : nds points on straight line between

existing data

MANIPULATING TIME SERIES DATA IN PYTHON

Concatenating two DataFrames
df1 = pd.DataFrame([1, 2, 3], columns=['df1'])
df2 = pd.DataFrame([4, 5, 6], columns=['df2'])
pd.concat([df1, df2])

df1 df2
0 1.0 NaN
1 2.0 NaN
2 3.0 NaN
0 NaN 4.0
1 NaN 5.0
2 NaN 6.0

MANIPULATING TIME SERIES DATA IN PYTHON

Concatenating two DataFrames
pd.concat([df1, df2], axis=1)

df1 df2
0 1 4
1 2 5
2 3 6

axis=1 : concatenate horizontally

MANIPULATING TIME SERIES DATA IN PYTHON

Plot interpolated real GDP growth
pd.concat([gdp_1, gdp_2], axis=1).loc['2015':].plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Combine GDP growth & unemployment
pd.concat([unrate, gdp_inter], axis=1).plot();

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Downsampling &
aggregation
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Downsampling & aggregation methods
So far: upsampling, ll logic & interpolation

Now: downsampling
hour to day

day to month, etc

How to represent the existing values at the new date?

Mean, median, last value?

MANIPULATING TIME SERIES DATA IN PYTHON

Air quality: daily ozone levels
ozone = pd.read_csv('ozone.csv',
parse_dates=['date'],
index_col='date')
ozone.info()

DatetimeIndex: 6291 entries, 2000-01-01 to 2017-03-31

Data columns (total 1 columns):
Ozone 6167 non-null float64
dtypes: float64(1)

ozone = ozone.resample('D').asfreq()
ozone.info()

DatetimeIndex: 6300 entries, 1998-01-05 to 2017-03-31

Freq: D
Data columns (total 1 columns):
Ozone 6167 non-null float64
dtypes: float64(1)

MANIPULATING TIME SERIES DATA IN PYTHON

Creating monthly ozone data
ozone.resample('M').mean().head() ozone.resample('M').median().head()

Ozone Ozone
date date
2000-01-31 0.010443 2000-01-31 0.009486
2000-02-29 0.011817 2000-02-29 0.010726
2000-03-31 0.016810 2000-03-31 0.017004
2000-04-30 0.019413 2000-04-30 0.019866
2000-05-31 0.026535 2000-05-31 0.026018

.resample().mean() : Monthly
average, assigned to end of
calendar month

MANIPULATING TIME SERIES DATA IN PYTHON

Creating monthly ozone data
ozone.resample('M').agg(['mean', 'std']).head()

Ozone
mean std
date
2000-01-31 0.010443 0.004755
2000-02-29 0.011817 0.004072
2000-03-31 0.016810 0.004977
2000-04-30 0.019413 0.006574
2000-05-31 0.026535 0.008409

.resample().agg() : List of aggregation functions like

groupby

MANIPULATING TIME SERIES DATA IN PYTHON

Plotting resampled ozone data
ozone = ozone.loc['2016':]
ax = ozone.plot()
monthly = ozone.resample('M').mean()
monthly.add_suffix('_monthly').plot(ax=ax)

MANIPULATING TIME SERIES DATA IN PYTHON

Resampling multiple time series
data = pd.read_csv('ozone_pm25.csv',
parse_dates=['date'],
index_col='date')
data = data.resample('D').asfreq()
data.info()

DatetimeIndex: 6300 entries, 2000-01-01 to 2017-03-31

Freq: D
Data columns (total 2 columns):
Ozone 6167 non-null float64
PM25 6167 non-null float64
dtypes: float64(2)

MANIPULATING TIME SERIES DATA IN PYTHON

Resampling multiple time series
data = data.resample('BM').mean()
data.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 207 entries, 2000-01-31 to 2017-03-31
Freq: BM
Data columns (total 2 columns):
ozone 207 non-null float64
pm25 207 non-null float64
dtypes: float64(2)

MANIPULATING TIME SERIES DATA IN PYTHON

Resampling multiple time series
df.resample('M').first().head(4)

Ozone PM25
date
2000-01-31 0.005545 20.800000
2000-02-29 0.016139 6.500000
2000-03-31 0.017004 8.493333
2000-04-30 0.031354 6.889474

df.resample('MS').first().head()

Ozone PM25
date
2000-01-01 0.004032 37.320000
2000-02-01 0.010583 24.800000
2000-03-01 0.007418 11.106667
2000-04-01 0.017631 11.700000
2000-05-01 0.022628 9.700000

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Rolling window
functions with
pandas
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Window functions in pandas
Windows identify sub periods of your time series

Calculate metrics for sub periods inside the window

Create a new time series of metrics

Two types of windows:

Rolling: same size, sliding (this video)

Expanding: contain all prior values (next video)

MANIPULATING TIME SERIES DATA IN PYTHON

Calculating a rolling average
data = pd.read_csv('google.csv', parse_dates=['date'], index_col='date')

DatetimeIndex: 1761 entries, 2010-01-04 to 2016-12-30

Data columns (total 1 columns):
price 1761 non-null float64
dtypes: float64(1)

MANIPULATING TIME SERIES DATA IN PYTHON

Calculating a rolling average
# Integer-based window size
data.rolling(window=30).mean() # fixed # observations

DatetimeIndex: 1761 entries, 2010-01-04 to 2017-05-24

Data columns (total 1 columns):
price 1732 non-null float64
dtypes: float64(1)

window=30 : # business days

min_periods : choose value < 30 to get results for rst days

MANIPULATING TIME SERIES DATA IN PYTHON

Calculating a rolling average
# Offset-based window size
data.rolling(window='30D').mean() # fixed period length

DatetimeIndex: 1761 entries, 2010-01-04 to 2017-05-24

Data columns (total 1 columns):
price 1761 non-null float64
dtypes: float64(1)

30D : # calendar days

MANIPULATING TIME SERIES DATA IN PYTHON

90 day rolling mean
r90 = data.rolling(window='90D').mean()
google.join(r90.add_suffix('_mean_90')).plot()

MANIPULATING TIME SERIES DATA IN PYTHON

90 & 360 day rolling means
data['mean90'] = r90
r360 = data['price'].rolling(window='360D'.mean()
data['mean360'] = r360; data.plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Multiple rolling metrics (1)
r = data.price.rolling('90D').agg(['mean', 'std'])
r.plot(subplots = True)

MANIPULATING TIME SERIES DATA IN PYTHON

Multiple rolling metrics (2)
rolling = data.google.rolling('360D')
q10 = rolling.quantile(0.1).to_frame('q10')
median = rolling.median().to_frame('median')
q90 = rolling.quantile(0.9).to_frame('q90')
pd.concat([q10, median, q90], axis=1).plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Expanding window
functions with
pandas
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Expanding windows in pandas
From rolling to expanding windows

Calculate metrics for periods up to current date

New time series re ects all historical values

Useful for running rate of return, running min/max

Two options with pandas:

.expanding() - just like .rolling()

.cumsum() , .cumprod() , cummin() / max()

MANIPULATING TIME SERIES DATA IN PYTHON

The basic idea
df = pd.DataFrame({'data': range(5)})
df['expanding sum'] = df.data.expanding().sum()
df['cumulative sum'] = df.data.cumsum()
df

data expanding sum cumulative sum

0 0 0.0 0
1 1 1.0 1
2 2 3.0 3
3 3 6.0 6
4 4 10.0 10

MANIPULATING TIME SERIES DATA IN PYTHON

Get data for the S&P 500
data = pd.read_csv('sp500.csv', parse_dates=['date'], index_col='date')

DatetimeIndex: 2519 entries, 2007-05-24 to 2017-05-24

Data columns (total 1 columns):
SP500 2519 non-null float64

MANIPULATING TIME SERIES DATA IN PYTHON

How to calculate a running return
Single period return rt : current price over last price minus 1:
Pt
rt = −1
Pt−1
Multi-period return: product of (1 + rt ) for all periods,
minus 1:

RT = (1 + r1 )(1 + r2 )...(1 + rT ) − 1

For the period return: .pct_change()

For basic math .add() , .sub() , .mul() , .div()

For cumulative product: .cumprod()

MANIPULATING TIME SERIES DATA IN PYTHON

Running rate of return in practice
pr = data.SP500.pct_change() # period return
pr_plus_one = pr.add(1)
cumulative_return = pr_plus_one.cumprod().sub(1)
cumulative_return.mul(100).plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Getting the running min & max
data['running_min'] = data.SP500.expanding().min()
data['running_max'] = data.SP500.expanding().max()
data.plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Rolling annual rate of return
def multi_period_return(period_returns):
return np.prod(period_returns + 1) - 1
pr = data.SP500.pct_change() # period return
r = pr.rolling('360D').apply(multi_period_return)
data['Rolling 1yr Return'] = r.mul(100)
data.plot(subplots=True)

MANIPULATING TIME SERIES DATA IN PYTHON

Rolling annual rate of return
data['Rolling 1yr Return'] = r.mul(100)
data.plot(subplots=True)

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Case study: S&P500
price simulation
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Random walks & simulations
Daily stock returns are hard to predict

Models o en assume they are random in nature

Numpy allows you to generate random numbers

From random returns to prices: use .cumprod()

Two examples:
Generate random returns

Randomly selected actual SP500 returns

MANIPULATING TIME SERIES DATA IN PYTHON

Generate random numbers
from numpy.random import normal, seed
from scipy.stats import norm
seed(42)
random_returns = normal(loc=0, scale=0.01, size=1000)
sns.distplot(random_returns, fit=norm, kde=False)

MANIPULATING TIME SERIES DATA IN PYTHON

Create a random price path
return_series = pd.Series(random_returns)
random_prices = return_series.add(1).cumprod().sub(1)
random_prices.mul(100).plot()

MANIPULATING TIME SERIES DATA IN PYTHON

S&P 500 prices & returns
data = pd.read_csv('sp500.csv', parse_dates=['date'], index_col='date')
data['returns'] = data.SP500.pct_change()
data.plot(subplots=True)

MANIPULATING TIME SERIES DATA IN PYTHON

S&P return distribution
sns.distplot(data.returns.dropna().mul(100), fit=norm)

MANIPULATING TIME SERIES DATA IN PYTHON

Generate random S&P 500 returns
from numpy.random import choice
sample = data.returns.dropna()
n_obs = data.returns.count()
random_walk = choice(sample, size=n_obs)
random_walk = pd.Series(random_walk, index=sample.index)
random_walk.head()

DATE
2007-05-29 -0.008357
2007-05-30 0.003702
2007-05-31 -0.013990
2007-06-01 0.008096
2007-06-04 0.013120

MANIPULATING TIME SERIES DATA IN PYTHON

Random S&P 500 prices (1)
start = data.SP500.first('D')

DATE
2007-05-25 1515.73
Name: SP500, dtype: float64

sp500_random = start.append(random_walk.add(1))
sp500_random.head())

DATE
2007-05-25 1515.730000
2007-05-29 0.998290
2007-05-30 0.995190
2007-05-31 0.997787
2007-06-01 0.983853
dtype: float64

MANIPULATING TIME SERIES DATA IN PYTHON

Random S&P 500 prices (2)
data['SP500_random'] = sp500_random.cumprod()
data[['SP500', 'SP500_random']].plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Relationships
between time series:
correlation
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Correlation & relations between series
So far, focus on characteristics of individual variables

Now: characteristic of relations between variables

Correlation: measures linear relationships

Financial markets: important for prediction and risk

management

pandas & seaborn have tools to compute & visualize

MANIPULATING TIME SERIES DATA IN PYTHON

Correlation & linear relationships
Correlation coe cient: how similar is the pairwise movement
of two variables around their averages?
∑N (x −x̄)(yi − ȳ )
Varies between -1 and +1 r= i=1 i
sx sy

MANIPULATING TIME SERIES DATA IN PYTHON

Importing five price time series
data = pd.read_csv('assets.csv', parse_dates=['date'],
index_col='date')
data = data.dropna().info()

DatetimeIndex: 2469 entries, 2007-05-25 to 2017-05-22

Data columns (total 5 columns):
sp500 2469 non-null float64
nasdaq 2469 non-null float64
bonds 2469 non-null float64
gold 2469 non-null float64
oil 2469 non-null float64

MANIPULATING TIME SERIES DATA IN PYTHON

Visualize pairwise linear relationships
daily_returns = data.pct_change()
sns.jointplot(x='sp500', y='nasdaq', data=data_returns);

MANIPULATING TIME SERIES DATA IN PYTHON

Calculate all correlations
correlations = returns.corr()
correlations

bonds oil gold sp500 nasdaq

bonds 1.000000 -0.183755 0.003167 -0.300877 -0.306437
oil -0.183755 1.000000 0.105930 0.335578 0.289590
gold 0.003167 0.105930 1.000000 -0.007786 -0.002544
sp500 -0.300877 0.335578 -0.007786 1.000000 0.959990
nasdaq -0.306437 0.289590 -0.002544 0.959990 1.000000

MANIPULATING TIME SERIES DATA IN PYTHON

Visualize all correlations
sns.heatmap(correlations, annot=True)

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Select index
components &
import data
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Market value-weighted index
Composite performance of various stocks

Components weighted by market capitalization

Share Price x Number of Shares => Market Value

Larger components get higher percentage weightings

Key market indexes are value-weighted:

S&P 500 , NASDAQ , Wilshire 5000 , Hang Seng

MANIPULATING TIME SERIES DATA IN PYTHON

Build a cap-weighted Index
Apply new skills to construct value-weighted index
Select components from exchange listing data

Get component number of shares and stock prices

Calculate component weights

Calculate index

Evaluate performance of components and index

MANIPULATING TIME SERIES DATA IN PYTHON

Load stock listing data
nyse = pd.read_excel('listings.xlsx', sheet_name='nyse',
na_values='n/a')
nyse.info()

RangeIndex: 3147 entries, 0 to 3146

Data columns (total 7 columns):
Stock Symbol 3147 non-null object # Stock Ticker
Company Name 3147 non-null object
Last Sale 3079 non-null float64 # Latest Stock Price
Market Capitalization 3147 non-null float64
IPO Year 1361 non-null float64 # Year of listing
Sector 2177 non-null object
Industry 2177 non-null object
dtypes: float64(3), object(4)

MANIPULATING TIME SERIES DATA IN PYTHON

Load & prepare listing data
nyse.set_index('Stock Symbol', inplace=True)
nyse.dropna(subset=['Sector'], inplace=True)
nyse['Market Capitalization'] /= 1e6 # in Million USD

Index: 2177 entries, DDD to ZTO

Data columns (total 6 columns):
Company Name 2177 non-null object
Last Sale 2175 non-null float64
Market Capitalization 2177 non-null float64
IPO Year 967 non-null float64
Sector 2177 non-null object
Industry 2177 non-null object
dtypes: float64(3), object(3)

MANIPULATING TIME SERIES DATA IN PYTHON

Select index components
components = nyse.groupby(['Sector'])['Market Capitalization'].nlargest(1)
components.sort_values(ascending=False)

Sector Stock Symbol

Health Care JNJ 338834.390080
Energy XOM 338728.713874
Finance JPM 300283.250479
Miscellaneous BABA 275525.000000
Public Utilities T 247339.517272
Basic Industries PG 230159.644117
Consumer Services WMT 221864.614129
Consumer Non-Durables KO 183655.305119
Technology ORCL 181046.096000
Capital Goods TM 155660.252483
Transportation UPS 90180.886756
Consumer Durables ABB 48398.935676
Name: Market Capitalization, dtype: float64

MANIPULATING TIME SERIES DATA IN PYTHON

Import & prepare listing data
tickers = components.index.get_level_values('Stock Symbol')
tickers

Index(['PG', 'TM', 'ABB', 'KO', 'WMT', 'XOM', 'JPM', 'JNJ', 'BABA', 'T',
'ORCL', ‘UPS'], dtype='object', name='Stock Symbol’)

tickers.tolist()

['PG',
'TM',
'ABB',
'KO',
'WMT',
...
'T',
'ORCL',
'UPS']

MANIPULATING TIME SERIES DATA IN PYTHON

Stock index components
columns = ['Company Name', 'Market Capitalization', 'Last Sale']
component_info = nyse.loc[tickers, columns]
pd.options.display.float_format = '{:,.2f}'.format

Company Name Market Capitalization Last Sale

Stock Symbol
PG Procter & Gamble Company (The) 230,159.64 90.03
TM Toyota Motor Corp Ltd Ord 155,660.25 104.18
ABB ABB Ltd 48,398.94 22.63
KO Coca-Cola Company (The) 183,655.31 42.79
WMT Wal-Mart Stores, Inc. 221,864.61 73.15
XOM Exxon Mobil Corporation 338,728.71 81.69
JPM J P Morgan Chase & Co 300,283.25 84.40
JNJ Johnson & Johnson 338,834.39 124.99
BABA Alibaba Group Holding Limited 275,525.00 110.21
T AT&T Inc. 247,339.52 40.28
ORCL Oracle Corporation 181,046.10 44.00
UPS United Parcel Service, Inc. 90,180.89 103.74

MANIPULATING TIME SERIES DATA IN PYTHON

Import & prepare listing data
data = pd.read_csv('stocks.csv', parse_dates=['Date'],
index_col='Date').loc[:, tickers.tolist()]
data.info()

DatetimeIndex: 252 entries, 2016-01-04 to 2016-12-30

Data columns (total 12 columns):
ABB 252 non-null float64
BABA 252 non-null float64
JNJ 252 non-null float64
JPM 252 non-null float64
KO 252 non-null float64
ORCL 252 non-null float64
PG 252 non-null float64
T 252 non-null float64
TM 252 non-null float64
UPS 252 non-null float64
WMT 252 non-null float64
XOM 252 non-null float64
dtypes: float64(12)

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Build a market-cap
weighted index
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Build your value-weighted index
Key inputs:
number of shares

stock price series

MANIPULATING TIME SERIES DATA IN PYTHON

Build your value-weighted index
Key inputs:
number of shares

stock price series

Normalize index to start

at 100

MANIPULATING TIME SERIES DATA IN PYTHON

Stock index components
components

Company Name Market Capitalization Last Sale

MANIPULATING TIME SERIES DATA IN PYTHON

Number of shares outstanding
shares = components['Market Capitalization'].div(components['Last Sale'])

Stock Symbol
PG 2,556.48 # Outstanding shares in million
TM 1,494.15
ABB 2,138.71
KO 4,292.01
WMT 3,033.01
XOM 4,146.51
JPM 3,557.86
JNJ 2,710.89
BABA 2,500.00
T 6,140.50
ORCL 4,114.68
UPS 869.30
dtype: float64

Market Capitalization = Number of Shares x Share Price

MANIPULATING TIME SERIES DATA IN PYTHON

Historical stock prices
data = pd.read_csv('stocks.csv', parse_dates=['Date'],
index_col='Date').loc[:, tickers.tolist()]
market_cap_series = data.mul(no_shares)
market_series.info()

DatetimeIndex: 252 entries, 2016-01-04 to 2016-12-30

Data columns (total 12 columns):
ABB 252 non-null float64
BABA 252 non-null float64
JNJ 252 non-null float64
JPM 252 non-null float64
...
TM 252 non-null float64
UPS 252 non-null float64
WMT 252 non-null float64
XOM 252 non-null float64
dtypes: float64(12)

MANIPULATING TIME SERIES DATA IN PYTHON

From stock prices to market value
market_cap_series.first('D').append(market_cap_series.last('D'))

ABB BABA JNJ JPM KO ORCL \\

Date
2016-01-04 37,470.14 191,725.00 272,390.43 226,350.95 181,981.42 147,099.95
2016-12-30 45,062.55 219,525.00 312,321.87 307,007.60 177,946.93 158,209.60
PG T TM UPS WMT XOM
Date
2016-01-04 200,351.12 210,926.33 181,479.12 82,444.14 186,408.74 321,188.96
2016-12-30 214,948.60 261,155.65 175,114.05 99,656.23 209,641.59 374,264.34

MANIPULATING TIME SERIES DATA IN PYTHON

Aggregate market value per period
agg_mcap = market_cap_series.sum(axis=1) # Total market cap
agg_mcap(title='Aggregate Market Cap')

MANIPULATING TIME SERIES DATA IN PYTHON

Value-based index
index = agg_mcap.div(agg_mcap.iloc[0]).mul(100) # Divide by 1st value
index.plot(title='Market-Cap Weighted Index')

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Evaluate index
performance
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Evaluate your value-weighted index
Index return:
Total index return

Contribution by component

Performance vs Benchmark
Total period return

Rolling returns for sub periods

MANIPULATING TIME SERIES DATA IN PYTHON

Value-based index - recap
agg_market_cap = market_cap_series.sum(axis=1)
index = agg_market_cap.div(agg_market_cap.iloc[0]).mul(100)
index.plot(title='Market-Cap Weighted Index')

MANIPULATING TIME SERIES DATA IN PYTHON

Value contribution by stock
agg_market_cap.iloc[-1] - agg_market_cap.iloc[0]

315,037.71

MANIPULATING TIME SERIES DATA IN PYTHON

Value contribution by stock
change = market_cap_series.first('D').append(market_cap_series.last('D'))
change.diff().iloc[-1].sort_values() # or: .loc['2016-12-30']

TM -6,365.07
KO -4,034.49
ABB 7,592.41
ORCL 11,109.65
PG 14,597.48
UPS 17,212.08
WMT 23,232.85
BABA 27,800.00
JNJ 39,931.44
T 50,229.33
XOM 53,075.38
JPM 80,656.65
Name: 2016-12-30 00:00:00, dtype: float64

MANIPULATING TIME SERIES DATA IN PYTHON

Market-cap based weights
market_cap = components['Market Capitalization']
weights = market_cap.div(market_cap.sum())
weights.sort_values().mul(100)

Stock Symbol
ABB 1.85
UPS 3.45
TM 5.96
ORCL 6.93
KO 7.03
WMT 8.50
PG 8.81
T 9.47
BABA 10.55
JPM 11.50
XOM 12.97
JNJ 12.97
Name: Market Capitalization, dtype: float64

MANIPULATING TIME SERIES DATA IN PYTHON

Value-weighted component returns
index_return = (index.iloc[-1] / index.iloc[0] - 1) * 100

14.06

weighted_returns = weights.mul(index_return)
weighted_returns.sort_values().plot(kind='barh')

MANIPULATING TIME SERIES DATA IN PYTHON

Performance vs benchmark
data = index.to_frame('Index') # Convert pd.Series to pd.DataFrame
data['SP500'] = pd.read_csv('sp500.csv', parse_dates=['Date'],
index_col='Date')
data.SP500 = data.SP500.div(data.SP500.iloc[0], axis=0).mul(100)

MANIPULATING TIME SERIES DATA IN PYTHON

Performance vs benchmark: 30D rolling return
def multi_period_return(r):
return (np.prod(r + 1) - 1) * 100
data.pct_change().rolling('30D').apply(multi_period_return).plot()

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Index correlation &
exporting to Excel
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Some additional analysis of your index
Daily return correlations:

Calculate among all components

Visualize the result as heatmap

Write results to excel using .xls and .xlsx formats:

Single worksheet

Multiple worksheets

MANIPULATING TIME SERIES DATA IN PYTHON

Index components - price data
data = DataReader(tickers, 'google', start='2016', end='2017')['Close']
data.info()

DatetimeIndex: 252 entries, 2016-01-04 to 2016-12-30

MANIPULATING TIME SERIES DATA IN PYTHON

Index components: return correlations
daily_returns = data.pct_change()
correlations = daily_returns.corr()

ABB BABA JNJ JPM KO ORCL PG T TM UPS WMT XOM

ABB 1.00 0.40 0.33 0.56 0.31 0.53 0.34 0.29 0.48 0.50 0.15 0.48
BABA 0.40 1.00 0.27 0.27 0.25 0.38 0.21 0.17 0.34 0.35 0.13 0.21
JNJ 0.33 0.27 1.00 0.34 0.30 0.37 0.42 0.35 0.29 0.45 0.24 0.41
JPM 0.56 0.27 0.34 1.00 0.22 0.57 0.27 0.13 0.49 0.56 0.14 0.48
KO 0.31 0.25 0.30 0.22 1.00 0.31 0.62 0.47 0.33 0.50 0.25 0.29
ORCL 0.53 0.38 0.37 0.57 0.31 1.00 0.41 0.32 0.48 0.54 0.21 0.42
PG 0.34 0.21 0.42 0.27 0.62 0.41 1.00 0.43 0.32 0.47 0.33 0.34
T 0.29 0.17 0.35 0.13 0.47 0.32 0.43 1.00 0.28 0.41 0.31 0.33
TM 0.48 0.34 0.29 0.49 0.33 0.48 0.32 0.28 1.00 0.52 0.20 0.30
UPS 0.50 0.35 0.45 0.56 0.50 0.54 0.47 0.41 0.52 1.00 0.33 0.45
WMT 0.15 0.13 0.24 0.14 0.25 0.21 0.33 0.31 0.20 0.33 1.00 0.21
XOM 0.48 0.21 0.41 0.48 0.29 0.42 0.34 0.33 0.30 0.45 0.21 1.00

MANIPULATING TIME SERIES DATA IN PYTHON

Index components: return correlations
sns.heatmap(correlations, annot=True)
plt.xticks(rotation=45)
plt.title('Daily Return Correlations')

MANIPULATING TIME SERIES DATA IN PYTHON

Saving to a single Excel worksheet
correlations.to_excel(excel_writer= 'correlations.xls',
sheet_name='correlations',
startrow=1,
startcol=1)

MANIPULATING TIME SERIES DATA IN PYTHON

Saving to multiple Excel worksheets
data.index = data.index.date # Keep only date component
with pd.ExcelWriter('stock_data.xlsx') as writer:
corr.to_excel(excel_writer=writer, sheet_name='correlations')
data.to_excel(excel_writer=writer, sheet_name='prices')
data.pct_change().to_excel(writer, sheet_name='returns')

MANIPULATING TIME SERIES DATA IN PYTHON

Let's practice!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Congratulations!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N

Stefan Jansen
Founder & Lead Data Scientist at
Applied Arti cial Intelligence
Congratulations!
M A N I P U L AT I N G T I M E S E R I E S D ATA I N P Y T H O N
Reading, inspecting,
and cleaning data
from CSV
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Import and clean data
Ensure that pd.DataFrame() is same as CSV source file
Stock exchange listings: amex-listings.csv

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

How pandas stores data
Each column has its own data format ( dtype )

dtype affects your calculation and visualization

pandas dtype Column characteristics

object Text, or a mix of text and numeric data

int64 Numeric: whole numbers - 64 bits (≤ 264 )

float64 Numeric: Decimals, or whole numbers with missing values

datetime64 Date and time information

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Import & inspect
import pandas as pd
amex = pd.read_csv('amex-listings.csv')
amex.info() # To inspect table structure & data types

RangeIndex: 360 entries, 0 to 359

Data columns (total 8 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 Stock Symbol 360 non-null object
1 Company Name 360 non-null object
2 Last Sale 346 non-null float64
3 Market Capitalization 360 non-null float64
4 IPO Year 105 non-null float64
5 Sector 238 non-null object
6 Industry 238 non-null object
7 Last Update 360 non-null object
dtypes: float64(3), object(5)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Dealing with missing values
# Replace 'n/a' with np.nan
amex = pd.read_csv('amex-listings.csv', na_values='n/a')
amex.info()

RangeIndex: 360 entries, 0 to 359

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Properly parsing dates
amex = pd.read_csv('amex-listings.csv',
na_values='n/a',
parse_dates=['Last Update'])
amex.info()

RangeIndex: 360 entries, 0 to 359

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Showing off the result
amex.head(2) # Show first n rows (default: 5)

Stock Symbol Company Name

0 XXII 22nd Century Group, Inc
1 FAX Aberdeen Asia-Pacific Income Fund Inc

Last Sale Market Capitalization IPO Year

0 1.3300 1.206285e+08 NaN
1 5.0000 1.266333e+09 1986.0

Sector Industry Last Update

0 Non-Durables Farming/Seeds/Milling 2017-04-26
1 NaN NaN 2017-04-25

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Read data from
Excel worksheets
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Import data from Excel

pd.read_excel(file, sheet_name=0)
Select first sheet by default with sheet_name=0

Select by name with sheet_name='amex'

Import several sheets with list such as sheet_name=['amex', 'nasdaq']

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Import data from one sheet
amex = pd.read_excel('listings.xlsx',
sheet_name='amex',
na_values='n/a')
amex.info()

RangeIndex: 360 entries, 0 to 359

Data columns (total 7 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 Stock Symbol 360 non-null object
1 Company Name 360 non-null object
2 Last Sale 346 non-null float64
3 Market Capitalization 360 non-null float64
4 IPO Year 105 non-null float64

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Import data from two sheets
listings = pd.read_excel('listings.xlsx',
sheet_name=['amex', 'nasdaq'], # keys = sheet name
na_values='n/a') # values = DataFrame
listings['nasdaq'].info()

# Column Non-Null Count Dtype

-- ------ -------------- -----
0 Stock Symbol 3167 non-null object
1 Company Name 3167 non-null object
2 Last Sale 3165 non-null float64
3 Market Capitalization 3167 non-null float64
4 IPO Year 1386 non-null float64
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get sheet names
xls = pd.ExcelFile('listings.xlsx') # pd.ExcelFile object
exchanges = xls.sheet_names
exchanges

['amex', 'nasdaq', 'nyse']

nyse = pd.read_excel(xls,
sheet_name=exchanges[2],
na_values='n/a')

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get sheet names
nyse.info()

RangeIndex: 3147 entries, 0 to 3146

Data columns (total 7 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 Stock Symbol 3147 non-null object
1 Company Name 3147 non-null object
... ...
6 Industry 2177 non-null object
dtypes: float64(3), object(4)
memory usage: 172.2+ KB

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Combine data from
multiple worksheets
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Combine DataFrames
Concatenate or "stack" a list of pd.DataFrame s
Syntax: pd.concat([amex, nasdaq, nyse])

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Combine DataFrames
Concatenate or "stack" a list of pd.DataFrame s
Syntax: pd.concat([amex, nasdaq, nyse])

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Combine DataFrames
Concatenate or "stack" a list of pd.DataFrame s
Syntax: pd.concat([amex, nasdaq, nyse])

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Concatenate two DataFrames
amex = pd.read_excel('listings.xlsx',
sheet_name='amex',
na_values=['n/a'])
nyse = pd.read_excel('listings.xlsx',
sheet_name='nyse',
na_values=['n/a'])
pd.concat([amex, nyse]).info()

Int64Index: 3507 entries, 0 to 3146

Data columns (total 7 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 Stock Symbol 3507 non-null object
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Add a reference column
amex['Exchange'] = 'AMEX' # Add column to reference source
nyse['Exchange'] = 'NYSE'
listings = pd.concat([amex, nyse])
listings.head(2)

Stock Symbol ... Exchange

0 XXII ... AMEX
1 FAX ... AMEX

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Combine three DataFrames
xls = pd.ExcelFile('listings.xlsx')
exchanges = xls.sheet_names
# Create empty list to collect DataFrames
listings = []
for exchange in exchanges:
listing = pd.read_excel(xls, sheet_name=exchange)
# Add reference col
listing['Exchange'] = exchange
# Add DataFrame to list
listings.append(listing)
# List of DataFrames
combined_listings = pd.concat(listings)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Combine three DataFrames
combined_listings.info()

Int64Index: 6674 entries, 0 to 3146

Data columns (total 8 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 Stock Symbol 6674 non-null object
1 Company Name 6674 non-null object
2 Last Sale 6590 non-null float64
3 Market Capitalization 6674 non-null float64
4 IPO Year 2852 non-null float64
5 Sector 5182 non-null object
6 Industry 5182 non-null object
7 Exchange 6674 non-null object
dtypes: float64(3), object(5)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
The DataReader:
Access financial
data online
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
pandas_datareader
Easy access to various financial internet data sources
Little code needed to import into a pandas DataFrame

Available sources include:

IEX and Yahoo! Finance (including derivatives)

Federal Reserve

World Bank, OECD, Eurostat

OANDA

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Stock prices: Yahoo! Finance
from pandas_datareader.data import DataReader
from datetime import date # Date & time functionality

start = date(2015, 1, 1) # Default: Jan 1, 2010

end = date(2016, 12, 31) # Default: today
ticker = 'GOOG'
data_source = 'yahoo'
stock_data = DataReader(ticker, data_source, start, end)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Stock prices: Yahoo! Finance
stock_data.info()

DatetimeIndex: 504 entries, 2015-01-02 to 2016-12-30

Data columns (total 6 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 High 504 non-null float64 # First price
1 Low 504 non-null float64 # Highest price
2 Open 504 non-null float64 # Lowest price
3 Close 504 non-null float64 # Last price
4 Volume 504 non-null float64 # No shares traded
5 Adj Close 504 non-null float64 # Adj. price
dtypes: float64(6)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Stock prices: Yahoo! Finance
pd.concat([stock_data.head(3), stock_data.tail(3)])

High Low Open Close Volume Adj Close

Date
2015-01-02 26.49 26.13 26.38 26.17 28951268 26.17
2015-01-05 26.14 25.58 26.09 25.62 41196796 25.62
2015-01-06 25.74 24.98 25.68 25.03 57998800 25.03
2016-12-28 39.71 39.16 39.69 39.25 23076000 39.25
2016-12-29 39.30 38.95 39.17 39.14 14886000 39.14
2016-12-30 39.14 38.52 39.14 38.59 35400000 38.59

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Stock prices: Visualization
import matplotlib.pyplot as plt
stock_data['Close'].plot(title=ticker)
plt.show()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Economic data from
the Federal Reserve
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Economic data from FRED

Federal Reserve Economic Data

500,000 series covering a range of categories:

Economic growth & employment

Monetary & fiscal policy

Demographics, industries, commodity prices

Daily, monthly, annual frequencies

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get data from FRED

1 https://fred.stlouisfed.org/

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get data from FRED

1 https://fred.stlouisfed.org/

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get data from FRED

1 https://fred.stlouisfed.org/

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Interest rates
from pandas_datareader.data import DataReader
from datetime import date
series_code = 'DGS10' # 10-year Treasury Rate
data_source = 'fred' # FED Economic Data Service
start = date(1962, 1, 1)
data = DataReader(series_code, data_source, start)
data.info()

DatetimeIndex: 15754 entries, 1962-01-02 to 2022-05-20

Data columns (total 1 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 DGS10 15083 non-null float64
dtypes: float64(1)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Stock prices: Visualization
.rename(columns={old_name: new_name})

series_name = '10-year Treasury'

data = data.rename(columns={series_code: series_name})
data.plot(title=series_name); plt.show()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Combine stock and economic data
start = date(2000, 1, 1)
series = 'DCOILWTICO' # West Texas Intermediate Oil Price
oil = DataReader(series, 'fred', start)
ticker = 'XOM' # Exxon Mobile Corporation
stock = DataReader(ticker, 'yanoo', start)
data = pd.concat([stock[['Close']], oil], axis=1)
data.info()

DatetimeIndex: 5841 entries, 2000-01-03 to 2022-05-23

Data columns (total 2 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 Close 5634 non-null float64
1 DCOILWTICO 5615 non-null float64

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Combine stock and economic data
data.columns = ['Exxon', 'Oil Price']
data.plot()
plt.show()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Select stocks and
get data from
Yahoo! Finance
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Select stocks based on criteria
Use the listing information to select specific stocks
As criteria:
Stock Exchange

Sector or Industry

IPO Year

Market Capitalization

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get ticker for largest company
nyse = pd.read_excel('listings.xlsx',sheet_name='nyse', na_values='n/a')
nyse = nyse.sort_values('Market Capitalization', ascending=False)
nyse[['Stock Symbol', 'Company Name']].head(3)

Stock Symbol Company Name

1586 JNJ Johnson & Johnson
1125 XOM Exxon Mobil Corporation
1548 JPM J P Morgan Chase & Co

largest_by_market_cap = nyse.iloc[0] # 1st row

largest_by_market_cap['Stock Symbol'] # Select row label

'JNJ'

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get ticker for largest company
nyse = nyse.set_index('Stock Symbol') # Stock ticker as index
nyse.info()

Index: 3147 entries, JNJ to EAE

Data columns (total 6 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 Company Name 3147 non-null object
1 Last Sale 3079 non-null float64
2 Market Capitalization 3147 non-null float64
...

nyse['Market Capitalization'].idxmax() # Index of max value

'JNJ'

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get ticker for largest tech company
nyse['Sector'].unique() # Unique values as numpy array

array(['Technology', 'Health Care', ...], dtype=object)

tech = nyse.loc[nyse.Sector == 'Technology']

tech['Company Name'].head(2)

Stock Symbol Company Name

ORCL Oracle Corporation
TSM Taiwan Semiconductor Manufacturing

nyse.loc[nyse.Sector=='Technology', 'Market Capitalization'].idxmax()

'ORCL'

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Get data for largest tech company with 2017 IPO
ticker = nyse.loc[(nyse.Sector=='Technology') & (nyse['IPO Year']==2017),
'Market Capitalization'].idxmax()
data = DataReader(ticker, 'yahoo') # Start: 2010/1/1
data = data.loc[:, ['Close', 'Volume']]

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Visualize price and volume on two axes
import matplotlib.pyplot as plt
data.plot(title=ticker, secondary_y='Volume')
plt.tight_layout(); plt.show()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Get several stocks &
manage a
MultiIndex
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Get data for several stocks
Use the listing information to select multiple stocks
E.g. largest 3 stocks per sector

Use Yahoo! Finance to retrieve data for several stocks

Learn how to manage a pandas MultiIndex , a powerful tool to deal with more complex
data sets

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Load prices for top 5 companies
nasdaq = pd.read_excel('listings.xlsx', sheet_name='nasdaq', na_values='n/a')
nasdaq.set_index('Stock Symbol', inplace=True)
top_5 = nasdaq['Market Capitalization'].nlargest(n=5) # Top 5
top_5.div(1000000) # Market Cap in million USD

AAPL 740024.467000
GOOG 569426.124504
... ...
Name: Market Capitalization, dtype: float64

tickers = top_5.index.tolist() # Convert index to list

['AAPL', 'GOOG', 'MSFT', 'AMZN', 'FB']

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Load prices for top 5 companies
df = DataReader(tickers, 'yahoo', start=date(2020, 1, 1))

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 712 entries, 2020-01-02 to 2022-10-27
Data columns (total 30 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 (Adj Close, AAPL) 712 non-null float64
1 (Adj Close, GOOG) 712 non-null float64
2 (Adj Close, MSFT) 712 non-null float64
...
28 (Volume, AMZN) 712 non-null float64
29 (Volume, FB) 253 non-null float64
dtypes: float64(30)
memory usage: 172.4 KB

df = df.stack()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Load prices for top 5 companies
df.info()

MultiIndex: 3101 entries, (Timestamp('2020-01-02 00:00:00'), 'AAPL') to (Timestamp('

Data columns (total 6 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 Adj Close 3101 non-null float64
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Reshape your data: .unstack()
unstacked = df['Close'].unstack()
unstacked.info()

DatetimeIndex: 712 entries, 2020-01-02 to 2022-10-27

Data columns (total 5 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 AAPL 712 non-null float64
1 GOOG 712 non-null float64
2 MSFT 712 non-null float64
3 AMZN 712 non-null float64
4 FB 253 non-null float64
dtypes: float64(5)
memory usage: 33.4 KB

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

From long to wide format
unstacked = df['Close'].unstack() # Results in DataFrame

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Stock prices: Visualization
unstacked.plot(subplots=True)
plt.tight_layout(); plt.show()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Summarize your
data with
descriptive stats
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Be on top of your data
Goal: Capture key quantitative characteristics
Important angles to look at:
Central tendency: Which values are "typical"?

Dispersion: Are there outliers?

Overall distribution of individual variables

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Central tendency
n
1
Mean (average): x̄ = ∑ xi
n
i=1
Median: 50% of values smaller/larger

Mode: most frequent value

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Central tendency
n
1
Mean (average): x̄ = ∑ xi
n
i=1
Median: 50% of values smaller/larger

Mode: most frequent value

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Central tendency
n
1
Mean (average): x̄ = ∑ xi
n
i=1
Median: 50% of values smaller/larger

Mode: most frequent value

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Calculate summary statistics
nasdaq = pd.read_excel('listings.xlsx', sheet_name='nasdaq', na_values='n/a')
market_cap = nasdaq['Market Capitalization'].div(10**6)

market_cap.mean()

3180.7126214953805

market_cap.median()

225.9684285

market_cap.mode()

0.0

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Calculate summary statistics

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Dispersion
Variance: Sum all of the squared differences from mean and divide by n − 1
n
1
var = ∑(xi − x̄)2
n−1
i=1
Standard deviation: Square root of variance
sd = √var

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Calculate variance and standard deviation
variance = market_cap.var()
print(variance)

648773812.8182

np.sqrt(variance)

25471.0387

market_cap.std()

25471.0387

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Describe the
distribution of your
data with quantiles
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Describe data distributions
First glance: Central tendency and standard deviation
How to get a more granular view of the distribution?

Calculate and plot quantiles

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

More on dispersion: quantiles
Quantiles: Groups with equal share of observations
Quartiles: 4 groups, 25% of data each

Deciles: 10 groups, 10% of data each

Interquartile range: 3rd quartile - 1st quartile

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Quantiles with pandas
market_cap = nasdaq['Market Capitalization'].div(10**6)
median = market_cap.quantile(.5)
median == market_cap.median()

True

quantiles = market_cap.quantile([.25, .75])

0.25 43.375930
0.75 969.905207

quantiles[.75] - quantiles[.25] # Interquartile Range

926.5292771575

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Quantiles with pandas & numpy
deciles = np.arange(start=.1, stop=.91, step=.1)
deciles

array([ 0.1, 0.2, 0.3, 0.4, ..., 0.7, 0.8, 0.9])

market_cap.quantile(deciles)

0.1 4.884565
0.2 26.993382
0.3 65.714547
0.4 124.320644
0.5 225.968428
0.6 402.469678
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Visualize quantiles with bar chart
title = 'NASDAQ Market Capitalization (million USD)'
market_cap.quantile(deciles).plot(kind='bar', title=title)
plt.tight_layout(); plt.show()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

All statistics in one go
market_cap.describe()

count 3167.000000
mean 3180.712621
std 25471.038707
min 0.000000
25% 43.375930 # 1st quantile
50% 225.968428 # Median
75% 969.905207 # 3rd quantile
max 740024.467000
Name: Market Capitalization

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

All statistics in one go
market_cap.describe(percentiles=np.arange(.1, .91, .1))

count 3167.000000
mean 3180.712621
std 25471.038707
min 0.000000
10% 4.884565
20% 26.993382
30% 65.714547
40% 124.320644
50% 225.968428
60% 402.469678
70% 723.163197
80% 1441.071134
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Visualize the
distribution of your
data
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Always look at your data!
Identical metrics can represent very different data

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Introducing seaborn plots
Many attractive and insightful statistical plots
Based on matplotlib

Swiss Army knife: seaborn.distplot()

Histogram

Kernel Density Estimation (KDE)

Rugplot

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

10 year treasury: trend and distribution
ty10 = web.DataReader('DGS10', 'fred', date(1962, 1, 1))
ty10.info()

DatetimeIndex: 15754 entries, 1962-01-02 to 2022-05-20

Data columns (total 1 columns):
# Column Non-Null Count Dtype
-- ------ -------------- -----
0 DGS10 15083 non-null float64

ty10.describe()

DGS10
mean 6.291073
std 2.851161
min 1.370000
25% 4.190000
50% 6.040000
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

10 year treasury: time series trend
ty10.dropna(inplace=True) # Avoid creation of copy
ty10.plot(title='10-year Treasury'); plt.tight_layout()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

10 year treasury: historical distribution
import seaborn as sns
sns.distplot(ty10)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

10 year treasury: trend and distribution
ax = sns.distplot(ty10)
ax.axvline(ty10['DGS10'].median(), color='black', ls='--')

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Summarize
categorical
variables
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
From categorical to quantitative variables
So far, we have analyzed quantitative variables
Categorical variables require a different approach

Concepts like average don't make much sense

Instead, we'll rely on their frequency distribution

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Categorical listing information
amex = pd.read_excel('listings.xlsx', sheet_name='amex',
na_values=['n/a'])
amex.info()

RangeIndex: 360 entries, 0 to 359

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Categorical listing information
amex = amex['Sector'].nunique()

apply() : call function on each column

lambda : "anonymous function", receives each column as argument x

amex.Sector.apply(lambda x: x.nunique())

Stock Symbol 360

Company Name 326
Last Sale 323
Market Capitalization 317
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

How many observations per sector?
amex['Sector'].value_counts()

Health Care 49 # Mode

Basic Industries 44
Energy 28
Consumer Services 27
Capital Goods 24
Technology 20
Consumer Non-Durables 13
Finance 12
Public Utilities 11
Miscellaneous 5
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

How many IPOs per year?
amex['IPO Year'].value_counts()

2002.0 19 # Mode
2015.0 11
1999.0 9
1993.0 7
2014.0 6
2013.0 5
2017.0 5
...
2009.0 1
1990.0 1
1991.0 1
Name: IPO Year, dtype: int64

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Convert IPO Year to int
ipo_by_yr = amex['IPO Year'].dropna().astype(int).value_counts()
ipo_by_yr

2002 19
2015 11
1999 9
1993 7
2014 6
2004 5
2003 5
2017 5
...
1987 1
Name: IPO Year, dtype: int64

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Convert IPO Year to int
ipo_by_yr.plot(kind='bar', title='IPOs per Year')
plt.xticks(rotation=45)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Aggregate your
data by category
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Summarize numeric data by category
So far: Summarize individual variables
Compute descriptive statistic like mean, quantiles

Split data into groups, then summarize groups

Examples:
Largest company by exchange

Median market capitalization per IPO year

Average market capitalization per sector

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Group your data by sector
nasdaq.info()

RangeIndex: 3167 entries, 0 to 3166

Data columns (total 7 columns):
# Column Non-Null Count Dtype
-- --- -------------- -----
0 Stock Symbol 3167 non-null object
1 Company Name 3167 non-null object
2 Last Sale 3165 non-null float64
3 Market Capitalization 3167 non-null float64
4 IPO Year 1386 non-null float64
5 Sector 2767 non-null object
6 Industry 2767 non-null object
dtypes: float64(3), object(4)
memory usage: 173 3+ KB

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Group your data by sector
nasdaq['market_cap_m'] = nasdaq['Market Capitalization'].div(1e6)
nasdaq = nasdaq.drop('Market Capitalization', axis=1) # Drop column
nasdaq_by_sector = nasdaq.groupby('Sector') # Create groupby object
for sector, data in nasdaq_by_sector:
print(sector, data.market_cap_m.mean())

Basic Industries 724.899933858

Capital Goods 1511.23737278
Consumer Durables 839.802606627
Consumer Non-Durables 3104.05120552
...
Public Utilities 2357.86531507
Technology 10883.4342135
Transportation 2869.66000673

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Keep it simple and skip the loop
mcap_by_sector = nasdaq_by_sector.market_cap_m.mean()
mcap_by_sector

Sector
Basic Industries 724.899934
Capital Goods 1511.237373
Consumer Durables 839.802607
Consumer Non-Durables 3104.051206
Consumer Services 5582.344175
Energy 826.607608
Finance 1044.090205
Health Care 1758.709197
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Visualize category summaries
title = 'NASDAQ = Avg. Market Cap by Sector'
mcap_by_sector.plot(kind='barh', title=title)
plt.xlabel('USD mn')

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Aggregate summary for all numeric columns
nasdaq_by_sector.mean()

Last Sale IPO Year market_cap_m

Sector
Basic Industries 21.597679 2000.766667 724.899934
Capital Goods 26.188681 2001.324675 1511.237373
Consumer Durables 24.363391 2003.222222 839.802607
Consumer Non-Durables 25.749565 2000.609756 3104.051206
Consumer Services 34.917318 2004.104575 5582.344175
Energy 15.496834 2008.034483 826.607608
Finance 29.644242 2010.321101 1044.090205
Health Care 19.462531 2009.240409 1758.709197
Miscellaneous 46.094369 2004.333333 3445.655935
Public Utilities 18.643705 2006.040000 2357.865315
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
More ways to
aggregate your
data
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Many ways to aggregate
Last segment: Group by one variable and aggregate

More detailed ways to summarize your data:

Group by two or more variables

Apply multiple aggregations

Examples
Median market cap by sector and IPO year

Mean & standard deviation of stock price by year

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Several aggregations by category
nasdaq['market_cap_m'] = nasdaq['Market Capitalization'].div(1e6)
by_sector = nasdaq.groupby('Sector')
by_sector.market_cap_m.agg(['size', 'mean']).sort_values('size')

Sector size mean

Transportation 52 2869.660007
Energy 66 826.607608
Public Utilities 66 2357.865315
Basic Industries 78 724.899934
...
Consumer Services 348 5582.344175
Technology 433 10883.434214
Finance 627 1044.090205
Health Care 645 1758.709197

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Several aggregations plus new labels
by_sector.market_cap_m.agg(['size', 'mean'])
.rename(columns={'size': '#Obs', 'mean': 'Average'})

Sector #Obs Average

Basic Industries 78 724.899934
Capital Goods 172 1511.237373
Consumer Durables 88 839.802607
Consumer Non-Durables 103 3104.051206
Consumer Services 348 5582.344175
...
Health Care 645 1758.709197
Miscellaneous 89 3445.655935
Public Utilities 66 2357.865315
Technology 433 10883.434214
Transportation 52 2869.660007

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Different statistics by column
by_sector.agg({'market_cap_m': 'size', 'IPO Year':'median'})

Sector market_cap_m IPO Year

Basic Industries 78 1972.0
Capital Goods 172 1972.0
Consumer Durables 88 1983.0
Consumer Non-Durables 103 1972.0
Consumer Services 348 1981.0
...
Health Care 645 1981.0
Miscellaneous 89 1987.0
Public Utilities 66 1981.0
Technology 433 1972.0
Transportation 52 1986.0

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Aggregate by two categories
by_sector_year = nasdaq.groupby(['Sector', 'IPO Year'])
by_sector_year.market_cap_m.mean()

Sector IPO Year

Basic Industries 1972.0 877.240005
1973.0 1445.697371
1986.0 1396.817381
...
Transportation 1986.0 1176.179710
1991.0 6646.778622
1992.0 56.074572
...
2009.0 552.445919
2011.0 3711.638317
2013.0 125.740421

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Select from MultiIndex
mcap_sector_year = by_sector_year.market_cap_m.mean()
mcap_sect_year.loc['Basic Industries']

IPO Year
1972.0 877.240005
1973.0 1445.697371
1986.0 1396.817381
1988.0 24.847526
...
2012.0 381.796074
2013.0 22.661533
2015.0 260.075564
2016.0 81.288336
Name: market_cap_m, dtype: float64

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Select from MultiIndex
mcap_sect_year.loc[['Basic Industries', 'Transportation']]

Sector IPO Year

Basic Industries 1972.0 877.240005
1973.0 1445.697371
1986.0 1396.817381
...
Transportation 1986.0 1176.179710
1991.0 6646.778622
1992.0 56.074572
...

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Summary statistics
by category with
seaborn
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Categorical plots with seaborn
Specialized ways to plot combinations of categorical and numerical variables
Visualize estimates of summary statistics per category

Understand how categories impact numerical variables

Compare using key metrics of distributional characteristics

Example: Mean Market Cap per Sector or IPO Year with indication of dispersion

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

The basics: countplot
sns.countplot(x='Sector', data=nasdaq)
plt.xticks(rotation=45)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

countplot, sorted
sector_size = nasdaq.groupby('Sector').size()
order = sector_size.sort_values(ascending=False)
order.head()

Sector
Health Care 645
Finance 627
Technology 433
...

order = order.index.tolist()

['Health Care', 'Finance', ..., 'Energy', 'Transportation']

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

countplot, sorted
sns.countplot(x='Sector', data=nasdaq, order=order)
plt.xticks(rotation=45)
plt.title('# Observations per Sector’)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

countplot, multiple categories
recent_ipos = nasdaq[nasdaq['IPO Year'] > 2014]
recent_ipos['IPO Year'] = recent_ipos['IPO Year'].astype(int)
sns.countplot(x='Sector', hue='IPO Year', data=recent_ipos)

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Compare stats with PointPlot
nasdaq['IPO'] = nasdaq['IPO Year'].apply(lambda x: 'After 2000' if x > 2000 else 'Before 2000')
sns.pointplot(x='Sector', y='market_cap_m', hue='IPO', data=nasdaq)
plt.xticks(rotation=45); plt.title('Mean Market Cap')

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Distributions by
category with
seaborn
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
Distributions by category
Last segment: Summary statistics
Number of observations, mean per category

Now: Visualize distribution of a variable by levels of a categorical variable to facilitate

comparison

Example: Distribution of Market Cap by Sector or IPO Year

More detail than summary stats

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Clean data: removing outliers
nasdaq = pd.read_excel('listings.xlsx', sheet_name='nasdaq',
na_values='n/a')
nasdaq['market_cap_m'] = nasdaq['Market Capitalization'].div(1e6)
nasdaq = nasdaq[nasdaq.market_cap_m > 0] # Active companies only
outliers = nasdaq.market_cap_m.quantile(.9) # Outlier threshold
nasdaq = nasdaq[nasdaq.market_cap_m < outliers] # Remove outliers

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Boxplot: quartiles and outliers
import seaborn as sns
sns.boxplot(x='Sector', y='market_cap_m', data=nasdaq)
plt.xticks(rotation=75);

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

A variation: SwarmPlot
sns.swarmplot(x='Sector', y='market_cap_m', data=nasdaq)
plt.xticks(rotation=75)
plt.show()

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Let's practice!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Congratulations!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N

Stefan Jansen
Instructor
What you learned
Import data from Excel and online sources

Combine datasets

Summarize and aggregate data

IMPORTING AND MANAGING FINANCIAL DATA IN PYTHON

Keep learning!
I M P O R T I N G A N D M A N A G I N G F I N A N C I A L D ATA I N P Y T H O N
Welcome to Portfolio
Analysis!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Hi! My name is Charlotte

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

What is a portfolio

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Why do we need portfolio analysis

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Portfolio versus fund versus index
Portfolio: a collection of investments (stocks, bonds, commodities, other funds) o en owned
by an individual

Fund: a pool of investments that is managed by a professional fund manager. Individual

investors buy "units" of the fund and the manager invests the money

Index: A smaller sample of the market that is representative of the whole, e.g. S&P500,
Nasdaq, Russell 2000, MSCI World Index

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Active versus passive investing
Passive investing: following a benchmark as
closely as possible

Active investing: taking active "bets" that

are di erent from a benchmark

Long only strategies: small deviations from

a benchmark

Hedgefunds: no benchmark but 'total return

strategies'

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Diversification
1. Single stock investments expose you to: a
sudden change in management,
disappointing nancial performance, weak
economy, an industry slump, etc

2. Good diversi cation means combining

stocks that are di erent: risk, cyclical,
counter-cyclical, industry, country

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Typical portfolio strategies
Equal weighted portfolios

Market-cap weighted portfolios

Risk-return optimized portfolios

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Portfolio returns
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
What are portfolio weights?
Weight is the percentage composition of a particular asset in a portfolio

All weights together have to sum up to 100%

Weights and diversi cation (few large investments versus many small investments)

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating portfolio weights

Calculate by dividing the value of a security by total value of the portfolio

Equal weighted portfolio, or market cap weighted portfolio

Weights determine your investment strategy, and can be set to optimize risk and expected
return

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Portfolio returns
Changes in value over time
Vt −Vt−1
Returnt = Vt−1

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Portfolio returns

Vt −Vt−1
Returnt = Vt−1
Historic average returns o en used to calculate expected return

Warning for confusion: average return, cumulative return, active return, and annualized
return

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating returns from pricing data
df.head(2)
AAPL AMZN TSLA
date
2018-03-25 13.88 114.74 92.48
2018-03-26 13.35 109.95 89.79

# Calculate returns over each day

returns = df.pct_change()

returns.head(2)
AAPL AMZN TSLA
date
2018-03-25 NaN NaN NaN
2018-03-26 -0.013772 0.030838 0.075705

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating returns from pricing data
weights = np.array([0, 0.50, 0.25])

# Calculate average return for each stock

meanDailyReturns = returns.mean()

# Calculate portfolio return

portReturn = np.sum(meanDailyReturns*weights)
print (portReturn)

0.05752375881537723

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating cumulative returns
# Calculate daily portfolio returns
returns['Portfolio']= returns.dot(weights)

# Let's see what it looks like

returns.head(3)

AAPL AMZN TSLA Portfolio

date
2018-03-23 -0.020974 -0.026739 -0.029068 -0.025880
2018-03-26 -0.013772 0.030838 0.075705 0.030902

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating cumulative returns
# Compound the percentage returns over time
daily_cum_ret=(1+returns).cumprod()

# Plot your cumulative return

daily_cum_ret.Portfolio.plot()

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Cumulative return plot

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Measuring risk of a
portfolio
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Risk of a portfolio
Investing is risky: individual assets will go up or down

Expected return is a random variable

Returns spread around the mean is measured by the variance σ 2 and is a common measure
of volatility
N
2
∑ (X−μ)
σ2 = i=1
N

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Variance

Variance of an individual asset varies: some

have more or less spread around the mean

Variance of the portfolio is not simply the

weighted variances of the underlying assets

Because returns of assets are correlated, it

becomes complex

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

How do variance and correlation relate to portfolio
risk?

The correlation between asset 1 and 2 is denoted by ρ1,2 , and tells us to which extend assets
move together

The portfolio variance takes into account the individual assets' variances (σ12 , σ22 , etc), the
weights of the assets in the portfolio (w1 , w2 ), as well as their correlation to each other

The standard deviation (σ ) is equal to the square root of variance (σ 2 ), both are a measure
of volatility

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating portfolio variance

ρ1,2 σ1 σ2 is called the covariance between asset 1 and 2

The covariance can also be wri en as σ1,2
This let's us write:

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Re-writing the portfolio variance shorter

This can be re-wri en in matrix notation, which you can use more easily in code:

In words, what we need to calculate in python is: Portfolio variance = Weights transposed x
(Covariance matrix x Weights)

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Portfolio variance in python
price_data.head(2)

ticker AAPL FB GE GM WMT

date
2018-03-21 171.270 169.39 13.88 37.58 88.18
2018-03-22 168.845 164.89 13.35 36.35 87.14

# Calculate daily returns from prices

daily_returns = df.pct_change()

# Construct a covariance matrix for the daily returns data

cov_matrix_d = daily_returns.cov()

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Portfolio variance in python
# Construct a covariance matrix from the daily_returns
cov_matrix_d = (daily_returns.cov())*250
print (cov_matrix_d)

AAPL FB GE GM WMT
AAPL 0.053569 0.026822 0.013466 0.018119 0.010798
FB 0.026822 0.062351 0.015298 0.017250 0.008765
GE 0.013466 0.015298 0.045987 0.021315 0.009513
GM 0.018119 0.017250 0.021315 0.058651 0.011894
WMT 0.010798 0.008765 0.009513 0.011894 0.041520

weights = np.array([0.2, 0.2, 0.2, 0.2, 0.2])

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Portfolio variance in python
# Calculate the variance with the formula
port_variance = np.dot(weights.T, np.dot(cov_matrix_a, weights))
print (port_variance)

0.022742232726360567

# Just converting the variance float into a percentage

print(str(np.round(port_variance, 3) * 100) + '%')

2.3%

port_stddev = np.sqrt(np.dot(weights.T, np.dot(cov_matrix_a, weights)))

print(str(np.round(port_stddev, 3) * 100) + '%')
15.1%

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Annualized returns
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Comparing returns
1. Annual Return: Total return earned over a period of one calendar year

2. Annualized return: Yearly rate of return inferred from any time period

3. Average Return: Total return realized over a longer period, spread out evenly over the
(shorter) periods.

4. Cumulative (compounding) return: A return that includes the compounded results of re-
investing interest, dividends, and capital gains.

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Why annualize returns?

Average return = (100 - 50) / 2 = 25%

Actual return = 0% so average return is not

a good measure for performance!

How to compare portfolios with di erent

time lengths?

How to account for compounding e ects

over time?

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating annualized returns

N in years: rate = (1 + Return)1/N − 1

N in months: rate = (1 + Return)12/N − 1
Convert any time length to an annual rate:

Return is the total return you want to annualize.

N is number of periods so far.

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Annualized returns in python
# Check the start and end of timeseries
apple_price.head(1)

date
2015-01-06 105.05
Name: AAPL, dtype: float64

apple_price.tail(1)

date
2018-03-29 99.75
Name: AAPL, dtype: float64

# Assign the number of months

months = 38

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Annualized returns in python
# Calculate the total return
total_return = (apple_price[-1] - apple_price[0]) /
apple_price[0]

print (total_return)

0.5397420653068692

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Annualized returns in python
# Calculate the annualized returns over months
annualized_return=((1 + total_return)**(12/months))-1
print (annualized_return)

0.14602501482708763

# Select three year period

apple_price = apple_price.loc['2015-01-01':'2017-12-31']
apple_price.tail(3)

date
2017-12-27 170.60
2017-12-28 171.08
2017-12-29 169.23
Name: AAPL, dtype: float64

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Annualized return in Python
# Calculate annualized return over 3 years
annualized_return = ((1 + total_return)**(1/3))-1

print (annualized_return)

0.1567672968419047

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Risk adjusted returns
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Choose a portfolio

Portfolio 1 Portfolio 2

Annual return of 14% Annual return of 6%

Volatility (standard deviation) is 8% Volatility is 3%

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Risk adjusted return

It de nes an investment's return by measuring how much risk is involved in producing that
return

It's usually a ratio

Allows you to objectively compare across di erent investment options

Tells you whether the return justi es the underlying risk

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Sharpe ratio
Sharpe ratio is the most commonly used risk adjusted return ratio

It's calculated as follows:

Rp −Rf
Sharpe ratio = σp

Where: Rp is the portfolio return, Rf is the risk free rate and σp is the portfolio standard
deviation

Remember the formula for the portfolio σp ?

σp = √(W eights transposed(Covariance matrix ∗ W eights) )

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Annualizing volatility

Annualized standard deviation is calculated as follows: σa = σm ∗ √T

σm is the measured standard deviation
σa is the annualized standard deviation
T is the number of data points per year

Alternatively, when using variance instead of standard deviation; σa2 = σm

2
∗T

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating the Sharpe Ratio
# Calculate the annualized standard deviation
annualized_vol = apple_returns.std()*np.sqrt(250)
print (annualized_vol)

0.2286248397870068

# Define the risk free rate

risk_free = 0.01

# Calcuate the sharpe ratio

sharpe_ratio = (annualized_return - risk_free) / annualized_vol
print (sharpe_ratio)

0.6419569149994251

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Which portfolio did you choose?

Portfolio 1 Portfolio 2

Annual return of 14% Annual return of 6%

Volatility (standard deviation) is 8% Volatility is 3%

Sharpe ratio of 1.75 Sharpe ratio of 2

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Non-normal
distribution of
returns
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
In a perfect world returns are distributed normally

1 Source: Distribution of monthly returns from the S&P500 from evestment.com

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

But using mean and standard deviations can be
deceiving

1 Source: “An Introduction to Omega, Con Keating and William Shadwick, The Finance Development Center, 2002

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Skewness: leaning towards the negative

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Pearson’s Coefficient of Skewness
3(mean−median)
Skewness = σ

Rule of thumb:

Skewness < −1 or Skewness > 1 ⇒ Highly skewed distribution

−1 < Skewness < −0.5 or 0.5 < Skewness < 1 ⇒ Moderately skewed distribution
−0.5 < Skewness < 0.5 ⇒ Approximately symmetric distribution

1 Source: h ps://brownmath.com/stat/shape.htm

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Kurtosis: Fat tailed distribution

1 Source: Pimco

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Interpreting kurtosis
“Higher kurtosis means more of the variance is the result of infrequent extreme deviations, as
opposed to frequent modestly sized deviations.”

A normal distribution has kurtosis of exactly 3 and is called (mesokurtic)

A distribution with kurtosis <3 is called platykurtic. Tails are shorter and thinner, and central
peak is lower and broader.

A distribution with kurtosis >3 is called leptokurtic: Tails are longer and fa er, and central
peak is higher and sharper (fat tailed)

1 Source: h ps://brownmath.com/stat/shape.htm

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculating skewness and kurtosis
apple_returns=apple_price.pct_change()
apple_returns.head(3)

date
2015-01-02 NaN
2015-01-05 -0.028172
2015-01-06 0.000094
Name: AAPL, dtype: float64

apple_returns.hist()

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON
Calculating skewness and kurtosis
print("mean : ", apple_returns.mean())
print("vol : ", apple_returns.std())
print("skew : ", apple_returns.skew())
print("kurt : ", apple_returns.kurtosis())

mean : 0.0006855391415724799
vol : 0.014459504468360529
skew : -0.012440851735057878
kurt : 3.197244607586669

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Alternative
measures of risk
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Looking at downside risk

A good risk measure should focus on potential losses i.e. downside risk

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Sortino ratio

Similar to the Sharpe ratio, just with a

di erent standard deviation
Rp −Rf
Sortino Ratio = σd
σd is the standard deviation of the
downside.

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Sortino ratio in python
# Define risk free rate and target return of 0
rfr = 0
target_return = 0

# Calcualte the daily returns from price data

apple_returns=pd.DataFrame(apple_price.pct_change())

# Select the negative returns only

negative_returns = apple_returns.loc[apple_returns['AAPL'] < target]

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

# Calculate expected return and std dev of downside returns
expected_return = apple_returns['AAPL'].mean()
down_stdev = negative_returns.std()

# Calculate the sortino ratio

sortino_ratio = (expected_return - rfr)/down_stdev
print(sortino_ratio)

0.07887683763760528

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Maximum draw-down
The largest percentage loss from a market peak to trough

Dependent on the chosen time window

The recovery time: time it takes to get back to break-even

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Maximum daily draw-down in Python
# Calculate the maximum value of returns using rolling().max()
roll_max = apple_price.rolling(min_periods=1,window=250).max()
# Calculate daily draw-down from rolling max
daily_drawdown = apple_price/roll_max - 1.0
# Calculate maximum daily draw-down
max_daily_drawdown = daily_drawdown.rolling(min_periods=1,window=250).min()
# Plot the results
daily_drawdown.plot()
max_daily_drawdown.plot()
plt.show()

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Maximum draw-down of Apple

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Comparing against
a benchmark
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Active investing against a benchmark

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Active return for an actively managed portfolio

Active return is the performance of an (active) investment, relative to the investment's

benchmark.

Calculated as the di erence between the benchmark and the actual return.

Active return is achieved by "active" investing, i.e. taking overweight and underweight
positions from the benchmark.

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Tracking error for an index tracker

Passive investment funds, or index trackers, don't use active return as a measure for
performance.

Tracking error is the name used for the di erence in portfolio and benchmark for a passive
investment fund.

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Active weights

1 Source: Schwab Center for Financial Research.

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Active return in Python
# Inspect the data
portfolio_data.head()

mean_ret var pf_w bm_w GICS Sector

Ticker
A 0.146 0.035 0.002 0.005 Health Care
AAL 0.444 0.094 0.214 0.189 Industrials
AAP 0.242 0.029 0.000 0.000 Consumer Discretionary
AAPL 0.225 0.027 0.324 0.459 Information Technology
ABBV 0.182 0.029 0.026 0.010 Health Care

1 Global Industry Classi cation System (GICS)

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Active return in Python
# Calculate mean portfolio return
total_return_pf = (pf_w*mean_ret).sum()

# Calculate mean benchmark return

total_return_bm = (bm_w*mean_ret).sum()

# Calculate active return

active_return = total_return_pf - total_return_bm
print ("Simple active return: ", active_return)

Simple active return: 6.5764

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Active weights in Python
# Group dataframe by GICS sectors
grouped_df=portfolio_data.groupby('GICS Sector').sum()

# Calculate active weights of portfolio

grouped_df['active_weight']=grouped_df['pf_weights']-
grouped_df['bm_weights']

print (grouped_df['active_weight'])

GICS Sector
Consumer Discretionary 20.257
Financials -2.116
...etc

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Risk factors
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
What is a factor?
Factors in portfolios are like nutrients in food

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Factors in portfolios
Di erent types of factors:

Macro factors: interest rates, currency, country, industry

Style factors: momentum, volatility, value and quality

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Using factor models to determine risk exposure

1 Source: h ps://invesco.eu/investment-campus/educational-papers/factor-investing

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Factor exposures
df.head()

date portfolio volatility quality

2015-01-05 -1.827811 1.02 -1.76
2015-01-06 -0.889347 0.41 -0.82
2015-01-07 1.162984 1.07 1.39
2015-01-08 1.788828 0.31 1.93
2015-01-09 -0.840381 0.28 -0.77

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Factor exposures
df.corr()

portfolio volatility quality

portfolio 1.000000 0.056596 0.983416
volatility 0.056596 1.000000 0.092852
quality 0.983416 0.092852 1.000000

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Correlations change over time
# Rolling correlation
df['corr']=df['portfolio'].rolling(30).corr(df['quality'])

# Plot results
df['corr'].plot()

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Rolling correlation with quality

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Factor models
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Using factors to explain performance

Factors are used for risk management.

Factors are used to help explain performance.

Factor models help you relate factors to portfolio returns

Empirical factor models exist that have been tested on historic data.

Fama French 3 factor model is a well-known factor model.

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Fama French Multi Factor model

Rpf = α + βm M KT + βs SM B + βh HM L
MKT is the excess return of the market, i.e. Rm − Rf
SMB (Small Minus Big) a size factor

HML (High Minus Low) a value factor

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Regression model refresher

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Difference between beta and correlation

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Regression model in Python
import statsmodels.api as sm

# Define the model

model = sm.OLS(factor_data['sp500'],
factor_data[['momentum','value']]).fit()

# Get the model predictions

predictions = model.predict(factor_data[['momentum','value']])

b1, b2 = model.params

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

The regression summary output
# Print out the summary statistics
model.summary()

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Obtaining betas quickly
# Get just beta coefficients from linear regression model
b1, b2 = regression.linear_model.OLS(df['returns'],
df[['F1', 'F2']]).fit().params

# Print the coefficients

print 'Sensitivities of active returns to factors:
\nF1: %f\nF2: %f' % (b1, b2)

Sensitivities of active returns to factors:

F1: -0.0381
F2: 0.9858

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Portfolio analysis
tools
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Professional portfolio analysis tools

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Back-testing your strategy
Back-testing: run your strategy on historic data and see how it would have performed

Strategy works on historic data: not guaranteed to work well on future data -> changes in
markets

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Quantopian's pyfolio tool

1 Github: h ps://github.com/quantopian/pyfolio

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Performance and risk analysis in Pyfolio
# Install the package
!pip install pyfolio
# Import the package
import pyfolio as pf

# Read the data as a pandas series

returns=pd.Series(pd.read_csv('pf_returns.csv')
returns.index=pd.to_datetime(returns.index)

# Create a tear sheet on returns

pf.create_returns_tear_sheet(returns)

# If you have backtest and live data

pf.create_returns_tear_sheet(returns, live_start_date='2018-03-01')

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Pyfolio's tear sheet

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Holdings and exposures in Pyfolio
# define our sector mappings
sect_map = {'COST': 'Consumer Goods',
'INTC': 'Technology',
'CERN': 'Healthcare',
'GPS': 'Technology',
'MMM': 'Construction',
'DELL': 'Technology',
'AMD': 'Technology'}

pf.create_position_tear_sheet(returns, positions,
sector_mappings=sect_map)

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Exposure tear sheet results

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Modern portfolio
theory
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Creating optimal portfolios

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

What is Portfolio Optimization?
Meet Harry Markowitz

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

The optimization problem: finding optimal weights
In words:

Minimize the portfolio variance, subject to:

The expected mean return is at least some

target return

The weights sum up to 100%

At least some weights are positive

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Varying target returns leads to the Efficient Frontier

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

PyPortfolioOpt for portfolio optimization
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt import risk_models
from pypfopt import expected_returns

df=pd.read_csv('portfolio.csv')
df.head(2)
XOM RRC BBY MA PFE
date
2010-01-04 54.068794 51.300568 32.524055 22.062426 13.940202
2010-01-05 54.279907 51.993038 33.349487 21.997149 13.741367

# Calculate expected annualized returns and sample covariance

mu = expected_returns.mean_historical_return(df)
Sigma = risk_models.sample_cov(df)

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Get the Efficient Frontier and portfolio weights
# Calculate expected annualized returns and risk
mu = expected_returns.mean_historical_return(df)
Sigma = risk_models.sample_cov(df)

# Obtain the EfficientFrontier

ef = EfficientFrontier(mu, Sigma)

# Select a chosen optimal portfolio

ef.max_sharpe()

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Different optimizations
# Select the maximum Sharpe portfolio
ef.max_sharpe()

# Select an optimal return for a target risk

ef.efficient_risk(2.3)

# Select a minimal risk for a target return

ef.efficient_return(1.5)

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Calculate portfolio risk and performance
# Obtain the performance numbers
ef.portfolio_performance(verbose=True, risk_free_rate = 0.01)

Expected annual return: 21.3%

Annual volatility: 19.5%
Sharpe Ratio: 0.98

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's optimize a
portfolio!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Maximum Sharpe
vs. minimum
volatility
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Remember the Efficient Frontier?

E cient frontier: all portfolios with an

optimal risk and return trade-o

Maximum Sharpe portfolio: the highest

Sharpe ratio on the EF

Minimum volatility portfolio: the lowest

level of risk on the EF

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Adjusting PyPortfolioOpt optimization

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Maximum Sharpe portfolio
Maximum Sharpe portfolio: the highest Sharpe ratio on the EF

from pypfopt.efficient_frontier import EfficientFrontier

# Calculate the Efficient Frontier with mu and S

ef = EfficientFrontier(mu, Sigma)
raw_weights = ef.max_sharpe()

# Get interpretable weights

cleaned_weights = ef.clean_weights()

{'GOOG': 0.01269,'AAPL': 0.09202,'FB': 0.19856,

'BABA': 0.09642,'AMZN': 0.07158,'GE': 0.02456,...}

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Maximum Sharpe portfolio
# Get performance numbers
ef.portfolio_performance(verbose=True)

Expected annual return: 33.0%

Annual volatility: 21.7%
Sharpe Ratio: 1.43

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Minimum Volatility Portfolio
Minimum volatility portfolio: the lowest level of risk on the EF

# Calculate the Efficient Frontier with mu and S

ef = EfficientFrontier(mu, Sigma)

raw_weights = ef.min_volatility()

# Get interpretable weights and performance numbers

cleaned_weights = ef.clean_weights()

{'GOOG': 0.05664, 'AAPL': 0.087, 'FB': 0.1591,

'BABA': 0.09784, 'AMZN': 0.06986, 'GE': 0.0123,...}

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Minimum Volatility Portfolio
ef.portfolio_performance(verbose=True)

Expected annual return: 17.4%

Annual volatility: 13.2%
Sharpe Ratio: 1.28

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's have another look at the Efficient Frontier

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Maximum Sharpe versus Minimum Volatility

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Alternative portfolio
optimization
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Expected risk and return based on historic data

Mean historic returns, or the historic

portfolio variance are not perfect estimates
of mu and Sigma

Weights from portfolio optimization

therefore not guaranteed to work well on
future data

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Historic data

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Exponentially weighted returns

Need be er measures for risk and return

Exponentially weighted risk and return

assigns more importance to the most
recent data

Exponential moving average in the graph:

most weight on t-1 observation

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Exponentially weighted covariance

The exponential covariance matrix: gives

more weight to recent data

In the graph: exponential weighted

volatility in black, follows real volatility
be er than standard volatility in blue

1 Source: h ps://systematicinvestor.github.io/Exponentially-Weighted-Volatility-RCPP

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Exponentially weighted returns
from pypfopt import expected_returns

# Exponentially weighted moving average

mu_ema = expected_returns.ema_historical_return(df,
span=252, frequency=252)
print(mu_ema)

symbol
XOM 0.103030
BBY 0.394629
PFE 0.186058

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Exponentially weighted covariance
from pypfopt import risk_models

# Exponentially weighted covariance

Sigma_ew = risk_models.exp_cov(df, span=180, frequency=252)

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Using downside risk in the optimization

Remember the Sortino ratio: it uses the variance of negative returns only

PyPortfolioOpt allows you to use semicovariance in the optimization, this is a measure for
downside risk:

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Semicovariance in PyPortfolioOpt
Sigma_semi = risk_models.semicovariance(df,
benchmark=0, frequency=252)

print(Sigma_semi)

XOM BBY MA PFE

XOM 0.018939 0.008505 0.006568 0.004058
BBY 0.008505 0.016797 0.009133 0.004404
MA 0.006568 0.009133 0.018711 0.005373
PFE 0.004058 0.004404 0.005373 0.008349

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Let's practice!
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N
Recap
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

Charlo e Werger
Data Scientist
Chapter 1: Calculating risk and return

A portfolio as a collection of weight and assets

Diversi cation

Mean returns versus cumulative returns

Variance, standard deviation, correlations and the covariance matrix

Calculating portfolio variance

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Chapter 2: Diving deep into risk measures

Annualizing returns and risk to compare over di erent periods

Sharpe ratio as a measured of risk adjusted returns

Skewness and Kurtosis: looking beyond mean and variance of a distribution

Maximum draw-down, downside risk and the Sortino ratio

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Chapter 3: Breaking down performance

Compare to benchmark with active weights and active returns

Investment factors: explain returns and sources of risk

Fama French 3 factor model to breakdown performance into explainable factors and alpha

Pyfolio as a portfolio analysis tool

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Chapter 4: Finding the optimal portfolio

Markowitz' portfolio optimization: e cient frontier, maximum Sharpe and minimum volatility
portfolios

Exponentially weighted risk and return, semicovariance

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

Continued learning

Datacamp course on Portfolio Risk Management in Python

Quantopian's lecture series: h ps://www.quantopian.com/lectures

Learning by doing: Pyfolio and PyPortfolioOpt

INTRODUCTION TO PORTFOLIO ANALYSIS IN PYTHON

End of this course
I N T R O D U C T I O N T O P O R T F O L I O A N A LY S I S I N P Y T H O N

AI and Machine Learning in Action Real World Solutions For Coders
No ratings yet
AI and Machine Learning in Action Real World Solutions For Coders
175 pages
Book AI Driven Software Development 13 August
No ratings yet
Book AI Driven Software Development 13 August
219 pages
M1 - Introducing Google Cloud v5.2 - ILT
No ratings yet
M1 - Introducing Google Cloud v5.2 - ILT
69 pages
AI Fundamentals
91% (11)
AI Fundamentals
881 pages
Ceswe Online Review Primer 8-11-21
0% (1)
Ceswe Online Review Primer 8-11-21
9 pages
A Survey On Vision Transformer
No ratings yet
A Survey On Vision Transformer
24 pages
Sample Outline Azure Machine Learning Engineering
No ratings yet
Sample Outline Azure Machine Learning Engineering
17 pages
Machine Learning Cheat Sheet ??? - ?
No ratings yet
Machine Learning Cheat Sheet ??? - ?
231 pages
Scikit Learn Docs PDF
No ratings yet
Scikit Learn Docs PDF
2,387 pages
Go Programming by Example - Go Programming by Example
No ratings yet
Go Programming by Example - Go Programming by Example
165 pages
Pytorch Lightning Manual Readthedocs Io English May2020
No ratings yet
Pytorch Lightning Manual Readthedocs Io English May2020
562 pages
HBase Succinctly PDF
100% (1)
HBase Succinctly PDF
85 pages
Hive and Impala
No ratings yet
Hive and Impala
46 pages
SENG419-python 98745
No ratings yet
SENG419-python 98745
103 pages
Spring Cloud Dataflow Reference
No ratings yet
Spring Cloud Dataflow Reference
130 pages
Duckdb Docs
No ratings yet
Duckdb Docs
721 pages
Lesson 1 - Course - Introduction
No ratings yet
Lesson 1 - Course - Introduction
9 pages
2 Machine Learning Overview
No ratings yet
2 Machine Learning Overview
112 pages
Recommend Courses, Books, Projects, Certification...
No ratings yet
Recommend Courses, Books, Projects, Certification...
2 pages
Flask Restplus
No ratings yet
Flask Restplus
86 pages
Aws Lambda Python Example Github
No ratings yet
Aws Lambda Python Example Github
427 pages
Dynamodb DG
No ratings yet
Dynamodb DG
705 pages
Py Projects
No ratings yet
Py Projects
84 pages
Flask With Aws Cloudwatch
No ratings yet
Flask With Aws Cloudwatch
6 pages
Javanotes9 Linked
No ratings yet
Javanotes9 Linked
773 pages
Introduction To Cloud Infrastructure Technologies
No ratings yet
Introduction To Cloud Infrastructure Technologies
11 pages
Neo4j-Manual-2 0 1
No ratings yet
Neo4j-Manual-2 0 1
593 pages
Introduction To Python - Draft - Final 012024
No ratings yet
Introduction To Python - Draft - Final 012024
760 pages
Applied Coding Track
No ratings yet
Applied Coding Track
10 pages
DSL Pandas
No ratings yet
DSL Pandas
87 pages
Fake News Detection
No ratings yet
Fake News Detection
14 pages
A Quick Introduction To Tensorflow: Machine Learning Spring 2019
100% (1)
A Quick Introduction To Tensorflow: Machine Learning Spring 2019
22 pages
Deep Learning Fundamentals Materials
100% (1)
Deep Learning Fundamentals Materials
216 pages
Kubernetes Deployment Strategies
No ratings yet
Kubernetes Deployment Strategies
11 pages
Testing in Python - Unit Test & Script
No ratings yet
Testing in Python - Unit Test & Script
5 pages
Yogesh Miyani - SFDC Lightning Developer
No ratings yet
Yogesh Miyani - SFDC Lightning Developer
4 pages
Introduction To DevOps
No ratings yet
Introduction To DevOps
146 pages
Scikit Learn User Guide 0.12
100% (1)
Scikit Learn User Guide 0.12
1,049 pages
Python Programming
No ratings yet
Python Programming
10 pages
50 SQL To Python Series Problems
No ratings yet
50 SQL To Python Series Problems
165 pages
The Guide To Learning TypeScript For CPP Programmers
No ratings yet
The Guide To Learning TypeScript For CPP Programmers
417 pages
Api Reference Guide PDF
No ratings yet
Api Reference Guide PDF
440 pages
Mastering Python 100 Exercises With Solutions
No ratings yet
Mastering Python 100 Exercises With Solutions
32 pages
Test Driven Machine Learning - Sample Chapter
100% (1)
Test Driven Machine Learning - Sample Chapter
25 pages
Maths of Machine Learning
No ratings yet
Maths of Machine Learning
75 pages
Getting Started With Building Microservices
No ratings yet
Getting Started With Building Microservices
17 pages
2023 Updated Huawei H12-711 - V40-ENU Exam Dumps - PDF Room
No ratings yet
2023 Updated Huawei H12-711 - V40-ENU Exam Dumps - PDF Room
27 pages
Cassandra
100% (1)
Cassandra
31 pages
Getting Started With TensorFlow - Js - TensorFlow - Medium
No ratings yet
Getting Started With TensorFlow - Js - TensorFlow - Medium
6 pages
Data Science in Python - Regression
No ratings yet
Data Science in Python - Regression
234 pages
Azure Developer Learning Pathway 1122i
No ratings yet
Azure Developer Learning Pathway 1122i
2 pages
Yousef Time Series Analysis in Python 2020
100% (1)
Yousef Time Series Analysis in Python 2020
835 pages
Big Data
No ratings yet
Big Data
22 pages
Introduction To Bash Scripting Light
No ratings yet
Introduction To Bash Scripting Light
169 pages
Rojas-Time Series Analysis and Forecasting-Book16
100% (1)
Rojas-Time Series Analysis and Forecasting-Book16
384 pages
List Methods and Functions - Python
No ratings yet
List Methods and Functions - Python
23 pages
Introduction To Python For Finance: Adina Howe
No ratings yet
Introduction To Python For Finance: Adina Howe
23 pages
Python For Finance: Introduction and Basics of Python
No ratings yet
Python For Finance: Introduction and Basics of Python
45 pages
Python For Finance Abdelfattah Etudiatns
No ratings yet
Python For Finance Abdelfattah Etudiatns
44 pages
00 Info - Python For Finance Web Links
No ratings yet
00 Info - Python For Finance Web Links
6 pages
Investment Formulas: A Simple Introduction
From Everand
Investment Formulas: A Simple Introduction
K.H. Erickson
No ratings yet
Trading Strategies World
From Everand
Trading Strategies World
Tarannum Mujawar
No ratings yet
Supervised Learning With Scikit-Learn
No ratings yet
Supervised Learning With Scikit-Learn
178 pages
Introduction To TensorFlow in Python
100% (3)
Introduction To TensorFlow in Python
146 pages
Introduction and Intermediate Docker
100% (1)
Introduction and Intermediate Docker
255 pages
Introduction To Statistics in Python
100% (2)
Introduction To Statistics in Python
211 pages
Applied Finance in Python
100% (2)
Applied Finance in Python
545 pages
Security and Maintenance Quiz
No ratings yet
Security and Maintenance Quiz
26 pages
Kommander T3 Software
No ratings yet
Kommander T3 Software
4 pages
Digital Signature
No ratings yet
Digital Signature
11 pages
Flutter
No ratings yet
Flutter
1 page
Unit III Deep Learning Chapter Notes
No ratings yet
Unit III Deep Learning Chapter Notes
23 pages
20250225-Part 02-Review For Students-Questions
No ratings yet
20250225-Part 02-Review For Students-Questions
43 pages
9783319706870
100% (2)
9783319706870
383 pages
15 Kategori Daftar Software-Software Untuk GNU Linux
No ratings yet
15 Kategori Daftar Software-Software Untuk GNU Linux
4 pages
Ai Internship Report-9 PDF
No ratings yet
Ai Internship Report-9 PDF
33 pages
Memory Hicorder: High Speed Oscilloscope and Multi-Channel Logger - All in One Powerful Instrument
No ratings yet
Memory Hicorder: High Speed Oscilloscope and Multi-Channel Logger - All in One Powerful Instrument
14 pages
1682049501506-Sad Assigment 01
No ratings yet
1682049501506-Sad Assigment 01
5 pages
A320-GenFam - MasterNotes Iss 2 Rev.1 15 July 2019
100% (2)
A320-GenFam - MasterNotes Iss 2 Rev.1 15 July 2019
482 pages
Watson Studio (Santanu Sasmal)
No ratings yet
Watson Studio (Santanu Sasmal)
54 pages
Kernel Mode Vs User Mode A Comprehensive Overview (1) (Read Only)
No ratings yet
Kernel Mode Vs User Mode A Comprehensive Overview (1) (Read Only)
9 pages
Mod Menu Crash 2022 09 18-08 19 09
No ratings yet
Mod Menu Crash 2022 09 18-08 19 09
3 pages
Roll-Out Training For Lupong Tagapamayapa Incentive Awards Information System
No ratings yet
Roll-Out Training For Lupong Tagapamayapa Incentive Awards Information System
54 pages
M Tech Thesis Topics in Mechanical
100% (3)
M Tech Thesis Topics in Mechanical
7 pages
App Store Optimization Tools 2017 Guide
No ratings yet
App Store Optimization Tools 2017 Guide
19 pages
Rohit Choudhary Resume
No ratings yet
Rohit Choudhary Resume
1 page
I2 CInterview Questions
No ratings yet
I2 CInterview Questions
6 pages
Mobile Ad Hoc Networks A General Perspective: Srivas N. Chennu Vishwas N. 8 Semester CSE Rvce
No ratings yet
Mobile Ad Hoc Networks A General Perspective: Srivas N. Chennu Vishwas N. 8 Semester CSE Rvce
18 pages
Sample Paper 2 Mark Scheme
No ratings yet
Sample Paper 2 Mark Scheme
9 pages
My CV
No ratings yet
My CV
3 pages
Sendquick ConeXa
No ratings yet
Sendquick ConeXa
2 pages
MaxTrax 2017
No ratings yet
MaxTrax 2017
31 pages
IPL Decentralized Fantasy Sports Platform
No ratings yet
IPL Decentralized Fantasy Sports Platform
13 pages
BITH261 Network Programming
No ratings yet
BITH261 Network Programming
121 pages
2.2.5 Basic Properties of Eigenvalue Problems: 2.2 Linear Algebra and Eigenvalues Problems
No ratings yet
2.2.5 Basic Properties of Eigenvalue Problems: 2.2 Linear Algebra and Eigenvalues Problems
57 pages
Operating System
No ratings yet
Operating System
3 pages