0% found this document useful (0 votes)
3 views1 page

Stock Market Data and Analysis in Python

This document provides a comprehensive guide on how to fetch and analyze stock market data using Python. It covers various sources for obtaining data, including Yahoo Finance, Quandl, and Alpha Vantage, along with examples of how to visualize and analyze the data. The article also discusses the advantages and disadvantages of each data source and methods for customizing data frequency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views1 page

Stock Market Data and Analysis in Python

This document provides a comprehensive guide on how to fetch and analyze stock market data using Python. It covers various sources for obtaining data, including Yahoo Finance, Quandl, and Alpha Vantage, along with examples of how to visualize and analyze the data. The article also discusses the advantages and disadvantages of each data source and methods for customizing data frequency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

) *

Search in the blog... !

Stock Market Data And Analysis In Python


Python For Trading

" Aug 06, 2019


# 15 min read

By Ishan Shah
$

%
In this article, you will learn to get the stock market data such as price, volume
& and fundamental data using python packages and how to analyze it.

'
In backtesting your strategies or analyzing the performance, one of the first

hurdles faced is getting the right stock market data and in the right format, isn't

it? Don't worry.

After reading this, you will be able to:

Fetch the open, high, low, close, and volume data.

Get data at a custom frequency such as 1 minute, 7 minutes or 2 hours

Perform analysis of your portfolio

Get the earnings data, balance sheet data, cash flow statements and

various key ratios such as price to earnings (PE) and price to book value

(PB)

Get the futures and options data for Indian stock market

Generally, web sources are quite unstable and therefore, you will learn to get

the stock market data from multiple web sources.

For easy navigation, this article is divided as below.

1. Price Volume Daily Data

2. Intraday Data

3. Fundamental Data

4. Futures and Options Data

5. Visualization and Analysis

Price Volume Daily Data

Yahoo Finance

One of the first sources from which you can get daily price-volume stock market

data is Yahoo finance. You can use pandas_datareader or yfinance

module to get the data.

In [ ]:

!pip install pandas_datareader==0.7.0

In [22]:

# Import pandas datareader


import pandas_datareader
pandas_datareader.__version__

Out[22]:

'0.7.0'

In [7]:

# Yahoo recently has become an unstable data source.


# If it gives an error, you may run the cell again, or try yf
inance
import pandas as pd
from pandas_datareader import data
# Set the start and end date
start_date = '1990-01-01'
end_date = '2019-02-01'
# Set the ticker
ticker = 'AMZN'
# Get the data
data = data.get_data_yahoo(ticker, start_date, end_date)
data.head()

Out[7]:

Date High Low Open Close Volume Adj

Close

1997- 2.500000 1.927083 2.437500 1.958333 72156000.0 1.958333

05-15

1997- 1.979167 1.708333 1.968750 1.729167 14700000.0 1.729167

05-16

1997- 1.770833 1.625000 1.760417 1.708333 6106800.0 1.708333

05-19

1997- 1.750000 1.635417 1.729167 1.635417 5467200.0 1.635417

05-20

1997- 1.645833 1.375000 1.635417 1.427083 18853200.0 1.427083

05-21

To visualize the adjusted close price data, you can use the matplotlib library and

plot method as shown below.

In [9]:

import matplotlib.pyplot as plt


%matplotlib inline
data['Adj Close'].plot()
plt.show()

Let us improve the plot by resizing, giving appropriate labels and adding grid

lines for better readability.

In [10]:

# Plot the adjusted close price


data['Adj Close'].plot(figsize=(10, 7))
# Define the label for the title of the figure
plt.title("Adjusted Close Price of %s" % ticker, fontsize=16)
# Define the labels for x-axis and y-axis
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
# Plot the grid lines
plt.grid(which="major", color='k', linestyle='-.', linewidth=
0.5)
# Show the plot
plt.show()

Advantages

1. Adjusted close price stock market data is available

2. Most recent stock market data is available

3. Doesn't require API key to fetch the stock market data

Disadvantages

1. It is not a stable source to fetch the stock market data

If the stock market data fetching fails from yahoo finance using the

pandas_datareader then you can use yfinance package to fetch the

data.

Quandl

Quandl has many data sources to get different types of data. However, some

are free and some are paid. Wiki is the free data source of Quandl to get the

data of the end of the day prices of 3000+ US equities.

It is curated by Quandl community and also provides information about the

dividends and split.

To get the stock market data, you need to first install the quandl module if it is

not already installed using the pip command as shown below.

In [ ]:

!pip install quandl

You need to get your own API Key from quandl to get the stock market data

using the below code. If you are facing issue in getting the API key then you can

refer to this link.

After you get your key, assign the variable


QUANDLA P IK EY

with that key. Then set the start date, end date and the ticker of the asset whose

stock market data you want to fetch.

The quandl get method takes this stock market data as input and returns the

open, high, low, close, volume, adjusted values and other information.

In [1]:

# Import the quandl


import quandl
# To get your API key, sign up for a free Quandl account.
# Then, you can find your API key on Quandl account settings
page.
QUANDL_API_KEY = 'REPLACE-THIS-TEXT-WITH-A-REAL-API-KEY'
# This is to prompt you to change the Quandl Key
if QUANDL_API_KEY == 'REPLACE-THIS-TEXT-WITH-A-REAL-API-KEY':
raise Exception("Please provide a valid Quandl API key!")
# Set the start and end date
start_date = '1990-01-01'
end_date = '2018-03-01'
# Set the ticker name
ticker = 'AMZN'
# Feth the data
data = quandl.get('WIKI/'+ticker, start_date=start_date,

Out[1]:

Date Open High Low Close Volume Ex- Split

Dividend Ratio

1997- 22.38 23.75 20.50 20.75 1225000.0 0.0 1.0

05-16

1997- 20.50 21.25 19.50 20.50 508900.0 0.0 1.0

05-19

1997- 20.75 21.00 19.63 19.63 455600.0 0.0 1.0

05-20

1997- 19.25 19.75 16.50 17.13 1571100.0 0.0 1.0

05-21

1997- 17.25 17.38 15.75 16.75 981400.0 0.0 1.0

05-22

In [3]:

# Define the figure size for the plot


plt.figure(figsize=(10, 7))
# Plot the adjusted close price
data['Adj. Close'].plot()
# Define the label for the title of the figure
plt.title("Adjusted Close Price of %s" % ticker, fontsize=16)
# Define the labels for x-axis and y-axis
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
# Plot the grid lines
plt.grid(which="major", color='k', linestyle='-.', linewidth=
0.5)
plt.show()

Get stock market data for multiple tickers

To get the stock market data of multiple stock tickers, you can create a list of

tickers and call the quandl get method for each stock ticker.[1]

For simplicity, I have created a dataframe data to store the adjusted close

price of the stocks.

In [4]:

# Define the ticker list


import pandas as pd
tickers_list = ['AAPL', 'IBM', 'MSFT', 'WMT']
# Import pandas
data = pd.DataFrame(columns=tickers_list)
# Feth the data
for ticker in tickers_list:
data[ticker] = quandl.get('WIKI/' + ticker, start_date=start_date,
end_date=end_date, api_key=QUANDL_API_KEY)['Adj. Close']
# Print first 5 rows of the data
data.head()

Out[4]:

Date AAPL IBM MSFT WMT

1990-01-02 1.118093 14.138144 0.410278 4.054211

1990-01-03 1.125597 14.263656 0.412590 4.054211

1990-01-04 1.129499 14.426678 0.424702 4.033561

1990-01-05 1.133101 14.390611 0.414300 3.990541

1990-01-08 1.140605 14.480057 0.420680 4.043886

In [5]:

# Plot all the close prices


data.plot(figsize=(10, 7))
# Show the legend
plt.legend()
# Define the label for the title of the figure
plt.title("Adjusted Close Price", fontsize=16)
# Define the labels for x-axis and y-axis
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
# Plot the grid lines
plt.grid(which="major", color='k', linestyle='-.', linewidth=
0.5)
plt.show()

Advantages

1. It is free of cost

2. Has split and dividend-adjusted stock market data

Disadvantages

1. Only available till 27-March-2018

Intraday Data

Alpha Vantage

Alpha vantage is used to get the minute level stock market data. You need to

signup on alpha vantage to get the free API key.

In [ ]:

# Install the alpha_vantage if not already installed


!pip install alpha_vantage

Assign the ALPHA_VANTAGE_API_KEY, with your API Key in the below code.

In [12]:

# Import TimeSeries class


from alpha_vantage.timeseries import TimeSeries
ALPHA_VANTAGE_API_KEY = 'REPLACE-THIS-TEXT-WITH-A-REAL-API-KE
Y'
# This is to prompt you to change the ALPHA_VANTAGE Key
if ALPHA_VANTAGE_API_KEY == 'REPLACE-THIS-TEXT-WITH-A-REAL-AP
I-KEY':
raise Exception("Please provide a valid Alpha Vantage API key!")
# Initialize the TimeSeries class with key and output format
ts = TimeSeries(key=ALPHA_VANTAGE_API_KEY, output_format='pan
das')
# Get pandas dataframe with the intraday data and information
of the data
intraday_data, data_info = ts.get_intraday(
'GOOGL', outputsize='full', interval='1min')
# Print the information of the data

Out[12]:

{'1. Information': 'Intraday (1min) open, high, low, close pr


ices and volume',
'2. Symbol': 'GOOGL',
'3. Last Refreshed': '2019-08-01 16:00:00',
'4. Interval': '1min',
'5. Output Size': 'Full size',
'6. Time Zone': 'US/Eastern'}

This gives information about the stock market data which is returned. The

information includes the type of data returned such as open, high, low and

close, the symbol or ticker of the stock, last refresh time of the data, frequency

of the stock market data and the time zone.

In [13]:

# Print the intraday data


intraday_data.head()

Out[13]:

Date Open High Low Close Volume

2019-07- 1228.2300 1232.49 1228.0000 1230.7898 407037.0

26

09:31:00

2019-07- 1230.9200 1235.13 1230.4301 1233.0000 111929.0

26

09:32:00

2019-07- 1233.0000 1237.90 1232.7500 1237.9000 86564.0

26

09:33:00

2019-07- 1237.4449 1241.90 1237.0000 1241.9000 105884.0

26

09:34:00

2019-07- 1241.9399 1244.49 1241.3500 1243.1300 74444.0

26

09:35:00

In [19]:

intraday_data['4. close'].plot(figsize=(10, 7))


# Define the label for the title of the figure
plt.title("Close Price", fontsize=16)
# Define the labels for x-axis and y-axis
plt.ylabel('Price', fontsize=14)
plt.xlabel('Time', fontsize=14)
# Plot the grid lines
plt.grid(which="major", color='k', linestyle='-.', linewidth=
0.5)
plt.show()

Get data at a custom frequency

During strategy modelling, you are required to work with a custom frequency of

stock market data such as 7 minutes or 35 minutes. This custom frequency

candles are not provided by data vendors or web sources.

In this case, you can use the pandas resample method to convert the stock

market data to the frequency of your choice. The implementation of these is

shown below where a 1-minute frequency data is converted to 10-minute

frequency data.

The first step is to define the dictionary with the conversion logic. For example, (
Our cookie policy +
to get the open value the first value will be used, to get the high value the

You might also like