Alpha in Quant Trading: Hoàng Tùng

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

ALPHA IN QUANT TRADING

HOÀNG TÙNG
ALPHA IN QUANT TRADING
• Alpha is a measure of the active return on an
investment
• Idea1: Edgar web scraping + NLP+ Machine learning
Edgar web scraping
• Majority of information related to the world of investing is
unstructured: text, image, audio, or video.
• How to transform unstructured information into numeric data
in real time:
• Integrated systems, web scraping, data collection, distributed
parallel computing, advanced Natural Language Processing
(NLP), machine learning techniques.
• In this research, I demonstrate how NLP and machine learning
can be used in processing corporate filing data from the
EDGAR database
Edgar web scraping
• EDGAR (the Electronic Data Gathering, Analysis, and Retrieval)
system, as a source of corporate filing information.
• The EDGAR is used by the US SEC (Securities and Exchange
Commission)
Edgar web scraping
• Web scraping EDGAR
• EDGAR provides daily/quarterly master index files to
effectively download company filings from its website.
• A sample master index file for Apple
Edgar web scraping
• MAP-REDUCE FRAMEWORK FOR TEXT MINING
• The mapper functions classify and extract relevant keywords, pairs and
numbers from the downloaded text files.
• Build a customized reduction function that aggregates words based on
company, type of document, date of filing and reporting period and
section.
Edgar web scraping
• Natural language processing
• Create a Corpus which is a collection of documents containing
natural language text.
• Remove punctuation and numbers, and further convert all the
text to upper cases.
• Remove all the English stop words from the text document,
remove words which are not relevant for our needs, such as
generic words, geographies.
• Stemming is the process of reducing words to their base or roots,
so that to group together words with similar meaning.
• At the end of this exercise, we should have a list of words and
their frequency count for each text section.
Edgar web scraping
• Word cloud of textual documents
Edgar web scraping
• SENTIMENT AND TONE ANALYSIS
• Positive sentiment = total positive words/total words
• Negative sentiment = total negative words/total words
• Net tone = positive sentiment – negative sentiment
• Naturally more negative words than positives
Edgar web scraping
Edgar web scraping
Edgar web scraping
• DISTANCE MEASURES (YOY CHANGE IN CORPORATE FILING LANGUAGE)
Edgar web scraping
• American Express is an example
Analyst
Idea2: Analyst
• Dataset: Estimize : crowd source earnings and revenue estimates from
financial professional and non financial.
• Coverage: 1600 companies in Russell 3000 (2014).

Some conclusion:
• More analysts => More accurate
• Closer to earning announcement day => better estimate
• Analyst experience. More experience => more accurate
• Analyst skill: estimation error = |Actual EPS – Estimated EPS| / Stock Price
Analyst
Pre-earning announcement strategy
• Before earning announcement day, if the estimated EPS go up => it’s likely
the actual EPS will beat the estimated EPS => Price will go up
Analyst
Earning Revision

• Upward (downward) EPS revision => positive (negative) return in


announcement day return
Post-earning announcement drift
• 3 weeks after earning announcement, stock price will follow the return of
announcement day.
• The signal now decay quickly in US
Analyst
Low risk strategy

• Higher dispersion => Higher risk and volatility and lower return on
announcement day.
• Remove those stocks have high dispersion => Improve performance

You might also like