0% found this document useful (0 votes)
62 views

2.1 Analysing Social Media in Python

The document discusses analyzing Twitter data by collecting tweets through the Twitter API using the Python library tweepy. It explains that the Twitter API allows access to tweet text, user profile information, geolocation, and retweets while only providing a 1% sample of all tweets. The document also covers authenticating with the Twitter API, collecting data using tweepy, and understanding the Twitter JSON response format.

Uploaded by

murari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

2.1 Analysing Social Media in Python

The document discusses analyzing Twitter data by collecting tweets through the Twitter API using the Python library tweepy. It explains that the Twitter API allows access to tweet text, user profile information, geolocation, and retweets while only providing a 1% sample of all tweets. The document also covers authenticating with the Twitter API, collecting data using tweepy, and understanding the Twitter JSON response format.

Uploaded by

murari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Analyzing Twitter

Data
A N A LY Z I N G S O C I A L M E D I A D ATA I N P Y T H O N

Alex Hanna
Computational Social Scientist
Why Analyze Twitter Data?

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Why Analyze Twitter Data?

Source: Conover et al. (2011)

ANALYZING SOCIAL MEDIA DATA IN PYTHON


What you can't analyze
Can't collect data on observers

Free-level of access is restrictive


Can't collect historical data

Only a 1% (unveri ed) sample

ANALYZING SOCIAL MEDIA DATA IN PYTHON


What you can analyze
1% sample is still a few
million tweets

Within a tweet
Text

User pro le information

Geolocation

Retweets and quoted


tweets

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Let's review!
A N A LY Z I N G S O C I A L M E D I A D ATA I N P Y T H O N
Collecting data
through the Twitter
API
A N A LY Z I N G S O C I A L M E D I A D ATA I N P Y T H O N

Alex Hanna
Computational Social Scientist
Twitter API
API: Application Programming Interace
Method of accessing data

Twi er APIs
Search API

Ads API

Streaming API

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Streaming API
Streaming API
Real-time tweets

Filter endpoint
Keywords

User IDs

Locations

Sample endpoint
Random sample

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Using tweepy to collect data
tweepy
Python package for accessing Streaming API

ANALYZING SOCIAL MEDIA DATA IN PYTHON


SListener
from tweepy.streaming import StreamListener
import time

class SListener(StreamListener):
def __init__(self, api = None):
self.output = open('tweets_%s.json' %
time.strftime('%Y%m%d-%H%M%S'), 'w')
self.api = api or API()
...

ANALYZING SOCIAL MEDIA DATA IN PYTHON


tweepy authentication
from tweepy import OAuthHandler
from tweepy import API

auth = OAuthHandler(consumer_key, consumer_secret)


auth.set_access_token(access_token, access_token_secret)
api = API(auth)

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Collecting data with tweepy
from tweepy import Stream

listen = SListener(api)
stream = Stream(auth, listen)
stream.sample()

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Let's practice!
A N A LY Z I N G S O C I A L M E D I A D ATA I N P Y T H O N
Understanding
Twitter JSON
A N A LY Z I N G S O C I A L M E D I A D ATA I N P Y T H O N

Alex Hanna
Computational Social Scientist
Contents of Twitter JSON
{ "created_at": "Thu Apr 19 14:25:04 +0000 2018",
"id": 986973961295720449,
"id_str": "986973961295720449",
"text": "Writing out the script of my @DataCamp class
and I can't help but mentally read it back to myself in
@hugobowne's voice.",
"retweet_count": 0,
"favorite_count": 1,
... }

How many retweets, favorites

Language

Reply to which tweet

Reply to which user

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Child JSON objects
{
"user": {
"id": 661613,
"name": "Alex Hanna, Data Witch",
"screen_name": "alexhanna",
"location": "Toronto, ON",
...
}
}

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Places, retweets/quoted tweets, and 140+ tweets
place and coordinate
contain geolocation

extended_tweet
tweets over 140 characters

retweeted_status and quoted_status


contain all tweet information of retweets and quoted
tweets

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Accessing JSON
import json

tweet_json = open('tweet-example.json', 'r').read()


tweet = json.loads(tweet_json)
tweet['text']

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Child tweet JSON
tweet['user']['screen_name']
tweet['user']['name']
tweet['user']['created_at']

ANALYZING SOCIAL MEDIA DATA IN PYTHON


Let's practice!
A N A LY Z I N G S O C I A L M E D I A D ATA I N P Y T H O N

You might also like