DataCamp Analyzing Social Media Data in Python
ANALYZING SOCIAL MEDIA DATA IN PYTHON
Maps and Twitter data
Alex Hanna
Computational Social Scientist
DataCamp Analyzing Social Media Data in Python
Why maps?
Geographical scope
Participants or observers?
Differentiating tweets
For or against?
DataCamp Analyzing Social Media Data in Python
How Twitter gets location data
Location is device-dependent
In practice, aggregate geographical to
county, state-level
DataCamp Analyzing Social Media Data in Python
Beware selection biases!
Warning: only 1-3% of Twitter data have geographical data
Limits the generalizability of inference
DataCamp Analyzing Social Media Data in Python
Types of Geographical Data available in Twitter
Twitter text (most imprecise)
User location
Bounding boxes
Coordinates and points (most precise)
DataCamp Analyzing Social Media Data in Python
ANALYZING SOCIAL MEDIA DATA IN PYTHON
Let's practice!
DataCamp Analyzing Social Media Data in Python
ANALYZING SOCIAL MEDIA DATA IN PYTHON
Geographical Data in
Twitter JSON
Alex Hanna
Computational Social Scientist
DataCamp Analyzing Social Media Data in Python
Locations in Twitter text
DataCamp Analyzing Social Media Data in Python
User-defined location
> print(tweet['user']['location'])
Bay Area
DataCamp Analyzing Social Media Data in Python
place JSON
> print(tweet['place'])
{'attributes': {},
'bounding_box': {'coordinates':
[[[-80.47611, 37.185195],
[-80.47611, 37.273387],
[-80.381618, 37.273387],
[-80.381618, 37.185195]]],
'type': 'Polygon'},
'country': 'United States',
'country_code': 'US',
'full_name': 'Blacksburg, VA',
'name': 'Blacksburg',
'place_type': 'city',
...}
DataCamp Analyzing Social Media Data in Python
Calculating the centroid
coordinates = [
[-80.47611, 37.185195],
[-80.47611, 37.273387],
[-80.381618, 37.273387],
[-80.381618, 37.185195]]
longs = np.unique( [x[0] for x
in coordinates] )
lats = np.unique( [x[1] for x
in coordinates] )
central_long = np.sum(longs) / 2
central_lat = np.sum(lats) / 2
DataCamp Analyzing Social Media Data in Python
coordinates JSON
> print(tweet['coordinates'])
{'type': 'Point',
'coordinates': [-72.2833, 21.7833]}
DataCamp Analyzing Social Media Data in Python
ANALYZING SOCIAL MEDIA DATA IN PYTHON
Let's practice!
DataCamp Analyzing Social Media Data in Python
ANALYZING SOCIAL MEDIA DATA IN PYTHON
Creating Twitter maps
Alex Hanna
Computational Social Scientist
DataCamp Analyzing Social Media Data in Python
Introducing Basemap
Library for plotting two-dimensional
maps
Built on top of matplotlib
Converts coordinates into map
projections
DataCamp Analyzing Social Media Data in Python
Beginning with Basemap
from mpl_toolkits.basemap
import Basemap
m = Basemap(projection='merc',
llcrnrlat = -35.62,
llcrnrlon = -17.29,
urcrnrlat = 37.73,
urcrnrlon = 51.39)
m.fillcontinents(color='white')
m.drawcoastlines(color='gray')
m.drawcountries(color='gray')
DataCamp Analyzing Social Media Data in Python
Plotting points
africa = pd.read_csv('africa.csv')
longs = africa['CapitalLongtiude']
lats = africa['CapitalLatitude']
m = Basemap(...)
m.fillcontinents(color='white',
zorder = 0)
m.drawcoastlines(color='gray')
m.drawcountries(color='gray')
m.scatter(longs.values,
lats.values,
latlon = True,
alpha = 0.7)
DataCamp Analyzing Social Media Data in Python
Using color
africa = pd.read_csv('africa.csv')
longs = africa['CapitalLongtiude']
lats = africa['CapitalLatitude']
arabic = africa['Arabic']
m = Basemap(...)
m.fillcontinents(color='white',
zorder = 0)
m.drawcoastlines(color='gray')
m.drawcountries(color='gray')
m.scatter(longs.values,
lats.values,
latlon = True,
c = arabic.values,
cmap = 'Paired',
alpha = 1)
DataCamp Analyzing Social Media Data in Python
ANALYZING SOCIAL MEDIA DATA IN PYTHON
Let's practice!
DataCamp Analyzing Social Media Data in Python
ANALYZING SOCIAL MEDIA DATA IN PYTHON
Congratulations!
Alex Hanna
Computational Social Scientist