Skip to content

Commit aea09f4

Browse files
Added scraper and updated readme
1 parent 4bcc841 commit aea09f4

File tree

3 files changed

+8
-7
lines changed

3 files changed

+8
-7
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,4 +31,5 @@
3131
28. Ecommerce Scraper: Scrapes product data from ecommerce websites and displays it to user in CLI.
3232
29. Lyrics Scraper: Scrape lyrics from atozlyrics website by specifying artist name.
3333
30. Walmart Scraper: Scrape data from walmart website and store it in database using MySQLdb.
34+
31. Twitter Scraper: Scrapes tweets from popular hashtags and saves them to csv file
3435

twitter-scraper/myfile.csv

2.16 MB
Binary file not shown.

twitter-scraper/twitter_scraper.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,19 +9,19 @@
99
#This code is using AppAuthHandler, not OAuthHandler to get higher limits, 2.5 times.
1010
auth = tweepy.AppAuthHandler('j2UAZfXuk6iitAjnLjbFcmn0y', 'Q9X7g4eAhyElO8u5VI183QwRCUF1sXrZs8m9poGt6Q1pmN4cOw')
1111
api = tweepy.API(auth, wait_on_rate_limit=True,
12-
wait_on_rate_limit_notify=True)
12+
wait_on_rate_limit_notify=True)
1313

1414

1515
if (not api):
1616
print ("Can't Authenticate")
1717
sys.exit(-1)
1818
def clean(val):
19-
clean = ""
20-
if val:
21-
clean = val.encode('utf-8')
22-
return clean
19+
clean = ""
20+
if val:
21+
clean = val.encode('utf-8')
22+
return clean
2323

24-
searchQuery = '' #This is for your hasthag(s), separate by comma
24+
searchQuery = '#techsytalk' #This is for your hasthag(s), separate by comma
2525
maxTweets = 80000 # Large max nr
2626
tweetsPerQry = 100 # the max the API permits
2727
fName = 'myfile.csv' #The CSV file where your tweets will be stored
@@ -62,7 +62,7 @@ def clean(val):
6262
print("No more tweets found")
6363
break
6464
for tweet in new_tweets:
65-
csvwriter.writerow([tweet.created_at, clean(tweet.user.screen_name), clean(tweet.text), tweet.user.created_at, tweet.user.followers_count, tweet.user.friends_count, tweet.user.statuses_count, clean(tweet.user.location), tweet.user.geo_enabled, tweet.user.lang, clean(tweet.user.time_zone), tweet.retweet_count]);
65+
csvwriter.writerow([tweet.created_at, clean(tweet.user.screen_name), clean(tweet.text), tweet.user.created_at, tweet.user.followers_count, tweet.user.friends_count, tweet.user.statuses_count, clean(tweet.user.location), tweet.user.geo_enabled, tweet.user.lang, clean(tweet.user.time_zone), tweet.retweet_count]);
6666

6767
tweetCount += len(new_tweets)
6868
#print("Downloaded {0} tweets".format(tweetCount))

0 commit comments

Comments
 (0)