Retrieve Twitter posts and comments using Python
By: Queency
Here's some sample Python code that uses the Twitter API to collect tweets and replies and saves them in a dataset:
import tweepy
import pandas as pd
# set up the Twitter API credentials
consumer_key = 'your_consumer_key'
consumer_secret = 'your_consumer_secret'
access_token = 'your_access_token'
access_secret = 'your_access_secret'
# authenticate with the Twitter API
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
# search for tweets containing a particular keyword or hashtag
query = 'python'
tweets = tweepy.Cursor(api.search_tweets, q=query, tweet_mode='extended').items(1000)
# create a list to store the tweet and reply data
data = []
# loop through each tweet and get its replies
for tweet in tweets:
tweet_data = {
'text': tweet.full_text,
'user': tweet.user.screen_name,
'created_at': tweet.created_at
}
replies = tweepy.Cursor(api.search_tweets, q='to:'+tweet.user.screen_name, since_id=tweet.id, tweet_mode='extended').items()
for reply in replies:
reply_data = {
'text': reply.full_text,
'user': reply.user.screen_name,
'created_at': reply.created_at,
'in_reply_to': tweet.id
}
data.append(reply_data)
data.append(tweet_data)
# convert the list of tweet and reply data into a Pandas DataFrame
df = pd.DataFrame(data)
# save the DataFrame as a CSV file
df.to_csv('twitter_data.csv', index=False)
In this code, we first set up the Twitter API credentials using the values for consumer_key, consumer_secret, access_token, and access_secret. We then use Tweepy to authenticate with the Twitter API and search for tweets containing a particular keyword or hashtag (query).
We loop through each tweet that we find and use Tweepy to get its replies by searching for tweets that are directed to the same user and have a higher ID than the current tweet. We store the tweet and reply data in a list called data, which we then convert into a Pandas DataFrame and save as a CSV file.
The resulting CSV file will contain columns for the text of the tweet and reply, the user who posted the tweet and reply, the timestamp for when the tweet and reply were created, and the ID of the tweet that the reply is in response to.
Archived Comments
Comment on this tutorial
- Data Science
- Android
- AJAX
- ASP.net
- C
- C++
- C#
- Cocoa
- Cloud Computing
- HTML5
- Java
- Javascript
- JSF
- JSP
- J2ME
- Java Beans
- EJB
- JDBC
- Linux
- Mac OS X
- iPhone
- MySQL
- Office 365
- Perl
- PHP
- Python
- Ruby
- VB.net
- Hibernate
- Struts
- SAP
- Trends
- Tech Reviews
- WebServices
- XML
- Certification
- Interview
categories
Related Tutorials
Python program to get location meta data from an image
Retrieve Twitter posts and comments using Python
How to install Jupyter in Ubuntu and make it accessible through Apache Reverse Proxy
Python Basics - Setting up your Python Development Environment
Schwartzian Transform in python
Multidimensional list (array) in python
Perl's chomp() equivalent for removing trailing newlines from strings in python