Programming Tutorials

Retrieve Twitter posts and comments using Python

By: Queency in Python Tutorials on 2023-03-17  

Here's some sample Python code that uses the Twitter API to collect tweets and replies and saves them in a dataset:

import tweepy
import pandas as pd

# set up the Twitter API credentials
consumer_key = 'your_consumer_key'
consumer_secret = 'your_consumer_secret'
access_token = 'your_access_token'
access_secret = 'your_access_secret'

# authenticate with the Twitter API
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)

# search for tweets containing a particular keyword or hashtag
query = 'python'
tweets = tweepy.Cursor(api.search_tweets, q=query, tweet_mode='extended').items(1000)

# create a list to store the tweet and reply data
data = []

# loop through each tweet and get its replies
for tweet in tweets:
    tweet_data = {
        'text': tweet.full_text,
        'user': tweet.user.screen_name,
        'created_at': tweet.created_at
    }
    replies = tweepy.Cursor(api.search_tweets, q='to:'+tweet.user.screen_name, since_id=tweet.id, tweet_mode='extended').items()
    for reply in replies:
        reply_data = {
            'text': reply.full_text,
            'user': reply.user.screen_name,
            'created_at': reply.created_at,
            'in_reply_to': tweet.id
        }
        data.append(reply_data)
    data.append(tweet_data)

# convert the list of tweet and reply data into a Pandas DataFrame
df = pd.DataFrame(data)

# save the DataFrame as a CSV file
df.to_csv('twitter_data.csv', index=False)

In this code, we first set up the Twitter API credentials using the values for consumer_key, consumer_secret, access_token, and access_secret. We then use Tweepy to authenticate with the Twitter API and search for tweets containing a particular keyword or hashtag (query).

We loop through each tweet that we find and use Tweepy to get its replies by searching for tweets that are directed to the same user and have a higher ID than the current tweet. We store the tweet and reply data in a list called data, which we then convert into a Pandas DataFrame and save as a CSV file.

The resulting CSV file will contain columns for the text of the tweet and reply, the user who posted the tweet and reply, the timestamp for when the tweet and reply were created, and the ID of the tweet that the reply is in response to.






Add Comment

* Required information
1000

Comments

No comments yet. Be the first!

Most Viewed Articles (in Python )

Latest Articles (in Python)