How would you link Twitter to Python

Twitter mining with R - part 1

With the US presidential election campaign at the latest, the microblogging service Twitter moved more into the public eye. However, if it is not enough just to browse through the social network, you can benefit from the wealth of data on Twitter with a few tricks.

Data access

Like many other online services, Twitter offers so-called APIs (“Application Programming Interface”). Interested parties can easily set up a link between the database of the online platform and their own analysis tool (e.g. R, Python).

However, in order to enable data access from Twitter, interested parties must have an active Twitter account. With this you can then create your own apps. In the following, we will go through all the steps after creating an active Twitter account as an example.

We have an active Twitter account and are now creating an app. To do this, we first visit apps.twitter.com and create a new app.

When filling in the mask, we must make sure that the name chosen is unique. The description can of course be generic. If you don't have your own website, you can easily choose a generic placeholder. We can leave out the callback URL for the time being.

In order to be able to access the API, we now need the following information:

  • Consumer Key (API Key)
  • Consumer Secret (API Secret)
  • Access token
  • Access Token Secret

We can easily see the consumer key elements under the tab "Keys and Access Tokens".

We have to generate the access token and the access token secret first.

The app has already been created and we have all the information to connect our analysis program to the Twitter API.

API query with RStudio

Twitter offers two different APIs: on the one hand the so-called Search or Rest API and on the other hand the Streaming API. With the Rest API, already written tweets and profile information can be queried. With the Streaming API, tweets can be continuously extracted. For the rest of the blog post, let's take a look at the rest of the API first.

RStudio and the rest of the API

In order to query tweets and profiles directly from RStudio via the Rest API, we need the following lines of code. Here,,, and must be defined beforehand.

Now the link between RStudio is under the Twitter API. In the next step we can request the first Twitter data.

Brief example

Assuming we are interested in tweets on the subject of data science, then these can be queried using the functions. With we pull off the last two tweets with the hashtag #DataScience.

However, the rest of the API is not limited to tweets. We can also mine specific profiles. Profile data can be queried with, here e.g. for the STATWORX account.

outlook

In the next issue we will look at the structure of the stored objects. The functions from the package allow us to subtract a large amount of information. Next time we will use this to structure, prepare and analyze the information.

About the author

Lukas Strömsdörfer

I am a data scientists at STATWORX, apart from automating my job, I am taking my vintage bike for a spin and building a ML tool that lets me become a below-average gardener.