Using Digital Trace Data in the Social Sciences: Session 2 Set Up and Introduction to Collecting Data on Twitter

In our second session, we focus on setting up our data collection and getting acquainted with Twitter’s APIs. We start by discussing the research process with digital trace data. Following this, we will prepare your machine for working with Python. We then will get you access to Twitter’s API. We then use a couple example scripts provided in our tutorial to get some first practice in collecting data on Twitter.

To prepare for this session or to build on the issues we discussed you could have a look at Twitter’s API documentation. Make sure to have an extended look at the data fields provided by Twitter’s API in its API objects. This will show you which information Twitter provides you with through their API and which information is available to you for your analyses.

For more information on APIs of various services have a look at Matthew A. Russell’s book Mining the Social Web (2014, 2nd ed.).

In the course we will work quite heavily with the command line of your system. So make sure you (re-)acquaint yourself with some basic commands (i.e. navigating to a specific directory through the command line et al.).

Also, keep in mind that the command line might make trouble accessing a file on your machine that has spaces in its name or in its path. In these cases, either rename your file or put quotation marks around the complete file path (e.g. cd “/Users/(…)/twitterresearch”).

Also please prepare your machine for the following sessions. First, make sure you have a Python distribution up and running. For the purposes of this course, I recommend Continuum Analytic’s Anaconda.

Now, make sure your system is prepared to work with Python. If you are using a Mac make sure you have Apple’s Xcode installed. If you are using a PC please install a current version of Microsoft’s Visual Studio. Please make sure your version includes both “Visual C++” and “Common Tools for Visual C++ 2015”. This is important as both programs are needed for you to run specific Python modules. In case you run in any troubles maybe have a look at this comment thread.

As a final step, please follow the procedure described in Jürgens & Jungherr (2016), p. 18.

Now your machine should be ready for the purposes of this course. You can test this by following the examples provided in this session’s code example.

Mandatory Readings:

Pascal Jürgens and Andreas Jungherr (2016) A Tutorial for Using Twitter Data in the Social Sciences: Data Collection, Preparation, and Analysis. Social Science Research Network (SSRN). doi: 10.2139/ssrn.2710146, pp. 15-20.

Course Material:

Back to Course Overview.

Big Data, Computational Social Science, Course Material, Python, Using Digital Trace Data in the Social Sciences

Recent Posts

Categories

Archives