During this session, we will learn how to load data downloaded on Twitter into a database. Loading tweets into a database will create a little overhead in work at the beginning but trust me, this will more then pay off for you further down the line when you are working with large data sets.
In this session, we will discuss the workings of the file “database.py” provided in our script set accompanying the Jürgens & Jungherr (2016). The script will load downloaded tweets into a SQLite database object and establish a predefined database structure allowing you a set of typical analytical approaches to Twitter data. The script uses peewee to interact with SQLite from Python.
To retrace your steps after the session have a look at Jürgens & Jungherr (2016), pp. 29-41. Also, have a look at the example code provided at the end of the post.
Mandatory Readings:
- Pascal Jürgens and Andreas Jungherr (2016) A Tutorial for Using Twitter Data in the Social Sciences: Data Collection, Preparation, and Analysis. Social Science Research Network (SSRN). doi: 10.2139/ssrn.2710146, pp. 29-41.
Background Readings:
- Grant Allen and Mike Owens. The Definitive Guide to SQLite. 2nd. New York, NY: Apress, 2010.
- Jay A. Kreibich. Using SQLite: Small. Fast. Reliable. Choose Any Three. Sebastopol, CA: O’Reilly Media, 2010.