Welcome to the course! You have interesting sessions to look forward to. At the end of which, I hope you are at least as excited by the work with digital trace data as you are now but of course much more able to translate that excitement into actual scientific projects.
In our first session, we will discuss the background of working with digital trace data. We will start by discussing some of the expectations connected with this new data sources. Here, we will discuss the terms Computational Social Science, Digital Methods, Big Data, and Digital Trace Data.
We then will focus on two prominent fallacies in the work with digital trace data:
- The n = all fallacy;
- The mirror hypothesis
Both fallacies can be found explicitly or implicitly in prominent works based on digital trace data. They are central to limiting the value of research based on digital trace data and to raising false expectations of which types of insight these data type can actually deliver.
Central to avoiding these fallacies are three often neglected steps:
- Start by clearly thinking about research design in working with digital trace data.
- Keep the data generating process in mind that led to the production of specific data sets. Doing so will help you in deciding and justifying for which social or political phenomena specific sets of digital trace data might hold promising insights.;
- Explicitly establish and test a theoretical link between the data collected by you online and your phenomenon of interest. Without such a link, you run the risk of falling for spurious correlations instead of offering insights.
In this context, we will quickly talk about the value of interpreting digital trace data as mediated traces of user behavior and, therefore, mediated reflections of social or political phenomena of interest.
After this, we will close by discussing a series of interesting questions in political science closely related to the data generating process leading to the publication of tweets and, therefore, closely connected with digital trace data.
- Andreas Jungherr (2017). Normalizing Digital Trace Data. In Digital Discussions: How Big Data Informs Political Communication, eds. Natalie Jomini Stroud and Shannon McGregor. New York, NY: Routledge. (Forthcoming). [Preprint]
- Pascal Jürgens and Andreas Jungherr (2016) A Tutorial for Using Twitter Data in the Social Sciences: Data Collection, Preparation, and Analysis. Social Science Research Network (SSRN). doi: 10.2139/ssrn.2710146, pp. 7-14.
- David Donoho. 50 Years of Data Science. Paper presented at the Tukey Centennial workshop, Princeton, NJ. Sept. 18 (2015).
- Bradley Efron, and Trevor Hastie. Computer Age Statistical Inference: Algorithms, Evidence and Data Science. Cambridge: Cambridge University Press.
- Deen Freelon. “On the interpretation of digital trace data in communication and social computing research“. In: Journal of Broadcasting & Electronic Media 58.1 (2014), pp. 59–75. doi: 10.1080/08838151.2013.875018.
- Scott A. Golder and Michael W. Macy. “Digital Footprints: Opportunities and Challenges for Online Social Research“. In: Annual Review of Sociology 40 (2014), pp. 129–152. doi: 10.1146/annurevsoc071913043145.
- James Howison, Andrea Wiggins, and Kevin Crowston. “Validity issues in the use of social network analysis with digital trace data“. In: Journal of the Association for Information Systems 12.12 (2011), pp. 767–797.
- Andreas Jungherr and Pascal Jürgens. “Forecasting the pulse: How deviations from regular patterns in online data can identify offline phenomena“. In: Internet Research 23.5 (2013), pp. 589–607. doi: 10.1108/IntR-06-2012-0115.
- Andreas Jungherr, Harald Schoen, and Pascal Jürgens. “The mediation of politics through Twitter: An analysis of messages posted during the campaign for the German federal election 2013“. In: Journal of Computer-Mediated Communication 21.1 (2016), pp. 50.68. doi: 10.1111/jcc4.12143.
- Andreas Jungherr, Harald Schoen, Oliver Posegga, and Pascal Jürgens. “Digital Trace Data in the Study of Public Opinion: An Indicator of Attention Toward Politics Rather Than Political Support“. In: Social Science Computer Review 35.3 (2017), pp. 336-356. doi: 10.1177/0894439316631043
- David Lazer et al. “Computational social science“. In: Science 323.5915 (2009), pp. 721–723. doi: 10.1126/science.1167742.
- David Lazer et al. “The Parable of Google Flu: Traps in Big Data Analysis“. In: Science 343.6176 (2014), pp. 1203–1205. doi: 10.1126/science.1248506.
- Viktor Mayer-Schönberger and Kenneth Cukier. Big Data: A Revolution that Will Transform How We Live, Work, and Think. New York, NY: Houghton Mifflin, 2013.
- Richard Rogers. Digital Methods. Cambridge, MA: The MIT Press, 2013.
- Derek Ruths and Jürgen Pfeffer. “Social media for large studies of behavior“. In: Science 346.6213 (2014), pp. 1063–1064. doi: 10.1126/science.346.6213.1063.
- Matthew Salganik. Bit by Bit: Social Research in the Digital Age. (Forthcoming).
- Markus Strohmaier and Claudia Wagner. “Computational Social Science for the World Wide Web“. In: IEEE Intelligent Systems 29.5 (2014), pp. 84–88. doi: 10.1109/MIS.2014.80.