This semester, I will take my course Psychological Mechanisms of Political Communication out for a second spin at the University of Mannheim. For this version, I moved somewhat farther away from a standard political communication course by dropping sections on the spiral of silence and opinion leaders and instead included sections on information processing, heuristics, and political knowledge. This should be fun.
CfP: The Empiricist’s Challenge: Asking Meaningful Questions in Political Science in the Age of Big Data
How can big data used to answer meaningful questions in political science?
Yannis Theocharis and I are organising a conference addressing this question. The conference will be held at the MZES at the University of Mannheim on October 23-24, 2015.
We aim to create a forum where leading practitioners, challengers and up-and-coming social scientists who work in the area of digital trace data meet and engage in debate. We are particularly interested in bringing together scholars from different scientific disciplines (such as political science, sociology, media and communication and computer science) who, although increasingly converge in their work around similar questions, often find it difficult to establish productive lines of communication and collaboration.
Keynote Speakers:
We are pleased to have secured the participation of three prolific scholars who with their work have contributed to shaping the use of digital trace data in the social sciences.
W. Lance Bennett, University of Washington
Sandra González-Bailón, University of Pennsylvania
Jonathan Nagler, New York University
Richard Rogers, University of Amsterdam
If you want to participate, here is our call for papers. For more information please visit our website, email us at bigdatapolitics [at] uni-mannheim.de, or follow us on Twitter @bigdatapolitics.
Objective
The continuously growing use of digital services has provided social scientists with an expanding reservoir of data, potentially holding valuable insights into human behavior and social systems. The potentials of the use of digital trace data in social science research has famously given rise to the terms “big data” and “computational social science”. Using such data, social scientists have argued, will enable us to better understand social, political and economic life through the generation of large datasets that are composed not of questions asked of citizens concerning their attitudes and behaviours, but of the digital traces of their actual behaviour as they navigate the online world.
While the potentials of the use of digital trace data have been a continuous focus in public debate, scientific contributions using such trace data in political science usually come in the form of research-manifestos or isolated proofs-of-concept, only marginally contributing to current debates in the social sciences. Examples abound of descriptive analyses, maps and visualizations of citizens” or candidates” social media use during electoral campaigns, or of activists during social movement mobilisations. Indeed, at present, most work using digital trace data in the analysis of political phenomena falls into two categories: (1) Using digital trace data to illustrate online-components of political events, such as protests, televised debates or election campaigns; or (2) Demonstrating that in specific cases, specific selections of digital-trace-data collected on specific services somewhat resembles routinely used metrics in political science such as opinion polls, election results and ideological placement of MPs based on roll-call-data.
Even though there are many interesting and valuable contributions among these studies, for moving into the main stream of political research the field has to mature. This includes: developing standards for data collection, preparation, analysis and reporting; establishing more systematic links between the established body of research in the social sciences; and a move away from proofs-of-concepts towards the systematic development and testing of hypotheses.
The aim of this conference is to contribute to this development. We aim to create a forum where leading practitioners, challengers and up-and-coming social scientists who work in the area of digital trace data meet and engage in debate. Any such endeavor needs to take interdisciplinary considerations into account. We are thus particularly interested in bringing together scholars from different scientific disciplines (such as political science, sociology, media and communication and computer science) who, although increasingly converge in their work around similar questions, often find it difficult to establish productive lines of communication and collaboration.
The papers can address, but should not be limited to, the following topics:
– Which are the areas of political science where the use of digital trace data hold potential and how can this be illustrated systematically?
– Which current theories offer researchers using digital trace data a valuable context to frame their research question and develop and test hypotheses?
– What are the main challenges to be resolved before the use of digital trace data can enter the mainstream of political science?
We invite papers that pursue these questions by specifically discussing the above mentioned challenges, as well as for analytical empirical studies that can serve as exemplars. In particular, we encourage papers discussing theoretical challenges of the use of digital trace data in the social sciences, linking the analysis of digital trace data with established research questions and topics in political science, and discussing on how to establish necessary methodological and procedural challenges in establishing digital trace data as standard elements of social science research.
Submissions
Proposals including a paper title, abstract of up to 500 words, 3-5 keywords and the names and affiliations of all authors should be submitted to the following e-mail address no later than July 13, 2015.
bigdatapolitics [at] uni-mannheim.de
Authors of selected high quality contributions will be invited to submit their papers for consideration in a special issue of the Journal of Information Technology & Politics. These submissions will undergo the journal’s regular peer-review process.
Funding
Travel and accommodation support of up to 300€ for PhD students and postdoctoral fellows will be offered to one author for each accepted paper. By providing resources for the participation of more junior scholars we hope to further encourage their participation. In the case of co-authored work the authors themselves should decide who will receive this support if their paper is accepted. Conference meals, including a conference dinner will also be covered.
Questions?
Please direct any and all inquiries concerning the workshop, paper submissions and/or funding to:
bigdatapolitics@uni-mannheim.de
Please visit our website for more details about the conference, and follow the official Twitter account @bigdatapolitics to stay updated with regards to the latest developments.
Computational Social Science, Digital Methods and Big Data in Political Science: Week 8
In week eight, we focus on extracting data on interaction networks between Twitter users from JSON files collected through Twitter’s API using Python. In a second step, we perform some simple description of the resulting network using the R package igraph. Kolaczyk & Csárdi (2014) offers an excellent introduction to the statistical description of networks while McKinney (2012) offers a very helpful primer on how to use Python for data manipulation.
Code Example Session 8
Mandatory Readings
Eric D. Kolaczyk & Gábor Csárdi (2014). “Descriptive Analysis of Network Graph Characteristics” Statistical Analysis of Network Data with R. Springer. (pp. 43-67)
Background Readings
Wes McKinney. [@wesmckinn]. (2012). Python for Data Analysis. O’Reilly.
Computational Social Science, Digital Methods and Big Data in Political Science: Week 7
In week seven of the course, we focus on the introduction of some foundations of the analysis of social networks. The basis of our discussion is The structure and function of complex networks by Mark Newman. The introduction given in this session will form the basis of the hands-on analyses of Twitter interaction networks during the final sessions of the course.
Mandatory Readings
Newman, Mark E.J. (2003). The Structure and Function of Complex Networks. SIAM Review 45(2): 167–256.
Background Readings
Hennig, Marina, Ulrik Brandes, Jürgen Pfeffer & Ines Mergel (2012). Studying Social Networks: A Guide to Empirical Research. Campus.
Easley, David, Jon Kleinberg (2010). Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press.
Eric D. Kolaczyk (2009). Statistical Analysis of Network Data: Methods and Models. Springer.
Newman, Mark E.J. (2010). Networks: An Introduction. Oxford University Press.
Scott, John (2012). Social Network Analysis (3rd Edition). Sage.
Wasserman, Stanley & Katherine Faust (1994). Social Network Analysis: Methods and Applications. Cambridge University Press.
Computational Social Science, Digital Methods and Big Data in Political Science: Week 6
This week’s session is blocked for students to work on self-chosen projects using the analytical methods introduced in the previous sessions.
Twitter in the Analysis of Social Phenomena: Poster
Next week I will participate in the Herrenhäuser Konferenz on Big Data in a Transdisciplinary Perspective organized by the Volkswagen Stiftung. The conference aims to address issues raised by “big data” in science, business, and politics:
Large amounts of data, a variety of sources, high speed production, but also high speed processing – these are the basic characteristics of Big Data. The amount of data that is generated and collected in each second grows exponentially. The management of Big Data, the intelligent use of large, heterogeneous data sets, is becoming increasingly important for competition. It is affecting all sectors – industry and academia but also the public sector. While the economy is exploring Big Data as a new gold mine, politicians are fighting over the problem of data capitalism, whereas science tackles the question of cross-disciplinary benefits, as well as on the challenges and the likely consequences for technology, innovation, and society.
I will contribute a poster focusing on some of my dissertation’s findings on the use of Twitter in the analysis of social and political phenomena. More on that topic later this year once the dissertation is published.
Twitter in the Analysis of Social Phenomena: Mediated Reflections of Social Life in Digital Trace Data
Most research using digital trace data in the analysis of social phenomena implicitly assumes these data to provide true mirrors to phenomena of interest (mirror hypothesis). Given specific data generating processes producing the data used in these analyses, this hypothesis seems highly unlikely. Instead, it is far more likely that these data offer a skewed reflection of social phenomena mediated through interests, attention, and intentions of users and service-specific affordances (mediation hypothesis). This can be illustrated by the analysis of political phenomena with digital trace data collected on the microblogging service Twitter.
Traces of political phenomena in Twitter data are produced in a multi-step mediation process. Based on an underlying political or social phenomenon (A) a stimulus emerges (such as an event) which has to grab the attention of a Twitter user for her to consider referring to politics in a tweet (B). Following this, users have to encode their initial responses to elements of political reality within the technological limitations of the microblogging service to create a digital artifact, tweet (C). These can in turn be aggregated (D). Based on these aggregates, various metrics can be calculated (e.g. mention counts of political actors or the structure of networks based on their interactions), potentially allowing inferences on the political phenomena giving rise to the data (4).
The relationship between tweets and political reality is thus filtered by various mediating steps, potentially introducing various biases leading the picture of political reality emerging from aggregates of tweets to be blurred or skewed. Three types of influences are likely to affect this mediation process. Some are based on characteristics of events in political reality (1), some on the individual characteristics of users (2), and some on the specific technological design of Twitter and respective usage conventions (3). These mediating factors have to be examined much more closely to understand which parts of political reality are likely to be emphasized by this process and which are likely to be neglected.
Political reality as found in aggregates of Twitter messages diverges from political reality in general. This is true for political events, popular topics of discussion, and attention towards political actors. This suggests some caution in expecting Twitter data to provide a true image of political phenomena. Instead, digital trace data appear to provide a selection of political reality determined by various mediation processes associated with the use of various digital services. Put differently, we have to take the dynamics and mechanisms of mediation of political reality through digital services seriously if we want to use digital trace data in the analysis of political and social phenomena.
Further Reading:
Jungherr, A. (2015). Analyzing Political Communication with Digital Trace Data: The Role of Twitter Messages in Social Science Research. Springer, Heidelberg. In Press.
Computational Social Science, Digital Methods and Big Data in Political Science: Week 5
In the fifth week of the course, we focus on how to load previously downloaded tweets in a MongoDB and how to run some descriptive analyses on these data with the PyMongo package in Python. The code examples for this week are based on parts of the MongoDB tutorial Getting Started and “Chapter 6: Mining Mailboxes: Analyzing Who’s Talking to Whom About What, How Often, and More” in Matthew A. Russell‘s (2013) Mining the Social Web (2nd edition). (pp. 225-278).
Background Readings
Kyle Banker, Peter Bakkum, Shaun Verch, Douglas Garrett, and Tim Hawkins (2015). MongoDB in Action (2nd edition). Manning Publications.
Kristina Chodorow (2013). MongoDB: The Definitive Guide (2nd edition). O’Reilly.
Amol Nayak (2014). MongoDB Cookbook. Packt Publishing.
Matthew A. Russell (2013). “Chapter 6: Mining Mailboxes: Analyzing Who’s Talking to Whom About What, How Often, and More” in Mining the Social Web (2nd edition). pp. 225-278. O’Reilly.
Computational Social Science, Digital Methods and Big Data in Political Science: Week 4
In the fourth week of the course, we focus on how to get all available messages by a specific user from the Twitter API and on how to convert date fields in a data format appropriate for import into MongoDB. The code examples for this week are based on work by Pascal Jürgens (University of Mainz) and parts of “Chapter 6: Mining Mailboxes: Analyzing Who’s Talking to Whom About What, How Often, and More” in Matthew A. Russell’s (2013) Mining the Social Web (2nd edition). (pp. 225-278).
Background Readings
Matthew A. Russell (2013). “Chapter 6: Mining Mailboxes: Analyzing Who’s Talking to Whom About What, How Often, and More” in Mining the Social Web (2nd edition). pp. 225-278. O’Reilly.
Computational Social Science, Digital Methods and Big Data in Political Science: Week 3
In the third week of the course, we focus on how to collect data from Twitter’s REST APIs. In this session we follow closely the script provided by Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort, Abhijit Dasgupta in the first part of Chapter 10 of their book Practical Data Science Cookbook (pp. 307-327). As background reading for the Python code used in this example serves Lutz (2013).
Mandatory Readings
Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort, Abhijit Dasgupta. (2014). Chapter 10: Harvesting and Geolocating Twitter Data (Python). Practical Data Science Cookbook. Packt Publishing.
Background Readings
Mark Lutz. (2013). Learning Python. 5th Edition. O’Reilly Media, Inc.
Computational Social Science, Digital Methods and Big Data in Political Science: Week 2
In the second week, the course focuses on introducing students to the concepts of Computational Social Science, Digital Methods, and Big Data. For this, we use articles by Cioffi-Revilla (2010) and Rogers (2010). In addition to these surveys of the field, two articles by Conover et al. (2012) and Hanna et al. (2013) serve as examples as to how one can approach research questions in political science by using Twitter data, either through network analysis or time series analysis. Cioffi-Revilla (2014), Lazer et al. (2009), and Rogers (2013) serve as background reading.
Mandatory Readings
Cioffi-Revilla, C. 2010. “Computational social science“. Wiley Interdisciplinary Reviews: Computational Statistics 2(3): 259–271.
Conover, M. D., Gonçalves, B., Flammini, A., & Menczer, F. 2012. Partisan asymmetries in online political activity. EPJ Data Science 1(6), 1–19. doi: 10.1140/epjds6
Hanna, A., Wells, C., Maurer, P., Shah, D.V., Friedland, L., & Mattes, J. 2013. Partisan alignments and political polarization online: A computational approach to understanding the French and US presidential elections. In I. Weber, A.M. Popescu, & M. Pennacchiotti (Ed.), PLEAD 2013: Proceedings of the 2nd workshop politics, elections and data (pp. 15–21). New York, NY: ACM.
Rogers, R. 2010. “Internet Research: The Question of Method“. Journal of Information Technology and Politics 7(2-3): 241-260.
Optional readings
Cioffi-Revilla, C. 2014. Introduction to Computational Social Science: Principles and Applications. Heidelberg, DE et al.: Springer.
Lazer D., Pentland A., Adamic L., Aral S., Barabási A.L., Brewer D., Christakis N., Contractor N., Fowler J., Gutmann M., Jebara T., King G., Macy M.W., Roy D., Alstyne M.V. 2009. Computational social science. Science 323(5915): 721–723.
Rogers, R. 2013. Digital Methods. Cambridge, MA: MIT Press.