Current Research

I work at the intersection between Social Science and Computer Science. The latter has a lot to offer in its algorithmic approach to analysing data. At the same time, Computer Science often lacks understandings of inference or data utility—perspectives well established in Social Science.

Recent advances in collecting and analysing human data have revolutionised the way in which we can do research about societies and I am really excited about these opportunities.

This is what is currently going on in the lab.

1 Text-as-Data

Scaling Lower Court Decisions: Together with Benjamin Engst, Thomas Gschwend, we propose a model that measures legal decisions of lower courts using widely available citations of legal sources. We show that some German Landgerichte indeed have a bias and use it to engage in ‘forum selling’. [paper]

How Presidents Answer the Call of International Capital: In this work with David Doyle, Nina Wiesehomeier, we show how Latin American Presidents use their state-of-the-union speeches to strategically communicate with International Captial Markets. [paper]

New Research Methods for the Analysis of Party Pledges

In the MiMac project I collaborate with ten colleagues from six international universities to investigate parties’ electoral pledges during election campaigns. As an interdisciplinary team of Computational Linguists and Social Scientist—led by Elin Naurin (Gothenburg) and Robert Thomson (Monash), and funded by the Swedish Riksbankens Jubileumsfond with € 1.15 million—we are developing AI-powered tools will enable researchers to examine parties’ campaign promises in large amounts of text and speech.

Together with Jac Larner and Fraser McMillan we are collecting and annotating the electoral pledges from Wales’ parties. Using and refining these new NLP tools, we are testing theories about devolved politics and single-party systems.

2 Synthetic Data

Really Useful Synthetic Data — Promises and Challenges of Releasing Sensitive Information With Differentially Private Data Synthesizers: Marcel Neunhoeffer and I develop a framework to measure the utility of differentially private synthetic data. [paper]

3 Images-as-Data

Detecting Election Irregularities with Machine Learning for Visual Data

This is a larger project together with Michelle Brown (NDI), J. Andrew Harris (NYU AD) and Zach Warner (Purdue) where we use computer vision to detect the presence of irregularities in election results. Funded by the ESRC, we are collaborating with the National Democratic Institute to improve electoral integrity in fragile democracies all around the world.

Hidden in Plain Sight? Detecting Electoral Irregularities Using Statutory Results: In this first paper we argue that the literature on electoral fraud suffers from its reliance on an ideal election as a baseline case, and delineate an alternative data-generating process for the irregularities often seen in developing democracies: benign human error. Using computer vision and deep learning tools, we identify statutory irregularities for each of 30,000 polling stations in Kenya’s 2013 presidential election. [paper]