Political Expression of Academics on Twitter. Joint with Thiemo Fetzer.
Nature Human Behaviour, 2025. Access: Paper 🔓, Website, Twitter Thread. Paper's Backstory
Coverage: Marginal Revolution, Matthew Yglesias, Noahpinion.blog, VoxEU, Times Higher Education (article; Op-ed), American Saga
Academics have traditionally played a vital role in both the generation and dissemination of knowledge, ideas and narratives. Social media, relative to traditional media, provides for new and more direct ways of science communication. Yet, since not all academics may engage with social media, the sample that does so may have an outsize influence on shaping public perceptions of academia more broadly through at least two channels: the set topics they engage with and through the particular style and tone of communication. This paper describes patterns in academics' expression online found in a newly constructed global dataset covering over 100,000 scholars linking their social media content to academic record. We document large and systematic variation in politically salient academic expression concerning climate action, cultural, and economic concepts. We show that these appear to often diverge from general public opinion in both topic focus and style.
Local Decline and Populism. Joint with Thiemo Fetzer and Jacob Edenhofer.
Economics Letters, 2025. Access: Paper 🔓, Twitter Thread.
Coverage: FAZ (German), The Conversation, VoxEU, Uni of Warwick, CAGE (1, 2)
Support for right-wing populist parties is characterised by considerable regional heterogeneity and especially concentrated in regions that have experienced economic decline. It remains unclear, however, whether the spatial externalities of local decline, including homelessness and crime, boost support for populist parties, even among those not directly affected by such decline. In this paper, we contribute to filling this gap in two ways. First, we gather novel data on a particularly visible form of local decline, high-street vacancies, that comprise 83,000 premises in England and Wales. Second, we investigate the influence of local decline on support for the right-wing populist UK Independence Party (UKIP) between 2009 and 2019. We find a significant positive association between high-street vacancy rates and UKIP support. These results enhance our understanding of how changes in the lived environment shape political preferences and behaviour, particularly in relation to right-wing populism.
Network Determinants of Cross-Border Media Coverage of Natural Disasters. Joint with Thiemo Fetzer
R&R at Nature Human Behaviour. Access: Paper
Climate change is increasing the frequency and severity of natural disasters worldwide. Media coverage of these events may be vital to generate empathy and mobilize global populations to address the common threat posed by climate change. Using a dataset of 466 news sources from 123 countries, covering 135 million news articles since 2016, we apply an event study framework to measure cross-border media activity following natural disasters. Our results shows that while media attention rises after disasters, it is heavily skewed towards certain events, notably earthquakes, accidents, and wildfires. In contrast, climatologically salient events such as floods, droughts, or extreme temperatures receive less coverage. This cross-border disaster reporting is strongly related to the number of deaths associated with the event, especially when the affected populations share strong social ties or genetic similarities with those in the reporting country. Achieving more balanced media coverage across different types of natural disasters may be essential to counteract skewed perceptions. Further, fostering closer social connections between countries may enhance empathy and mobilize the resources necessary to confront the global threat of climate change.
Causal Claims in Economics. Joint with Thiemo Fetzer.
View Paper. Open-Access Data. Twitter Thread (v1, v2), Summary and Method Guide.
Coverage: The Economist, Marginal Revolution (v1, v2), Noahpinion, World Bank, VoxEU (1, 2), VoxDev, Australian Treasury, Nada es Gratis, Correio Braziliense, causalpython.io, econometriafacil, Phenomenal World
Interactive Website (www.causal.claims) includes open data on claims from 45K papers, interactive tool to search knowledge graph of your papers, and CClARA (a Causal Claim Research Assistant to do graph-driven literature review.)
We analyze over 44,000 NBER and CEPR working papers from 1980 to 2023 using a custom language model to construct knowledge graphs that map economic concepts and their relationships. We distinguish between general claims and those documented via causal inference methods (e.g., DiD, IV, RDD, RCTs). We document a substantial rise in the share of causal claims-from roughly 4% in 1990 to nearly 28% in 2020-reflecting the growing influence of the "credibility revolution." We find that causal narrative complexity (e.g., the depth of causal chains) strongly predicts both publication in top-5 journals and higher citation counts, whereas non-causal complexity tends to be uncorrelated or negatively associated with these outcomes. Novelty is also pivotal for top-5 publication, but only when grounded in credible causal methods: introducing genuinely new causal edges or paths markedly increases both the likelihood of acceptance at leading outlets and long-run citations, while non-causal novelty exhibits weak or even negative effects. Papers engaging with central, widely recognized concepts tend to attract more citations, highlighting a divergence between factors driving publication success and long-term academic impact. Finally, bridging underexplored concept pairs is rewarded primarily when grounded in causal methods, yet such gap filling exhibits no consistent link with future citations. Overall, our findings suggest that methodological rigor and causal innovation are key drivers of academic recognition, but sustained impact may require balancing novel contributions with conceptual integration into established economic discourse.
AI-Generated Production Networks: Measurement and Applications to Global Trade. Joint with Thiemo Fetzer, Peter John Lambert, Bennet Feld.
Access: Paper. Website. Twitter Thread.
Coverage: VoxEU, Interview by SCMP, The Ecologist
Interactive Website (aipnet.io) includes open data on input-output links between 5000 HS products.
This paper leverages generative AI to build a network structure over 5,000 product nodes, where directed edges represent input-output relationships in production. We layout a two-step ‘build-prune’ approach using an ensemble of prompt-tuned generative AI classifications. The ’build’ step provides an initial distribution of edge predictions, the ‘prune’ step then re-evaluates all edges. With our AI-generated Production Network (AIPNET) in toe, we document a host of shifts in the network position of products and countries during the 21st century. Finally, we study production network spillovers using the natural experiment presented by the 2017 blockade of Qatar. We find strong evidence of such spill-overs, suggestive of on-shoring of critical production. This descriptive and causal evidence demonstrates some of the many research possibilities opened up by our granular measurement of product linkages, including studies of on-shoring, industrial policy, and other recent shifts in global trade.
Politicized Scientists: Credibility Cost of Political Expression on Twitter. Joint with Eleonora Alabrese and Franceso Capozza
Access: Paper , Twitter Thread.
Coverage: Times Higher Education (article, op-ed), TheAmericanSaga, University of Bath, Italian media (Nadaesgratis, A Fuoco, iL Post)
The study measures scientists’ polarization on social media and its impact on public perceptions of their credibility. Analyzing 98,000 scientists on Twitter from 2016 to 2022 reveals significant divergence in expressed political opinions. An experiment assesses the impact of online political expression on a representative sample of 1,700 U.S. respondents, who rated vignettes with synthetic academic profiles varying scientists’ political affiliations based on real tweets. Politically neutral scientists are viewed as the most credible. Strikingly, on both the ’left’ and ’right’ sides of politically neutral, there is a monotonic penalty for scientists displaying political affiliations: the stronger their posts, the less credible their profile and research are perceived, and the lower the public’s willingness to read their content. The penalty varies with respondents’ political leanings.
Why Academics Are Leaving Twitter for Bluesky. Joint with Dorian Quelle, Frederic Denker and Alexandre Bovet
Access: Paper. Best Student Paper at NetSciSci2025
Coverage: Aporia Magazine,
We analyse the migration of 300,000 academic users from Twitter/X to Bluesky between 2023 and early 2025, combining rich bibliometric data, longitudinal social-media activity, and a novel cross-platform identity-matching pipeline. We show that 18% of scholars in our sample transitioned, with transition rates varying sharply by discipline, political expression, and Twitter engagement but not by traditional academic metrics. Using time-varying Cox models and a matched-pairs design, we isolate genuine peer influence from homophily. We uncover a striking asymmetry whereby information sources drive migration far more powerfully than audience, with this influence decaying exponentially within a week. We further develop an ego-level contagion classifier, revealing that simple contagion drives two-thirds of all exits, shock-driven bursts account for 16%, and complex contagion plays a marginal role. Finally, we show that scholars who rebuild a higher fraction of their former Twitter networks on Bluesky remain significantly more active and engaged. Our findings provide new insights onto theories of network externalities, directional influence, and platform migration, highlighting information sources’ central role in overcoming switching costs.
AI health advice accuracy varies across languages and contexts. Joint with Thiemo Fetzer
Access: Paper
Using basic health statements authorized by UK and EU registers and ~9,100 journalist-vetted public-health assertions on topics such as abortion, COVID-19 and politics from sources ranging from peer-reviewed journals and government advisories to social media and news across the political spectrum, we benchmark seven leading large language models in 21 languages. We find that, despite high accuracy on English-centric textbook claims, performance falls in multiple non-European languages and fluctuates by topic and source. This highlights the urgency of comprehensive multilingual, domain-aware validation before deploying AI in global health communication.
On Bob Dylan: A Computational Perspective.
Commissioned by Aeon/Psyche. Access: Paper, Thread on Twitter or Bluesky
Cass Sunstein's essay 'On Bob Dylan' describes Dylan's 'dishabituating' style -- a constant refusal to conform to expectation and a penchant for reinventing his musical and lyrical identity. In this paper, I extend Sunstein's observations through a large-scale computational analysis of Dylan's lyrics from 1962 to 2012. Using o3-mini-high (a large language model), I extract concept-to-concept relationships from the lyrics and construct directed knowledge graphs that capture Dylan's thematic structure. I then quantify shifts in sentiment, metaphorical expression, thematic diversity, and network complexity over time. The results indicate that Dylan's lyrics increasingly rely on metaphor, display an evolving sentiment profile, and exhibit heightened dishabituation -- measured here as a growing variance in the network centrality of key concepts. I also find that references to movement, protest, and mythic imagery fluctuate in ways that align with well-known phases of Dylan's career, reflecting the dynamic and unpredictable quality of his art. These findings not only deepen our empirical understanding of Sunstein's thesis but also introduce a novel computational method for analyzing an artist's evolution-offering broader applicability to the study of cultural and creative change.
This open‑source notebook collection and slides demonstrate two complementary LLM paradigms, retrieval and generation, for turning raw text into structured, research‑ready data.
Retrieval notebooks show how to mine large document corpora to extract causal edges, stance labels, demographic attributes and other key fields (e.g., the pipeline powering www.causal.claims).
Generation notebooks start from minimal seed prompts and leverage the model’s prior to build production networks, innovation profiles and context‑aware keyword dictionaries (see aipnet.io and www.academicexpression.online).
Across both strands you will find hands‑on modules for prompt engineering, JSON‑schema enforcement, cost‑efficient batch calling, embedding‑based code mapping (HS6 / JEL) and validation routines such as modal voting and cosine sanity checks. By the end, users can scale or adapt each workflow—whether analysing messy policy PDFs or constructing supply‑chain graphs—while keeping costs predictable and outputs auditable.
Cross-Border Regulatory Enforcement and Innovation Spillovers (Job Market Paper)
<Preliminary and Incomplete>
Global supply chains now provide most U.S. drugs, devices, and high-risk foods, yet fewer than 2 percent of foreign plants are inspected annually. I merge FDA audit records from 2014 to 2025 with 53 million import-shipment lines covering 2.4 million global facilities. A staggered difference-in-differences analysis reveals that even a clean audit induces substantial product exits and suppresses new introductions, thereby narrowing product portfolios and shifting output toward closely related product lines. I model a plant’s first regulatory inspection as a belief shock to its perceived regulatory burden, sharply increasing its perceived probability of future inspections. Subsequent formal regulatory violations add little beyond this initial effect. These outcomes illustrate the inherent balancing act policymakers face: tightening enforcement enhances quality but reallocates innovation and increases vulnerability within global supply networks.
Conspiracy Theories Joint with Thomas Graeber and Christopher Roth
<Preliminary; Draft available on request>
We analyze all posts from major conspiracy forums on a large social media platform between 2008 and 2025 to study the content and origins of conspiracy theories. Using recent advances in large language models, we construct structured graphical representations that capture the events, actors, and interrelations featured in each theory. We first compare conspiratorial narratives to political narratives on the same platform and establish three stylized facts: conspiracies contain fewer truthful claims, are more complex, draw on more central actors and feature more novel content. Popularity in political narratives is positively linked to truthfulness, penalized by complexity, and increased by novelty. In conspiratorial narratives, these patterns are offset: popularity is less strongly tied to truthfulness, while the negative associations with complexity and novelty are attenuated or even reversed. Finally, leveraging complete posting histories of users, we recover a rich database of their personal experiences. Using matched weekly difference-in-difference, we show that adverse life experiences, such as job loss or financial shocks, differentially increase the likelihood of conspiratorial posts.
Geography of Medical Knowledge Joint with Hongyu Zhang and Thiemo Fetzer
<Preliminary>
Medical evidence is overwhelmingly produced in high‑income settings, raising concerns that global research priorities may not align with global health needs. We built a geography‑aware knowledge graph that links clinical articles from top 500 medical journals to the diseases they study, the countries whose data they analyze and the institutional homes of their authors. Large‑language‑model extraction and funder classification allow direct comparison between publication output, national disease burden and source of support across 204 countries and 22 major disease groups. The elasticity of research output with respect to domestic Disability‑Adjusted Life Years has more than doubled since 1990, signalling a steady shift toward needs‑driven science. Yet lower‑income countries still provide <1% of authorship and maternal–neonatal disorders, nutritional deficiencies and several infectious diseases remain under‑studied. Philanthropic funders preferentially target neglected burdens, whereas corporate R&D focuses on profitable chronic conditions. Quasi‑experimental evidence from over 3,000 WHO disease outbreak alerts confirms that sudden health shocks trigger rapid, durable increases in both domestic and global research. Together, these results reveal a scientific system that is becoming more inclusive but remains uneven—and offer a scalable framework for tracking progress in real time.
Chatting about Innovation Joint with Ralf Martin
<Preliminary and Incomplete>
Patent records are the workhorse indicator of corporate innovation, but they miss large swathes of inventive activity, especially in lower-income economies and environmentally oriented technologies. We extract innovation signals for 4 million firms using a large language model (LLM), starting from only firm names and locations. At the country–industry level, LLM-based indicators closely track official patent counts. We define the innovation gap as the share of firms introducing novel products or processes minus the share with any patent, and the green innovation gap as the analogous difference for green introductions. Both gaps are positive, with the green gap substantially larger on average. The overall gap declines with income per capita. Composition differs: patenters concentrate in codifiable technical families (imaging, sensors, batteries) in manufacturing and utilities, while non-patenters drive adoption in environmental and service categories across consumer-facing sectors. We further classify process innovations on the labor margin into labor-saving (automation/substitution) and labor-augmenting (complementarity/assistance). Patenting firms are much more likely to introduce labor-saving processes (47.6% vs. 27.4% among non-patenters), while non-patenters are somewhat more likely to introduce labor-augmenting processes (9.7% vs. 6.7%). A Net Labor Effect index summarizing this balance correlates positively with patenting. Together, the results show that patent statistics are informative but systematically incomplete, and that LLM-based text extraction offers a scalable complement for tracking any-introduction and frontier-novel (including green) activity across countries and industries.
Self-Censorship of US Academics Joint with Patrick Warren and Vitor Melo