top of page


This paper studies causal inference with observational network data. A challenging aspect of this setting is the possibility of interference in both potential outcomes and selection into treatment, for example due to peer effects in either stage. We therefore consider a nonparametric setup in which both stages are reduced forms of simultaneous-equations models. This results in high-dimensional network confounding, where the network and covariates of all units constitute sources of selection bias. The literature predominantly assumes that confounding can be summarized by a known, low-dimensional function of these objects, and it is unclear what selection models justify common choices of functions. We show that graph neural networks (GNNs) are well suited to adjust for high-dimensional network confounding. We establish a network analog of approximate sparsity under primitive conditions on interference. This demonstrates that the model has low-dimensional structure that makes estimation feasible and justifies the use of shallow GNN architectures.

Media Coverage: Link 1, Link 2, Link 3, Link 4, Link 5, Link 6Link 7

Digital platforms have revolutionized the way illegal drug trafficking is taking place. Modern drug dealers use social network platforms, such as Instagram and TikTok, as direct-to-consumer marketing tools. But apart from the marketing side, drug dealers also use fintech payment apps to engage in financial transactions with their clients. In this work, we leverage a large dataset from Venmo to investigate the digital money trail of drug dealers and the social networks they create. Using text and social network analytics, we identify two types of illicit users: mixed-activity participants and heavy drug traffickers, and build a random forest classifier that accurately predicts both types of illicit nodes. We then investigate the social network structure of drug dealers on Venmo and find that heavy drug traffickers share similar network characteristics with previous literature findings on drug trafficking networks. However, mixed-activity participants exhibit different patterns of network structure characteristics, including a higher clustering coefficient, suggesting that they may be accessing multiple networks and bridging those networks through their illicit activities. Our findings highlight the importance of distinguishing between these two types of illicit users and provide law enforcement agencies with valuable insights that can aid in combating illegal drug transactions in digital payment apps.

Media Coverage: Link 1, Link 2, Link 3, Link 4, Link 5, Link 6

We empirically investigate the harbinger of failure phenomenon in the motion picture industry by analyzing the pre-release reviews written on movies by film critics. We find that harbingers of failure do exist. Their positive pre-release movie reviews provide a strong predictive signal that the movie will turn out to be a flop. This signal persists even for the top critic category, which usually consists of professional reviewers, indicating that having expertise in a professional domain does not necessarily lead to correct predictions. Our findings challenge the current belief that positive reviews always help enhance box office revenue. Moreover, they shed new light on the influencer reviewer hypothesis, which asks whether critics are indeed influencing the popularity of a movie or if they are just able to predict its popularity. We observe that, at least in a pre-release setting, harbinger critics are not influencing the outcome but rather mispredicting it, since if the opposite was true harbingers' reviews could turn a flop movie into a success. We further analyze the writing style of harbingers and provide new insights into their personality traits and cognitive biases.

In a digital era where personal data is as valuable as currency, the management of privacy in peer-to-peer (P2P) platforms has become an ever-pressing issue. Our research delves into the dichotomy of data utility and privacy within Venmo, a leading P2P payment platform. We analyze a novel dataset comprising 200,000 users to discern the predictive capabilities of social network metrics on individual user behavior and privacy. We find that social metrics serve as a crucial predictive tool, particularly for new users lacking historical data, thus providing a solution to the cold-start problem common in digital services. The findings reveal that while user-specific behavior grows in predictive strength over time, the social network's structure retains substantial predictive accuracy throughout a user's lifecycle. This emphasizes the durability of social metrics as an analytical tool, alongside the privacy risks that accompany the indirect exposure of individual behavior. The study provides a detailed account of how social connections inform user activity and the challenges of maintaining privacy when one's network chooses to share data publicly. We further distinguish between structural and transactional social metrics, noting the nuanced differences in their predictive performances. Structural metrics, which include centrality and cohesion, demonstrate a consistent predictive power, while transactional metrics — though lesser in strength — remain significant. Our research has important managerial implications, suggesting that P2P platforms should dynamically adjust privacy safeguards as the relevance of user data evolves. The insights also guide the development of engagement strategies that capitalize on network analytics to cultivate active user bases, while underscoring the need for privacy measures that mitigate indirect data exposure risks.

This work investigates the structure and evolution of Venmo, a peer-to-peer payment application. Venmo is a unique social network in the sense that the edges among nodes represent financial transactions among individuals who shared an offline social interaction. We had two important findings. First, the degree distributions do not follow a power-law distribution, confirming previous studies that real-world social networks are rarely scale-free. Second, we examine the "topological" version of the small-world hypothesis and find that Venmo users are separated by a mean of 5.9 steps and a median of 6 steps confirming Milgram's hypothesis.

This is the outcome of our group's participation in the 11th triennial choice symposium. Our group discussed the emerging topic of data ethics and the choices we have as individuals. Our paper explores how data policies drive technological, organizational, and economic decisions made by digital platforms and consumers, and highlights the need for education on digital and data interactions.  

bottom of page