Data for: Improving Linkage of Datasets on Organizations Using Half-a-Billion Open-Collaborated Records
Description: Repository contains millions of training examples of positive and negative organizational name matches derived from the LinkedIn corpus. It also contains two distinct network representations of relationships between names on the network. Data are useful for fine-tuning LLMs or building large-scale organizational name-match models.
[Dataverse] [Preprint]

Data for: The Composition of Descriptive Representation
Description: Repository contains information on the gender, ethnicity, religion, and language of political leaders across the globe. Other leader features include age, educational attainment, and birthplace. Merged population-level statistics also provided for the different group types.
[Dataverse] [Article]

Data for: Image-based Treatment-effect Heterogeneity
Description: Repository contains experimental data for an anti-poverty intervention in Uganda, along with geo-referenced satellite imagery for each experimental unit.
[Dataverse] [Article]

Data for: Integrating Earth Observation Data into Causal Inference: Challenges and Opportunities
Description: Repository contains observational data for aid interventions in Nigeria, along with geo-referenced satellite imagery for each observational unit.
[Dataverse] [Preprint]

Data for: An Improved Method of Automated Nonparametric Content Analysis for Social Science
Description: Repository contains data for 74 text analysis tasks related to politics and public opinion.
[Dataverse] [Article]

Data for: The Impact of a Transportation Intervention on Electoral Politics: Evidence from E-ZPass
Description: Repository contains data on a transportation intervention in New Jersey and Pennsylvania, along with changes in housing values, electoral outcomes, and individual-level campaign contribution behavior.
[Dataverse] [Article]

