Small-Area Global Elections (SAGE) Dataset

by Noah Dasanaike

SAGE collects lower-house parliamentary and presidential election returns for 110 countries at the smallest unit each electoral authority publishes, which for most countries is a polling station with fewer than a thousand voters. It geocodes the units and harmonizes party and candidate identifiers across languages. In total the data contain roughly ten billion votes cast across more than eight million units.

Formats & access

Tabular (Parquet), partitioned by country and year, with coordinates (about 1 GB); this is what most analyses use. Polygons (GeoParquet), one file per country, for choropleth mapping. Companion R and Python packages load a country-and-year slice in one line:

R:      remotes::install_github("noahdasanaike/sage", subdir = "sage_R")
        sage_load("Germany", years = 2021)

Python: pip install "git+https://github.com/noahdasanaike/sage.git#subdirectory=sage_python"
        import sage; sage.sage_load("Germany", years=2021)

The codebook documents per-country coverage, sources, and every column. R users who want polygon geometry joined to vote rows inline can load the native sf objects; see the code repository.

Releases happen periodically as countries, election years, and corrections are added. Sign up for announcements:

Download by country

Each link downloads that country's results as a Parquet bundle (votes, coordinates, and party identifiers, partitioned by year). Polygon boundaries are large and distributed separately; load them through the R and Python packages, the polygons/ tree, or the interactive map. The canonical, citable copy of the entire dataset is the Harvard Dataverse deposit.

Release V1.0

June 30, 2026

[] [codebook] [country-level coverage]

v1.0 (June 30, 2026): Public release; conditionally accepted at Nature Scientific Data
v0.895 (September 30th, 2025): Fixed Croatian vote-share aggregation
v0.89 (August 25th, 2025): Fixed party names in Chile and Slovakia; minor party name adjustments elsewhere
v0.88 (July 14th, 2025): Added 2025 South Korean presidential election; Fixed Danish eligible voter and turnout calculation
v0.875 (July 9th, 2025): Added Portugal parish boundaries back to 1976, fixed independent candidate reference
v0.87 (May 30th, 2025): Added elections for the Kyrgyz Republic, Liechtenstein, Mauritania, Uganda, making 110 total countries
v0.86 (May 6th, 2025): Added registered voter turnout in the United States for 2012 to 2020
v0.85 (April 5th, 2025): Redid all of the United States; presidential results only from 2008 to 2024
v0.81 (March 24th, 2025): Redid all of Greece from scratch, and added 2023 elections
v0.80 (March 17th, 2025): Added registered/eligible voter counts to elections in Canada, Denmark, Finland, Hungary
v0.78 (March 12th, 2025): Changed party columns corresponding to candidates in Myanmar to candidate columns
v0.75 (February 4th, 2025): Added turnout for most Russian elections
v0.7 (January 24th, 2025): Fixed Bosnia and Herzegovina Thiessen polygons
v0.6 (January 15th, 2025): Fixed bug in Thiessen generation code affecting polygon edges
v0.5 (January 9th, 2025): Added 2024 election results for multiple countries
v0.4 (January 7th, 2025): Added 2014 and 2018 Hungarian elections
v0.3 (December 20th, 2024): Added Icelandic presidential elections
v0.2 (December 8th, 2024): Fixed overseas France geometry (meridian Thiessen issues)
v0.1 (September 5th, 2024): Fixed vote count totals for Uruguay and the Solomon Islands
v0.0 (August 6th, 2024): Initial completion of the data

Citation

If you use SAGE, please cite both the dataset and the accompanying paper:

Dasanaike, Noah. The Small-Area Global Elections (SAGE) Dataset. Harvard Dataverse, https://doi.org/10.7910/DVN/YGJR1L (2026).
Dasanaike, Noah. “The Small-Area Global Elections (SAGE) Dataset.” Scientific Data (forthcoming).