SAGE collects lower-house parliamentary and presidential election returns for 110 countries at the smallest unit each electoral authority publishes, which for most countries is a polling station with fewer than a thousand voters. It geocodes the units and harmonizes party and candidate identifiers across languages. In total the data contain roughly ten billion votes cast across more than eight million units.
[download the dataset] [explore on the map] [paper] [R/Python packages]
Tabular (Parquet), partitioned by country and year, with coordinates (about 1 GB); this is what most analyses use. Polygons (GeoParquet), one file per country, for choropleth mapping. Companion R and Python packages load a country-and-year slice in one line:
R: remotes::install_github("noahdasanaike/sage", subdir = "sage_R")
sage_load("Germany", years = 2021)
Python: pip install "git+https://github.com/noahdasanaike/sage.git#subdirectory=sage_python"
import sage; sage.sage_load("Germany", years=2021)
The codebook documents per-country coverage, sources, and every
column. R users who want polygon geometry joined to vote rows inline can load the native
sf objects; see the code repository.
Releases happen periodically as countries, election years, and corrections are added. Sign up for announcements:
Each link downloads that country's results as a Parquet bundle (votes, coordinates, and party
identifiers, partitioned by year). Polygon boundaries are large and distributed separately; load
them through the R and Python packages, the polygons/ tree, or the interactive map. The
canonical, citable copy of the entire dataset is the Harvard Dataverse deposit.
[] [codebook] [country-level coverage]
v1.0 (June 30, 2026): Public release; conditionally accepted at Nature Scientific
Data
v0.895 (September 30th, 2025): Fixed Croatian vote-share aggregation
v0.89 (August 25th, 2025): Fixed party names in Chile and Slovakia; minor party name
adjustments elsewhere
v0.88 (July 14th, 2025): Added 2025 South Korean presidential election; Fixed Danish
eligible voter and turnout calculation
v0.875 (July 9th, 2025): Added Portugal parish boundaries back to 1976, fixed
independent candidate reference
v0.87 (May 30th, 2025): Added elections for the Kyrgyz Republic, Liechtenstein,
Mauritania, Uganda, making 110 total countries
v0.86 (May 6th, 2025): Added registered voter turnout in the United States for 2012 to
2020
v0.85 (April 5th, 2025): Redid all of the United States; presidential results only from
2008 to 2024
v0.81 (March 24th, 2025): Redid all of Greece from scratch, and added 2023
elections
v0.80 (March 17th, 2025): Added registered/eligible voter counts to elections in
Canada, Denmark, Finland, Hungary
v0.78 (March 12th, 2025): Changed party columns corresponding to candidates in Myanmar
to candidate columns
v0.75 (February 4th, 2025): Added turnout for most Russian elections
v0.7 (January 24th, 2025): Fixed Bosnia and Herzegovina Thiessen polygons
v0.6 (January 15th, 2025): Fixed bug in Thiessen generation code affecting polygon
edges
v0.5 (January 9th, 2025): Added 2024 election results for multiple countries
v0.4 (January 7th, 2025): Added 2014 and 2018 Hungarian elections
v0.3 (December 20th, 2024): Added Icelandic presidential elections
v0.2 (December 8th, 2024): Fixed overseas France geometry (meridian Thiessen
issues)
v0.1 (September 5th, 2024): Fixed vote count totals for Uruguay and the Solomon
Islands
v0.0 (August 6th, 2024): Initial completion of the data
If you use SAGE, please cite both the dataset and the accompanying paper:
Dasanaike, Noah. The Small-Area Global Elections (SAGE) Dataset. Harvard Dataverse, https://doi.org/10.7910/DVN/YGJR1L (2026).
Dasanaike, Noah. “The Small-Area Global Elections (SAGE) Dataset.” Scientific Data (forthcoming).