Fuzzy Search Over 42,010 Philippine Barangays

Fuzzy search the complete Philippine PSGC dataset of 42,010 barangays with the barangay package. Tolerant of misspellings, abbreviations, and unstandardized addresses using rapidfuzz.
Author

bendlikeabamboo

Published

November 17, 2025

Exact matching fails the moment a user types “Tongmagen” instead of “Tongmageng”, or omits the municipality. Fuzzy search over the full PSGC dataset fixes this. The barangay package ships fuzzy search built on rapidfuzz, running entirely offline.

Install

pip install barangay

How scoring works

search_fuzzy() scores a query against several patterns — barangay only, province+barangay, municipality+barangay, and province+municipality+barangay — and reports the maximum score. Adding province or municipality context raises the score for the intended match.

Tune it

search_fuzzy("San Jose", level=None, threshold=70.0, limit=10)
  • threshold — minimum score to keep (default 60.0).
  • limit — max results (default 5).
  • level — restrict to a specific admin level.

From the command line

barangay search "Tongmageng, Tawi-Tawi"

The CLI prints a ranked table including the maximum score across all matching patterns.

Use cases

  • Address autocomplete — return the top-N barangays for a partial query.
  • Data cleaning — match messy spreadsheet entries to canonical PSGC names.
  • Deduplication — group near-duplicate addresses to a single PSGC code.

Because everything is bundled and in-process, you can fuzzy-search the entire country instantly without rate limits.


See the fuzzy search reference and bulk lookup tutorial.