Methods

How this atlas is built

What the numbers are, where they come from, and how to read them honestly. Everything here is derived from a single cross-section of 32,409 joined ZIP/ZCTA areas, of which 32,263 are present in the current PMTiles geometry.

What the numbers are

Health outcomes are model-based small-area estimates from the CDC's PLACES ZCTA release, prepared in the local zcta_atlas.parquet file with ACS demographics and ADI context. A value is estimated prevalence for an area, not a direct count of diagnosed people.

This is area-level data, not individual medical guidance.A ZIP's estimate describes a place, not any person who lives there. Nothing here is a diagnosis, a risk score, or health advice.

ZIP codes vs. ZCTAs

The map's geography is the U.S. Census ZIP Code Tabulation Area (ZCTA). ZCTAs are generalized areal representations of USPS ZIP Code service areas — they are notofficial USPS delivery boundaries, and a mailing ZIP does not always have a one-to-one ZCTA. The interface says “ZIP/ZCTA” for readability; if you need official mailing geography, use USPS sources instead.

Backfill and provenance

The CDC ZCTA release has source-side gaps for Pennsylvania and Kentucky. Where a native ZCTA estimate is absent and tract-level PLACES data are available, the prep pipeline uses a population-weighted tract-to-ZCTA aggregate. Each ZIP profile carries its provenance: direct estimates, mixed direct-plus-backfill, aggregate-only, or no health source.

Across the current file, 113,571 health cells are marked as backfilled. Backfilled values are useful for coverage, but they are not native CDC ZCTA estimates and do not carry native ZCTA confidence intervals.

The benchmark and the three view modes

Every measure is compared against a single national benchmark: the population-weighted national mean. The map offers three ways to read a value:

  • Rate — the estimated prevalence itself, on a luminance-ordered sequential ramp.
  • Gap vs. U.S. — the difference from the national mean, on a diverging ramp whose neutral midpoint is the benchmark (cooler = better than average, warmer = worse).
  • Percentile — the national percentile rank among ZIP/ZCTA areas with an estimate. Percentile rank is within available data only.

Switching mode swaps only the map's color expression; switching measure re-pushes values to the existing geometry. The same color scale drives the map, the legend, and the charts, so a color means the same thing everywhere.

The deprivation gradient

The headline analytical panel groups ZIP/ZCTA areas into deciles of the Area Deprivation Index (ADI)national rank and plots the population-weighted average of the selected measure per decile, with a 95% confidence band. The “most − least deprived” figure is the gap between the top and bottom deciles. Companion panels show the distribution, the highest/lowest-burden ZIPs, and a scatter against ADI with a LOESS trend and Spearman correlations.

Ecological, not causal. These are relationships between places, computed across areas. They do not describe individuals (the ecological fallacy) and a correlation — Spearman ρ here — is not evidence of cause.

Missing data

ZIP/ZCTA areas without an estimate for the selected measure are drawn in a neutral grey and excluded from percentile and ranking computations. Some rows are valid analytical ZIP/ZCTA records but do not have geometry in the current PMTiles; they appear in search and tables but cannot be clicked on the map. 105 rows have no usable health measures after source cleanup. The table below is generated from the live data manifest.

Per-measure coverage across 32,409 joined ZIP/ZCTA areas.
MeasureDenominatorCoveredMissingNativeMixed/backfilledDirection
Diabetes · Health outcomesadults 18+32,30410525,0077,297Lower is better
High blood pressure · Health outcomesadults 18+32,30210725,0077,295Lower is better
Coronary heart disease · Health outcomesadults 18+32,30410525,0077,297Lower is better
Stroke · Health outcomesadults 18+32,30410525,0077,297Lower is better
COPD · Health outcomesadults 18+32,30410525,0077,297Lower is better
Cancer · Health outcomesadults 18+32,30410525,0077,297Lower is better
All teeth lost (65+) · Health outcomesadults 65+32,30010925,0037,297Lower is better
Depression · Mental & functional healthadults 18+32,30410525,0077,297Lower is better
Frequent poor mental health · Mental & functional healthadults 18+32,30410525,0077,297Lower is better
Frequent poor physical health · Mental & functional healthadults 18+32,30410525,0077,297Lower is better
Fair or poor health · Mental & functional healthadults 18+32,30410525,0077,297Lower is better
Any disability · Mental & functional healthadults 18+32,30410525,0077,297Lower is better
Obesity · Health behaviorsadults 18+32,30410525,0077,297Lower is better
Smoking · Health behaviorsadults 18+32,30410525,0077,297Lower is better
Physical inactivity · Health behaviorsadults 18+32,30410525,0077,297Lower is better
Short sleep · Health behaviorsadults 18+32,30410525,0077,297Lower is better
Binge drinking · Health behaviorsadults 18+32,30410525,0077,297Lower is better
Uninsured · Access & preventionadults 18-6432,30410525,0077,297Lower is better
No recent dental visit · Access & preventionadults 18+32,30410525,0077,297Lower is better
No annual checkup · Access & preventionadults 18+32,30410525,0077,297Lower is better
Food insecurity · Health-related needsadults 18+29,2383,17123,7715,467Lower is better
Housing insecurity · Health-related needsadults 18+29,2383,17123,7715,467Lower is better
Transportation barriers · Health-related needsadults 18+29,2383,17123,7715,467Lower is better
Low social support · Health-related needsadults 18+29,2383,17123,7715,467Lower is better
Loneliness · Health-related needsadults 18+23,8228,58723,77151Lower is better
Utility shutoff threat · Health-related needsadults 18+29,2383,17123,7715,467Lower is better

The stories: PCA, clustering, and gradients

The stories section is precomputed by data-prep/analytics_v3.py from the same parquet source, over the ~23,800 ZIP/ZCTA areas with complete data on all 26 measures (coverage is limited mainly by the newer social-needs measures, especially loneliness, so the smallest rural areas are under-represented; each story states this). Each area is one unweighted observation; all measures are standardized to z-scores first.

  • Correlationsare Spearman rank correlations, ordered by average-linkage hierarchical clustering on 1 − ρ with optimal leaf ordering.
  • Principal components come from PCA on the standardized matrix; PC1 is sign-oriented so that higher always means more burden.
  • Archetypesare k-means clusters on the same matrix. k = 4 was chosen by silhouette score over k = 3…8; clusters are ordered by mean PC1 and hand-labeled. Demographic variables are not used in clustering — they are only described afterwards. Cluster boundaries are soft; treat the labels as portraits, not categories of nature.
  • Deprivation gradients are population-weighted means by ADI national-rank decile, shown relative to the least-deprived decile.
  • The wealth gradient (data-prep/analytics_wealth.py) ranks complete-case ZIP/ZCTA areas with at least 500 residents on a composite socioeconomic advantage score — the mean rank percentile of income, college attainment, home value, reversed ADI, reversed poverty, and reversed unemployment — and compares the top and bottom deciles of that score using population-weighted mean prevalence.
  • Outcome stories use all ZCTAs with the relevant measures (not the complete case). The smoking story fits a quadratic of smoking on ADI and maps the residual; the mental-health story uses the ratio of diagnosed depression to frequent mental distress as a crude diagnosis-access index. Both are descriptive decompositions, not causal models.

The community typeshown on each ZIP snapshot is that ZIP's archetype assignment. Clusters are fit on complete-case areas only; any ZCTA observing at least 18 of the 26 measures is then assigned to its nearest centroid using the dimensions it observes (for complete rows this equals the k-means label). The same rule fills the story dot maps, with PC1 scores for partially observed rows computed by available-dimension projection. ZCTAs with fewer than 18 measures have no assignment and show no type.

Color & accessibility

Ramps are chosen for a clear luminance progression (so they remain legible in grayscale) and for color-vision-deficiency resilience (warm vs. cool, never red/green alone). Selection is encoded by weight, halo, and a direct label — never by color alone. The map is not the only path to the data: every chart ships an accessible table fallback, values are visible without hover, controls are keyboard-operable with visible focus, and prefers-reduced-motion is honored.

What this is not

  • Not a direct count or a registry — estimates are modeled.
  • Not individual-level — it describes areas, not people.
  • Not causal — associations are ecological.
  • Not official mailing geography — ZCTAs approximate ZIP service areas.

For the underlying files and per-measure provenance, see Sources & provenance. To explore the data, open the interactive atlas.