The structure of place-based health

Most of ZIP-code health is one axis

Run a principal component analysis on all 26 measures and a single dimension — tracking deprivation and income, not any one disease — explains the majority of the variation between America's ZIP codes.

Twenty-six measures sounds like twenty-six different problems: diabetes here, smoking there, food insecurity somewhere else. It isn't. Standardize all 26 measures and ask the data for its principal axes of variation, and the answer is blunt: a single component explains 57% of the variance between ZIP codes. A second adds 20%. Together, two numbers carry 77% of everything these 26 measures can say about how places differ.

Two components carry most of the signal

Share of total variance explained by each principal component

PCA on z-standardized measures across 23,818 ZIP/ZCTA areas. The sharp elbow after PC2 means the remaining components are mostly local texture.

The first axis is deprivation wearing a hospital gown

What is this dominant axis? Look at how every measure “loads” onto it: nearly everything points the same way. ZIP codes high on PC1 have more diabetes and more smoking and more disability and more food insecurity, all at once. And the axis is barely about health care at all — across ZIP codes it correlates at ρ = -0.78 with median household income and ρ = +0.72 with the Area Deprivation Index. If you know how poor a neighborhood is, you already know most of what this axis knows.

The exceptions are the interesting part. Binge drinking loads negative — it is the one behavior that rises with affluence. Cancer prevalence and skipped checkups barely load at all, because they answer to a different master: age.

How each measure loads on the two axes

PC1 = overall burden · PC2 = the age-and-place axis

PC2 separates older, sparser places (high cancer, heart disease, blood pressure — top of the list) from younger, denser ones (high loneliness, skipped checkups, housing insecurity). It correlates ρ = +0.66 with the share of residents 65+, and ρ = -0.55 with population density.

Every ZIP code, on two axes

Plot every analyzed ZIP code in this two-dimensional space and color it by income, and the gradient is unmistakable — blue (higher-income) places pile up on the left of the burden axis, red (lower-income) places stretch right. The vertical spread at any income level is the age-and-place axis doing its separate work.

The health plane of America's ZIP codes

Each dot is one ZIP/ZCTA area, positioned by its two principal component scores

Loading 23,818 ZIP codes…
A sample of 4,500 of the 23,818 analyzed areas. Income was not used to compute the axes — the color gradient emerges on its own.

The burden axis, on the map

ZIP centroids colored by PC1 percentile (deeper red = higher combined burden)

The familiar geography appears without being asked for: the Deep South, Appalachia, the Rio Grande border, and pockets of every large metro at the high end; the affluent suburban rings at the low end.
Read this carefully. Estimates are CDC PLACES-style model-based small-area estimates, not direct measurements. Every association here is ecological — it describes places, not people, and implies nothing about causation. Cross-measure models are fit on the ~23,800 ZIP/ZCTA areas with complete data on all 26 measures (coverage is limited mainly by the newer social-needs measures); maps and community-type assignments extend to areas observing at least 18 of the 26. Full details on the methods page.