Methodologies for Estimating Data to the ZCTA Level

<< Back to All Tutorials and Resources

Two organizations have provided imputed data to the UDS Mapper. Both organizations' imputation methodologies are described here.

JSI, Inc. provided imputed data for the comparison of health center patients to underlying populations by insurance types. The following data were developed using the JSI Methodology:

  • Percent Population Uninsured, 2015
  • Percent Population with Medicaid or Other Needs-Based Public Insurance, 2015
  • Percent Population with Medicare and Private Insurance, 2015
  • Penetration of Uninsured Population
  • Penetration of Population with Medicaid or Other Needs-Based Public Insurance
  • Penetration of Population with Medicare or Private Insurance
  • Uninsured Not Served by Health Centers
  • Medicaid or Other Needs-Based Public Insurance Not Served by Health Centers
  • Medicare or Private Insurance Not Served by Health Centers
  • Uninsured Not Served by Health Centers (Dot Density)
  • Medicaid or Other Needs-Based Public Insurance Not Served by Health Centers (Dot Density)
  • Medicare or Private Insurance Not Served by Health Centers (Dot Density)

JSI Methodology

The 2015 ZCTA-level estimates of the uninsured for this project were developed by JSI using the 2015 ACS 1-year table B27016: Health Insurance Coverage Status and Type by Ratio of Income to Poverty Level by Age. The initial data were obtained at the Public Use Microdata Area (PUMA) level to provide uniform coverage over all areas of the country. The uninsured in each age and poverty-level category within each PUMA were allocated to the Census Tracts within that PUMA based on the proportional presence of that population segment within each Census Tract, as a portion of the segment within the PUMA overall. The population segments within the tracts were based on 5-year (2011-2015) tract-level data from table B17024: Age By Ratio Of Income To Poverty Level In The Past 12 Months. The age and income categories were then collapsed to insurance status only. Medicaid totals were then adjusted to classify the dually eligible residents (termed "Medicaid/Means Tested & Medicare") into Medicare - matching how these users would be counted in the UDS. The estimates of percent dually eligible was developed at the PUMA level using 2015 1-year ACS table B27010: Types Of Health Insurance Coverage By Age. The adjusted tract estimates were then allocated down to census block level by population proportion within the tract to permit reaggregation to the ZIP Code Tabulation Area (ZCTA) level. The final ZCTA level insurance estimates include three populations: Uninsured, Medicaid & Medicare/Private Insurance combined.

Robert Graham Center provided imputed data for the Population Indicators tool.

Graham Center Methodology 1 The following data were developed using the Graham Center Methodology 1:
  • Low Birth Weight Rate
  • Age-Adjusted Mortality Rate (per 100,000)

Base data sets on the (1) data for Low Birth Weight Rate from the Area Health Resource File and (2) data for Age-Adjusted Mortality Rate (per 100, 000) from the Centers for Disease Control and Prevention Wonder System were obtained. County-level estimates were computed for each of the above population indicators. For the counties that had a population of less than 20,000 people or for which the estimate was unavailable, state level estimates were used for imputation.

  1. Imputed block group estimates: For each block group, the rate for each population indicator (low birth weight rate, age-adjusted mortality rate, health indicators, and uninsured) at the imputed county level was applied using the block group population percentages by race and ethnicity.
  2. Imputed block estimates: For each block, the rate for each population indicator (low birth weight rate, age-adjusted mortality rate, health indicators, and uninsured) at the block group level was applied to each census block population counts to determine an estimated number of individuals for each variable.
  3. Assign census blocks to ZCTAs: Census block 2010 and ZCTA 2010 shapefiles were obtained from the Census Bureau. A Block-to-ZCTA Crosswalk file was created using a spatial join in Geographic Information Systems (GIS) software to code census 2010 block centroids with 2010 ZCTA FIPS ID numbers.
  4. Aggregate census block data to ZCTA level: Using the block-to-ZCTA crosswalk file, census block count data from Step 3 were aggregated to the ZCTA-level for all the population indicators.

Graham Center Methodology 2

The following data were developed using the Graham Center Methodology 2:

  • Percent of Adults Ever Told Have Diabetes
  • Percent of Adults Ever Told Have High Blood Pressure
  • Percent of Adults Who Are Obese
  • Percent of Adults with No Dental Visit in the Past Year
  • Percent of Adults Who Have Delayed or Not Sought Care Due to High Cost
  • Percent of Adults with No Usual Source of Care

Health indicators from BRFSS at the ZIP Code Tabulation Area (ZCTA) were estimated using a Small Area Estimation (SAE) methodology developed and validated by the CDC researchers1. SAE is used to estimate data on health indicators, where such data does not exist currently. SAE techniques include identifying the relationship between the health outcome and a set of explanatory variables (individual level -age, sex, race and contextual level- county poverty) using multilevel regression models. The coefficients thus obtained were applied to a small area level geography where the same set of data (age, race, sex) exist and, probability and prevalence of targeted health outcomes were computed.

  1. Part of the analysis was carried out at the CDC Research Data Center (RDC), National Center for Health Statistics. Analysts at the RDC merged 2014 BRFSS age by sex by race data with corresponding age by sex by race by poverty at <150% FPL from ACS 2014 using state and county FIPS codes.
  2. Multilevel logistic regressions were performed on each of the health indicators at the RDC.
  3. Individual level and county level weighted estimates were merged with counts of Census block population to compute the probability of risk of a health outcome.
  4. Non-sampled counties’ estimates, were developed using Bayesian spatial smoothing techniques using adjacent counties with random effects.
  5. Prevalence of health outcome at the Census block level was obtained by totaling the individual risks of health outcomes over 208 categories in a Census block weighted by the population sizes in that census block.
  6. Simulation models were drawn across 1,000 repetitions of model parameters to generate point estimate and 95% confidence interval for prevalence of health outcome by age, sex and race at Census block.
  7. Census-block level prevalence of health outcomes were then aggregated to obtain estimates of health outcomes at higher level geographies.) SAE’s for Targeted Health Outcomes and Behaviors of hypertension, diabetes, obesity, access to care, usual source of care, dental visits, were developed.

Imputed block group estimates: For each block group, the rate for each population indicator (low birth weight rate, age-adjusted mortality rate, and uninsured) at the imputed county level was applied using the block group population percentages by race and ethnicity.

Imputed block estimates: For each block, the rate for each population indicator (low birth weight rate, age-adjusted mortality rate, and uninsured) at the block group level was applied to each census block population counts to determine an estimated number of individuals for each variable.

Assign Census blocks to ZCTAs: Census block 2010 and ZCTA 2010 shapefiles were obtained from the Census Bureau. A block-to-ZCTA Crosswalk file was created using a spatial join in Geographic Information Systems (GIS) software to code Census 2010 block centroids with 2010 ZCTA FIPS ID numbers.

Aggregate Census block data to ZCTA level: Using the block-to-ZCTA crosswalk file, Census block count data from Step 3 were aggregated to the ZCTA-level for all the population indicators.

1 Zhang, X., Holt, J.B., Lu, H., Wheaton, A.G., Ford E.S., Greenlund, K. J., Croft, J.B. Multilevel Regression and Poststratification for Small-Area Estimation of Population Health Outcomes: A Case Study of Chronic Obstructive Pulmonary Disease Prevalence Using the Behavioral Risk Factor Surveillance System. American Journal of Epidemiology. 2014. 179 (8):1025-1033.