Introduction and Background
Surgical site infection risk following colon procedures in California hospitals (2024)
Surgical site infections (SSIs) are complications of surgery that account for 20–31% of all hospital-acquired infections (HAIs) and substantially increase patient morbidity, mortality, and hospitalization costs CDC, 2026. After colorectal surgery, it has been estimated that 4.2% of patients develop SSIs, resulting in longer hospital stays and higher costs Gantz et al., 2019.
Many hospitals report zero or only a small number of infections, which makes SSI data particularly noisy. The goal of this project is to use 2024 California SSI data to estimate facility-level SSI risk for colon procedures, accounting for facility type and county-level contextual factors CDPH, 2025. I address the following primary questions:
What are the posterior estimates of SSI risk for each facility, and how do these differ from the observed rates once uncertainty and partial pooling are considered?
How much do SSI risks vary across counties after adjusting for facility type, as quantified by the posterior distribution of county-level random effects?
To answer these questions, I fit a Bayesian hierarchical binomial model, which allows facilities within the same county to share information while capturing meaningful differences across facilities and counties. The model produces partially pooled estimates of SSI risk with full posterior uncertainty, providing a more stable and interpretable representation than raw observed rates, particularly for low-volume facilities.
Along the way, I also fit and evaluate several simpler models, including a standard logistic regression, a non-hierarchical Bayesian binomial model, and a generalized linear mixed model (GLMM). These models serve two purposes: to provide baseline comparisons for inference and prediction, and to illustrate the limitations of approaches that do not fully account for multilevel structure and small-sample variability. Together, these comparisons motivate the use of hierarchical Bayesian modeling as a principled framework for facility-level risk estimation in sparse and heterogeneous healthcare data.
- CDC. (2026). Chapter 9: Surgical Site Infection (SSI) Event. National Healthcare Safety Network (NHSN), Centers for Disease Control and Prevention. https://www.cdc.gov/nhsn/pdfs/pscmanual/9pscssicurrent.pdf
- Gantz, O., Zagadailov, P., & Merchant, A. M. (2019). The Cost of Surgical Site Infections after Colorectal Surgery in the United States from 2001 to 2012: A Longitudinal Analysis. American Surgeon, 85(2), 142–149. 10.1177/000313481908500219
- CDPH. (2025). Surgical Site Infections (SSIs) for Operative Procedures in California Hospitals. California Department of Public Health. https://data.chhs.ca.gov/dataset/surgical-site-infections-ssis-for-28-operative-procedures-in-california-hospitals