Applied Statistician & Data Scientist
I have over 10 years of experience in applied statistics, data science, and machine learning.
My past work includes supporting research and analytical efforts at Harvard University and the CDC.
Contact: rdanielsstat@gmail.com
LinkedIn: linkedin.com/in/robcdaniels/
See my public projects on GitHub: github.com/rdanielsstat
A Reveal.js and Quarto presentation explaining the overlap and distinctions between data science, statistics, and AI/machine learning, emphasizing the difference between inference and prediction.
Tools: Reveal.js, Quarto, R, Markdown
View presentation: ds-stats-aiml-presentation
A MyST-based analysis of 2024 California hospital colon surgery data, estimating facility-level surgical site infection (SSI) risk. The project compares logistic regression, GLMM, and both non-hierarchical and hierarchical Bayesian binomial models, illustrating partial pooling for low-volume hospitals and multilevel effects across facilities and counties. A manuscript-quality summary, formatted in LaTeX with abstract, full introduction, methods, results, and conclusions, is included to demonstrate a publication-ready presentation of the analysis.
Tools: R, JAGS, ggplot2, plotly, MyST notebooks, Markdown, LaTeX
Repository: bayesian-ssi-analysis
Live analysis: bayesian-ssi-analysis
An end-to-end machine learning engineering project that classifies breast ultrasound images as benign or malignant. The project demonstrates the full ML lifecycle: exploratory data analysis, neural network architecture experimentation (fully connected and convolutional models with dropout, scheduling, and data augmentation), training and saving predictive models, containerization with Docker, serving predictions via FastAPI, and optional deployment to the cloud using Fly.io. Reproducible notebooks and scripts highlight the process of taking models from research to production, with CPU-only configurations to reduce deployment size.
Tools: Python, PyTorch, FastAPI, Docker, Fly.io, Jupyter, NumPy, PIL, Poetry
Repository: ml-breast-cancer-prediction
Live deployment: https://breast-cancer-prediction.fly.dev/docs
A Quarto-based analysis demonstrating recurrent-events survival analysis using the bladder cancer dataset. Includes multiple modeling approaches, dynamic visualizations, and modern workflow tools such as GitHub, Markdown, and integrated bibliography management.
Tools: R, Quarto, ggplot2, plotly, Markdown
Live analysis: bladder-cancer-recurrence-analysis
An applied analysis of a cluster randomized trial demonstrating reproducible data processing, descriptive statistics, treatment-group comparisons, and a permutation test. The project highlights advanced statistical handling of clustered data and a clean workflow structure.
Tools: R, Quarto, Markdown, ggplot2, plotly
Repository: schisto-cluster-project
Live analysis: schisto-cluster-analysis