Portfolio

This portfolio highlights a range of applied projects that demonstrate my expertise in statistical programming and modeling, machine learning, Bayesian inference, natural language processing, and Monte Carlo simulation. While my academic research focused on quantitative political science, my work has since expanded to include diverse domains and applied challenges. These examples reflect the versatility of my skill set and its broad relevance to research, analytics, and data science roles across sectors. Each work sample emphasizes rigorous methods, thoughtful interpretation, and clear communication of complex findings.

Explaining the 2024 Election: The analyses presented here leverage the 2024 American National Election Studies (ANES) and Cooperative Election Study (CES) to shed light on the forces that shaped voter behavior. I provide insight into not only which factors mattered most, but also how their impact varied across states.

State Public Opinion and Bayesian MRP: Here, I present a series of maps that display state public opinion estimates derived from the 2024 Cooperative Election Study. Estimates were developed using Bayesian multilevel regression and post stratification, a a technique that pools information across demographic and geographic groups to generate stable, reliable state-level estimates even from national survey data.

Mapping Housing Market Pressure across NYC: This analysis constructs a composite index of housing market stress for New York City neighborhoods using principal components analysis (PCA). Inputs include rental and for-sale vacancy rates, rent levels, home values, and recent development activity. The index is mapped using interactive choropleths to visualize pressure patterns spatially. This project showcases skills in dimensionality reduction, spatial analysis, and applied urban data science.

Credit Risk Classification with Logistic & Random Forest Models: This project analyzes predictors of consumer creditworthiness using both logistic regression and random forest classification. The logistic model estimates the odds of good credit based on financial indicators and employment status, while the random forest model identifies the most influential features using permutation-based variable importance. The work combines interpretable modeling with machine learning to explore credit risk from multiple angles.

Predicting Wages with Bayesian Regression: This analysis models wages using Bayesian regression, incorporating predictors such as education, experience, occupation, and race. Posterior distributions are used to quantify uncertainty, and marginal effects are plotted to interpret each factor’s contribution. The use of Bayesian inference in this project underscores the unique ability of such methods to incorporate prior knowledge and deliver more flexible, interpretable results in applied research settings.

Simulating Portfolio Growth using Monte Carlo Methods: This project models the long-term growth of an investment portfolio using Monte Carlo simulation with historical S&P 500 annual returns. Thousands of future trajectories are simulated to capture the range of possible outcomes given annual contributions and market volatility. An interactive fan chart visualizes shifts in central tendency and the spread of risk over time. The project demonstrates expertise in probabilistic modeling and uncertainty communication.

Sentiment Analysis of Hacker News Posts: This project applies natural language processing to analyze sentiment in Hacker News post titles and its relationship to post engagement. Sentiment scores are generated using the AFINN lexicon and linked to comment and score data. An interactive scatterplot with loess smoothing illustrates a modest curvilinear relationship between sentiment and engagement. Skills in text mining, sentiment scoring, and exploratory data visualization were instrumental to the analysis.