Christian Caron, PhD
Predicting Wages with Bayesian Regression
To explore the factors influencing wages, I estimated a Bayesian multilevel regression model using the wagepan panel dataset from the wooldridge package in R. The outcome variable is the natural log of wages, modeled as a function of education, experience and its squared term, union status, race, ethnicity, hours worked, and occupation. In this case, log-transforming the dependent variable helps reduce skewness. The model includes a random intercept for each individual, capturing unobserved personal traits that persist over time.
​
Bayesian regression offers a probabilistic framework for estimating coefficients. Instead of producing a single "best guess" like traditional (frequentist) regression, it generates a distribution of plausible values for each parameter based on prior beliefs and observed data. These distributions reflect our uncertainty about the true effect of each variable.
​
The plot below displays the posterior means and 95% credible intervals for selected predictors in the model. Each point represents the average estimated effect of a variable on log wages, while the horizontal bar shows the range in which the true effect likely falls with 95% probability. Coefficients can be interpreted approximately as percentage differences in wages, relative to either a one-unit change (for continuous variables) or a reference group (for categorical variables). When the estimated effect is small, this approximation is very close. For larger effects, the actual percentage change in wages will be slightly higher or lower than the coefficient value due to the way log transformations work.
​
For example, the posterior mean for the union coefficient is 0.11, suggesting that, all else equal, union members earn approximately 11% more than non-union workers. The coefficient for education is also positive, indicating that each additional year of schooling is associated with higher wages, on average.
​
In contrast, the coefficient for Black is negative, implying that Black workers earn less than their non-Black counterparts, even after adjusting for education, occupation, experience, and other factors.
​
Both Professional/Technical and Managers/Admin occupations are associated with higher wages relative to the reference group, which is Unskilled Blue Collar workers.
​
Overall, the plot provides an intuitive summary of both the magnitude and certainty of each predictor’s estimated relationship with wages.
Bayesian Coefficient Estimates (95% Credible Intervals)
