The linear varying coefficient (VC) model generalizes the conventional linear model by allowing the additive effect of each covariate on the outcome to vary as a function of additional effect modifiers. While there are many existing procedures for VC modeling with a single scalar effect modifier (often assumed to be time), there has, until recently, been comparatively less development for settings with multivariate modifiers. Unfortunately, existing state-of-the-art procedures that can accommodate multivariate modifiers typically make restrictive structural assumptions about the covariate effect functions or require intensive problem-specific hand-tuning that scales poorly to large datasets. In response, we propose VC-BART, which estimates the covariate effect functions in a VC model using Bayesian Additive Regression Trees (BART).
On several synthetic and real-world data sets, we demonstrate that, with simple default hyperparameter settings, VC-BART displays covariate effect recovery performance superior to state-of-the-art VC modeling techniques and predictive performance on par with more flexible but less interpretable nonparametric regression procedures. We further demonstrate the theoretical near-optimality of VC-BART by synthesizing recent theoretical results about the VC model and BART to derive posterior concentration rates in settings with independent and correlated errors.
A pre-print is available at arXiv:2003.06416.
An R package implementing VC-BART and scripts and data for reproducing all examples, figures, and tables in the paper are available at this GitHub repo.