High-dimensional regression with outcomes of mixed-type using the multivariate spike-and-slab LASSO
Abstract : We consider a high-dimensional multi-outcome regression in which $q$, possibly dependent, binary and continuous outcomes are regressed onto $p$ covariates. We model the observed outcome vector as a partially observed latent realization from a multivariate linear regression model. Our goal is to estimate simultaneously a sparse matrix ($B$) of latent regression coefficients (i.e., partial covariate effects) and a sparse latent residual precision matrix ($\Omega$), which induces partial correlations between the observed outcomes. To this end, we specify continuous spike-and-slab priors on all entries of $B$ and off-diagonal elements of $\Omega$ and introduce a Monte Carlo Expectation-Conditional Maximization algorithm to compute the maximum a posterior estimate of the model parameters. Under a set of mild assumptions, we derive the posterior contraction rate for our model in the high-dimensional regimes where both $p$ and $q$ diverge with the sample size $n$ and establish a sure screening property, which implies that, as n increases, we can recover all truly non-zero elements of $B$ with probability tending to one. We demonstrate the excellent finite-sample properties of our proposed method, which we call mixed-mSSL, using extensive simulation studies and three applications spanning medicine to ecology.
A pre-print is available here.
Recommended citation: Ghosh, S., and Deshpande, S.K. (2025+). "High-dimensional regression with outcomes of mixed-type using the multivariate spike-and-slab LASSO."