Summary:
Continues a discussion with @avehtari from here. Distilled down: full-rank ADVI is constrained by memory. The mean-field approximation can be problematic for certain models. A sensible intermediate (a low-rank implementation) for certain models would be very helpful. Ong et al. (2017) described one possible implementation.
Description:
I'll briefly outline the mathematical approach of Ong et al., and leave the Stan-specific implementation details (most of which were kindly outlined in the preceding discussion) for the pull request. To generate the parameters of the model: if n is the dimension of the parameters, and r is the desired rank of our approximation, we draw eta = (z, eps) from the r + n dimensional identity Gaussian. Then zeta is distributed according to N(mu, BB^T + diag(d^2)) where mu and d are n-dimensional and B is n x r and constrained to be lower-triangular, and can be obtained from eta by the reparameterization trick with the formula zeta = mu + Bz + d * eps. zeta is then transformed to the model parameters according to ADVI.
Additional info:
I've started working on an implementation and will open a PR now.
Current Version:
v2.19.1
Summary:
Continues a discussion with @avehtari from here. Distilled down: full-rank ADVI is constrained by memory. The mean-field approximation can be problematic for certain models. A sensible intermediate (a low-rank implementation) for certain models would be very helpful. Ong et al. (2017) described one possible implementation.
Description:
I'll briefly outline the mathematical approach of Ong et al., and leave the Stan-specific implementation details (most of which were kindly outlined in the preceding discussion) for the pull request. To generate the parameters of the model: if
nis the dimension of the parameters, andris the desired rank of our approximation, we draweta = (z, eps)from ther + ndimensional identity Gaussian. Thenzetais distributed according toN(mu, BB^T + diag(d^2))wheremuanddaren-dimensional andBisn x rand constrained to be lower-triangular, and can be obtained frometaby the reparameterization trick with the formulazeta = mu + Bz + d * eps.zetais then transformed to the model parameters according to ADVI.Additional info:
I've started working on an implementation and will open a PR now.
Current Version:
v2.19.1