Markov chain Monte Carlo updating schemes for hidden Gaussian Markov random field models

Steinsland, Ingelin

Steinsland, Ingelin

Doctoral thesis

Åpne

Dr.Ing. (Låst)

Permanent lenke

http://hdl.handle.net/11250/228796

Utgivelsesdato

2003

Metadata

Vis full innførsel

Samlinger

Institutt for matematiske fag [2468]

Sammendrag

Part I discusses how to construct approximations to the posterior distribution π(x|y, θ) of a latent Gaussian Markov random field on a graph of dimension n when data are considered conditionally mutually independent and π(x|y, θ) is unimodal. It is demonstrated that a class of non-Gaussian approximations can be constructed for a wide range of likelihood models. They have the appealing properties that exact samples can be drawn from them, the normalisation constant is computable, and the computational complexity is at most O(n2) in the spatial case. The non-Gaussian approximations are refined versions of a Gaussian approximation. The Gaussian approximation serves well if the likelihood is near-Gaussian, but it is not sufficiently accurate when the likelihood is not near-Gaussian or if n is large. The accuracy of the approximations can be tuned by intuitive parameters to near any precision. The approximations are applied as proposals for x in one-block updating scheme Metropolis-Hastings samplers. These samplers are used for spatial disease mapping and model-based geostatistics problems involving different likelihoods. One-block updating scheme Metropolis-Hastings samplers and Metropolized independence samplers for such models are also presented. These sampling schemes are major improvements compared to the single-site schemes commonly used.

The main focuses of part II are geostatistical GMRF models and parallel exact sampling of GMRFs. There are also brief overviews of parallel computing and MCMC, and a literature review of parallel MCMC. The geostatistical GMRF models are constructed by discretising the domain region using a lattice. Instead of giving this lattice a Gaussian random field prior, that corresponds to a Gaussian process, a GMRF that is an approximation to the GRF is chosen. More computational benefits are achieved through the nice parallelisation possibilities of GMRF sampling. The computationally expensive part of GMRF sampling is Cholesky decomposition of the precision matrix. Parallelisation is done with parallel algorithms from linear algebra for sparse symmetric positive definite matrices. The parallel GMRF sampler is tested for graphs and lattices, and gives both good speed-up and good scalability.

A parallel one-block updating scheme sampler for latent GMRF models is constructed using a GMRF approximation to π(x|y, θ) as proposal for the latent field. It is used for a geostatistical GMRF model with binomial likelihood, and shows good mixing for both the latent field and the hyper-parameters, as well as good speed-up from the parallelisation.

In part III joint proposal distributions for the posterior of latent Gaussian Markov random fields π(x|y, θ) are constructed. We can both sample from and evaluate these proposals without working directly with an n-dimensional distribution. The key idea in the construction of the proposals is to combine samples from overlapping blocks of the latent field. Each block is sampled from its conditional distribution or an approximation to its conditional distribution. The overlapping block proposals for x are used together with proposals for the hyper-parameters θ and an opposite reverse acceptance probability in one-block updating scheme Metropolis-Hastings algorithms.

Through examples the method prove to work well both when each block is sampled exactly and when an approximation is necessary. Overlapping block proposals are successfully applied for a latent GMRF sampling problem of dimension 100000. For some of the problems hyper-parameters with a Gaussian prior are also included in the overlapping blocking scheme.

Part IV is a short note describing a way of parallelising the samplers introduced in part I and part III. These methods establish a connection between part II, and part I and III. The suggested parallelisation methods are based on the probabilistic interpretations of parallel sampling of GMRFs.

Utgiver

Fakultet for informasjonsteknologi, matematikk og elektroteknikk

Serie

Dr. ingeniøravhandling, 0809-103X; 2003:89