Time-Series Models for Microbiome Data


Within the biomedical community there is an increasing recognition of the importance that host- associated microbes play in both human health and disease. Moreover, there has been much excitement over the insights that can be obtained from longitudinal measurements of these mi- crobial communities; however, due to statistical limitations appropriate models have been lacking. Host microbiota are typically measured using high-throughput DNA sequencing which results in counts for different species. Relative abundances, assumed compositional, are then estimated from these counts. In addition, due to technological limitations, the total number of counts per sample is often small compared to the distribution of species’ relative abundances. This leads to datasets with many zero or small counts. With such data, maximum likelihood estimates of sample pro- portions are biased and models that incorporate the sampling variability are essential. In this seminar I will explore these as well as other statistical challenges that arise in modeling microbiota time-series. I will also discuss a novel framework that addresses these challenges by combining multinomial-normal-on-the-simplex dynamic linear models with recent advances in Markov Chain Monte Carlo simulations to enable scalable Bayesian inference for microbiome time-series. Using real datasets, I will explore some of the insights that these models have provided into the forces that drive microbial dynamics.

Abbadia San Salvatore, Siena (Italy)