Exploring optimal log-ratio representations for high-dimensional compositional data with applications to regression analysis in metabolomics
High-throughput technologies are used in biological research to obtain a comprehensive account of molecules in a sample. Because of chemical/physical noise and technical limitations, the raw data require intensive pre-processing which often results in a normalised data set carrying only relative information. Hence, compositional data analysis methods which exploit the structure of relative variation in data are meaningful in this context. We review some recent developments to deal with high-dimensional compositional data in the context of an investigation of the association of rumen metabolite spectral profiles with greenhouse gas emissions in ruminants. In particular, we consider alternatives to determine optimal log-ratio representations of the metabolomic profiles which facilitate regression analysis and identification of the most relevant signals.
Palabras clave: Compositional data high-dimensional data log-ratio analysis metabolomics.
Otros trabajos en la misma sesión
Últimas noticias
-
04/07/19
Programa científico completo disponible -
31/05/19
Convocado Premio INE 2019 -
13/04/19
Inscripción ya abierta