Null diffusion-based enrichment for metabolomics data

Published on Dec 6, 2017in PLOS ONE2.74
· DOI :10.1371/JOURNAL.PONE.0189012
Sergio Picart-Armada5
Estimated H-index: 5
(UPC: Polytechnic University of Catalonia),
Francesc Fernandez-Albert7
Estimated H-index: 7
(UPC: Polytechnic University of Catalonia)
+ 5 AuthorsAlexandre Perera-Lluna12
Estimated H-index: 12
(UPC: Polytechnic University of Catalonia)
Metabolomics experiments identify metabolites whose abundance varies as the conditions under study change. Pathway enrichment tools help in the identification of key metabolic processes and in building a plausible biological explanation for these variations. Although several methods are available for pathway enrichment using experimental evidence, metabolomics does not yet have a comprehensive overview in a network layout at multiple molecular levels. We propose a novel pathway enrichment procedure for analysing summary metabolomics data based on sub-network analysis in a graph representation of a reference database. Relevant entries are extracted from the database according to statistical measures over a null diffusive process that accounts for network topology and pathway crosstalk. Entries are reported as a sub-pathway network, including not only pathways, but also modules, enzymes, reactions and possibly other compound candidates for further analyses. This provides a richer biological context, suitable for generating new study hypotheses and potential enzymatic targets. Using this method, we report results from cells depleted for an uncharacterised mitochondrial gene using GC and LC-MS data and employing KEGG as a knowledge base. Partial validation is provided with NMR-based tracking of 13C glucose labelling of these cells.
📖 Papers frequently viewed together
229 Citations
5 Citations
#1Maria VinaixaH-Index: 24
#2Emma L. Schymanski (Eawag: Swiss Federal Institute of Aquatic Science and Technology)H-Index: 32
Last. Oscar YanesH-Index: 29
view all 6 authors...
Abstract At present, mass spectrometry (MS)-based metabolomics has been widely used to obtain new insights into human, plant, and microbial biochemistry; drug and biomarker discovery; nutrition research; and food control. Despite the high research interest, identifying and characterizing the structure of metabolites has become a major drawback for converting raw MS data into biological knowledge. Comprehensive and well-annotated MS-based spectral databases play a key role in serving this purpose...
243 CitationsSource
#1Xianbin Li (Wenzhou University)H-Index: 2
#2Liangzhong Shen (Wenzhou University)H-Index: 5
Last. Wenbin Liu (Wenzhou University)H-Index: 8
view all 4 authors...
Pathway analysis is a common approach to gain insight from biological experiments. Signaling-pathway impact analysis (SPIA) is one such method and combines both the classical enrichment analysis and the actual perturbation on a given pathway. Because this method focuses on a single pathway, its resolution generally is not very high because the differentially expressed genes may be enriched in a local region of the pathway. In the present work, to identify cancer-related pathways, we incorporated...
37 CitationsSource
#1Jianguo Xia (McGill University)H-Index: 41
#2Igor Sinelnikov (U of A: University of Alberta)H-Index: 12
Last. David S. Wishart (National Institute for Nanotechnology)H-Index: 116
view all 4 authors...
MetaboAnalyst ( is a web server designed to permit comprehensive metabolomic data analysis, visualization and interpretation. It supports a wide range of complex statistical calculations and high quality graphical rendering functions that require significant computational resources. First introduced in 2009, MetaboAnalyst has experienced more than a 50X growth in user traffic (>50 000 jobs processed each month). In order to keep up with the rapidly increasing computational d...
1,927 CitationsSource
#1Arnald Alonso (UPC: Polytechnic University of Catalonia)H-Index: 15
#2Sara MarsalH-Index: 32
Last. Antonio JuliàH-Index: 32
view all 3 authors...
Metabolomics comprises the methods and techniques that are used to measure the small molecule composition of biofluids and tissues, and is actually one of the most rapidly evolving research fields. The determination of the metabolomic profile –the metabolome- has multiple applications in many biological sciences, including the developing of new diagnostic tools in medicine. Recent technological advances in nuclear magnetic resonance (NMR) and mass spectrometry (MS) are significantly improving ou...
319 CitationsSource
#1Francesc Fernandez-Albert (UPC: Polytechnic University of Catalonia)H-Index: 7
#2Rafael Llorach (UPC: Polytechnic University of Catalonia)H-Index: 3
Last. Alexandre Perera (UPC: Polytechnic University of Catalonia)H-Index: 16
view all 4 authors...
Summary: Current tools for liquid chromatography and mass spectrometry for metabolomic data cover a limited number of processing steps, whereas online tools are hard to use in a programmable fashion. This article introduces the Metabolite Automatic Identification Toolkit (MAIT) package, which makes it possible for users to perform metabolomic end-to-end liquid chromatography and mass spectrometry data analysis. MAIT is focused on improving the peak annotation stage and provides essential tools t...
49 CitationsSource
#1Antonio Fabregat (EMBL-EBI: European Bioinformatics Institute)H-Index: 14
#2Konstantinos Sidiropoulos (EMBL-EBI: European Bioinformatics Institute)H-Index: 9
Last. Peter D'Eustachio (NYU: New York University)H-Index: 54
view all 22 authors...
Reactome ( is a manually curated open-source open-data resource of human pathways and reactions. The current version 46 describes 7088 human proteins (34% of the predicted human proteome), participating in 6744 reactions based on data extracted from 15 107 research publications with PubMed links. The Reactome Web site and analysis tool set have been completely redesigned to increase speed, flexibility and user friendliness. The data model has been extended to support anno...
3,633 CitationsSource
#1Monica Chagoyen (CSIC: Spanish National Research Council)H-Index: 20
#2Florencio PazosH-Index: 34
The so-called ‘omics’ approaches used in modern biology aim at massively characterizing the molecular repertories of living systems at different levels. Metabolomics is one of the last additions to the ‘omics’ family and it deals with the characterization of the set of metabolites in a given biological system. As metabolomic techniques become more massive and allow characterizing larger sets of metabolites, automatic methods for analyzing these sets in order to obtain meaningful biological infor...
47 CitationsSource
#1Evan O. Paull (UCSC: University of California, Santa Cruz)H-Index: 14
#2Daniel E. Carlin (UCSC: University of California, Santa Cruz)H-Index: 13
Last. Joshua M. Stuart (UCSC: University of California, Santa Cruz)H-Index: 64
view all 6 authors...
Motivation: Identifying the cellular wiring that connects genomic perturbations to transcriptional changes in cancer is essential to gain a mechanistic understanding of disease initiation, progression and ultimately to predict drug response. We have developed a method called Tied Diffusion Through Interacting Events (TieDIE) that uses a network diffusion approach to connect genomic perturbations to gene expression changes characteristic of cancer subtypes. The method computes a subnetwork of pro...
135 CitationsSource
#1Nikolas Kessler (Bielefeld University)H-Index: 7
#2Heiko Neuweger (Bielefeld University)H-Index: 19
Last. Alexander Goesmann (Bielefeld University)H-Index: 76
view all 7 authors...
Motivation: The research area metabolomics achieved tremendous popularity and development in the last couple of years. Owing to its unique interdisciplinarity, it requires to combine knowledge from various scientific disciplines. Advances in the high-throughput technology and the consequently growing quality and quantity of data put new demands on applied analytical and computational methods. Exploration of finally generated and analyzed datasets furthermore relies on powerful tools for data min...
57 CitationsSource
#1Koyel Mitra (UCSD: University of California, San Diego)H-Index: 4
#1Koyel Mitra (UCSD: University of California, San Diego)H-Index: 3
Last. Trey Ideker (UCSD: University of California, San Diego)H-Index: 98
view all 4 authors...
The recent proliferation of omics data has required a toolbox of integrative systems biology bioinformatics approaches to elucidate functional relationships between molecules. Here the authors explain the principles behind these approaches and discuss their applications.
399 CitationsSource
Cited By14
#1Jun Wang (SDU: Shandong University)
#2Mingzhi Gong (SDU: Shandong University)H-Index: 4
Last. Deguo Xing (SDU: Shandong University)H-Index: 4
view all 5 authors...
BACKGROUND This study hoped to explore the potential biomarkers and associated metabolites during osteosarcoma (OS) progression based on bioinformatics integrated analysis. METHODS Gene expression profiles of GSE28424, including 19 human OS cell lines (OS group) and 4 human normal long bone tissue samples (control group), were downloaded. The differentially expressed genes (DEGs) in OS vs. control were investigated. The enrichment investigation was performed based on DEGs, followed by protein-pr...
#1Sergio Picart-Armada (UPC: Polytechnic University of Catalonia)H-Index: 5
#2Wesley K. ThompsonH-Index: 68
Last. Alexandre Perera-Lluna (UPC: Polytechnic University of Catalonia)H-Index: 12
view all 4 authors...
MOTIVATION Network diffusion and label propagation are fundamental tools in computational biology, with applications like gene-disease association, protein function prediction and module discovery. More recently, several publications have introduced a permutation analysis after the propagation process, due to concerns that network topology can bias diffusion scores. This opens the question of the statistical properties and the presence of bias of such diffusion processes in each of its applicati...
1 CitationsSource
Disease-related gene prioritization is one of the most well-established pharmaceutical techniques used to identify genes that are important to a biological process relevant to a disease. In identifying these essential genes, the network diffusion (ND) approach is a widely used technique applied in gene prioritization. However, there is still a large number of candidate genes that need to be evaluated experimentally. Therefore, it would be of great value to develop a new strategy to improve the p...
Climate change has caused serious problems related to the productivity of agricultural crops directly affecting human well-being. Plants have evolved to produce molecular mechanisms in response to environmental stresses, such as transcription factors (TFs), to cope with abiotic stress. The NAC proteins constitute a plant-specific family of TFs involved in plant development processes and tolerance to biotic and abiotic stress. Sugarcane is a perennial grass that accumulates a large amount of sucr...
2 CitationsSource
#2Chen Liao (MSK: Memorial Sloan Kettering Cancer Center)H-Index: 8
Last. Joao B. Xavier (MSK: Memorial Sloan Kettering Cancer Center)H-Index: 46
view all 7 authors...
Many bacteria have an incredible ability to swarm cooperatively over surfaces. But swarming phenotypes can be quite different even between strains of the same species. What drives this diversity? We compared the metabolomes of 29 clinical Pseudomonas aeruginosa isolates with a range of swarming phenotypes. We identified that isolates incapable of secreting rhamnolipids--a surfactant needed for swarming--had perturbed tricarboxylic acid (TCA) cycle and amino acid pathways and grew exponentially s...
#1Josep Marín-Llaó (Fraunhofer Society)H-Index: 5
#2Sarah Mubeen (Fraunhofer Society)H-Index: 5
Last. Daniel Domingo-Fernández (Fraunhofer Society)H-Index: 9
view all 6 authors...
High-throughput screening yields vast amounts of biological data which can be highly challenging to interpret. In response, knowledge-driven approaches emerged as possible solutions to analyze large datasets by leveraging prior knowledge of biomolecular interactions represented in the form of biological networks. Nonetheless, given their size and complexity, their manual investigation quickly becomes impractical. Thus, computational approaches, such as diffusion algorithms, are often employed to...
#1Asiful Islam (Universiti Sains Malaysia)H-Index: 6
#2Shahad Saif Khandker (JU: Jahangirnagar University)H-Index: 6
Last. Rosline Hassan (Universiti Sains Malaysia)H-Index: 13
view all 4 authors...
Systemic lupus erythematosus (SLE) is an autoimmune disease characterised by multiple organ involvement, including the skin, joints, kidneys, lungs, central nervous system and the haematopoietic system, with a large number of complications. Despite years of study, the aetiology of SLE remains unclear; thus, safe and specifically targeted therapies are lacking. In the last 20 years, researchers have explored the potential of nutritional factors on SLE and have suggested complementary treatment op...
2 CitationsSource
#2Nilson Nunes Morais Junior (International Foundation for Electoral Systems)
Last. Celia Raquel QuirinoH-Index: 13
view all 7 authors...
Twenty-eight pluriparous and non-lactating Santa Ines sheep were synchronized with vaginal sponge and an intramuscular (IM) injection of 37.5 μg of cloprostenol on random days of the estrous cycle (D0); day 6 (D6), at 7:00 am, the devices were removed, and after 24 h (D7), GnRH analog (25 μg of lecirelin) was administrated. Fixed-time artificial insemination (FTAI) with cervical traction by the transcervical route was performed 52 to 58 h after sponge removal. Doppler velocimetry of both uterine...
#1Sergio Picart-Armada (UPC: Polytechnic University of Catalonia)H-Index: 5
#2Wesley K. Thompson (UCSD: University of California, San Diego)H-Index: 68
Last. Alexandre Perera-Lluna (UPC: Polytechnic University of Catalonia)H-Index: 12
view all 4 authors...
Motivation: Network diffusion and label propagation are fundamental tools in computational biology, with applications like gene-disease association, protein function prediction and module discovery. More recently, several publications have introduced a permutation analysis after the propagation process, due to concerns that network topology can bias diffusion scores. This opens the question of the statistical properties and the presence of bias of such diffusion processes in each of its applicat...
#1Jan Stanstrup (UCPH: University of Copenhagen)H-Index: 13
#2Corey D. Broeckling (CSU: Colorado State University)H-Index: 31
Last. Steffen NeumannH-Index: 39
view all 19 authors...
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of ope...
29 CitationsSource