Developing a practical toxicogenomics data analysis system utilizing open-source software.

Published on Jan 1, 2013in Methods of Molecular Biology
路 DOI :10.1007/978-1-62703-059-5_16
Takehiro Hirai3
Estimated H-index: 3
(Daiichi Sankyo),
Naoki Kiyosawa17
Estimated H-index: 17
(Daiichi Sankyo)
Sources
Abstract
: Comprehensive gene expression analysis has been applied to investigate the molecular mechanism of toxicity, which is generally known as toxicogenomics (TGx). When analyzing large-scale gene expression data obtained by microarray analysis, typical multivariate data analysis methods performed with commercial software such as hierarchical clustering or principal component analysis usually do not provide conclusive outputs by themselves. To best utilize the TGx data for toxicity evaluation in the drug development process, fit-for-purpose customization of the analytical algorithm with user-friendly interface and intuitive outputs are required to practically address the toxicologists' demands. However, commercial software is usually not very flexible in the customization of their functions or outputs. Owing to the recent advancement and accumulation of open-source software contributed by bioinformaticians all over the world, it becomes easier for us to develop practical and fit-for-purpose analytical software by ourselves with fairly low cost and efforts. The aim of this article is to present an example of developing an automated TGx data processing system (ATP system), which implements gene set-level analysis toxicogenomic profiling by D-score method and generates straightforward output that makes it easy to interpret the biological and toxicological significance of the TGx data. Our example will provide basic clues for readers to develop and customize their own TGx data analysis system which complements the function of existing commercial software.
馃摉 Papers frequently viewed together
References14
Newest
#1Naoki Kiyosawa (Daiichi Sankyo)H-Index: 17
#2Sunao ManabeH-Index: 18
Last. Atsushi SanbuisshoH-Index: 12
view all 4 authors...
A systems-level understanding of molecular perturbations is crucial for evaluating chemical-induced toxicity risks appropriately, and for this purpose comprehensive gene expression analysis or toxicogenomics investigation is highly advantageous. The recent accumulation of toxicity-associated gene sets (toxicogenomic biomarkers), enrichment in public or commercial large-scale microarray database and availability of open-source software resources facilitate our utilization of the toxicogenomic dat...
Source
#1Naoki Kiyosawa (Daiichi Sankyo)H-Index: 17
#2Sunao Manabe (Daiichi Sankyo)H-Index: 18
Last. Takashi Yamoto (Daiichi Sankyo)H-Index: 16
view all 4 authors...
Toxicogenomics data sets on rat livers covering 118 compounds were subjected to inference of a gene set-level, not individual gene-level, network structure. Expression changing levels for 58 gene sets was used for network inference with a Gaussian graphical model algorithm. The established network contained reasonable relationships, such as ones between the blood glucose level and glycolysis-related genes or the blood transaminase level and cellular injury-related genes, indicating that the gene...
Source
#1Naoki Kiyosawa (Daiichi Sankyo)H-Index: 17
#2Yosuke Ando (Daiichi Sankyo)H-Index: 17
Last. Takashi Yamoto (Daiichi Sankyo)H-Index: 16
view all 6 authors...
Abstract As information regarding microarray data sets and toxicogenomic biomarkers grows rapidly, the process of analyzing data and interpreting the results is increasingly complicated. To facilitate data analysis, a simple expression ratio-based scoring method called the TGP1 score was previously proposed [Kiyosawa, N., Shiwaku, K., Hirode, M., Omura, K., Uehara, T., Shimizu, T., Mizukawa, Y., Miyagishima, T., Ono, A., Nagao, T., Urushidani, T., 2006. Utilization of a one-dimensional score for...
Source
#1Alan S. Bass (Schering-Plough)H-Index: 16
#2Mark E. Cartwright (Schering-Plough)H-Index: 8
Last. John C. Hunter (Schering-Plough)H-Index: 14
view all 9 authors...
Identification of novel new molecules which hold the greatest promise of safe and effective therapies remains a continuous challenge to the pharmaceutical industry. This has led the industry to implement strategies for identification of the most promising candidates during the discovery phase and for their safe and expeditious advancement through development. Testing for potential liable properties in the discovery phase has included the evaluation of major areas of pharmaceutics that have led t...
Source
#1Naoki Kiyosawa (Daiichi Sankyo)H-Index: 17
#2Yosuke Ando (Daiichi Sankyo)H-Index: 17
Last. Takashi Yamoto (Daiichi Sankyo)H-Index: 16
view all 4 authors...
Toxicogenomics (TGx) is a widely used technique in the preclinical stage of drug development to investigate the molecular mechanisms of toxicity. A number of candidate TGx biomarkers have now been identified and are utilized for both assessing and predicting toxicities. Further accumulation of novel TGx biomarkers will lead to more efficient, appropriate and cost effective drug risk assessment, reinforcing the paradigm of the conventional toxicology system with a more profound understanding of t...
Source
#1Aravind Subramanian (MIT: Massachusetts Institute of Technology)H-Index: 23
#2Pablo TamayoH-Index: 73
Last. Jill P. Mesirov (MIT: Massachusetts Institute of Technology)H-Index: 83
view all 11 authors...
Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights in...
Source
#1Vincent J. Carey (Brigham and Women's Hospital)H-Index: 75
#2Jeff Gentry (Harvard University)H-Index: 9
Last. Robert Gentleman (Harvard University)H-Index: 59
view all 4 authors...
Summary: In this paper, we review the central concepts and implementations of tools for working with network structures in Bioconductor. Interfaces to open source resources for visualization (AT&T Graphviz) and network algorithms (Boost) have been developed to support analysis of graphical structures in genomics and computational biology. Availability: Packages graph, Rgraphviz, RBGL of Bioconductor (www.bioconductor.org). Contact: stvjc@channing.harvard.edu
Source
#1Robert Gentleman (Harvard University)H-Index: 59
#2Vincent J. Carey (Brigham and Women's Hospital)H-Index: 75
Last. Jianhua Zhang (Harvard University)H-Index: 87
view all 25 authors...
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to...
Source
#1Earl Hubbell (Affymetrix)H-Index: 21
#2Wei-Min Liu (Affymetrix)H-Index: 7
Last. Rui Mei (Affymetrix)H-Index: 18
view all 3 authors...
Motivation: We consider the problem of estimating values associated with gene expression from oligonucleotide arrays. Such estimates should linearly track concentration, yield non-negative results, have statistical guarantees of robustness against outliers, and allow estimates of significance and variance. Results: Ah ierarchy of simple models is used to design robust estimators meeting these goals for both stand alone and comparative experiments. This algorithm has been validated against an ext...
Source
#1Kam D. Dahlquist (UCSF: University of California, San Francisco)H-Index: 12
#2Nathan Salomonis (UCSF: University of California, San Francisco)H-Index: 40
Last. Bruce R. Conklin (UCSF: University of California, San Francisco)H-Index: 73
view all 5 authors...
Source
Cited By2
Newest
#1Junko Yamane (Kyoto University)H-Index: 11
#2Sachiyo Aburatani (AIST: National Institute of Advanced Industrial Science and Technology)H-Index: 13
Last. Wataru Fujibuchi (Kyoto University)H-Index: 23
view all 9 authors...
Predictive toxicology using stem cells or their derived tissues has gained increasing importance in biomedical and pharmaceutical research. Here, we show that toxicity category prediction by support vector machines (SVMs), which uses qRT-PCR data from 20 categorized chemicals based on a human embryonic stem cell (hESC) system, is improved by the adoption of gene networks, in which network edge weights are added as feature vectors when noisy qRT-PCR data fail to make accurate predictions. The acc...
Source
Dec 3, 2013 in CSE (Computational Science and Engineering)
#1Olugbenga Oluwagbemi (Covenant University)H-Index: 7
#2Adewole Adewumi (Covenant University)H-Index: 8
Last. Luis Fernandez (University of Alcal谩)H-Index: 4
view all 5 authors...
There are several scripting languages that exist today. However, some are used more frequently and popular than others. This is due to certain characteristics and features that they possess. Particularly in applied computing fields like software engineering, bioinformatics and computational biology, scripting languages are gaining popularity. This paper presents a comparative study of ten popular scripting languages that are used in the above mentioned fields/area. For making comparison, we have...
Source
This website uses cookies.
We use cookies to improve your online experience. By continuing to use our website we assume you agree to the placement of these cookies.
To learn more, you can find in our Privacy Policy.