Functional site plasticity in domain superfamilies

Published on May 1, 2013in Biochimica et Biophysica Acta: Bioenergetics3.465
· DOI :10.1016/J.BBAPAP.2013.02.042
Benoit H. Dessailly18
Estimated H-index: 18
,
Natalie L. Dawson21
Estimated H-index: 21
(UCL: University College London)
+ 1 AuthorsChristine A. Orengo85
Estimated H-index: 85
(UCL: University College London)
Sources
Abstract
We present, to our knowledge, the first quantitative analysis of functional site diversity in homologous domain superfamilies. Different types of functional sites are considered separately. Our results show that most diverse superfamilies are very plastic in terms of the spatial location of their functional sites. This is especially true for protein–protein interfaces. In contrast, we confirm that catalytic sites typically occupy only a very small number of topological locations. Small-ligand binding sites are more diverse than expected, although in a more limited manner than protein–protein interfaces. In spite of the observed diversity, our results also confirm the previously reported preferential location of functional sites. We identify a subset of homologous domain superfamilies where diversity is particularly extreme, and discuss possible reasons for such plasticity, i.e. structural diversity. Our results do not contradict previous reports of preferential co-location of sites among homologues, but rather point at the importance of not ignoring other sites, especially in large and diverse superfamilies. Data on sites exploited by different relatives, within each well annotated domain superfamily, has been made accessible from the CATH website in order to highlight versatile superfamilies or superfamilies with highly preferential sites. This information is valuable for system biology and knowledge of any constraints on protein interactions could help in understanding the dynamic control of networks in which these proteins participate. The novelty of our work lies in the comprehensive nature of the analysis – we have used a significantly larger dataset than previous studies – and the fact that in many superfamilies we show that different parts of the domain surface are exploited by different relatives for ligand/protein interactions, particularly in superfamilies which are diverse in sequence and structure, an observation not previously reported on such a large scale. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly.
📖 Papers frequently viewed together
6 Authors (David A. Lee)
102 Authors (Predrag Radivojac, ..., Iddo Friedberg)
References54
Newest
#1Robert Rentzsch (RKI: Robert Koch Institute)H-Index: 12
#2Christine A. Orengo (UCL: University College London)H-Index: 85
Here we assessed the use of domain families for predicting the functions of whole proteins. These 'functional families' (FunFams) were derived using a protocol that combines sequence clustering with supervised cluster evaluation, relying on available high-quality Gene Ontology (GO) annotation data in the latter step. In essence, the protocol groups domain sequences belonging to the same superfamily into families based on the GO annotations of their parent proteins. An initial test based on enzym...
Source
#1Ambrish Roy (UM: University of Michigan)H-Index: 16
#2Yang Zhang (UM: University of Michigan)H-Index: 82
Summary Proteins perform functions through interacting with other molecules. However, structural details for most of the protein-ligand interactions are unknown. We present a comparative approach (COFACTOR) to recognize functional sites of protein-ligand interactions using low-resolution protein structural models, based on a global-to-local sequence and structural comparison algorithm. COFACTOR was tested on 501 proteins, which harbor 582 natural and drug-like ligand molecules. Starting from I-T...
Source
#1Renu Goel (Kuvempu University)H-Index: 21
#2H. C. HarshaH-Index: 29
Last. T. S. Keshava Prasad (Pondicherry University)H-Index: 39
view all 4 authors...
Human Protein Reference Database (HPRD) is a rich resource of experimentally proven features of human proteins. Protein information in HPRD includes protein–protein interactions, post-translational modifications, enzyme/substrate relationships, disease associations, tissue expression, and subcellular localization of human proteins. Although, protein–protein interaction data from HPRD has been widely used by the scientific community, its phosphoproteome data has not been exploited to its full pot...
Source
#1Samuel Kerrien (UBC: University of British Columbia)H-Index: 18
#2Bruno Aranda (UBC: University of British Columbia)H-Index: 14
Last. Henning Hermjakob (UBC: University of British Columbia)H-Index: 84
view all 22 authors...
IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. Two levels of curation are now available within the database, with both IMEx-level annotation and less detailed MIMIx-compatible entries currently supported. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications. The IntAct website has been improved to enhance the search pr...
Source
#1Luana Licata (University of Rome Tor Vergata)H-Index: 19
#2Leonardo Briganti (University of Rome Tor Vergata)H-Index: 8
Last. Gianni Cesareni (University of Rome Tor Vergata)H-Index: 78
view all 12 authors...
The Molecular INTeraction Database (MINT, http:// mint.bio.uniroma2.it/mint/) is a public repository for protein–protein interactions (PPI) reported in peer-reviewed journals. The database grows steadily over the years and at September 2011 contains approximately 235 000 binary interactions captured from over 4750 publications. The web interface allows the users to search, visualize and download interactions data. MINT is one of the members of the International Molecular Exchange consortium (IME...
Source
#1Jonathan G. Lees (UCL: University College London)H-Index: 27
#2Corin Yeats (UCL: University College London)H-Index: 30
Last. Christine A. Orengo (UCL: University College London)H-Index: 85
view all 7 authors...
Gene3D http://gene3d.biochem.ucl.ac.uk is a comprehensive database of protein domain assignments for sequences from the major sequence databases. Domains are directly mapped from structures in the CATH database or predicted using a library of representative profile HMMs derived from CATH superfamilies. As previously described, Gene3D integrates many other protein family and function databases. These facilitate complex associations of molecular function, structure and evolution. Gene3D now includ...
Source
#1Benjamin A. Shoemaker (NIH: National Institutes of Health)H-Index: 25
#2Dachuan Zhang (NIH: National Institutes of Health)H-Index: 14
Last. Anna R. Panchenko (NIH: National Institutes of Health)H-Index: 44
view all 9 authors...
We have recently developed the Inferred Biomolecular Interaction Server (IBIS) and database, which reports, predicts and integrates different types of interaction partners and locations of binding sites in proteins based on the analysis of homologous structural complexes. Here, we highlight several new IBIS features and options. The server's webpage is now redesigned to allow users easier access to data for different interaction types. An entry page is added to give a quick summary of available ...
Source
#1Mark N. Wass (ICL: Imperial College London)H-Index: 28
#2Alessia David (ICL: Imperial College London)H-Index: 13
Last. Michael J.E. Sternberg (ICL: Imperial College London)H-Index: 104
view all 3 authors...
Macromolecular interactions are central to most cellular processes. Experimental methods generate diverse data on these interactions ranging from high throughput protein–protein interactions (PPIs) to the crystallised structures of complexes. Despite this, only a fraction of interactions have been identified and therefore predictive methods are essential to fill in the numerous gaps. Many predictive methods use information from related proteins. Accordingly, we review the conservation of interfa...
Source
#1Chris Stark (OICR: Ontario Institute for Cancer Research)H-Index: 19
#2Bobby-Joe Breitkreutz (OICR: Ontario Institute for Cancer Research)H-Index: 18
Last. Mike Tyers (OICR: Ontario Institute for Cancer Research)H-Index: 53
view all 15 authors...
The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained ...
Source
#1David Croft (EMBL-EBI: European Bioinformatics Institute)H-Index: 9
#2Gavin O'KellyH-Index: 3
Last. Lincoln SteinH-Index: 106
view all 22 authors...
Reactome (http://www.reactome.org) is a collaboration among groups at the Ontario Institute for Cancer Research, Cold Spring Harbor Laboratory, New York University School of Medicine and The European Bioinformatics Institute, to develop an open source curated bioinformatics database of human pathways and reactions. Recently, we developed a new web site with improved tools for pathway browsing and data analysis. The Pathway Browser is an Systems Biology Graphical Notation (SBGN)-based visualizati...
Source
Cited By26
Newest
#1Sayoni DasH-Index: 18
#2Harry M. Scholes (UCL: University College London)H-Index: 6
Last. Christine A. Orengo (UCL: University College London)H-Index: 85
view all 4 authors...
MOTIVATION Identification of functional sites in proteins is essential for functional characterization, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein-protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH f...
Source
#1Su Datt Lam (UCL: University College London)H-Index: 8
#2M. Madan Babu (LMB: Laboratory of Molecular Biology)H-Index: 69
Last. Christine A. Orengo (UCL: University College London)H-Index: 85
view all 4 authors...
Alternative splicing can expand the diversity of proteomes. Homologous mutually exclusive exons (MXEs) originate from the same ancestral exon and result in polypeptides with similar structural properties but altered sequence. Why would some genes switch homologous exons and what are their biological impact? Here, we analyse the extent of sequence, structural and functional variability in MXEs and report the first large scale, structure-based analysis of the biological impact of MXE events from d...
Source
#1Sayoni Das (UCL: University College London)H-Index: 18
#2Harry M. Scholes (UCL: University College London)H-Index: 6
Last. Christine A. Orengo (UCL: University College London)H-Index: 85
view all 3 authors...
Motivation: Identification of functional sites in proteins is essential for functional characterisation, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein-protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH ...
Source
#1Paul Ashford (UCL: University College London)H-Index: 12
#2Camilla S.M. Pang (UCL: University College London)H-Index: 6
Last. Christine A. Orengo (UCL: University College London)H-Index: 85
view all 5 authors...
Tumour sequencing identifies highly recurrent point mutations in cancer driver genes, but rare functional mutations are hard to distinguish from large numbers of passengers. We developed a novel computational platform applying a multi-modal approach to filter out passengers and more robustly identify putative driver genes. The primary filter identifies enrichment of cancer mutations in CATH functional families (CATH-FunFams) – structurally and functionally coherent sets of evolutionary related d...
Source
#1Ian Sillitoe (UCL: University College London)H-Index: 36
#2Natalie L. Dawson (UCL: University College London)H-Index: 21
Last. Christine A. Orengo (UCL: University College London)H-Index: 85
view all 14 authors...
This article provides an update of the latest data and developments within the CATH protein structure classification database (http://www.cathdb.info). The resource provides two levels of release: CATH-B, a daily snapshot of the latest structural domain boundaries and superfamily assignments, and CATH+, which adds layers of derived data, such as predicted sequence domains, functional annotations and functional clustering (known as Functional Families or FunFams). The most recent CATH+ release (v...
Source
#1Su Datt Lam (UCL: University College London)H-Index: 8
#2Christine A. Orengo (UCL: University College London)H-Index: 85
Last. Jon G. Lees (Oxford Brookes University)H-Index: 6
view all 3 authors...
Alternative splicing (AS) has been suggested as one of the major processes expanding the diversity of proteomes in multicellular organisms. Mutually exclusive exons (MXE) provide one form of AS that is less likely to disrupt protein structure and is over-represented in the proteome compared to other forms of AS. We used domain structure information from the CATH classification to perform a systematic structural analysis of the effects of MXE splicing in high quality animal genomes (e.g. human, f...
Source
#1Aurelio A. Moya-García (UCL: University College London)H-Index: 3
#2Tolulope Adeyelu (UCL: University College London)H-Index: 2
Last. Juan A. G. Ranea (UMA: University of Málaga)H-Index: 15
view all 8 authors...
Protein domains mediate drug-protein interactions and this principle can guide the design of multi-target drugs i.e. polypharmacology. In this study, we associate multi-target drugs with CATH functional families through the overrepresentation of targets of those drugs in CATH functional families. Thus, we identify CATH functional families that are currently enriched in drugs (druggable CATH functional families) and we use the network properties of these druggable protein families to analyse thei...
Source
#1Natalie L. Dawson (UCL: University College London)H-Index: 21
#2Ian Sillitoe (UCL: University College London)H-Index: 36
Last. Christine A. Orengo (UCL: University College London)H-Index: 85
view all 5 authors...
: This chapter describes the generation of the data in the CATH-Gene3D online resource and how it can be used to study protein domains and their evolutionary relationships. Methods will be presented for: comparing protein structures, recognizing homologs, predicting domain structures within protein sequences, and subclassifying superfamilies into functionally pure families, together with a guide on using the webpages.
Source
Bioinformatics and chemoinformatics approaches contribute to the discovery of novel targets, chemical probes, hits, leads and medicinal drugs. A vast repertoire of computational methods has indeed been reported over the years and in this review, I will briefly introduce some concepts and approaches, namely the analysis of potential therapeutic target binding pockets, the preparation of compound collections and virtual screening. An example of application is provided for two proteins acting in th...
Source
#1Govindarajan Sudha (IISc: Indian Institute of Science)H-Index: 6
#2Narayanaswamy Srinivasan (IISc: Indian Institute of Science)H-Index: 54
A comprehensive analysis of the quaternary features of distantly related homo-oligomeric proteins is the focus of the current study. This study has been performed at the levels of quaternary state, symmetry, and quaternary structure. Quaternary state and quaternary structure refers to the number of subunits and spatial arrangements of subunits, respectively. Using a large dataset of available 3D structures of biologically relevant assemblies, we show that only 53% of the distantly related homo-o...
Source
This website uses cookies.
We use cookies to improve your online experience. By continuing to use our website we assume you agree to the placement of these cookies.
To learn more, you can find in our Privacy Policy.