Please use this identifier to cite or link to this item: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7612
Title: Computational prediction of SUMOylation sites in proteins
Authors: MADHUSUDHAN, M.S.
RAMTIRTHA, YOGENDRA
Dept. of Biology
20143311
Keywords: computational biology and bioinformatics
Research Subject Categories::NATURAL SCIENCES::Biology::Cell and molecular biology::Molecular biology
Issue Date: Aug-2022
Citation: 101
Abstract: SUMOylation is a post translational modification that involves formation of a covalent bond between lysine residues in substrate proteins and SUMO. SUMO (Small Ubiquitin-related MOdifier) is a protein that belongs to ubiquitin-like family of proteins. Mutations of SUMOylated lysines have been linked to neuro-degenerative diseases and cancer. Experimental determination of SUMOylated lysines is a difficult and expensive task. Hence, computational methods for SUMOylation site prediction are important. The present thesis presents two computational approaches for predicting modified lysines. First method proposed in this thesis combines information from mass spectrometry-proteomics experiments along with evolutionary conservation. This method helped in predicting > 8600 proteins encoded by > 4600 genes from the fruit fly Drosophila melanogaster. This method uses protein sequence information to predict SUMOylated lysines. Analysis done in this study revealed that ψ-K-x-E/D, (where ψ = I / L / V, K = SUMOylated lysine and x = any amino acid), is a commonly occurring consensus motif involving SUMOylated lysines. Protein targets of SUMOylation are involved in many cellular activities such as transcription regulation, cell division, DNA repair and signal transduction. Proteins localizing to nucleus have higher tendency to get SUMOylated than their cytoplasmic counterparts. Our homology based method is capable of predicting SUMOylated lysines that are missed by existing popular prediction tools. The consensus motif ψ-K-x-E/D accounts for only half of all the known SUMOylated lysines. There are a lot of modified lysines that cannot be accounted using sequence motifs alone. Hence, we believe that candidate lysines are chosen by Ubc9 (SUMO E2 conjugating enzyme) after interacting with target / substrate proteins. Therefore, studying 3-D structures of ubc9-target complexes is very important. The second method proposed in this thesis involves using protein 3-D structures to predict SUMOylated lysines. Presented here is a proof-of-concept study of a novel structure based SUMOylation site prediction tool. The dataset used in this study consists of 1841 human proteins with > 7400 SUMOylated lysines. At present, the Protein Data Bank has 3-D structure of only one ubc9-target complex. A special kind of docking tool was developed to dock lysine residues into the active site of ubc9 using the 3-D structures of both the proteins. This tool was referred to as “sampling method”. The sampling method was applied to every lysine of all proteins in the dataset. A scoring method was designed to distinguish between conformational poses of SUMOylated and non-SUMOylated lysines. Residue contacts between ubc9 and target proteins at the interface between both proteins were used by the scoring method. Our method achieved a sensitivity = 25%, specificity = 98%, accuracy = 81% and Matthews’ Correlation Coefficient = 0.4.
URI: http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/7612
Appears in Collections:PhD THESES

Files in This Item:
File Description SizeFormat 
20143311_YOGENDRA_RAMTIRTHA.pdfPh.D Thesis1.45 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.