dc.description.abstract |
Protein interaction networks are ubiquitous in the functioning of organisms. Inspired by the work of Leskovec et al. on changes in the resilience of such networks, we observe how quantitative characteristics of protein interaction networks change over the evolutionary scale. We find that the spectrum of the Laplacian of the network has features that are similar for similar species, and this correlation can be used to guess the biological genera of species, only knowing its protein network. We then generate a clustering of species using a metric for comparison between different networks. We are currently working on observing how different such a generated tree is from the tree of life generated using sequence data. The thesis follows the following plan: In chapter 1, we start by introducing protein interaction networks and discussing why their study is important. We then give the motivation for our study, describing the work of Leskovec et al. on the resilience of the network and how it has inspired our work. Finally, we give a brief description of the aim of our study. Chapter 2 covers all the necessary background theories used. We broadly discuss three broad aspects: the study of networks, using statistics for working with datasets, and the workings of Phylogenetic Trees. In this chapter, we develop our problem in detail and discuss the ideas we used to study the problem at hand. In Chapter 3, we discuss some of the existing results which we reproduce, in particular the calculation of spectral entropy of some synthetic networks and real divergence between real data. We move to get the spectral entropy for our data and then discuss our exploration of the spectrum of the Laplacian, and finally, come up with a hierarchical clustering to quantify if our method can be extended to generate trees similar to the existing phylogenetic tree. With Chapter 4, as a conclusion, we summarize all the methods and results. We then discuss the limitations of our study and its potential. |
en_US |