Dr. Linda Dib

Co-evolution in genes dictated by functional and structural constraints.

It has been demonstrated that evolutionarily co-evolving networks of residues mediate allosteric communication in proteins involved in cellular signaling, the process by which signals originating at one site in a protein propagate reliably to affect distant functional sites. The general principles of protein structure that underlie this process remain unknown. In a seminal paper Ranganathan described a sequence-based statistical method for quantitatively mapping the global network of amino acid interactions in a protein.

To investigate further Ranganathan approach, Baussand developed a new method, based on a fine combinatorial analysis of protein family to reconstruct networks of co-evolved residues from sequence analysis. The approach was used to detect motifs of co-evolved residues which will be used to detect distantly related protein pairs. Our new aim is first to undertand the differences between the statistcal methodologies and the combinatorial one (A. Carbone and L. Dib, Co-evolution and information signals in biological sequences. Theoretical Computer Science, 2011). Second we define blocks of residues that co-evolve (L. Dib and A. Carbone, Co-evolution of fragments in biological sequences, Plos One, 2012) and we finally clustered those blocks to extract networks of co-evolving blocks (L. Dib and A. Carbone, CLAG: a non supervised non hierarchical clustering algorithm, BMC bioinformatics, 2012).

Co-evolving blocks and positions are the result of a long process of evolution. We propose another Markov model, Coev, that describes the evolutionary process of coevolving positions along DNA sequences (L. Dib, D. Silvestro, N. Salamin, Co-evolving profiles in genes, submitted, 2013). Because correlated positions within nucleotide and protein sequences are the result of an evolutionary process, a better understanding of the coevolving profile should be obtained by incorporating the underlying evolutionary history of the sequences (i.e. their phylogenetic relationships and the associated profile; (Dutheil et al. 2010). We thus developed a dependent model of nucleotide substitutions based on a$16 X 16 instantaneous rate matrix that includes 4 substitution rates s, d, w1, w2 and a fifth parameter representing the profile of coevolution. The Coev model can predict coevolving positions based on the aligned sequences and a phylogenetic tree, and estimate the profile associated with a pair of coevolving positions by exhaustively iterating over the 192 possible profiles defined for nucleotide sequences (for example: {AA, CC}, {AA, CC, GG}, {AA, CC, GG, TT}, ...). The model is implemented in a maximum likelihood and Bayesian framework.

Follow us: