A Dual Sampling Algorithm of RNA Sequences with Hamming Distance Filtration
- A Dual Sampling Algorithm of RNA Sequences with Hamming Distance Filtration
- 2017-12-11 16:00-15:00
- Christian Reidys
Virginia Tech University 教授
Motivation: Recently, a framework considering RNA sequences and their RNA secondary structures as pairs, led to some information-theoretic perspectives on how the semantics encoded in RNA sequences can be inferred. In this context the pairing arises naturally from the energy model of RNA secondary structures. Fixing the sequence in the pairing yields to the RNA energy landscape, whose partition function was discovered by McCaskill. Dually, fixing the structure induces the energy landscape of sequences. The latter have been considered originally for designing more efficient inverse folding algorithms and subsequently enhanced by facilitating the sampling of sequences.
Results: We present here a Hamming filtered, dual partition function, together with a Boltzmann sampler using novel dynamic programming routines for the loop-based energy model. The time complexity of the algorithm is O(h2n), where h; n are Hamming distance and sequence length, respectively, reducing the time complexity of samplers, reported in the literature by O(n2). We then present two applications, which are in the context of the evolution of natural sequence-structure pairs of microRNAs. The first is the inverse fold rate (IFR) of sequence-structure pairs, filtered by Hamming distance, observing that such pairs evolve towards higher levels of robustness, i.e., increasing IFR. Then we construct neutral paths: given two sequences in a neutral network, we employ our sampler in order to construct short paths connecting them, consisting of sequences all contained in the neutral network.