Modelling DNA and RNA secondary structures using matrix insertion–deletion systems

Open access

Abstract

Insertion and deletion are operations that occur commonly in DNA processing and RNA editing. Since biological macromolecules can be viewed as symbols, gene sequences can be represented as strings and structures can be interpreted as languages. This suggests that the bio-molecular structures that occur at different levels can be theoretically studied by formal languages. In the literature, there is no unique grammar formalism that captures various bio-molecular structures. To overcome this deficiency, in this paper, we introduce a simple grammar model called the matrix insertion–deletion system, and using it we model several bio-molecular structures that occur at the intramolecular, intermolecular and RNA secondary levels.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Boullier P. and Sagot B. (2011). Multi-component tree insertion grammars in P. De Groote et al. (Eds.) Formal Grammar 2009 Lecture Notes in Artificial Intelligence Vol. 5591 Springer Berlin/Heidelberg pp. 31–46.

  • Brendel V. and Busse H.G. (1984). Genome structure described by formal languages Nucleic Acids Research 12(5): 2561–2568.

  • Brown M. and Wilson C. (1995). RNA pseudoknot modelling using intersections of stochastic context free grammars with applications to database search Proceedings of the Pacific Symposium on Biocomputing Big Island HI USA pp. 109–125.

  • Cai L. Russell L. and Wu Y. (2003). Stochastic modelling of RNA pseudoknotted structures: A grammatical approach Bioinformatics 19(1): 66–73.

  • Calude C.S. and Paŭn Gh. (2001). Computing with Cells and Atoms: An Introduction to Quantum DNA and Membrane Computing Taylor and Francis London.

  • Chiang D. Joshi A.K. and Searls D.B. (2006). Grammatical representations of macromolecular structure Journal of Computational Biology 13(5): 1077–1100.

  • Dong S. and Searls D.B. (1994). Gene structure prediction by linguistic methods Genomics 23(3): 540–551.

  • Dorigo M. and Stutzle T. (2004). Ant Colony Optimization MIT Press Cambridge MA.

  • Durbin R. Eddy S. Krogh A. and Mitchison G. (1998). Biological Sequence Analysis Cambridge University Press Cambridge.

  • Eiben A.E. and Smith J.E. (2003). Introduction to Evolutionary Computing Springer Berlin/Heidelberg.

  • Galiukschov B.S. (1981). Semicontextual grammars Matematicheskaya Logika i Matematicheskaya Lingvistika: 38–50 (in Russian).

  • Goldberg E.D. (1989). Genetic Algorithms in Search Optimization and Machine Learning Addison-Wesley Boston MA.

  • Haussler D. (1982). Insertion and Iterated Insertion as Operations on Formal Languages Ph.D. thesis University of Colorado Boulder CO.

  • Haussler D. (1983). Insertion languages Information Science 131(1): 77–89.

  • Head T. (1987). Formal language theory and DNA: An analysis of the generative capacity of specific recombinant behaviors Bulletin of Mathematical Biology 49(6): 737–750.

  • Kuppusamy L. Mahendran A. and Krishna S.N. (2011a). Matrix insertion–deletion systems for bio-molecular structures in R. Natarajan and A. Ojo (Eds.) ICDCIT-2011 Lecture Notes in Computer Science Vol. 6536 Springer Berlin/Heidelberg pp. 301–311.

  • Kuppusamy L. Mahendran A. and Clergerie E.V. (2011b). Modelling intermolecular structures and defining ambiguity in gene sequences using matrix insertion–deletion systems in biology computation and linguistics in G.B. Enguix et al. (Eds.) New Interdisciplinary Paradigms IOS Press Amsterdam pp. 71–85.

  • Lyngso R.B. Zuker M. and Pedersen C.N.S. (1999). Internal loops in RNA secondary structure prediction RECOMB99 Proceedings of the 3rd International Conference on Computational Molecular Biology Lyon France pp. 260–267.

  • Lyngso R.B. and Pedersen C.N.S. (2000). Pseudoknots in RNA secondary structure RECOMB00 Proceedings of the 4th Annual International Conference on Computational Molecular Biology Tokyo Japan pp. 201–209.

  • Mamitsuka H. and Abe N. (1994). Prediction of beta-sheet structures using stochastic tree grammars Proceedings of the 5th Workshop on Genome Informatics Yokohama Japan pp. 19–28.

  • Pardo M.A.A. Clergerie E.V. and Ferro M.V. (1997). Automata-based parsing in dynamic programming for LIG in A.S. Narinyani (Ed.) Proceedings of the DIALOGUE’97 Computational Linguistics and Its Applications Workshop Moscow Russia pp. 22–27.

  • Păun Gh. Rozenberg G. and Salomaa A. (1998). DNA Computing: New Computing Paradigms Springer Berlin/Heidelberg.

  • Păun Gh. (2002). Membrane Computing: An Introduction Springer Berlin/Heidelberg.

  • Petre I. and Verlan S. (2012). Matrix insertion–deletion systems Theoretical Computer Science 456: 80–88.

  • Rivas E. and Eddy S.R. (2000). The language of RNA: A formal grammar that includes pseudoknots Bioinformatics 16(4): 334–340.

  • Rozenberg G. and Salomaa A. (1997). Handbook of Formal Languages Vol. 1 Springer New York NY.

  • Sakakibara Y. Brown R. Hughey R. Mian I.S. Sjolander K. Underwood R.C. and Haussler D. (1996). Stochastic context-free grammars for tRNA modelling Nucleic Acids Research 22(23): 5112–5120.

  • Sakakibara Y. (2003). Pair hidden Markov models on tree structures Bioinformatics 19(1): 232–240.

  • Searls D.B. (1988). Representing genetic information with formal grammars Proceedings of the National Conference on Artificial Intelligence Saint Paul MN USA pp. 386–391.

  • Searls D.B. (1992). The linguistics of DNA American Scientist 80(6): 579–591.

  • Searls D.B. (1993). The computational linguistics of biological sequences in L. Hunter (Ed.) Artificial Intelligence and Molecular Biology AAAI Press Paolo Alto CA pp. 47–120.

  • Searls D.B. (1995). Formal grammars for intermolecular structures 1st International IEEE Symposium on Intelligence and Biological Systems Washington DC USA pp. 30–37.

  • Searls D.B. (2002). The language of genes Nature 420(6912): 211–217.

  • Theis C. Janssen S. and Giegerich R. (2010). Prediction of RNA secondary structure including kissing hairpin motifs Proceedings of WABI 2010 Liverpool UK pp. 52–64.

  • Uemura Y Hasegawa A. Kobayashi S. and Yokomori T. (1999). Tree adjoining grammars for RNA structure prediction Theoretical Computer Science 210(2): 277–303.

  • Yuki S. and Kasami T. (2006). RNA pseudoknotted structure prediction using stochastic multiple context-free grammar IPSJ Transactions on Bioinformatics 47: 12–21.

Search
Journal information
Impact Factor

IMPACT FACTOR 2018: 1.504
5-year IMPACT FACTOR: 1.553

CiteScore 2018: 2.09

SCImago Journal Rank (SJR) 2018: 0.493
Source Normalized Impact per Paper (SNIP) 2018: 1.361

Mathematical Citation Quotient (MCQ) 2018: 0.08

Cited By
Metrics
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 251 142 1
PDF Downloads 96 72 0