In this review, I have presented several topics relevant to the present state and to the future state of the scientific field that I propose to call sequence biology (SB). In some pertinent publications, this field was called DNA linguistics. At the heart of SB lies a concept of a sequence code. In this review, I discussed three concepts: a concept of SB, a concept of encyclopaedia of genetic codes, and a concept of a corpus DNA linguistics.
If the inline PDF is not rendering correctly, you can download the PDF file here.
 Brendel V. Beckman J.S. Trifonov E.N. 1986. Linguistics of nucleotide sequences: Morphology and comparison of vocabularies. Journal of Biomolecular Structure and Dynamics 4 11–21.
 Trifonov E.N. Brendel V. 1987. Gnomic: Dictionary of genetic codes. Rehovot: Balaban Publishers 1986; Wiley-VCH Verlag GmbH.
 Pevzner P.A. Borodovsky M.Y. Mironov A.A. 1989. Linguistics of nucleotide sequences. I: The significance of deviations from mean statistical characteristics and prediction of the frequencies of occurrence of words. Journal of Biomolecular Structure and Dynamics 6 1013–1026.
 Bolshoy A. Volkovich Z. Kirzhner V. et al. 2010. Genome clustering: From linguistics models to classification of genetic texts Studies in Computational Intelligence Berlin: Springer-Verlag.
 Trifonov E.N. 1989. The multiple codes of nucleotide sequences. Bulletin of Mathematical Biology 51 417–432.
 Pevzner P. 2000. Computational molecular biology: An algorithmic approach. Cambridge MA: MIT Press.
 Brazma A. Jonassen I. Eidhammer I. et al. 1998. Approaches to the automatic discovery of patterns in biosequences. Journal of Computational Biology 5 279–305.