Open Access

Application of Statistical Methods in the Analysis of Sentence Structure


Cite

The goal of this research is to explore sentence structures expressed by parts of speech. Due to a small amount of data, a problem of sparse data has arisen, which was solved by recording the annotated sentences and considering a “framework” of a sentence made up from a verb and a noun, which was conditionally called a code. The code of a sentence is created by changing each word of a sentence by a symbol (letter or number) that encodes one or other property of that word as a constituent of the sentence. Zipf’s law describes sentences, encoded like that, rather well. If we ‘learn’ well to identify and analyze (annotate, translate, etc.) sentences of the simplest structure, we can automatically process quite a large part of text sentences. It is possible to identify at least 17% of sentences consisting of the simplest structure.

eISSN:
2256-0939
Language:
English
Publication timeframe:
2 times per year
Journal Subjects:
Life Sciences, Biotechnology, Plant Science, Ecology