Word pattern prediction using Big Data frameworks

[1] G. Erin. Processing time of TFIDF and Naive Bayes on Spark 2.0, Hadoop 2.6 and Hadoop 2.7: Which Tool Is More Efficient?, Msc Thesis, National College of Ireland Dublin, 2016. ⇒52Search in Google Scholar

[2] K. Rattanaopas, S. Kaewkeeree. Improving Hadoop MapReduce performance with data compression: A study using wordcount job, 2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTICON). IEEE, 2017. p. 564-567 ⇒5210.1109/ECTICon.2017.8096300Search in Google Scholar

[3] KM. Lee, CS. Han, KI. Kim, SH. Lee, Word recommendation for English composition using big corpus data processing, Cluster Computing, (2019), 1911-1924. ⇒56, 65Search in Google Scholar

[4] M. Kontagora, H. Gonzalez-Velez, Benchmarking a MapReduce Environment on a Full Virtualisation Platform, The 4th International Conference on Complex, Intelligent and Software Intensive Systems, 433-438. 10.1109/CISIS.2010.45. ⇒62Search in Google Scholar

[5] M. Bartík, S. Ulbik, P. Kubalik Matěj. LZ4 compression algorithm on FPGA, 2015 IEEE International Conference on Electronics, Circuits, and Systems (ICECS). IEEE, 2015 ⇒6310.1109/ICECS.2015.7440278Search in Google Scholar

[6] RY Rubinstein, DP. Kroese, Simulation and the Monte Carlo method. Vol. 10. John Wiley & Sons, 2016. ⇒6310.1002/9781118631980Search in Google Scholar

[7] R Lenhardt,J Alakuijala, Gipfeli-high speed compression algorithm. 2012 Data Compression Conference (pp. 109-118). IEEE ⇒6210.1109/DCC.2012.19Search in Google Scholar

[8] H. Karloff, S. Suri, S. Vassilvitskii, A model of computation for MapReduce. Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 2010. ⇒5310.1137/1.9781611973075.76Search in Google Scholar

[9] Apache Hadoop, Apache, https://hadoop.apache.org/ ⇒52Search in Google Scholar

[10] Apache Spark, Apache, https://spark.apache.org/ ⇒52, 55Search in Google Scholar

[11] E. Brill, A simple rule-based part of speech tagger, Proceedings of the third conference on Applied natural language processing. Association for Computational Linguistics, 1992. ⇒5210.3115/974499.974526Search in Google Scholar

[12] Apache Yarn, Apache, https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html ⇒53Search in Google Scholar

[13] Apache HDFS docs, https://hadoop.apache.org/docs/r1.2.1/ ⇒53Search in Google Scholar

[14] Hadoop Native Library, https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html ⇒61Search in Google Scholar

[15] Project repository, https://gitlab.com/thelfter/word-prediction ⇒64Search in Google Scholar

[16] Spark Sql, https://spark.apache.org/docs/latest/sql-programming-guide.html ⇒55Search in Google Scholar

[17] Stanford part-of-speecg tagger, https://nlp.stanford.edu/software/tagger.html ⇒57Search in Google Scholar

[18] Wikipedia dumps, https://dumps.wikimedia.org/ ⇒63Search in Google Scholar

eISSN:: 2066-7760
Język:: Angielski

Częstotliwość wydawania:: 2 razy w roku
Dziedziny czasopisma:: Computer Sciences, other

Kanał RSS czasopisma

Word pattern prediction using Big Data frameworks

Data publikacji: 16 lip 2020

Zakres stron: 51 - 69

Otrzymano: 31 sty 2020

Przyjęty: 28 lut 2020

DOI: https://doi.org/10.2478/ausi-2020-0004

Słowa kluczoweword-pattern, word-prediction, big data, hadoop, spark, nlp, map reduce, snappy, lz4, data compression

© 2020 Bence Szabari et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Słowa kluczowe
word-pattern, word-prediction, big data, hadoop, spark, nlp, map reduce, snappy, lz4, data compression