The amount of data to store, organize and manage in any organization, is very high and increases every day, fact well-known by companies as Facebook, Google or SAS. With this current growth rate, technologies must adapt to the amount of disposable data, and a new approach to information processing is required. Big Data technologies are more focused, and this is a reason for a greater spread of NoSQL database models. The purpose of this article is to validate the existing (and already used) migration methods and to adapt them, to understand the most efficient method to migrate a relational database to a NoSQL database. We will show the methodology used and what were the steps followed for the implementation, as well as the configuration of the environment used during the tests. Results show that in this migration process, the most efficient method is what is referred to as automatic offline migration. However, it requires a window of unavailability greater than the method of online migration, which in turn requires more resources from the operating system to migrate. Therefore, the most efficient method to migrate a database will depend on the application availability, and the computational resources available for it. We hope to make an important contribution in helping to choose a migration method to use, and the metrics that can be collected to better evaluate the performance of a migration.
If the inline PDF is not rendering correctly, you can download the PDF file here.
Antaño A. C. M. Castro J. M. M. & Valencia R. E. C. (2014). Migracion de Bases de Datos SQL a NoSQL. Revista Tlamati Especial 3 144-148.
Codd E. F. (1982). Relational database: a practical foundation for productivity. Communications of the ACM 25(2) 109-117.
Codd E. F. (1990). The relational model for database management: version 2. United States of America: Addison-Wesley.
Davenport T. H. & Dyche J. (2013). Big Data in Big Companies. International Institute for Analytics.
Gomes P. F. L. (2011). Migração de aplicações legadas para bases de dados NoSQL. Universidade do Minho.
Manyika J. Chui M. Brown B. Bughin J. Dobbs R. Roxburgh C. & Byers A. H. (2011). Big data: The next frontier for innovation competition and productivity. Retrieved January 9 2016 from https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-forinnovation
MongoDB. (2015). RDBMS to MongoDB Migration Guide. A MongoDB White Paper 16.
Moniruzzaman A. B. M. & Hossain S. A. (2013). NoSQL Database : New Era of Databases for Big data Analytics- Classification Characteristics and Comparison. International Journal of Database Theory and Application 6(4) 14.
Neto P. de A. dos santos Neto J. R. Junior F. das C. R. & Oliveira P. A. (2013). Requisitos para ferramentas de Migração de Dados. In IX Simpósio Brasileiro de Sistemas de Informação (pp. 887-898). Teresina.
Oliveira C. S. de & Marcelino M. A. (2012). Metodologias e Extratégias de Migração de Dados. Sinergia (CEFETSP) 13(3) 183-191. Retrieved from www2.ifsp.edu.br/edu/prp/sinergia
Oliveira F. V. de. (2017). Migração de bases de dados relacionais para NoSQL - Métodos de Análise. ISCTEIUL Instituto Universitário de Lisboa.
Pereira D. J. P. (2014). Armazens de dados em bases de dados NoSQL. Instituto Superior de Engenharia do Porto.
Rodrigues R. A. B. (2009). Métricas e Ferramentas Livres para Análise de Capacidade em Servidores Linux. Universidade Federal de Lavras.
TPC. (1992a). TPC-C is an On-Line Transaction Processing Benchmark. Retrieved May 29 2017 from http://www.tpc.org/tpcc/
TPC. (1992b). TPC-H is a Decision Support Benchmark. Retrieved January 11 2016 from http://www.tpc.org/tpch/
Wikipedia. (2017). Hybrid Transactional/Analytical Processing (HTAP). Retrieved June 22 2017 from https://en.wikipedia.org/wiki/Hybrid_Transactional/Analytical_Processing_(HTAP)
Yaqub N. (2012). Comparison of Virtualization Performance: VMWare and KVM. Signal Processing. Univesity of Oslo.