Efficiency and Agility for a Modern Solution of Deterministic Multiple Source Prioritization and Validation Tasks

Open access

Abstract

This article focuses on a multiple source prioritization and validation service. We describe a modern rule-based, loosely coupled solution. We follow generalization, efficiency and agility principles in application design. We show benefits and stumbling blocks in micro-service architectural style and in rule-based solutions, where even the selection task is solved through selection rules, which encapsulate the calls to Entity Services, allowing access to input-sources. We allowing the rule-based service efficiency and further local and remote input data selection scenarios for the validation Statistical Service. In particular, data virtualization technologies enable architects to use remote sourcing and further increases agility in data selection issues. Through a wide number of experimental results, we show the necessary level of attention in process implementation, data architectures and resource usage. Agility and efficiency emerge as drivers which possibly sustain the Modernization flexibility impetus. In fact, flexible services may potentially serve multiple scenarios and domains.

Alagiannis, I., R. Borovica, M. Branco, S. Idreos, and A. Ailamaki. 2012. “NoDB: Efficient Query Execution on Raw Data Files.” In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data: 241–252. Scottsdale, Arizona, U.S.A. May 20–24, 2012. Doi: http://dx.doi.org/10.1145/2213836.2213864.

Ananthanarayanan, G. 2013. Optimizing Parallel Job Performance in Data-Intensive Clusters. Diss. University of California, Technical report Berkeley EECS. Available at: https://www2.eecs.berkeley.edu/Pubs/TechRpts/2014/ (accessed November 2017).

Ananthanarayanan, G., A. Ghodsi, S. Shenker, and I. Stoica. 2013. “Effective Straggler Mitigation: Attack of the Clones.” In NSDI 13: 185–198. ISBN: 978-1-931971-00-3. Available at https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/ananthanarayanan (accessed November 2017).

Chen, Y. and D.Z. Wang. 2014. “Knowledge Expansion Over Probabilistic Knowledge Bases.” In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data: 649–660. Snowbird, Utah, U.S.A. June 22–27, 2014. Doi: http://dx.doi.org/10.1145/2588555.2610516.

Chen, T., R. Bahsoon, and A.R.H. Tawil. 2014. “Scalable Service-Oriented Replication with Flexible Consistency Guarantee in the Cloud.” Information Sciences 264: 349–370. Doi: http://dx.doi.org/10.1016/j.ins.2013.11.024.

Cheng, Yu, and F. Rusu. 2015. “Scanraw: A Database Meta-Operator for Parallel In-Situ Processing and Loading.” ACM Transactions on Database Systems (TODS) 40(3): 19. Doi: http://dx.doi.org/10.1145/2818181.

Delimitrou, C. and C. Kozyrakis. 2014. “Quasar: Resource-Efficient and Qos-Aware Cluster Management.” In ACM SIGPLAN Notices 49(4): 127–144. ACM. Doi: http://dx.doi.org/10.1145/2644865.2541941.

De Sa, C., A. Ratner, C. Ré, J. Shin, F. Wang, S. Wu, and C. Zhang. 2016. “DeepDive: Declarative Knowledge Base Construction.” ACM SIGMOD Record 45(1): 60–67. Doi: http://dx.doi.org/10.1145/3060586.

Di Zio, M., N. Fursova, T. Gelsema, S. Giessing, U. Guarnera, J. Petrauskiene˙, L. Quenselvon Kalben, M. Scanu, K.O. ten Bosch, M. van der Loo, and K. Walsdorfer. 2016. “Methodology for Data Validation.” ESSNET ValiDat Foundation. Available at https://ec.europa.eu/eurostat/cros/system/files/methodology_for_data_validation_v1.0_rev-2016-06_final.pdf (accessed November 2017).

Dragoni, N., M. Mazzara, S. Giallorenzo, F. Montesi, A. Lluch Lafuente, R. Mustafin, and L. Safina. 2017. “Microservices: Yesterday, Today, and Tomorrow. In Present and Ulterior Software Engineering.” Springer Berlin Heidelberg. Doi: http://dx.doi.org/10.1007/978-3-319-67425-4_12.

ESSnet. 2015. “Enterprise Architecture Reference Framework.” Available at https://ec.europa.eu/eurostat/cros/content/ess-enterprise-architecture-reference-framework_en (accessed November 2017).

ESSnet Core Project. 2011. “Common Reference Environment.” Available at https://ec.europa.eu/eurostat/cros/content/core_en (accessed November 2017).

ESSnet ValiDat Integration. 2017. “Harmonising Data Validation Approaches in the ESS.” Available at https://ec.europa.eu/eurostat/cros/content/essnet-validat-integration_en (accessed November 2017).

Fowler, M. 2014. “A definition of this new architectural term”. Available at http://martinfowler.com/articles/microservices.html (accessed November 2017).

Goede, R. 2011. “Agile Data Warehousing: The Suitability of Scrum as Development Methodology.” In Proceedings of the 5th IADIS Multi Conference on Computer Science and Information Systems (MCCSIS’2011): 51–58. Rome, Italy. 20–26 July 2011. Available at http://ims.mii.lt/ims/konferenciju_medziaga/MCCSIS/I_WAC_TNS_2011.pdf#page=72 (accessed November 2017).

Gramaglia, L. 2015. “Towards a European Validation Architecture.” ESSNET ValiDat Foundation. Available at https://ec.europa.eu/eurostat/cros/content/workshop_en (accessed November 2017).

GSBPMv5.0. 2017. The Generic Statistical Business Process Model. Available at https://statswiki.unece.org/display/GSBPM/GSBPM+v5.0 (accessed November 2017).

GSDEM. 2015. The Generic Statistical Data Editing Models. Available at https://statswiki.unece.org/display/sde/GSDEMs (accessed November 2017).

Idreos, S., I. Alagiannis, R. Johnson, and A. Ailamaki. 2011. “Here are my data files. here are my queries. where are my results?” In Proceedings of 5th Biennial Conference on Innovative Data Systems Research (No. EPFL-CONF-161489). Asilomar, California, U.S.A., January 9 – 12, 2011. Available at http://cidrdb.org/cidr2011/Papers/CIDR11_Paper7.pdf (accessed November 2017).

Karpathiotakis, M., I. Alagiannis, T. Heinis, M. Branco, and A. Ailamaki. 2015. “Just-In-Time Data Virtualization: Lightweight Data Management with ViDa.” In Proceedings of the 7th Biennial Conference on Innovative Data Systems Research (CIDR) (No. EPFL-CONF-203677). Asilomar, California, U.S.A., January 4–7, 2015. Available at https://infoscience.epfl.ch/record/203677/files/vida-cidr.pdf (accessed November 2017).

Karpathiotakis, M., A. Ioannis, and A. Anastasia. 2016. “Fast Queries Over Heterogeneous Data Through Engine Customization.” Proceedings of the VLDB Endowment 9(12): 972–983. Doi: http://dx.doi.org/10.14778/2994509.2994516.

Khadka, R., A. Saedi, A. Idu, J. Hage, and S. Jansen. 2012. “Legacy to SOA evolution: A systematic literature review. Migrating Legacy Applications”: Challenges in Service Oriented Architecture and Cloud Computing Environments: 40. Doi: http://dx.doi.org/10.4018/978-1-4666-2488-7.ch003.

Krawatzeck, R., B. Dinter, and D.A. Pham Thi. 2015. “How to make business intelligence agile: The Agile BI actions catalog.” In System Sciences (HICSS), 2015 48th Hawaii International Conference on: 4762–4771. 5–8 January 2015. Hawaii, U.S.A. IEEE. Doi: http://dx.doi.org/10.1109/HICSS.2015.566.

Lavrac, N. 2001. “Data Mining and Decision Support: A note on the issues of their integration and their relation to Expert Systems.” In the workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning IDDM. Available at http://kt.ijs.si/Branax/IDDM-2001_submissions/Lavrac.pdf (accessed November 2017).

Liang, S., P. Fodor, H. Wan, and M. Kifer. 2009. “OpenRuleBench: An analysis of the performance of rule engines.” In Proceedings of the 18th international conference on World Wide Web: 601–610. Madrid, Spain. April 20–24, 2009. ACM. Doi: http://dx.doi.org/10.1145/1526709.1526790.

Milani, B.A. and N.J. Navimipour. 2016. “A Comprehensive Review of the Data Replication Techniques in the Cloud Environments: Major Trends and Future Directions.” Journal of Network and Computer Applications 64: 229–238. Doi: http://dx.doi.org/10.1016/j.jnca.2016.02.005.

Mohamed, M.F. 2016. “Service Replication Taxonomy in Distributed Environments.” Service Oriented Computing and Applications 10(3): 317–336. Doi: 10.1007/s11761-015-0189-7.

Montoya, G., H. Skaf-Mollia, P. Molli, and M.-E. Vidal. 2017. “Decomposing Federated Queries in Presence of Replicated Fragments.” Web Semantics: Science, Services and Agents on the World Wide Web 42: 1–18. Doi: http://dx.doi.org/10.1016/j.websem.2016.12.001.

Namiot, D. and M. Sneps-Sneppe. 2014. “On Micro-Services Architecture.” International Journal of Open Information Technologies 2(9): 24–27. Available at http://injoit.org/index.php/j1/article/view/139 (accessed November 2017).

O’Brien, L., P. Brebner, and J. Gray. 2008. “Business transformation to SOA: aspects of the migration and performance and QoS issues.” In Proceedings of the 2nd international workshop on Systems development in SOA environments: 35–40. Leipzig, Germany. May 10–18, 2008. ACM. Doi: http://dx.doi.org/10.1145/1370916.1370925.

Osrael, J., L. Froihofer, and K.M. Goeschka. 2006. “What Service Replication Middleware Can Learn from Object Replication Middleware.” In Proceedings of the 1st workshop on Middleware for Service Oriented Computing (MW4SOC 2006): 18–23. Melbourne, Australia. November 27 – December 01, 2006. ACM. Doi: http://dx.doi.org/10.1145/1169091.1169094.

Prokop, H. 1999. Cache-oblivious algorithms. Doctoral dissertation, Massachusetts Institute of Technology. Available at http://supertech.csail.mit.edu/papers/Prokop99.pdf (accessed November 2017).

Pullokkaran, L.J. 2013. Analysis of Data Virtualization and Enterprise Data Standardization in Business Intelligence. Doctoral dissertation, Massachusetts Institute of Technology. Available at http://hdl.handle.net/1721.1/90703 (accessed November 2017).

Quensel-von Kalben, L. 2017a. “SERV – Adopting Common Statistical Production Architecture (CSPA) in Europe.” NTTS 2017. Doi: http://dx.doi.org/10.2901/EUROSTAT.C2017.001.

Quensel-von Kalben, L. 2017b. “Validation, shared services and enterprise architecture: how it fits.” UNECE SDE. Available at https://www.unece.org/index.php?id=43887 (accessed November 2017).

Razavian, M. and P. Lago. 2015. “A Systematic Literature Review on SOA Migration.” Journal of Software: Evolution and Process 27(5): 337–372. Doi: http://dx.doi.org/10.1002/smr.1712.

Scannapieco, M., L. Tosco, C. Vaccari, and A. Virgillito. 2011. “A Common Reference Architecture for National Statistical Institutes: the CORA Project.” NTTS 2011. Doi: http://dx.doi.org/10.2901/Eurostat.C2011.001.

Schafer, M. 2015. A study on VTL. A Study on the Validation and Transformation Language. Available at https://ec.europa.eu/eurostat/cros/content/essnet-validation-study-vtl-final_en (accessed November 2017).

Stodder, D. 2013. Achieving Greater Agility with Business Intelligence. TDWI Best Practices Report, First Quarter. Available at http://info.attivio.com/rs/attivio/images/TDWI-and-Attivio-Best-Practices-Report-Achieving-Greater-Agility-with-Business-Intelligence-Q1-2013.pdf (accessed November 2017).

Subhlok, J., J.M. Stichnoth, D.R. O’Hallaron, and T. Gross. 1993. “Exploiting Task and Data Parallelism on a Multicomputer.” In ACM SIGPLAN Notices 28(7): 13–22. ACM. Doi: http://dx.doi.org/10.1145/173284.155334.

Tian, Y., I. Alagiannis, E. Liarou, A. Ailamaki, P. Michiardi, and M. Vukolić. 2017. “DiNoDB: an Interactive-speed Query Engine for Ad-hoc Queries on Temporary Data.” IEEE Transactions on Big Data. Doi: http://dx.doi.org/10.1109/TBDATA.2016.2637356.

Van Der Lans, R.F. 2013. Creating an Agile Data Integration Platform using Data Virtualization. R20 consultancy technical whitepaper. Available at http://stonebond.com/wp-content/uploads/2014/02/Rick-Van-Der-Lans-Whitepaper-May-2013.pdf (accessed November 2017).

Xavier, M.G., M. Neves, F. Rossi, T. Ferreto, T. Lange, and C. de Rose. 2013. “Performance evaluation of container-based virtualization for high performance computing environments. Parallel, Distributed and Network-Based Processing (PDP).” 2013 21st Euromicro International Conference on. IEEE. Belfast, United Kingdom, 27 Februari–1 March 2013. Doi: http://dx.doi.org/10.1109/PDP.2013.41.

Xie, G., G. Zeng, Y. Chen, Y. Bai, Z. Zhou, R. Li, and K. Li. 2017. “Minimizing Redundancy to Satisfy Reliability Requirement for a Parallel Application on Heterogeneous Service-oriented Systems.” IEEE Transactions on Services Computing. Doi: http://dx.doi.org/10.1109/TSC.2017.2665552.

Yu, S., C. Wang, K. Ren, and W. Lou. 2010. “Achieving Secure, Scalable, and Fine-Grained Data Access Control in Cloud Computing.” In Infocom, 2010 proceedings IEEE: 1–9. Ieee. Doi: http://dx.doi.org/10.1109/INFCOM.2010.5462174.

Zhou, X., Y. Chen, and D.Z. Wang. 2016. “ArchimedesOne: Query Processing Over Probabilistic Knowledge Bases.” Proceedings of the VLDB Endowment 9(13): 1461–1464. Doi: http://dx.doi.org/10.14778/3007263.3007284.

Zissis, D. and D. Lekkas. 2012. “Addressing Cloud Computing Security Issues.” Future Generation Computer Systems 28(3): 583–592. Doi: http://dx.doi.org/10.1016/j.future.2010.12.006.

Journal of Official Statistics

The Journal of Statistics Sweden

Journal Information


IMPACT FACTOR 2017: 0.662
5-year IMPACT FACTOR: 1.113

CiteScore 2017: 0.74

SCImago Journal Rank (SJR) 2017: 1.158
Source Normalized Impact per Paper (SNIP) 2017: 0.860

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 102 102 66
PDF Downloads 93 93 55