An Approach to Parallelization of Remote Data Integration Tasks
Data integration from autonomous, remote data sources is complicated by the data source heterogeneity, lack of methodological support and appropriate data integration systems. To solve this problem, the On-demand Remote Data Integration Architecture (ORDIA) is defined, which promotes maintenance and allows minimizing data integration time. A data integration task parallelization algorithm is the key part of this architecture. A detailed description of this algorithm is provided, and its performance is evaluated by experimental comparison with other data integration solutions.