Partitioning the systems of equations is a very important process when solving it on a parallel computer. This paper presents some criteria which leads to more efficient parallelization, that must be taken into consideration. New criteria added to preconditioning process by reducing average bandwidth are pro- posed in this paper. These new criteria lead to a combination between preconditioning and partitioning of systems equations, so no need two distinct algorithms/processes. In our proposed methods - where the preconditioning is done by reducing the average bandwidth- two directions were followed in terms of partitioning: for a given preconditioned system determining the best partitioning (or one as close) and the second consist in achieving an adequate preconditioning, depending on a given/desired partitioning. A mixed method it is also proposed. Experimental results, conclusions and recommendations, obtained after parallel implementation of conjugate gradient on IBM BlueGene /P supercomputer- based on a synchronous model of parallelization- are also presented in this paper.
[1] P. Arbenz W. Gander A survey of direct parallel algorithms for banded linear systems Report Departement Informatik ETH Zurich (1994)
[2] P. Arbenz M. Hegland A. Cleary J. Dongarra Paralel numerical liniar al- gebra chapter: A comparison of paralel solvers for diagonally dominant and general narrowbanded liniar systems". Nova Science Publishers Inc. Commack NY USA 2001
[3] O. Axelsson A survey of preconditioned iterative methods for linear systems of algebraic equations BIT Numerical Mathematics 25/1 (1985) 165-187
[4] M. Benzi Preconditioning Techniques for Large Linear Systems: A Survey Journal of Computational Physics 182 (2002) 418-477
[5] B. Carpentieri I.S. Duff L. Giraud G. Sylvand Combining Fast Multipole Techniques and an Approximate Inverse Preconditioner for Large Electromagnetism Calculations SIAM J. Sci. Comput. 27/3 (2005) 774-792
[6] U.V. Catalyurek C. Aykanat Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication Parallel and Distributed Systems IEEE Transactions 10/7 (1999)
[7] U.V. Catalyurek C. Aykanat A Hypergraph-Partitioning Approach for Coarse- Grain Decomposition Supercomputing ACM/IEEE 2001 Conference ISBN:1-58113-293-X IEEE (2001)
[8] U.V. Catalyurek E.G. Boman K.D. Devine D. Bozdag Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations Parallel MESH Dis- tributed Processing Symposium 2007. IPDPS 2007. IEEE International ISBN:1-4244-0910-1 (2007)
[9] A. Cevahir A. Nukada S. Matsuoka High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning Computer Science - Research and Development May 2010 Volume 25 Issue 1-2 pp 83-91 (2010)
[10] C.K. Cheng Y.C.A. Wei An improved two-way partitioning algorithm with stable performance IEEE Trans. Computer Aided Design 10/12 (1991) 1502-1511
[11] R. Cheng M. Gen Parallel machine scheduling problems using memetic algorithms Computers et. Industrial Engineering Elsevier 33/3..4 (1998) 761-764
[12] C. Chevalier F. Pellegrini Improvement of the Efficiency of Genetic Algorithms for Scalable Parallel Graph Partitioning in a Multi-level Framework Euro-Par 2006 Parallel Processing Lecture Notes in Computer ScienceSpringer 4128 (2006) 243-252
[13] A.K. Cline C.B. Moler G.W. Stewart J.H. Wilkinson An Estimate for the Condition Number of a Matrix SIAM J. Numer. Anal. 16/2 (1979) 368375
[14] E. Cuthill J. McKee Reducing the bandwidth of sparse symmetric matrices Proc. of ACM (1969) 157-172
[15] G. Cybenko Dynamic load balancing for distributed memory multiprocessors Jour- nal Parallel Distrib. Comput 7 (1989) 279-301
[16] K.D. Devine et. all New challenges in dynamic load balancing Applied Numerical Mathematics 52/(2-3) (2005) 133-152
[17] K.D. Devine et. all Parallel hypergraph partitioning for scientific computing Par- allel and Distributed Processing Symposium 2006. IPDPS 2006. 20th International ISBN: 1-4244-0054-6 (2006)
[18] G. Garcia R. Yahyapour A.Tchernykh Load Balancing for Parallel Compu- tations with the Finite Element Method Computacin y Sistemas ISSN 1405-5546 17/3 (2013) 299-316
[19] B.A. Hendrickson Fast spectral methods for ratio cut partitioning and clustering Proceedings of IEEE International Conference on Computer Aided Design (1991) 10-13
[20] M.T. Heath P. Raghavan A Cartesian parallel nested dissection algorithm SIAM J. Matrix Anal. Appl. 16/1 (1995) 235-253
[21] B. Hendrickson R. Leland A Multilevel Algorithm for Partitioning Graphs Tech- nical Report SAND93-1301 Sandia National Laboratories (1993)
[22] B.A. Hendrickson Graph partitioning and parallel solvers: Has the emperor no clothes? Proceedings of the 5th Solving Irregularly Structured Problems in Parallel (1998) 218-225
[23] G. Karypis R. Aggarwal V. Kumar S. Shashi Multilevel hypergraph parti- tioning: applications in VLSI domain Very Large Scale Integration (VLSI) Systems IEEE Transactions 7/1 (1999)
[24] C.S. Kenney A. J. Laub M. S. Reese Statistical Condition Estimation for Linear Systems SIAM Journal on Scientific Computing 19/2 (1998) 566-583
[25] P. Korosec J. Silc B. Robic Mesh-Partitioning with the Multiple Ant-Colony Algorithm Ant Colony Optimization and Swarm Intelligence Lecture Notes in Com- puter Science ISBN:978-3-540-22672-7. 3172 (2004)
[26] A.J. Laub J. Xiai Statistical Condition Estimation for the Roots of Polynomials TSIAM Journal on Scientific Computing Vol. 31 No. 1 pp. 624-643 (2008)
[27] G. Laszewski A Collection of Graph Partitioning Algorithms: Simulated Anneal- ing Simulated Tempering Kernighan Lin Two Optimal Graph Reduction Bisec- tion Northeast Parallel Architectures Center at Syracuse University. Technical Report SCCS 477 (1993)
[28] L.O. Mafteiu-Scai Bandwidth reduction on sparse matrix West University of Timisoara Annals XLVIII/3 (2010)
[29] L.O. Mafteiu-Scai V. Negru D. Zaharie O. Aritoni Average bandwidth re- duction in sparse matrices using hybrid heuristics Studia Universitatis Babes-Bolyai University Cluj Napoca 3/2011 (2011)
[30] L.O. Mafteiu-Scai Interchange opportunity in average bandwidth reduction in sparse matrix West Univ. of Timisoara Annals Timisoara Romania ISSN:1841-3307 (2012)
[31] L.O. Mafteiu-Scai Average bandwidth relevance in parallel solving systems of lin- ear equations Int. J. Eng. Res. Appl. ISSN 2248-9622 3/1 (2013) 1898-1907
[32] L.O. Mafteiu-Scai A new dissimilarity measure between feature-vectors Int. J. of Comp. Appl. ISSN: 0975 8887ISBN: 973-93-80873-17-5 64/17 (2013)
[33] S. Maruster V. Negru L.O. Mafteiu-Scai Experimental study on parallel meth- ods for solving systems of equations SYNACS Timisoara 2012 IEEE Xplore CPS ISBN: 978-1-4673-5026-6 DOI: 10.1109/SYNASC.2012.7 (Febr. 2013)
[34] B. Nour-Omid A. Raefsky G. Lyzenga Solving finite element equations on concurrent computers American Society of Mechanical Engineering A. K. Noor Ed. (1986) 291-307
[35] A. Pothen H.D. Simon K.P. Liou Partitioning sparse matrices with eigenvectors of graphs SIAM J. Matrix Anal. Appl. 11/3 (1990) 430-452.
[36] P. Raghavan Line and Plane Separators Technical Report UIUCDCS-R-93-1794 Department of Computer Science University of Illinois Urbana IL 61801 Feb. 1993 (1993)
[37] Y. Saad Iterative methods for sparse liniar systems (2nd ed.) Chapter 6: Krylov Subspace Methods Part I". SIAM ISBN 978-0-89871-534-7 2003
[38] N.G. Shivaratri P. Krueger M. Singhal Load distributing for locally distributed systems Computer 25/12 (1992) 33-44
[39] M.S. Squillante On the benefits and limitations of dynamic partitioning in parallel computer systems Job Scheduling Strategies for Parallel Processing Lecture Notes in Computer Science Springer 949 (1995) 219-238
[40] E.G.Talbi P. Bessiere A parallel genetic algorithm for the graph partitioning prob- lem Proceeding ICS '91 Proceedings of the 5th international conference on Super- computing ACM New York ISBN:0-89791-434-1 25/12 (1991) 312-320
[41] S. Xu J. Zhang A new data mining approach to predict matrix condition numbers Comunications in information and systems 4/4 (2004) 325-340
[42] M.H. Willebeek-LeMair A.P. Reeves Strategies for dynamic load balancing on highly parallel computers IEEE Transactions on Parallel and Distributed Systems 4/9 (1993) 979-993