source: CIVL/examples/mpi-omp/AMG2013/docs/amg2013.readme

main
Last change on this file was ea777aa, checked in by Alex Wilton <awilton@…>, 3 years ago

Moved examples, include, build_default.properties, common.xml, and README out from dev.civl.com into the root of the repo.

git-svn-id: svn://vsl.cis.udel.edu/civl/trunk@5704 fb995dde-84ed-4084-dfe6-e5aef3e2452c

  • Property mode set to 100644
File size: 23.6 KB
RevLine 
[2aa6644]1
2Code Description
3
4A. General description:
5
6AMG2013 is a parallel algebraic multigrid solver for linear systems arising from
7problems on unstructured grids.
8
9See the following papers for details on the algorithm and its parallel
10implementation/performance:
11
12Van Emden Henson and Ulrike Meier Yang, "BoomerAMG: A Parallel Algebraic
13Multigrid Solver and Preconditioner", Appl. Num. Math. 41 (2002),
14pp. 155-177. Also available as LLNL technical report UCRL-JC-141495.
15
16Hans De Sterck, Ulrike Meier Yang and Jeffrey Heys, "Reducing Complexity in
17Parallel Algebraic Multigrid Preconditioners", SIAM Journal on Matrix Analysis
18and Applications 27 (2006), pp. 1019-1039. Also available as LLNL technical
19reports UCRL-JRNL-206780.
20
21Hans De Sterck, Robert D. Falgout, Josh W. Nolting and Ulrike Meier Yang,
22"Distance-Two Interpolation for Parallel Algebraic Multigrid", Numerical
23Linear Algebra with Applications 15 (2008), pp. 115-139. Also available as
24LLNL technical report UCRL-JRNL-230844.
25
26U. M. Yang, "On Long Range Interpolation Operators for Aggressive Coarsening",
27Numer. Linear Algebra Appl., 17 (2010), pp. 453-472. LLNL-JRNL-417371.
28
29A. H. Baker, R. D. Falgout, T. V. Kolev, and U. M. Yang, "Multigrid Smoothers
30for Ultraparallel Computing", SIAM J. Sci. Comput., 33 (2011), pp. 2864-2887.
31LLNL-JRNL-473191.
32
33The driver provided with AMG2013 builds linear systems for various
343-dimensional problems, which are described in Section D.
35
36To determine when the solver has converged, the driver uses the
37relative-residual stopping criteria,
38
39 ||r_k||_2 / ||b||_2 < tol
40
41with tol = 10^-6.
42
43B. Coding:
44
45AMG2013 is written in ISO-C. It is an SPMD code which uses MPI. Parallelism is
46achieved by data decomposition. The driver provided with AMG2013 achieves this
47decomposition by simply subdividing the grid into logical P x Q x R (in 3D)
48chunks of equal size.
49
50C. Parallelism:
51
52AMG2013 is a highly synchronous code. The communications and computations
53patterns exhibit the surface-to-volume relationship common to many parallel
54scientific codes. Hence, parallel efficiency is largely determined by the size
55of the data "chunks" mentioned above, and the speed of communications and
56computations on the machine. AMG2013 is also memory-access bound, doing only
57about 1-2 computations per memory access, so memory-access speeds will also have
58a large impact on performance.
59
60D. Test problems
61
62Problem 1 (default): The default problem is a Laplace type problem on an
63unstructured domain with an anisotropy in one part. A 2-dimensional projection
64of the grid with the corresponding 2-dimensional stencils is illustrated in the
65file 'mg_grid_labels.pdf'. The problem is made 3-dimensional by extending the
66domain uniformly in z-direction. The default problem size is 384 unknowns, but
67this is easily refined on the amg2013 command line (see "Running the Code" for details);
68Suggestions for test runs are given in Section "Suggested Test Runs".
69
70Problem 2 (-laplace): Solves
71
72 - cx u_xx - cy u_yy - cz u_zz = (1/h)^2
73
74with Dirichlet boundary conditions of u = 0, where h is the mesh spacing in each
75direction on the unit cube. Standard finite differences are used to discretize
76the equations yielding 7-pt stencils in 3D. This problem can also be used to
77generate 2D or 1D problems by setting the length in one or two of the directions
78(<nx>, <ny> or <nz>) to 1.
79
80Problem 3 (-27pt): Solves a Laplace type problem using a 27-point stencil.
81
82Problem 4 (-jumps): Solves the PDE
83
84 - a(x,y,z)(u_xx + u_yy = u_zz) = (1/h)^2
85
86with Dirichlet boundary conditions of u=0 on the unit cube, and
87
88 a(x,y,z) = 1000 on [0.1,0.9] x [0.1,0.9] x [0.1,0.9]
89 = 0.01 on the 8 corner cubes of size 0.1 x 0.1 x 0.1
90 = 1 elsewhere
91
92%==========================================================================
93%==========================================================================
94
95Important Kernels in this Distribution
96
97Here the important files that are used for the linear solver,
98both preconditioner and solver, are listed. Files that don't
99take much time such as wrappers (files starting with HYPRE_,
100files that are not used during the suggested runs or files that are
101used for the generation of the problems are not included here.
102A complete listing of all directories and files
103as well as a short description of each directory
104can be found in the next section.
105
106
107In the 'krylov' directory:
108
109pcg.c functions for the conjugate gradient algorithm
110gmres.c functions for the GMRES algorithm
111
112In the 'parcsr_ls' directory:
113
114par_amg.c Setup phase of the AMG preconditioner
115par_amg_setup.c Setup phase of the AMG preconditioner
116par_coarsen.c various coarsening algorithms
117par_strength.c computes a strength matrix for coarsening and
118 interpolation
119par_indepset.c independent set function needed for coarsening
120par_interp.c interpolation algorithms for solvers 0 and 3
121par_lr_interp.c interpolation algorithms for solvers 1 and 4
122aux_interp.c auxiliary functions needed in par_lr_interp.c
123par_multi_interp.c interpolation algorithm for fine level
124 in solvers 1 and 4
125par_rap.c generates coarse grid operator
126par_rap_communication.c sets up communication in par_rap.c
127par_amg_solve.c Solve phase of the AMG preconditioner
128par_cycle.c AMG cycle
129par_relax.c AMG smoothers
130
131In the 'parcsr_mv' directory:
132
133par_csr_communication.c communication routines for global partitioning
134new_commpkg.c communication routines for assumed partitioning
135par_csr_assumed_part.c communication routines for assumed partitioning
136par_csr_matrix.c basic parallel matrix operations
137par_csr_matvec.c parallel matrix vector multiplication
138par_csr_matop.c additional parallel matrix operations
139par_vector.c basic vector operations
140
141In the 'seq_mv' directory:
142
143big_csr_matrix.c basic sequential matrix operations
144csr_matrix.c basic sequential matrix operations
145csr_matvec.c sequential matrix vector multiplications
146csr_matop.c additional sequential matrix operations
147vector.c basic sequential vector operations
148
149%==========================================================================
150%==========================================================================
151
152Files in this Distribution
153
154NOTE: The AMG2013 code is derived directly from the hypre library, a large
155linear solver library that is being developed in the Center for Applied
156Scientific Computing (CASC) at LLNL.
157
158In the amg2013 directory the following files are included:
159
160COPYING_LESSER
161COPYRIGHT
162HYPRE.h
163Makefile
164Makefile.include
165
166The following subdirectories are also included:
167
168docs Documentation
169IJ_mv Linear algebraic interface routines
170krylov Krylov solvers, such as PCG and GMRES
171parcsr_ls routines needed to generate solvers and preconditioners
172 as well as Problems 2-4
173parcsr_mv parallel matrix and vector routines
174 (ParCSR data structure)
175seq_mv sequential matrix and vector routines
176sstruct_mv semistructured matrix and vector routines - included
177 to generate Problem 1
178struct_mv structured matrix and vector routines - included to
179 generate Problem 1
180test driver and input file for Problem 1
181utilities functions for memory allocation, timing, error codes,
182 sorting, searching, etc.
183
184In the 'docs' directory the following files are included:
185
186amg2013.readme
187mg_grid_labels.pdf
188
189In the 'IJ_mv' directory the following files are included:
190
191aux_parcsr_matrix.c
192aux_parcsr_matrix.h
193aux_par_vector.c
194aux_par_vector.h
195headers.h
196HYPRE_IJMatrix.c
197HYPRE_IJ_mv.h
198HYPRE_IJVector.c
199IJMatrix.c
200IJ_matrix.h
201IJMatrix_parcsr.c
202IJ_mv.h
203IJVector.c
204IJ_vector.h
205IJVector_parcsr.c
206Makefile
207
208In the 'krylov' directory the following files are included:
209
210all_krylov.h
211gmres.c
212gmres.h
213HYPRE_gmres.c
214HYPRE_MatvecFunctions.h
215HYPRE_pcg.c
216krylov.h
217Makefile
218pcg.c
219pcg.h
220
221In the 'parcsr_ls' directory the following files are included:
222
223aux_interp.c
224aux_interp.h
225gen_redcs_mat.c
226headers.h
227HYPRE_parcsr_amg.c
228HYPRE_parcsr_gmres.c
229HYPRE_parcsr_ls.h
230HYPRE_parcsr_pcg.c
231Makefile
232par_amg.c
233par_amg.h
234par_amg_setup.c
235par_amg_solve.c
236par_cg_relax_wt.c
237par_coarsen.c
238par_coarse_parms.c
239parcsr_ls.h
240par_cycle.c
241par_difconv.c
242par_indepset.c
243par_interp.c
244par_jacobi_interp.c
245par_laplace_27pt.c
246par_laplace.c
247par_lr_interp.c
248par_multi_interp.c
249par_nodal_systems.c
250par_rap.c
251par_rap_communication.c
252par_relax.c
253par_relax_interface.c
254par_relax_more.c
255par_scaled_matnorm.c
256par_stats.c
257par_strength.c
258partial.c
259par_vardifconv.c
260pcg_par.c
261
262In the 'parcsr_mv' directory the following files are included:
263
264headers.h
265HYPRE_parcsr_matrix.c
266HYPRE_parcsr_mv.h
267HYPRE_parcsr_vector.c
268Makefile
269new_commpkg.c
270new_commpkg.h
271par_csr_assumed_part.c
272par_csr_assumed_part.h
273par_csr_communication.c
274par_csr_communication.h
275par_csr_matop.c
276par_csr_matop_marked.c
277par_csr_matrix.c
278par_csr_matrix.h
279par_csr_matvec.c
280parcsr_mv.h
281par_vector.c
282par_vector.h
283
284In the 'seq_mv' directory the following files are included:
285
286big_csr_matrix.c
287csr_matop.c
288csr_matrix.c
289csr_matrix.h
290csr_matvec.c
291genpart.c
292headers.h
293HYPRE_csr_matrix.c
294HYPRE_seq_mv.h
295HYPRE_vector.c
296Makefile
297seq_mv.h
298vector.c
299vector.h
300
301In the 'sstruct_mv' directory the following files are included:
302
303box_map.c
304box_map.h
305headers.h
306HYPRE_sstruct_graph.c
307HYPRE_sstruct_grid.c
308HYPRE_sstruct_matrix.c
309HYPRE_sstruct_mv.h
310HYPRE_sstruct_stencil.c
311HYPRE_sstruct_vector.c
312Makefile
313sstruct_axpy.c
314sstruct_copy.c
315sstruct_graph.c
316sstruct_graph.h
317sstruct_grid.c
318sstruct_grid.h
319sstruct_innerprod.c
320sstruct_matrix.c
321sstruct_matrix.h
322sstruct_matvec.c
323sstruct_mv.h
324sstruct_overlap_innerprod.c
325sstruct_scale.c
326sstruct_stencil.c
327sstruct_stencil.h
328sstruct_vector.c
329sstruct_vector.h
330
331In the 'struct_mv' directory the following files are included:
332
333assumed_part.c
334assumed_part.h
335box_algebra.c
336box_alloc.c
337box_boundary.c
338box.c
339box.h
340box_manager.c
341box_manager.h
342box_neighbors.c
343box_neighbors.h
344box_pthreads.h
345communication_info.c
346computation.c
347computation.h
348grow.c
349headers.h
350HYPRE_struct_grid.c
351HYPRE_struct_matrix.c
352HYPRE_struct_mv.h
353HYPRE_struct_stencil.c
354HYPRE_struct_vector.c
355Makefile
356new_assemble.c
357new_box_neighbors.c
358project.c
359struct_axpy.c
360struct_communication.c
361struct_communication.h
362struct_copy.c
363struct_grid.c
364struct_grid.h
365struct_innerprod.c
366struct_io.c
367struct_matrix.c
368struct_matrix.h
369struct_matrix_mask.c
370struct_matvec.c
371struct_mv.h
372struct_overlap_innerprod.c
373struct_scale.c
374struct_stencil.c
375struct_stencil.h
376struct_vector.c
377struct_vector.h
378
379In the 'test' directory the following files are included:
380
381amg2013.c
382Makefile
383sstruct.in.MG.FD
384
385In the 'utilities' directory the following files are included:
386
387amg_linklist.c
388amg_linklist.h
389binsearch.c
390exchange_data.c
391exchange_data.h
392exchange_data.README
393general.h
394hypre_error.c
395hypre_error.h
396hypre_memory.c
397hypre_memory.h
398hypre_qsort.c
399hypre_smp_forloop.h
400HYPRE_utilities.h
401Makefile
402memory_dmalloc.c
403mpistubs.c
404mpistubs.h
405qsplit.c
406random.c
407threading.c
408threading.h
409thread_mpistubs.c
410thread_mpistubs.h
411timer.c
412timing.c
413timing.h
414umalloc_local.c
415umalloc_local.h
416utilities.h
417
418%==========================================================================
419%==========================================================================
420
421Building the Code
422
423AMG2013 uses a simple Makefile system for building the code. All compiler and
424link options are set by modifying the file 'amg2013/Makefile.include'
425appropriately. This file is then included in each of the following makefiles:
426
427 krylov/Makefile
428 IJ_mv/Makefile
429 parcsr_ls/Makefile
430 parcsr_mv/Makefile
431 seq_mv/Makefile
432 sstruct_mv/Makefile
433 struct_mv/Makefile
434 test/Makefile
435 utilities/Makefile
436
437To build the code, first modify the 'Makefile.include' file appropriately, then
438type (in the amg2013 directory)
439
440 make
441
442Other available targets are
443
444 make clean (deletes .o files)
445 make veryclean (deletes .o files, libraries, and executables)
446
447To configure the code to run with:
448
4491 - MPI only , add '-DTIMER_USE_MPI' to the 'INCLUDE_CFLAGS' line
450 in the 'Makefile.include' file and use a valid MPI.
4512 - OpenMP with MPI, add vendor dependent compilation flag for OMP
4523 - to use the assumed partition (recommended for several thousand
453 processors or more), add '-DHYPRE_NO_GLOBAL_PARTITION'
4544 - to be able to solve problems that are larger than 2^31-1,
455 add '-DHYPRE_LONG_LONG'
456
457%==========================================================================
458%==========================================================================
459
460Optimization and Improvement Challenges
461
462This code is memory-access bound. We believe it would be very difficult to
463obtain "good" cache reuse with an optimized version of the code.
464
465%==========================================================================
466%==========================================================================
467
468Parallelism and Scalability Expectations
469
470AMG2013 has been run on the following platforms:
471
472 BG/Q - up to over 1,000,000 MPI processes
473 BG/P - up to 125,000 MPI processes
474 Sierra - up to 13,824 MPI processes
475 and more
476
477Consider increasing both problem size and number of processors in tandem.
478On scalable architectures, time-to-solution for AMG2013 will initially
479increase, then it will level off at a modest numbers of processors,
480remaining roughly constant for larger numbers of processors. Iteration
481counts will also increase slightly for small to modest sized problems,
482then level off at a roughly constant number for larger problem sizes.
483
484For example, we get the following timing results (in seconds) for a 3D Laplace
485problem with cx = cy = cz = 1.0, distributed on a logical P x Q x R processor
486topology, with fixed local problem size per process given as 40 x 40 x 40:
487
488 P x Q x R procs solver similar to solver 0
489 ---------------------------------------------------------------
490 16x16x16 4096 5.75
491 20x20x20 8000 6.88
492 32x32x32 32768 8.11
493 44x44x44 91125 10.48
494 50x50x50 125000 10.54
495
496These results were obtained on BG/P using the assumed partition option
497-DHYPRE_NO_GLOBAL_PARTITION and -DHYPRE_LONG_LONG.
498
499%==========================================================================
500%==========================================================================
501
502Running the Code
503
504The driver for AMG2013 is called `amg2013', and is located in the amg2013/test
505subdirectory. Type
506
507 mpirun -np 1 amg2013 -help
508
509to get usage information. This prints out the following:
510
511Usage: amg2013 [<options>]
512
513 -in <filename> : input file (default is `sstruct.in.AMG.FD')
514
515 -P <Px> <Py> <Pz> : define processor topology per part
516 Note that for test problem 1, which has 8 parts
517 this leads to 8*Px*Py*Pz MPI processes!
518 For all other test problems, the total amount of
519 MPI processes is Px*Py*Pz.
520
521 -pooldist <p> : pool distribution to use
522
523 -r <rx> <ry> <rz> : refine part(s) for default problem
524 -b <bx> <by> <bz> : refine and block part(s) for default problem
525
526 -n <nx> <ny> <nz> : define size per processor for problems on cube
527 -c <cx> <cy> <cz> : define anisotropies for Laplace problem
528
529 -laplace : 3D Laplace problem on a cube
530 -27pt : Problem with 27-point stencil on a cube
531 -jumps : PDE with jumps on a cube
532
533 -solver <ID> : solver ID (default = 0)
534 0 - PCG with AMG precond
535 1 - PCG with diagonal scaling
536 2 - GMRES(10) with AMG precond
537 3 - GMRES(10) with diagonal scaling
538
539 -printstats : print out detailed info on AMG preconditioner
540
541 -printsystem : print out the system
542
543 -rhsfromcosine : solution is cosine function (default), can be used for
544 default problem only
545 -rhsone : rhs is vector with unit components
546
547All of the arguments are optional. The most important option for the AMG2013
548compact application is the `-P' option. It specifies the MPI process topology
549on which to run.
550
551For the default problem, there are two possible pool distributions, which
552lead to different partitionings of the problem. Pool distribution 0 will
553give each process a portion of one of the 8 parts of the test problem, thus
554assigning disjoint subdomains to each process. Pool distribution 1 uses a
555more natural partitioning, assigning each process a subdomain in one of
556the 8 parts, and therefore requires the total number of processes to be a
557multiple of 8, i.e. it needs to be run as follows:
558mpirun -np <N> amg2013 -pooldist 1 -P <Px> <Py> <Pz> ...
559with <N> = 8*<Px>*<Py>*<Pz>.
560Both partitionings lead to a load balanced distribution of the original problem.
561The problem size per MPI process can be increased using the `-r' option,
562which defines the refinement factor for the grid on each process in each
563direction, or the '-b' option, which increases the number of blocks per process.
564
565
566For the other three problems (laplace, 27pt and jumps) the `-n' option allows
567one to specify the local problem size per MPI process, leading to a global
568problem size of <Px>*<nx> by <Py>*<ny> by <Pz>*<nz>.
569
570%==========================================================================
571%==========================================================================
572
573Timing Issues
574
575If using MPI, the whole code is timed using the MPI timers. If not using MPI,
576standard system timers are used. Timing results are printed to standard out,
577and are divided into "Setup Phase" times and "Solve Phase" times. Timings for a
578few individual routines are also printed out.
579
580%==========================================================================
581%==========================================================================
582
583Memory Needed
584
585AMG2013 's memory needs are somewhat complicated to describe. They are very
586dependent on the type of problem solved and the AMG options used. In general,
587solver 1 and solver 4 will need less memory than solver 0 and 3. When turning
588on the '-printstats' option, operator complexities <oc> are displayed, which are
589defined by the sum of nonzeros of the original matrix and all coarse grid
590matrices divided by the number of nonzeros of the original matrix, i.e. for
591original matrix and coarse grid operators about <oc> times as much space is
592needed as for the original matrix. However this does not include memory needed
593for interpolation operators, communication, etc.
594
595%==========================================================================
596%==========================================================================
597
598About the Data
599
600AMG2013 requires one input file to generate the default problem, which is
601located in the test directory. Apart from this all control is on the command
602line.
603
604%==========================================================================
605%==========================================================================
606
607Expected Results
608
609Consider the following run, which was compiled using options -DTIMER_USE_MPI -DHYPRE_USING_OPENMP,
610linking with OpenMP and setting OMP_NUM_THREADS to 2:
611
612 mpirun -np 8 amg2013 -pooldist 1 P 1 1 1 -r 4 4 4 -printstats
613
614This is what AMG2013 prints out:
615
616=============================================
617SStruct Interface:
618=============================================
619SStruct Interface:
620SStruct Interface wall clock time = 0.014211 seconds
621SStruct Interface cpu clock time = 0.010000 seconds
622
623Number of MPI processes: 8 , Number of OpenMP threads: 2
624
625BoomerAMG SETUP PARAMETERS:
626
627 Max levels = 25
628 Num levels = 6
629
630 Strength Threshold = 0.250000
631 Interpolation Truncation Factor = 0.000000
632 Maximum Row Sum Threshold for Dependency Weakening = 0.900000
633
634 Coarsening Type = HMIS
635 Hybrid Coarsening (switch to CLJP when coarsening slows)
636 measures are determined locally
637
638 no. of levels of aggressive coarsening: 1
639
640 Interpolation = extended+i interpolation
641
642Operator Matrix Information:
643
644 nonzero entries per row row sums
645lev rows entries sparse min max avg min max
646===================================================================
647 0 82944 648936 0.000 4 9 7.8 -4.274e-15 3.000e+02
648 1 8985 159896 0.002 4 45 17.8 -2.069e-13 9.293e+02
649 2 2763 72864 0.010 6 121 26.4 -2.487e-14 1.668e+03
650 3 1001 21100 0.021 3 167 21.1 8.298e-02 3.147e+03
651 4 320 8354 0.082 2 79 26.1 1.938e-01 1.098e+02
652 5 21 171 0.388 4 12 8.1 5.854e+00 6.784e+00
653
654
655Interpolation Matrix Information:
656
657 entries/row min max row sums
658lev rows cols min max weight weight min max
659=================================================================
660 0 82944 x 8985 1 10 1.488e-02 9.980e-01 1.759e-01 1.000e+00
661 1 8985 x 2763 1 4 8.769e-03 1.000e+00 1.624e-01 1.000e+00
662 2 2763 x 1001 0 4 2.076e-03 1.000e+00 0.000e+00 1.000e+00
663 3 1001 x 320 0 4 -4.281e-01 1.452e+00 -7.104e-03 1.000e+00
664 4 320 x 21 0 4 2.627e-03 5.150e-02 0.000e+00 1.000e+00
665
666
667 Complexity: grid = 1.157817
668 operator = 1.404331
669
670
671
672
673BoomerAMG SOLVER PARAMETERS:
674
675 Maximum number of cycles: 1
676 Stopping Tolerance: 0.000000e+00
677 Cycle type (1 = V, 2 = W, etc.): 1
678
679 Relaxation Parameters:
680 Visiting Grid: down up coarse
681 Number of partial sweeps: 1 1 1
682 Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 8 8 8
683 Point types, partial sweeps (1=C, -1=F):
684 Pre-CG relaxation (down): 0
685 Post-CG relaxation (up): 0
686 Coarsest grid: 0
687
688=============================================
689Setup phase times:
690=============================================
691PCG Setup:
692PCG Setup wall clock time = 0.066036 seconds
693PCG Setup cpu clock time = 0.090000 seconds
694
695System Size / Setup Phase Time: 1.674723e+06
696
697=============================================
698Solve phase times:
699=============================================
700PCG Solve:
701PCG Solve wall clock time = 0.103601 seconds
702PCG Solve cpu clock time = 0.140000 seconds
703
704AMG2013 Benchmark version 1.0
705Iterations = 8
706Final Relative Residual Norm = 6.945422e-07
707
708System Size * Iterations / Solve Phase Time: 8.539842e+06
709
710%==========================================================================
711%==========================================================================
712
713Suggested Test Runs
714
7151. For the default problem:
716
717mpirun -np <8*px*py*pz> amg2013 -pooldist 1 -r 12 12 12 -P px py pz
718
719This will generate a problem with 82,944 variables per MPI process leading to
720a total system size of 663,552*px*py*pz.
721
722mpirun -np <8*px*py*pz> amg2013 -pooldist 1 -r 24 24 24 -P px py pz
723
724This will generate a problem with 663,552 variables per process leading to
725a total system size of 5,308,416*px*py*pz and solve it using conjugate gradient
726preconditioned with AMG. If one wants to use AMG-GMRES(10) append -solver 2 .
727
728The domain (for a 2-dimensional projection of the domain see mg_grid_labels.pdf)
729can be scaled up by increasing the values for px, py and pz.
730
7312. For the 7pt 3D Laplace problem:
732
733mpirun -np <px*py*pz> amg2013 -laplace -n 40 40 40 -P px py pz
734
735This will generate a problem with 64,000 grid points per MPI process
736with a domain of the size 40*px x 40*py x 40*pz .
737
738mpirun -np <px*py*pz> amg2013 -laplace -n 80 80 80 -P px py pz
739
740This will generate a problem with 512,000 grid points per MPI process
741with a domain of the size 80*px x 80*py x 80*pz .
742
743%==========================================================================
744%==========================================================================
745
746For further information on AMG2013 contact
747Ulrike Yang
748ph: (925)422-2850
749email: umyang@llnl.gov
750
751%==========================================================================
752%==========================================================================
753
754Release and Modification Record
755
756LLNL code release number: UCRL-CODE-222953.
757
758See the files COPYRIGHT and COPYING.LESSER for a complete copyright notice,
759additional contact information, disclaimer and license.
Note: See TracBrowser for help on using the repository browser.