Context Navigation

amg2013.readme

main

Last change on this file was ea777aa, checked in by Alex Wilton <awilton@…>, 3 years ago

Moved examples, include, build_default.properties, common.xml, and README out from dev.civl.com into the root of the repo.

git-svn-id: svn://vsl.cis.udel.edu/civl/trunk@5704 fb995dde-84ed-4084-dfe6-e5aef3e2452c

Property mode set to 100644

File size: 23.6 KB

Rev	Line
[2aa6644]	1
	2	Code Description
	3
	4	A. General description:
	5
	6	AMG2013 is a parallel algebraic multigrid solver for linear systems arising from
	7	problems on unstructured grids.
	8
	9	See the following papers for details on the algorithm and its parallel
	10	implementation/performance:
	11
	12	Van Emden Henson and Ulrike Meier Yang, "BoomerAMG: A Parallel Algebraic
	13	Multigrid Solver and Preconditioner", Appl. Num. Math. 41 (2002),
	14	pp. 155-177. Also available as LLNL technical report UCRL-JC-141495.
	15
	16	Hans De Sterck, Ulrike Meier Yang and Jeffrey Heys, "Reducing Complexity in
	17	Parallel Algebraic Multigrid Preconditioners", SIAM Journal on Matrix Analysis
	18	and Applications 27 (2006), pp. 1019-1039. Also available as LLNL technical
	19	reports UCRL-JRNL-206780.
	20
	21	Hans De Sterck, Robert D. Falgout, Josh W. Nolting and Ulrike Meier Yang,
	22	"Distance-Two Interpolation for Parallel Algebraic Multigrid", Numerical
	23	Linear Algebra with Applications 15 (2008), pp. 115-139. Also available as
	24	LLNL technical report UCRL-JRNL-230844.
	25
	26	U. M. Yang, "On Long Range Interpolation Operators for Aggressive Coarsening",
	27	Numer. Linear Algebra Appl., 17 (2010), pp. 453-472. LLNL-JRNL-417371.
	28
	29	A. H. Baker, R. D. Falgout, T. V. Kolev, and U. M. Yang, "Multigrid Smoothers
	30	for Ultraparallel Computing", SIAM J. Sci. Comput., 33 (2011), pp. 2864-2887.
	31	LLNL-JRNL-473191.
	32
	33	The driver provided with AMG2013 builds linear systems for various
	34	3-dimensional problems, which are described in Section D.
	35
	36	To determine when the solver has converged, the driver uses the
	37	relative-residual stopping criteria,
	38
	39	\|\|r_k\|\|_2 / \|\|b\|\|_2 < tol
	40
	41	with tol = 10^-6.
	42
	43	B. Coding:
	44
	45	AMG2013 is written in ISO-C. It is an SPMD code which uses MPI. Parallelism is
	46	achieved by data decomposition. The driver provided with AMG2013 achieves this
	47	decomposition by simply subdividing the grid into logical P x Q x R (in 3D)
	48	chunks of equal size.
	49
	50	C. Parallelism:
	51
	52	AMG2013 is a highly synchronous code. The communications and computations
	53	patterns exhibit the surface-to-volume relationship common to many parallel
	54	scientific codes. Hence, parallel efficiency is largely determined by the size
	55	of the data "chunks" mentioned above, and the speed of communications and
	56	computations on the machine. AMG2013 is also memory-access bound, doing only
	57	about 1-2 computations per memory access, so memory-access speeds will also have
	58	a large impact on performance.
	59
	60	D. Test problems
	61
	62	Problem 1 (default): The default problem is a Laplace type problem on an
	63	unstructured domain with an anisotropy in one part. A 2-dimensional projection
	64	of the grid with the corresponding 2-dimensional stencils is illustrated in the
	65	file 'mg_grid_labels.pdf'. The problem is made 3-dimensional by extending the
	66	domain uniformly in z-direction. The default problem size is 384 unknowns, but
	67	this is easily refined on the amg2013 command line (see "Running the Code" for details);
	68	Suggestions for test runs are given in Section "Suggested Test Runs".
	69
	70	Problem 2 (-laplace): Solves
	71
	72	- cx u_xx - cy u_yy - cz u_zz = (1/h)^2
	73
	74	with Dirichlet boundary conditions of u = 0, where h is the mesh spacing in each
	75	direction on the unit cube. Standard finite differences are used to discretize
	76	the equations yielding 7-pt stencils in 3D. This problem can also be used to
	77	generate 2D or 1D problems by setting the length in one or two of the directions
	78	(<nx>, <ny> or <nz>) to 1.
	79
	80	Problem 3 (-27pt): Solves a Laplace type problem using a 27-point stencil.
	81
	82	Problem 4 (-jumps): Solves the PDE
	83
	84	- a(x,y,z)(u_xx + u_yy = u_zz) = (1/h)^2
	85
	86	with Dirichlet boundary conditions of u=0 on the unit cube, and
	87
	88	a(x,y,z) = 1000 on [0.1,0.9] x [0.1,0.9] x [0.1,0.9]
	89	= 0.01 on the 8 corner cubes of size 0.1 x 0.1 x 0.1
	90	= 1 elsewhere
	91
	92	%==========================================================================
	93	%==========================================================================
	94
	95	Important Kernels in this Distribution
	96
	97	Here the important files that are used for the linear solver,
	98	both preconditioner and solver, are listed. Files that don't
	99	take much time such as wrappers (files starting with HYPRE_,
	100	files that are not used during the suggested runs or files that are
	101	used for the generation of the problems are not included here.
	102	A complete listing of all directories and files
	103	as well as a short description of each directory
	104	can be found in the next section.
	105
	106
	107	In the 'krylov' directory:
	108
	109	pcg.c functions for the conjugate gradient algorithm
	110	gmres.c functions for the GMRES algorithm
	111
	112	In the 'parcsr_ls' directory:
	113
	114	par_amg.c Setup phase of the AMG preconditioner
	115	par_amg_setup.c Setup phase of the AMG preconditioner
	116	par_coarsen.c various coarsening algorithms
	117	par_strength.c computes a strength matrix for coarsening and
	118	interpolation
	119	par_indepset.c independent set function needed for coarsening
	120	par_interp.c interpolation algorithms for solvers 0 and 3
	121	par_lr_interp.c interpolation algorithms for solvers 1 and 4
	122	aux_interp.c auxiliary functions needed in par_lr_interp.c
	123	par_multi_interp.c interpolation algorithm for fine level
	124	in solvers 1 and 4
	125	par_rap.c generates coarse grid operator
	126	par_rap_communication.c sets up communication in par_rap.c
	127	par_amg_solve.c Solve phase of the AMG preconditioner
	128	par_cycle.c AMG cycle
	129	par_relax.c AMG smoothers
	130
	131	In the 'parcsr_mv' directory:
	132
	133	par_csr_communication.c communication routines for global partitioning
	134	new_commpkg.c communication routines for assumed partitioning
	135	par_csr_assumed_part.c communication routines for assumed partitioning
	136	par_csr_matrix.c basic parallel matrix operations
	137	par_csr_matvec.c parallel matrix vector multiplication
	138	par_csr_matop.c additional parallel matrix operations
	139	par_vector.c basic vector operations
	140
	141	In the 'seq_mv' directory:
	142
	143	big_csr_matrix.c basic sequential matrix operations
	144	csr_matrix.c basic sequential matrix operations
	145	csr_matvec.c sequential matrix vector multiplications
	146	csr_matop.c additional sequential matrix operations
	147	vector.c basic sequential vector operations
	148
	149	%==========================================================================
	150	%==========================================================================
	151
	152	Files in this Distribution
	153
	154	NOTE: The AMG2013 code is derived directly from the hypre library, a large
	155	linear solver library that is being developed in the Center for Applied
	156	Scientific Computing (CASC) at LLNL.
	157
	158	In the amg2013 directory the following files are included:
	159
	160	COPYING_LESSER
	161	COPYRIGHT
	162	HYPRE.h
	163	Makefile
	164	Makefile.include
	165
	166	The following subdirectories are also included:
	167
	168	docs Documentation
	169	IJ_mv Linear algebraic interface routines
	170	krylov Krylov solvers, such as PCG and GMRES
	171	parcsr_ls routines needed to generate solvers and preconditioners
	172	as well as Problems 2-4
	173	parcsr_mv parallel matrix and vector routines
	174	(ParCSR data structure)
	175	seq_mv sequential matrix and vector routines
	176	sstruct_mv semistructured matrix and vector routines - included
	177	to generate Problem 1
	178	struct_mv structured matrix and vector routines - included to
	179	generate Problem 1
	180	test driver and input file for Problem 1
	181	utilities functions for memory allocation, timing, error codes,
	182	sorting, searching, etc.
	183
	184	In the 'docs' directory the following files are included:
	185
	186	amg2013.readme
	187	mg_grid_labels.pdf
	188
	189	In the 'IJ_mv' directory the following files are included:
	190
	191	aux_parcsr_matrix.c
	192	aux_parcsr_matrix.h
	193	aux_par_vector.c
	194	aux_par_vector.h
	195	headers.h
	196	HYPRE_IJMatrix.c
	197	HYPRE_IJ_mv.h
	198	HYPRE_IJVector.c
	199	IJMatrix.c
	200	IJ_matrix.h
	201	IJMatrix_parcsr.c
	202	IJ_mv.h
	203	IJVector.c
	204	IJ_vector.h
	205	IJVector_parcsr.c
	206	Makefile
	207
	208	In the 'krylov' directory the following files are included:
	209
	210	all_krylov.h
	211	gmres.c
	212	gmres.h
	213	HYPRE_gmres.c
	214	HYPRE_MatvecFunctions.h
	215	HYPRE_pcg.c
	216	krylov.h
	217	Makefile
	218	pcg.c
	219	pcg.h
	220
	221	In the 'parcsr_ls' directory the following files are included:
	222
	223	aux_interp.c
	224	aux_interp.h
	225	gen_redcs_mat.c
	226	headers.h
	227	HYPRE_parcsr_amg.c
	228	HYPRE_parcsr_gmres.c
	229	HYPRE_parcsr_ls.h
	230	HYPRE_parcsr_pcg.c
	231	Makefile
	232	par_amg.c
	233	par_amg.h
	234	par_amg_setup.c
	235	par_amg_solve.c
	236	par_cg_relax_wt.c
	237	par_coarsen.c
	238	par_coarse_parms.c
	239	parcsr_ls.h
	240	par_cycle.c
	241	par_difconv.c
	242	par_indepset.c
	243	par_interp.c
	244	par_jacobi_interp.c
	245	par_laplace_27pt.c
	246	par_laplace.c
	247	par_lr_interp.c
	248	par_multi_interp.c
	249	par_nodal_systems.c
	250	par_rap.c
	251	par_rap_communication.c
	252	par_relax.c
	253	par_relax_interface.c
	254	par_relax_more.c
	255	par_scaled_matnorm.c
	256	par_stats.c
	257	par_strength.c
	258	partial.c
	259	par_vardifconv.c
	260	pcg_par.c
	261
	262	In the 'parcsr_mv' directory the following files are included:
	263
	264	headers.h
	265	HYPRE_parcsr_matrix.c
	266	HYPRE_parcsr_mv.h
	267	HYPRE_parcsr_vector.c
	268	Makefile
	269	new_commpkg.c
	270	new_commpkg.h
	271	par_csr_assumed_part.c
	272	par_csr_assumed_part.h
	273	par_csr_communication.c
	274	par_csr_communication.h
	275	par_csr_matop.c
	276	par_csr_matop_marked.c
	277	par_csr_matrix.c
	278	par_csr_matrix.h
	279	par_csr_matvec.c
	280	parcsr_mv.h
	281	par_vector.c
	282	par_vector.h
	283
	284	In the 'seq_mv' directory the following files are included:
	285
	286	big_csr_matrix.c
	287	csr_matop.c
	288	csr_matrix.c
	289	csr_matrix.h
	290	csr_matvec.c
	291	genpart.c
	292	headers.h
	293	HYPRE_csr_matrix.c
	294	HYPRE_seq_mv.h
	295	HYPRE_vector.c
	296	Makefile
	297	seq_mv.h
	298	vector.c
	299	vector.h
	300
	301	In the 'sstruct_mv' directory the following files are included:
	302
	303	box_map.c
	304	box_map.h
	305	headers.h
	306	HYPRE_sstruct_graph.c
	307	HYPRE_sstruct_grid.c
	308	HYPRE_sstruct_matrix.c
	309	HYPRE_sstruct_mv.h
	310	HYPRE_sstruct_stencil.c
	311	HYPRE_sstruct_vector.c
	312	Makefile
	313	sstruct_axpy.c
	314	sstruct_copy.c
	315	sstruct_graph.c
	316	sstruct_graph.h
	317	sstruct_grid.c
	318	sstruct_grid.h
	319	sstruct_innerprod.c
	320	sstruct_matrix.c
	321	sstruct_matrix.h
	322	sstruct_matvec.c
	323	sstruct_mv.h
	324	sstruct_overlap_innerprod.c
	325	sstruct_scale.c
	326	sstruct_stencil.c
	327	sstruct_stencil.h
	328	sstruct_vector.c
	329	sstruct_vector.h
	330
	331	In the 'struct_mv' directory the following files are included:
	332
	333	assumed_part.c
	334	assumed_part.h
	335	box_algebra.c
	336	box_alloc.c
	337	box_boundary.c
	338	box.c
	339	box.h
	340	box_manager.c
	341	box_manager.h
	342	box_neighbors.c
	343	box_neighbors.h
	344	box_pthreads.h
	345	communication_info.c
	346	computation.c
	347	computation.h
	348	grow.c
	349	headers.h
	350	HYPRE_struct_grid.c
	351	HYPRE_struct_matrix.c
	352	HYPRE_struct_mv.h
	353	HYPRE_struct_stencil.c
	354	HYPRE_struct_vector.c
	355	Makefile
	356	new_assemble.c
	357	new_box_neighbors.c
	358	project.c
	359	struct_axpy.c
	360	struct_communication.c
	361	struct_communication.h
	362	struct_copy.c
	363	struct_grid.c
	364	struct_grid.h
	365	struct_innerprod.c
	366	struct_io.c
	367	struct_matrix.c
	368	struct_matrix.h
	369	struct_matrix_mask.c
	370	struct_matvec.c
	371	struct_mv.h
	372	struct_overlap_innerprod.c
	373	struct_scale.c
	374	struct_stencil.c
	375	struct_stencil.h
	376	struct_vector.c
	377	struct_vector.h
	378
	379	In the 'test' directory the following files are included:
	380
	381	amg2013.c
	382	Makefile
	383	sstruct.in.MG.FD
	384
	385	In the 'utilities' directory the following files are included:
	386
	387	amg_linklist.c
	388	amg_linklist.h
	389	binsearch.c
	390	exchange_data.c
	391	exchange_data.h
	392	exchange_data.README
	393	general.h
	394	hypre_error.c
	395	hypre_error.h
	396	hypre_memory.c
	397	hypre_memory.h
	398	hypre_qsort.c
	399	hypre_smp_forloop.h
	400	HYPRE_utilities.h
	401	Makefile
	402	memory_dmalloc.c
	403	mpistubs.c
	404	mpistubs.h
	405	qsplit.c
	406	random.c
	407	threading.c
	408	threading.h
	409	thread_mpistubs.c
	410	thread_mpistubs.h
	411	timer.c
	412	timing.c
	413	timing.h
	414	umalloc_local.c
	415	umalloc_local.h
	416	utilities.h
	417
	418	%==========================================================================
	419	%==========================================================================
	420
	421	Building the Code
	422
	423	AMG2013 uses a simple Makefile system for building the code. All compiler and
	424	link options are set by modifying the file 'amg2013/Makefile.include'
	425	appropriately. This file is then included in each of the following makefiles:
	426
	427	krylov/Makefile
	428	IJ_mv/Makefile
	429	parcsr_ls/Makefile
	430	parcsr_mv/Makefile
	431	seq_mv/Makefile
	432	sstruct_mv/Makefile
	433	struct_mv/Makefile
	434	test/Makefile
	435	utilities/Makefile
	436
	437	To build the code, first modify the 'Makefile.include' file appropriately, then
	438	type (in the amg2013 directory)
	439
	440	make
	441
	442	Other available targets are
	443
	444	make clean (deletes .o files)
	445	make veryclean (deletes .o files, libraries, and executables)
	446
	447	To configure the code to run with:
	448
	449	1 - MPI only , add '-DTIMER_USE_MPI' to the 'INCLUDE_CFLAGS' line
	450	in the 'Makefile.include' file and use a valid MPI.
	451	2 - OpenMP with MPI, add vendor dependent compilation flag for OMP
	452	3 - to use the assumed partition (recommended for several thousand
	453	processors or more), add '-DHYPRE_NO_GLOBAL_PARTITION'
	454	4 - to be able to solve problems that are larger than 2^31-1,
	455	add '-DHYPRE_LONG_LONG'
	456
	457	%==========================================================================
	458	%==========================================================================
	459
	460	Optimization and Improvement Challenges
	461
	462	This code is memory-access bound. We believe it would be very difficult to
	463	obtain "good" cache reuse with an optimized version of the code.
	464
	465	%==========================================================================
	466	%==========================================================================
	467
	468	Parallelism and Scalability Expectations
	469
	470	AMG2013 has been run on the following platforms:
	471
	472	BG/Q - up to over 1,000,000 MPI processes
	473	BG/P - up to 125,000 MPI processes
	474	Sierra - up to 13,824 MPI processes
	475	and more
	476
	477	Consider increasing both problem size and number of processors in tandem.
	478	On scalable architectures, time-to-solution for AMG2013 will initially
	479	increase, then it will level off at a modest numbers of processors,
	480	remaining roughly constant for larger numbers of processors. Iteration
	481	counts will also increase slightly for small to modest sized problems,
	482	then level off at a roughly constant number for larger problem sizes.
	483
	484	For example, we get the following timing results (in seconds) for a 3D Laplace
	485	problem with cx = cy = cz = 1.0, distributed on a logical P x Q x R processor
	486	topology, with fixed local problem size per process given as 40 x 40 x 40:
	487
	488	P x Q x R procs solver similar to solver 0
	489	---------------------------------------------------------------
	490	16x16x16 4096 5.75
	491	20x20x20 8000 6.88
	492	32x32x32 32768 8.11
	493	44x44x44 91125 10.48
	494	50x50x50 125000 10.54
	495
	496	These results were obtained on BG/P using the assumed partition option
	497	-DHYPRE_NO_GLOBAL_PARTITION and -DHYPRE_LONG_LONG.
	498
	499	%==========================================================================
	500	%==========================================================================
	501
	502	Running the Code
	503
	504	The driver for AMG2013 is called `amg2013', and is located in the amg2013/test
	505	subdirectory. Type
	506
	507	mpirun -np 1 amg2013 -help
	508
	509	to get usage information. This prints out the following:
	510
	511	Usage: amg2013 [<options>]
	512
	513	-in <filename> : input file (default is `sstruct.in.AMG.FD')
	514
	515	-P <Px> <Py> <Pz> : define processor topology per part
	516	Note that for test problem 1, which has 8 parts
	517	this leads to 8PxPy*Pz MPI processes!
	518	For all other test problems, the total amount of
	519	MPI processes is PxPyPz.
	520
	521	-pooldist <p> : pool distribution to use
	522
	523	-r <rx> <ry> <rz> : refine part(s) for default problem
	524	-b <bx> <by> <bz> : refine and block part(s) for default problem
	525
	526	-n <nx> <ny> <nz> : define size per processor for problems on cube
	527	-c <cx> <cy> <cz> : define anisotropies for Laplace problem
	528
	529	-laplace : 3D Laplace problem on a cube
	530	-27pt : Problem with 27-point stencil on a cube
	531	-jumps : PDE with jumps on a cube
	532
	533	-solver <ID> : solver ID (default = 0)
	534	0 - PCG with AMG precond
	535	1 - PCG with diagonal scaling
	536	2 - GMRES(10) with AMG precond
	537	3 - GMRES(10) with diagonal scaling
	538
	539	-printstats : print out detailed info on AMG preconditioner
	540
	541	-printsystem : print out the system
	542
	543	-rhsfromcosine : solution is cosine function (default), can be used for
	544	default problem only
	545	-rhsone : rhs is vector with unit components
	546
	547	All of the arguments are optional. The most important option for the AMG2013
	548	compact application is the `-P' option. It specifies the MPI process topology
	549	on which to run.
	550
	551	For the default problem, there are two possible pool distributions, which
	552	lead to different partitionings of the problem. Pool distribution 0 will
	553	give each process a portion of one of the 8 parts of the test problem, thus
	554	assigning disjoint subdomains to each process. Pool distribution 1 uses a
	555	more natural partitioning, assigning each process a subdomain in one of
	556	the 8 parts, and therefore requires the total number of processes to be a
	557	multiple of 8, i.e. it needs to be run as follows:
	558	mpirun -np <N> amg2013 -pooldist 1 -P <Px> <Py> <Pz> ...
	559	with <N> = 8<Px><Py>*<Pz>.
	560	Both partitionings lead to a load balanced distribution of the original problem.
	561	The problem size per MPI process can be increased using the `-r' option,
	562	which defines the refinement factor for the grid on each process in each
	563	direction, or the '-b' option, which increases the number of blocks per process.
	564
	565
	566	For the other three problems (laplace, 27pt and jumps) the `-n' option allows
	567	one to specify the local problem size per MPI process, leading to a global
	568	problem size of <Px><nx> by <Py><ny> by <Pz>*<nz>.
	569
	570	%==========================================================================
	571	%==========================================================================
	572
	573	Timing Issues
	574
	575	If using MPI, the whole code is timed using the MPI timers. If not using MPI,
	576	standard system timers are used. Timing results are printed to standard out,
	577	and are divided into "Setup Phase" times and "Solve Phase" times. Timings for a
	578	few individual routines are also printed out.
	579
	580	%==========================================================================
	581	%==========================================================================
	582
	583	Memory Needed
	584
	585	AMG2013 's memory needs are somewhat complicated to describe. They are very
	586	dependent on the type of problem solved and the AMG options used. In general,
	587	solver 1 and solver 4 will need less memory than solver 0 and 3. When turning
	588	on the '-printstats' option, operator complexities <oc> are displayed, which are
	589	defined by the sum of nonzeros of the original matrix and all coarse grid
	590	matrices divided by the number of nonzeros of the original matrix, i.e. for
	591	original matrix and coarse grid operators about <oc> times as much space is
	592	needed as for the original matrix. However this does not include memory needed
	593	for interpolation operators, communication, etc.
	594
	595	%==========================================================================
	596	%==========================================================================
	597
	598	About the Data
	599
	600	AMG2013 requires one input file to generate the default problem, which is
	601	located in the test directory. Apart from this all control is on the command
	602	line.
	603
	604	%==========================================================================
	605	%==========================================================================
	606
	607	Expected Results
	608
	609	Consider the following run, which was compiled using options -DTIMER_USE_MPI -DHYPRE_USING_OPENMP,
	610	linking with OpenMP and setting OMP_NUM_THREADS to 2:
	611
	612	mpirun -np 8 amg2013 -pooldist 1 P 1 1 1 -r 4 4 4 -printstats
	613
	614	This is what AMG2013 prints out:
	615
	616	=============================================
	617	SStruct Interface:
	618	=============================================
	619	SStruct Interface:
	620	SStruct Interface wall clock time = 0.014211 seconds
	621	SStruct Interface cpu clock time = 0.010000 seconds
	622
	623	Number of MPI processes: 8 , Number of OpenMP threads: 2
	624
	625	BoomerAMG SETUP PARAMETERS:
	626
	627	Max levels = 25
	628	Num levels = 6
	629
	630	Strength Threshold = 0.250000
	631	Interpolation Truncation Factor = 0.000000
	632	Maximum Row Sum Threshold for Dependency Weakening = 0.900000
	633
	634	Coarsening Type = HMIS
	635	Hybrid Coarsening (switch to CLJP when coarsening slows)
	636	measures are determined locally
	637
	638	no. of levels of aggressive coarsening: 1
	639
	640	Interpolation = extended+i interpolation
	641
	642	Operator Matrix Information:
	643
	644	nonzero entries per row row sums
	645	lev rows entries sparse min max avg min max
	646	===================================================================
	647	0 82944 648936 0.000 4 9 7.8 -4.274e-15 3.000e+02
	648	1 8985 159896 0.002 4 45 17.8 -2.069e-13 9.293e+02
	649	2 2763 72864 0.010 6 121 26.4 -2.487e-14 1.668e+03
	650	3 1001 21100 0.021 3 167 21.1 8.298e-02 3.147e+03
	651	4 320 8354 0.082 2 79 26.1 1.938e-01 1.098e+02
	652	5 21 171 0.388 4 12 8.1 5.854e+00 6.784e+00
	653
	654
	655	Interpolation Matrix Information:
	656
	657	entries/row min max row sums
	658	lev rows cols min max weight weight min max
	659	=================================================================
	660	0 82944 x 8985 1 10 1.488e-02 9.980e-01 1.759e-01 1.000e+00
	661	1 8985 x 2763 1 4 8.769e-03 1.000e+00 1.624e-01 1.000e+00
	662	2 2763 x 1001 0 4 2.076e-03 1.000e+00 0.000e+00 1.000e+00
	663	3 1001 x 320 0 4 -4.281e-01 1.452e+00 -7.104e-03 1.000e+00
	664	4 320 x 21 0 4 2.627e-03 5.150e-02 0.000e+00 1.000e+00
	665
	666
	667	Complexity: grid = 1.157817
	668	operator = 1.404331
	669
	670
	671
	672
	673	BoomerAMG SOLVER PARAMETERS:
	674
	675	Maximum number of cycles: 1
	676	Stopping Tolerance: 0.000000e+00
	677	Cycle type (1 = V, 2 = W, etc.): 1
	678
	679	Relaxation Parameters:
	680	Visiting Grid: down up coarse
	681	Number of partial sweeps: 1 1 1
	682	Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 8 8 8
	683	Point types, partial sweeps (1=C, -1=F):
	684	Pre-CG relaxation (down): 0
	685	Post-CG relaxation (up): 0
	686	Coarsest grid: 0
	687
	688	=============================================
	689	Setup phase times:
	690	=============================================
	691	PCG Setup:
	692	PCG Setup wall clock time = 0.066036 seconds
	693	PCG Setup cpu clock time = 0.090000 seconds
	694
	695	System Size / Setup Phase Time: 1.674723e+06
	696
	697	=============================================
	698	Solve phase times:
	699	=============================================
	700	PCG Solve:
	701	PCG Solve wall clock time = 0.103601 seconds
	702	PCG Solve cpu clock time = 0.140000 seconds
	703
	704	AMG2013 Benchmark version 1.0
	705	Iterations = 8
	706	Final Relative Residual Norm = 6.945422e-07
	707
	708	System Size * Iterations / Solve Phase Time: 8.539842e+06
	709
	710	%==========================================================================
	711	%==========================================================================
	712
	713	Suggested Test Runs
	714
	715	1. For the default problem:
	716
	717	mpirun -np <8pxpy*pz> amg2013 -pooldist 1 -r 12 12 12 -P px py pz
	718
	719	This will generate a problem with 82,944 variables per MPI process leading to
	720	a total system size of 663,552pxpy*pz.
	721
	722	mpirun -np <8pxpy*pz> amg2013 -pooldist 1 -r 24 24 24 -P px py pz
	723
	724	This will generate a problem with 663,552 variables per process leading to
	725	a total system size of 5,308,416pxpy*pz and solve it using conjugate gradient
	726	preconditioned with AMG. If one wants to use AMG-GMRES(10) append -solver 2 .
	727
	728	The domain (for a 2-dimensional projection of the domain see mg_grid_labels.pdf)
	729	can be scaled up by increasing the values for px, py and pz.
	730
	731	2. For the 7pt 3D Laplace problem:
	732
	733	mpirun -np <pxpypz> amg2013 -laplace -n 40 40 40 -P px py pz
	734
	735	This will generate a problem with 64,000 grid points per MPI process
	736	with a domain of the size 40px x 40py x 40*pz .
	737
	738	mpirun -np <pxpypz> amg2013 -laplace -n 80 80 80 -P px py pz
	739
	740	This will generate a problem with 512,000 grid points per MPI process
	741	with a domain of the size 80px x 80py x 80*pz .
	742
	743	%==========================================================================
	744	%==========================================================================
	745
	746	For further information on AMG2013 contact
	747	Ulrike Yang
	748	ph: (925)422-2850
	749	email: umyang@llnl.gov
	750
	751	%==========================================================================
	752	%==========================================================================
	753
	754	Release and Modification Record
	755
	756	LLNL code release number: UCRL-CODE-222953.
	757
	758	See the files COPYRIGHT and COPYING.LESSER for a complete copyright notice,
	759	additional contact information, disclaimer and license.

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format