Note
You can run this notebook on Google Colab.
Programming your solver¶
In this notebook, we will look at some of the more advanced capabilities Firedrake has for configuring and developing preconditioners. In particular, we will show support for geometric multigrid, as well as user-defined preconditioners.
As our prototypical example, we will consider the Stokes equations. Find \((u, p) \in V \times Q \subset (H^1)^d \times L^2\) such that
for all \((v, q) \in V \times Q\). Where \(\nu\) is the viscosity.
We will use the inf-sup stable Taylor-Hood element pair of piecewise quadratic velocities and piecewise linear pressures.
[1]:
import matplotlib.pyplot as plt
[2]:
from firedrake import *
mesh = UnitSquareMesh(8, 8)
We now build a hierarchy of regularly refined meshes with this as the coarsest mesh, and grab the finest one to define the problem.
[3]:
meshes = MeshHierarchy(mesh, refinement_levels=3)
# Grab the finest mesh
mesh = meshes[-1]
V = VectorFunctionSpace(mesh, "Lagrange", 2)
Q = FunctionSpace(mesh, "Lagrange", 1)
W = V*Q
We set up the problem in residual form (using TestFunctions but no TrialFunctions).
[4]:
v, q = TestFunctions(W)
w = Function(W)
u, p = split(w)
nu = Constant(0.0001)
F = nu*inner(grad(u), grad(v))*dx - p*div(v)*dx - div(u)*q*dx
We now need to augment the problem with a forcing term and boundary conditions. We will solve a regularised lid-driven cavity problem, and thus choose \(f = 0\) and the boundary conditions:
[5]:
x, y = SpatialCoordinate(mesh)
bc_value = as_vector([0.25 * x**2 * (2-x)**2 *y**2, 0])
bcs = [DirichletBC(W.sub(0), bc_value, 4),
DirichletBC(W.sub(0), 0, (1, 2, 3))]
This problem has a null space of constant pressures, so we’ll need to inform the solver about that too.
[6]:
nullspace = MixedVectorSpaceBasis(W, [W.sub(0), VectorSpaceBasis(constant=True, comm=mesh.comm)])
Since we’re going to look at a bunch of different solver options, let’s have a function that builds a solver with the provided options.
[7]:
def create_solver(solver_parameters, *, pmat=None, appctx=None):
p = {}
if solver_parameters is not None:
p.update(solver_parameters)
# Default to linear SNES
p.setdefault("snes_type", "ksponly")
p.setdefault("ksp_rtol", 1e-7)
problem = NonlinearVariationalProblem(F, w, bcs=bcs, Jp=pmat)
solver = NonlinearVariationalSolver(problem, nullspace=nullspace, options_prefix="",
solver_parameters=p, appctx=appctx)
return solver
First, let’s go ahead and solve the problem using a direct solver. The solver is configured with a dictionary of PETSc options. Here we select MUMPS to perform the sparse LU factorisation. (Note that these are actually the default solver parameters that Firedrake assumes.)
[8]:
solver_parameters = {
"mat_type": "aij",
"ksp_type": "preonly",
# Use MUMPS since it handles the null space
"pc_type": "lu",
"pc_factor_mat_solver_type": "mumps"
}
[9]:
# Programmatically inspect convergence of solver
def convergence(solver):
from firedrake.solving_utils import KSPReasons, SNESReasons
snes = solver.snes
print("""
SNES iterations: {snes}; SNES converged reason: {snesreason}
KSP iterations: {ksp}; KSP converged reason: {kspreason}""".format(snes=snes.getIterationNumber(),
snesreason=SNESReasons[snes.getConvergedReason()],
ksp=snes.ksp.getIterationNumber(),
kspreason=KSPReasons[snes.ksp.getConvergedReason()]))
We’re ready to solve.
[10]:
w.zero()
solver = create_solver(solver_parameters)
solver.solve()
convergence(solver)
SNES iterations: 1; SNES converged reason: CONVERGED_ITS
KSP iterations: 1; KSP converged reason: CONVERGED_ITS
We can now have a look at the solution, using some simple builtin plotting that utilises matplotlib.
[11]:
from firedrake.pyplot import streamplot
u_h, p_h = w.subfunctions
fig, axes = plt.subplots()
streamlines = streamplot(u_h, resolution=1/30, seed=0, axes=axes)
fig.colorbar(streamlines);
Configuring a better preconditioner¶
For this small problem, we can (and probably should) use a direct factorisation method. But what if the problem is too big? Then we need an iterative method, and an appropriate preconditioner.
Let’s try everyone’s favourite, ILU(0).
[12]:
solver_parameters = {
"mat_type": "aij",
"ksp_type": "gmres",
"ksp_gmres_modifiedgramschmidt": None,
"ksp_max_it": 2000,
"ksp_converged_reason": None,
"pc_type": "ilu"
}
[13]:
w.zero()
solver = create_solver(solver_parameters)
solver.solve()
convergence(solver)
Linear solve converged due to CONVERGED_RTOL iterations 862
SNES iterations: 1; SNES converged reason: CONVERGED_ITS
KSP iterations: 862; KSP converged reason: CONVERGED_RTOL
This is, unsurprisingly, bad. Fortunately, better options are available.
Block preconditioning¶
Firedrake hooks up all the necessary machinery to access PETSc’s PCFIELDSPLIT preconditioner. This provides mechanisms for building preconditioners based on block factorisations. The Stokes problem
is a block system with matrix
admitting a factorisation
with \(S = -\color{#2A52BE}{B} \color{#800020}{A}^{-1} \color{#2A52BE}{B^T}\) the Schur complement. This has an inverse
\(S\) is never formed, so it’s inverse is approximated using an iterative method.
[14]:
exact_inverse_parameters = {
"ksp_type": "fgmres",
"pc_type": "fieldsplit",
"pc_fieldsplit_type": "schur",
"fieldsplit_0": {
"ksp_type": "preonly",
"pc_type": "lu",
},
"fieldsplit_1": {
"ksp_type": "cg",
"ksp_rtol": 1e-8,
"pc_type": "none",
}
}
[15]:
w.zero()
solver = create_solver(exact_inverse_parameters)
solver.solve()
convergence(solver)
SNES iterations: 1; SNES converged reason: CONVERGED_ITS
KSP iterations: 1; KSP converged reason: CONVERGED_RTOL
This looks good, but we had to use an unpreconditioned Krylov method to invert \(S\). To do better we need to provide either an approximation to \(S\) or \(S^{-1}\).
For the Stokes equations, Silvester and Wathen (1993) show that \(S \approx -\nu^{-1} Q\) is a good approximation, where \(Q\) is the pressure mass matrix.
Problem: \(Q\) is not available as one of the blocks of \(\mathcal{A}\).
PETSc’s approach is to allow us to supply a separate matrix to the solver which will be used to construct the preconditioner. So, we just need to additionally supply
where \(Q = \int_\Omega p q \,\mathrm{d}x\).
We will construct P by symbolically computing the derivative of the residual to get \(\mathcal{A}\) and then subtracting \(\nu^{-1} Q\).
[16]:
trial = TrialFunction(W)
_, p_t = split(trial)
amat = lhs(derivative(F, w, trial))
pmat = amat - 1/nu * p_t * q*dx
We can now pass this pmat form to create_solver and can configure an appropriate preconditioner.
[17]:
pmat_parameters = {
"mat_type": "nest", # We only need the blocks
"snes_type": "ksponly",
"ksp_view": None,
"ksp_monitor_true_residual": None,
"ksp_max_it": 100,
"pc_type": "fieldsplit",
"pc_fieldsplit_type": "schur",
"fieldsplit_0": {
"ksp_type": "preonly",
"pc_type": "lu",
},
"fieldsplit_1": {
"ksp_type": "preonly",
"pc_type": "lu",
}
}
[18]:
w.zero()
solver = create_solver(pmat_parameters, pmat=pmat)
solver.solve()
convergence(solver)
0 KSP preconditioned resid norm 5.496162170027e+00 true resid norm 7.005934058591e-04 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 9.288239850613e-01 true resid norm 2.453830661658e-03 ||r(i)||/||b|| 3.502503222464e+00
2 KSP preconditioned resid norm 4.322571804189e-01 true resid norm 1.513282272808e-03 ||r(i)||/||b|| 2.160000736736e+00
3 KSP preconditioned resid norm 9.747752360344e-02 true resid norm 4.889677087768e-04 ||r(i)||/||b|| 6.979336440902e-01
4 KSP preconditioned resid norm 2.168769655062e-02 true resid norm 2.367419610194e-04 ||r(i)||/||b|| 3.379163421172e-01
5 KSP preconditioned resid norm 6.602391994159e-03 true resid norm 1.323183601900e-04 ||r(i)||/||b|| 1.888661227516e-01
6 KSP preconditioned resid norm 3.201127617308e-03 true resid norm 5.559951301980e-05 ||r(i)||/||b|| 7.936059996400e-02
7 KSP preconditioned resid norm 1.583402976548e-03 true resid norm 1.881712632922e-05 ||r(i)||/||b|| 2.685884019440e-02
8 KSP preconditioned resid norm 5.459453638348e-04 true resid norm 3.548481385286e-06 ||r(i)||/||b|| 5.064965435887e-03
9 KSP preconditioned resid norm 2.435745735484e-04 true resid norm 1.708889316423e-06 ||r(i)||/||b|| 2.439202684655e-03
10 KSP preconditioned resid norm 8.264338652367e-05 true resid norm 4.816892092487e-07 ||r(i)||/||b|| 6.875445946541e-04
11 KSP preconditioned resid norm 2.869421008933e-05 true resid norm 2.297141941031e-07 ||r(i)||/||b|| 3.278851787385e-04
12 KSP preconditioned resid norm 1.126260481221e-05 true resid norm 1.157469224325e-07 ||r(i)||/||b|| 1.652126917903e-04
13 KSP preconditioned resid norm 3.183234319281e-06 true resid norm 2.964758722361e-08 ||r(i)||/||b|| 4.231782225705e-05
14 KSP preconditioned resid norm 1.315553340592e-06 true resid norm 1.114853946331e-08 ||r(i)||/||b|| 1.591299514109e-05
15 KSP preconditioned resid norm 6.727350475971e-07 true resid norm 4.697218910484e-09 ||r(i)||/||b|| 6.704629063308e-06
16 KSP preconditioned resid norm 2.943828887797e-07 true resid norm 2.160101611332e-09 ||r(i)||/||b|| 3.083245707520e-06
KSP Object: 1 MPI process
type: gmres
restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
happy breakdown tolerance=1e-30
maximum iterations=100, initial guess is zero
tolerances: relative=1e-07, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
type: fieldsplit
FieldSplit with Schur preconditioner, factorization FULL
Preconditioner for the Schur complement formed from A11
Split info:
Split number 0 Defined by IS
Split number 1 Defined by IS
KSP solver for A00 block
KSP Object: (fieldsplit_0_) 1 MPI process
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_0_) 1 MPI process
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: nd
factor fill ratio given 5., needed 5.95328
Factored matrix:
Mat Object: (fieldsplit_0_) 1 MPI process
type: seqbaij
rows=33282, cols=33282, bs=2
package used to perform factorization: petsc
total: nonzeros=4511180, allocated nonzeros=4511180
linear system matrix, which is also used to construct the preconditioner:
Mat Object: (fieldsplit_0_) 1 MPI process
type: seqbaij
rows=33282, cols=33282, bs=2
total: nonzeros=757764, allocated nonzeros=757764
total number of mallocs used during MatSetValues calls=0
KSP solver for S = A11 - A10 inv(A00) A01
KSP Object: (fieldsplit_1_) 1 MPI process
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_1_) 1 MPI process
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: nd
factor fill ratio given 5., needed 7.03531
Factored matrix:
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
package used to perform factorization: petsc
total: nonzeros=204425, allocated nonzeros=204425
not using I-node routines
linear system matrix, followed by the matrix used to construct the preconditioner:
Mat Object: (fieldsplit_1_) 1 MPI process
type: schurcomplement
rows=4225, cols=4225
has attached null space
Schur complement A11 - A10 inv(A00) A01
A11
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=29057, allocated nonzeros=29057
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
A10
Mat Object: 1 MPI process
type: seqaij
rows=4225, cols=33282, rbs=1, cbs=2
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
not using I-node routines
KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
A01
Mat Object: 1 MPI process
type: seqaij
rows=33282, cols=4225, rbs=2, cbs=1
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 16639 nodes, limit used is 5
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=29057, allocated nonzeros=29057
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
linear system matrix, followed by the matrix used to construct the preconditioner:
Mat Object: 1 MPI process
type: nest
rows=37507, cols=37507
has attached null space
MatNest, rows=2, cols=2, structure:
(0,0) : type=seqbaij, rows=33282, cols=33282
(0,1) : type=seqaij, rows=33282, cols=4225
(1,0) : type=seqaij, rows=4225, cols=33282
(1,1) : type=seqaij, rows=4225, cols=4225
Mat Object: 1 MPI process
type: nest
rows=37507, cols=37507
has attached null space
MatNest, rows=2, cols=2, structure:
(0,0) : prefix="fieldsplit_0_", type=seqbaij, rows=33282, cols=33282
(0,1) : type=seqaij, rows=33282, cols=4225
(1,0) : type=seqaij, rows=4225, cols=33282
(1,1) : prefix="fieldsplit_1_", type=seqaij, rows=4225, cols=4225
SNES iterations: 1; SNES converged reason: CONVERGED_ITS
KSP iterations: 16; KSP converged reason: CONVERGED_RTOL
Providing auxiliary operators¶
An inconvenience here is that we must build \(\mathcal{P}\), even though we only need \(-\nu^{-1} Q\) in additional to \(\mathcal{A}\) in the preconditioner.
Firedrake offers a facilities to build Python preconditioning objects, utilising petsc4py.
In this case, we can subclass the AuxiliaryOperatorPC to provide the mass matrix.
[19]:
class MassMatrix(AuxiliaryOperatorPC):
_prefix = "mass_"
def form(self, pc, test, trial):
# Extract the original form and bcs
a, bcs = super().form(pc, test, trial)
# Grab the definition of nu from the user application context (a dict)
nu = self.get_appctx(pc)["nu"]
return (-1/nu * test*trial*dx, bcs)
Now we just need to select parameters such that this Python preconditioner is used.
[20]:
mass_parameters = {
"mat_type": "nest", # We only need the blocks
"ksp_view": None,
"pc_type": "fieldsplit",
"pc_fieldsplit_type": "schur",
"fieldsplit_0": {
"ksp_type": "preonly",
"pc_type": "lu",
},
"fieldsplit_1": {
"ksp_type": "preonly",
"pc_type": "python",
"pc_python_type": "__main__.MassMatrix",
"mass_pc_type": "lu",
}
}
[21]:
appctx = {"nu": nu} # arbitrary user data that is available inside the user PC object
w.zero()
solver = create_solver(mass_parameters, appctx=appctx)
solver.solve()
convergence(solver)
KSP Object: 1 MPI process
type: gmres
restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
happy breakdown tolerance=1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-07, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
type: fieldsplit
FieldSplit with Schur preconditioner, factorization FULL
Preconditioner for the Schur complement formed from A11
Split info:
Split number 0 Defined by IS
Split number 1 Defined by IS
KSP solver for A00 block
KSP Object: (fieldsplit_0_) 1 MPI process
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_0_) 1 MPI process
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: nd
factor fill ratio given 5., needed 5.95328
Factored matrix:
Mat Object: (fieldsplit_0_) 1 MPI process
type: seqbaij
rows=33282, cols=33282, bs=2
package used to perform factorization: petsc
total: nonzeros=4511180, allocated nonzeros=4511180
linear system matrix, which is also used to construct the preconditioner:
Mat Object: (fieldsplit_0_) 1 MPI process
type: seqbaij
rows=33282, cols=33282, bs=2
total: nonzeros=757764, allocated nonzeros=757764
total number of mallocs used during MatSetValues calls=0
KSP solver for S = A11 - A10 inv(A00) A01
KSP Object: (fieldsplit_1_) 1 MPI process
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_1_) 1 MPI process
type: python
Python: __main__.MassMatrix
Firedrake custom preconditioner MassMatrix
PC to apply inverse
PC Object: (fieldsplit_1_mass_) 1 MPI process
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: nd
factor fill ratio given 5., needed 7.03531
Factored matrix:
Mat Object: (fieldsplit_1_mass_) 1 MPI process
type: seqaij
rows=4225, cols=4225
package used to perform factorization: petsc
total: nonzeros=204425, allocated nonzeros=204425
not using I-node routines
linear system matrix, followed by the matrix used to construct the preconditioner:
Mat Object: (fieldsplit_1_) 1 MPI process
type: schurcomplement
rows=4225, cols=4225
has attached null space
Schur complement A11 - A10 inv(A00) A01
A11
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=4225, allocated nonzeros=4225
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
A10
Mat Object: 1 MPI process
type: seqaij
rows=4225, cols=33282, rbs=1, cbs=2
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
not using I-node routines
KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
A01
Mat Object: 1 MPI process
type: seqaij
rows=33282, cols=4225, rbs=2, cbs=1
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 16639 nodes, limit used is 5
Mat Object: (fieldsplit_1_mass_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=29057, allocated nonzeros=29057
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
linear system matrix, followed by the matrix used to construct the preconditioner:
Mat Object: (fieldsplit_1_) 1 MPI process
type: schurcomplement
rows=4225, cols=4225
has attached null space
Schur complement A11 - A10 inv(A00) A01
A11
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=4225, allocated nonzeros=4225
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
A10
Mat Object: 1 MPI process
type: seqaij
rows=4225, cols=33282, rbs=1, cbs=2
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
not using I-node routines
KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
A01
Mat Object: 1 MPI process
type: seqaij
rows=33282, cols=4225, rbs=2, cbs=1
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 16639 nodes, limit used is 5
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=4225, allocated nonzeros=4225
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
linear system matrix, which is also used to construct the preconditioner:
Mat Object: 1 MPI process
type: nest
rows=37507, cols=37507
has attached null space
MatNest, rows=2, cols=2, structure:
(0,0) : prefix="fieldsplit_0_", type=seqbaij, rows=33282, cols=33282
(0,1) : type=seqaij, rows=33282, cols=4225
(1,0) : type=seqaij, rows=4225, cols=33282
(1,1) : prefix="fieldsplit_1_", type=seqaij, rows=4225, cols=4225
SNES iterations: 1; SNES converged reason: CONVERGED_ITS
KSP iterations: 16; KSP converged reason: CONVERGED_RTOL
This performs identically to the previous approach, except that the preconditioning matrix is only built for the pressure space, and constructed “on demand”.
Multigrid preconditioners and smoothers¶
So far, we’ve only used direct solvers for the blocks. We can also use iterative methods. Here we’ll use geometric multigrid to solve
In the same way that Firedrake hooks up solvers such that PCFIELDSPLIT is enabled, if a problem was defined on a mesh from a MeshHierarchy, PCMG and SNESFAS are also available.
[22]:
fieldsplit_mg_parameters = {
"mat_type": "nest",
"ksp_view": None,
"pc_type": "fieldsplit",
"pc_fieldsplit_type": "schur",
"fieldsplit_0": {
"ksp_type": "preonly",
"pc_type": "mg",
"mg_levels": {
"ksp_type": "chebyshev",
"ksp_max_it": 2,
}
},
"fieldsplit_1": {
"ksp_type": "chebyshev",
"ksp_max_it": 2,
"pc_type": "python",
"pc_python_type": f"{__name__}.MassMatrix",
"mass_pc_type": "sor",
}
}
Now, when the solver runs, PETSc will call back in to Firedrake for restriction and prolongation, as well as rediscretising \(A\) on the coarser levels.
[23]:
appctx = {"nu": nu} # arbitrary user data that is available inside the user PC object
w.zero()
solver = create_solver(fieldsplit_mg_parameters, appctx=appctx)
solver.solve()
convergence(solver)
KSP Object: 1 MPI process
type: gmres
restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
happy breakdown tolerance=1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-07, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
type: fieldsplit
FieldSplit with Schur preconditioner, factorization FULL
Preconditioner for the Schur complement formed from A11
Split info:
Split number 0 Defined by IS
Split number 1 Defined by IS
KSP solver for A00 block
KSP Object: (fieldsplit_0_) 1 MPI process
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_0_) 1 MPI process
type: mg
type is MULTIPLICATIVE, levels=4 cycles=v
Cycles per PCApply=1
Not using Galerkin computed coarse grid matrices
Coarse grid solver -- level 0 -------------------------------
KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI process
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_0_mg_coarse_) 1 MPI process
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
matrix ordering: nd
factor fill ratio given 5., needed 2.493
Factored matrix:
Mat Object: (fieldsplit_0_mg_coarse_) 1 MPI process
type: seqbaij
rows=578, cols=578, bs=2
package used to perform factorization: petsc
total: nonzeros=30644, allocated nonzeros=30644
linear system matrix, which is also used to construct the preconditioner:
Mat Object: (fieldsplit_0_mg_levels_0_) 1 MPI process
type: seqbaij
rows=578, cols=578, bs=2
total: nonzeros=12292, allocated nonzeros=12292
total number of mallocs used during MatSetValues calls=0
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI process
type: chebyshev
Chebyshev polynomial of first kind
eigenvalue targets used: min 0.0993791, max 1.09317
eigenvalues estimated via gmres: min 0.026502, max 0.993791
eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI process
type: gmres
restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
happy breakdown tolerance=1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using a noisy random number generated right-hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI process
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix, which is also used to construct the preconditioner:
Mat Object: (fieldsplit_0_mg_levels_1_) 1 MPI process
type: seqbaij
rows=2178, cols=2178, bs=2
total: nonzeros=48132, allocated nonzeros=48132
total number of mallocs used during MatSetValues calls=0
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
type: chebyshev
Chebyshev polynomial of first kind
eigenvalue targets used: min 0.099358, max 1.09294
eigenvalues estimated via gmres: min 0.019399, max 0.99358
eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] KSP Object: (fieldsplit_0_mg_levels_2_esteig_) 1 MPI process
type: gmres
restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
happy breakdown tolerance=1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using a noisy random number generated right-hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix, which is also used to construct the preconditioner:
Mat Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
type: seqbaij
rows=8450, cols=8450, bs=2
total: nonzeros=190468, allocated nonzeros=190468
total number of mallocs used during MatSetValues calls=0
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (fieldsplit_0_mg_levels_3_) 1 MPI process
type: chebyshev
Chebyshev polynomial of first kind
eigenvalue targets used: min 0.0992963, max 1.09226
eigenvalues estimated via gmres: min 0.0129753, max 0.992963
eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] KSP Object: (fieldsplit_0_mg_levels_3_esteig_) 1 MPI process
type: gmres
restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
happy breakdown tolerance=1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using a noisy random number generated right-hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
not checking for convergence
PC Object: (fieldsplit_0_mg_levels_3_) 1 MPI process
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix, which is also used to construct the preconditioner:
Mat Object: (fieldsplit_0_) 1 MPI process
type: seqbaij
rows=33282, cols=33282, bs=2
total: nonzeros=757764, allocated nonzeros=757764
total number of mallocs used during MatSetValues calls=0
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix, which is also used to construct the preconditioner:
Mat Object: (fieldsplit_0_) 1 MPI process
type: seqbaij
rows=33282, cols=33282, bs=2
total: nonzeros=757764, allocated nonzeros=757764
total number of mallocs used during MatSetValues calls=0
KSP solver for S = A11 - A10 inv(A00) A01
KSP Object: (fieldsplit_1_) 1 MPI process
type: chebyshev
Chebyshev polynomial of first kind
eigenvalue targets used: min 0.09586, max 1.05446
eigenvalues estimated via gmres: min 0.111926, max 0.9586
eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] KSP Object: (fieldsplit_1_esteig_) 1 MPI process
type: gmres
restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
happy breakdown tolerance=1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using a noisy random number generated right-hand side
maximum iterations=2, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (fieldsplit_1_) 1 MPI process
type: python
Python: __main__.MassMatrix
Firedrake custom preconditioner MassMatrix
PC to apply inverse
PC Object: (fieldsplit_1_mass_) 1 MPI process
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix, followed by the matrix used to construct the preconditioner:
Mat Object: (fieldsplit_1_) 1 MPI process
type: schurcomplement
rows=4225, cols=4225
has attached null space
Schur complement A11 - A10 inv(A00) A01
A11
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=4225, allocated nonzeros=4225
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
A10
Mat Object: 1 MPI process
type: seqaij
rows=4225, cols=33282, rbs=1, cbs=2
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
not using I-node routines
KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
A01
Mat Object: 1 MPI process
type: seqaij
rows=33282, cols=4225, rbs=2, cbs=1
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 16639 nodes, limit used is 5
Mat Object: (fieldsplit_1_mass_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=29057, allocated nonzeros=29057
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
linear system matrix, followed by the matrix used to construct the preconditioner:
Mat Object: (fieldsplit_1_) 1 MPI process
type: schurcomplement
rows=4225, cols=4225
has attached null space
Schur complement A11 - A10 inv(A00) A01
A11
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=4225, allocated nonzeros=4225
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
A10
Mat Object: 1 MPI process
type: seqaij
rows=4225, cols=33282, rbs=1, cbs=2
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
not using I-node routines
KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
A01
Mat Object: 1 MPI process
type: seqaij
rows=33282, cols=4225, rbs=2, cbs=1
total: nonzeros=156930, allocated nonzeros=156930
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 16639 nodes, limit used is 5
Mat Object: (fieldsplit_1_) 1 MPI process
type: seqaij
rows=4225, cols=4225
total: nonzeros=4225, allocated nonzeros=4225
total number of mallocs used during MatSetValues calls=0
has attached null space
not using I-node routines
linear system matrix, which is also used to construct the preconditioner:
Mat Object: 1 MPI process
type: nest
rows=37507, cols=37507
has attached null space
MatNest, rows=2, cols=2, structure:
(0,0) : prefix="fieldsplit_0_", type=seqbaij, rows=33282, cols=33282
(0,1) : type=seqaij, rows=33282, cols=4225
(1,0) : type=seqaij, rows=4225, cols=33282
(1,1) : prefix="fieldsplit_1_", type=seqaij, rows=4225, cols=4225
SNES iterations: 1; SNES converged reason: CONVERGED_ITS
KSP iterations: 10; KSP converged reason: CONVERGED_RTOL
We can also do monolithic, or “all at once” multigrid. Here we’re using Vanka smoothing. This is supported by a new preconditioner in PETSc PCPATCH.
[24]:
vanka_parameters = {
"mat_type": "matfree", # We only need the action
"ksp_type": "fgmres",
"ksp_max_it": 25,
"pc_type": "mg",
"mg_levels": {
"ksp_type": "chebyshev",
"ksp_convergence_test": "skip",
"ksp_max_it": 2,
"pc_type": "python",
"pc_python_type": "firedrake.PatchPC",
"patch": {
"pc_patch_save_operators": 1,
"pc_patch_partition_of_unity": False,
"pc_patch_construct_dim": 0,
# Topological decomposition
"pc_patch_construct_type": "vanka",
# Pressure space is constraint space
"pc_patch_exclude_subspaces": 1,
# Configure the solver on each patch
"pc_patch_sub": {
"mat_type": "dense",
"ksp_type": "preonly",
"pc_type": "lu",
"pc_factor_shift_type": "nonzero",
}
}
},
"mg_coarse": {
"mat_type": "aij",
"ksp_type": "preonly",
"pc_type": "lu",
"pc_factor_mat_solver_type": "mumps",
}
}
The solver can be invoked as below, but frequently crashes Jupyter notebooks:
[25]:
#w.zero()
#solver = create_solver(vanka_parameters)
#solver.solve()
#convergence(solver)
[ ]:
