Note

You can run this notebook on Google Colab.

Programming your solver

In this notebook, we will look at some of the more advanced capabilities Firedrake has for configuring and developing preconditioners. In particular, we will show support for geometric multigrid, as well as user-defined preconditioners.

As our prototypical example, we will consider the Stokes equations. Find \((u, p) \in V \times Q \subset (H^1)^d \times L^2\) such that

\[\begin{split}\begin{align} \nu\int_\Omega \nabla u : \nabla v\,\mathrm{d}x - \int_\Omega p \nabla \cdot v\,\mathrm{d}x &= \int_\Omega f \cdot v\,\mathrm{d}x, \\ -\int_\Omega \nabla \cdot u q \,\mathrm{d}x&= 0. \end{align}\end{split}\]

for all \((v, q) \in V \times Q\). Where \(\nu\) is the viscosity.

We will use the inf-sup stable Taylor-Hood element pair of piecewise quadratic velocities and piecewise linear pressures.

[1]:
import matplotlib.pyplot as plt
[2]:
from firedrake import *
mesh = UnitSquareMesh(8, 8)

We now build a hierarchy of regularly refined meshes with this as the coarsest mesh, and grab the finest one to define the problem.

[3]:
meshes = MeshHierarchy(mesh, refinement_levels=3)
# Grab the finest mesh
mesh = meshes[-1]

V = VectorFunctionSpace(mesh, "Lagrange", 2)
Q = FunctionSpace(mesh, "Lagrange", 1)
W = V*Q

We set up the problem in residual form (using TestFunctions but no TrialFunctions).

[4]:
v, q = TestFunctions(W)
w = Function(W)
u, p = split(w)

nu = Constant(0.0001)
F = nu*inner(grad(u), grad(v))*dx - p*div(v)*dx - div(u)*q*dx

We now need to augment the problem with a forcing term and boundary conditions. We will solve a regularised lid-driven cavity problem, and thus choose \(f = 0\) and the boundary conditions:

\[\begin{split}\begin{align} u &= \begin{pmatrix}\frac{x^2 (2 - x)^2 y^2}{4} \\ 0 \end{pmatrix} & \text{ on $\Gamma_1 = \{y = 1\}$},\\ u &= 0 & \text{ otherwise.}\\ \end{align}\end{split}\]
[5]:
x, y = SpatialCoordinate(mesh)
bc_value = as_vector([0.25 * x**2 * (2-x)**2 *y**2, 0])

bcs = [DirichletBC(W.sub(0), bc_value, 4),
       DirichletBC(W.sub(0), 0, (1, 2, 3))]

This problem has a null space of constant pressures, so we’ll need to inform the solver about that too.

[6]:
nullspace = MixedVectorSpaceBasis(W, [W.sub(0), VectorSpaceBasis(constant=True, comm=mesh.comm)])

Since we’re going to look at a bunch of different solver options, let’s have a function that builds a solver with the provided options.

[7]:
def create_solver(solver_parameters, *, pmat=None, appctx=None):
    p = {}
    if solver_parameters is not None:
        p.update(solver_parameters)
    # Default to linear SNES
    p.setdefault("snes_type", "ksponly")
    p.setdefault("ksp_rtol", 1e-7)
    problem = NonlinearVariationalProblem(F, w, bcs=bcs, Jp=pmat)
    solver = NonlinearVariationalSolver(problem, nullspace=nullspace, options_prefix="",
                                        solver_parameters=p, appctx=appctx)
    return solver

First, let’s go ahead and solve the problem using a direct solver. The solver is configured with a dictionary of PETSc options. Here we select MUMPS to perform the sparse LU factorisation. (Note that these are actually the default solver parameters that Firedrake assumes.)

[8]:
solver_parameters = {
    "mat_type": "aij",
    "ksp_type": "preonly",
    # Use MUMPS since it handles the null space
    "pc_type": "lu",
    "pc_factor_mat_solver_type": "mumps"
}
[9]:
# Programmatically inspect convergence of solver
def convergence(solver):
    from firedrake.solving_utils import KSPReasons, SNESReasons
    snes = solver.snes
    print("""
SNES iterations: {snes}; SNES converged reason: {snesreason}
   KSP iterations: {ksp}; KSP converged reason: {kspreason}""".format(snes=snes.getIterationNumber(),
                                                                      snesreason=SNESReasons[snes.getConvergedReason()],
                                                                      ksp=snes.ksp.getIterationNumber(),
                                                                      kspreason=KSPReasons[snes.ksp.getConvergedReason()]))

We’re ready to solve.

[10]:
w.zero()
solver = create_solver(solver_parameters)
solver.solve()
convergence(solver)

SNES iterations: 1; SNES converged reason: CONVERGED_ITS
   KSP iterations: 1; KSP converged reason: CONVERGED_ITS

We can now have a look at the solution, using some simple builtin plotting that utilises matplotlib.

[11]:
from firedrake.pyplot import streamplot

u_h, p_h = w.subfunctions
fig, axes = plt.subplots()
streamlines = streamplot(u_h, resolution=1/30, seed=0, axes=axes)
fig.colorbar(streamlines);
../_images/notebooks_08-composable-solvers_19_0.svg

Configuring a better preconditioner

For this small problem, we can (and probably should) use a direct factorisation method. But what if the problem is too big? Then we need an iterative method, and an appropriate preconditioner.

Let’s try everyone’s favourite, ILU(0).

[12]:
solver_parameters = {
    "mat_type": "aij",
    "ksp_type": "gmres",
    "ksp_gmres_modifiedgramschmidt": None,
    "ksp_max_it": 2000,
    "ksp_converged_reason": None,
    "pc_type": "ilu"
}
[13]:
w.zero()
solver = create_solver(solver_parameters)
solver.solve()
convergence(solver)
    Linear solve converged due to CONVERGED_RTOL iterations 862

SNES iterations: 1; SNES converged reason: CONVERGED_ITS
   KSP iterations: 862; KSP converged reason: CONVERGED_RTOL

This is, unsurprisingly, bad. Fortunately, better options are available.

Block preconditioning

Firedrake hooks up all the necessary machinery to access PETSc’s PCFIELDSPLIT preconditioner. This provides mechanisms for building preconditioners based on block factorisations. The Stokes problem

\[\begin{split}\begin{align} \nu\int_\Omega \color{#800020}{\nabla u : \nabla v}\,\mathrm{d}x - \int_\Omega \color{#2A52BE}{p \nabla \cdot v}\,\mathrm{d}x &= \int_\Omega f \cdot v\,\mathrm{d}x, \\ -\int_\Omega \color{#2A52BE}{\nabla \cdot u q} \,\mathrm{d}x&= 0 \end{align}\end{split}\]

is a block system with matrix

\[\begin{split}\mathcal{A} = \begin{bmatrix} \color{#800020}{A} & \color{#2A52BE}{B^T} \\ \color{#2A52BE}{B} & 0 \end{bmatrix},\end{split}\]

admitting a factorisation

\[\begin{split}\begin{bmatrix} I & 0 \\ \color{#2A52BE}{B} \color{#800020}{A}^{-1} & I\end{bmatrix} \begin{bmatrix}\color{#800020}{A} & 0 \\ 0 & S\end{bmatrix} \begin{bmatrix} I & \color{#800020}{A}^{-1} \color{#2A52BE}{B^T} \\ 0 & I\end{bmatrix},\end{split}\]

with \(S = -\color{#2A52BE}{B} \color{#800020}{A}^{-1} \color{#2A52BE}{B^T}\) the Schur complement. This has an inverse

\[\begin{split}\begin{bmatrix} I & -\color{#800020}{A}^{-1}\color{#2A52BE}{B^T} \\ 0 & I \end{bmatrix} \begin{bmatrix} \color{#800020}{A}^{-1} & 0 \\ 0 & S^{-1}\end{bmatrix} \begin{bmatrix} I & 0 \\ -\color{#2A52BE}{B}\color{#800020}{A}^{-1} & I\end{bmatrix}.\end{split}\]

\(S\) is never formed, so it’s inverse is approximated using an iterative method.

[14]:
exact_inverse_parameters = {
    "ksp_type": "fgmres",
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "fieldsplit_0": {
        "ksp_type": "preonly",
        "pc_type": "lu",
    },
    "fieldsplit_1": {
        "ksp_type": "cg",
        "ksp_rtol": 1e-8,
        "pc_type": "none",
    }
}
[15]:
w.zero()
solver = create_solver(exact_inverse_parameters)
solver.solve()
convergence(solver)

SNES iterations: 1; SNES converged reason: CONVERGED_ITS
   KSP iterations: 1; KSP converged reason: CONVERGED_RTOL

This looks good, but we had to use an unpreconditioned Krylov method to invert \(S\). To do better we need to provide either an approximation to \(S\) or \(S^{-1}\).

For the Stokes equations, Silvester and Wathen (1993) show that \(S \approx -\nu^{-1} Q\) is a good approximation, where \(Q\) is the pressure mass matrix.

Problem: \(Q\) is not available as one of the blocks of \(\mathcal{A}\).

PETSc’s approach is to allow us to supply a separate matrix to the solver which will be used to construct the preconditioner. So, we just need to additionally supply

\[\begin{split}\mathcal{P} = \mathcal{A} + \begin{bmatrix} 0 & 0 \\ 0 & -\nu^{-1}Q\end{bmatrix} = \begin{bmatrix} \color{#800020}{A} & \color{#2A52BE}{B^T} \\ \color{#2A52BE}{B} & -\nu^{-1} Q \end{bmatrix},\end{split}\]

where \(Q = \int_\Omega p q \,\mathrm{d}x\).

We will construct P by symbolically computing the derivative of the residual to get \(\mathcal{A}\) and then subtracting \(\nu^{-1} Q\).

[16]:
trial = TrialFunction(W)
_, p_t = split(trial)

amat = lhs(derivative(F, w, trial))
pmat = amat - 1/nu * p_t * q*dx

We can now pass this pmat form to create_solver and can configure an appropriate preconditioner.

[17]:
pmat_parameters = {
    "mat_type": "nest", # We only need the blocks
    "snes_type": "ksponly",
    "ksp_view": None,
    "ksp_monitor_true_residual": None,
    "ksp_max_it": 100,
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "fieldsplit_0": {
        "ksp_type": "preonly",
        "pc_type": "lu",
    },
    "fieldsplit_1": {
        "ksp_type": "preonly",
        "pc_type": "lu",
    }
}
[18]:
w.zero()
solver = create_solver(pmat_parameters, pmat=pmat)
solver.solve()
convergence(solver)
    0 KSP preconditioned resid norm 5.496162170027e+00 true resid norm 7.005934058591e-04 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP preconditioned resid norm 9.288239850613e-01 true resid norm 2.453830661658e-03 ||r(i)||/||b|| 3.502503222464e+00
    2 KSP preconditioned resid norm 4.322571804189e-01 true resid norm 1.513282272808e-03 ||r(i)||/||b|| 2.160000736736e+00
    3 KSP preconditioned resid norm 9.747752360344e-02 true resid norm 4.889677087768e-04 ||r(i)||/||b|| 6.979336440902e-01
    4 KSP preconditioned resid norm 2.168769655062e-02 true resid norm 2.367419610194e-04 ||r(i)||/||b|| 3.379163421172e-01
    5 KSP preconditioned resid norm 6.602391994159e-03 true resid norm 1.323183601900e-04 ||r(i)||/||b|| 1.888661227516e-01
    6 KSP preconditioned resid norm 3.201127617308e-03 true resid norm 5.559951301980e-05 ||r(i)||/||b|| 7.936059996400e-02
    7 KSP preconditioned resid norm 1.583402976548e-03 true resid norm 1.881712632922e-05 ||r(i)||/||b|| 2.685884019440e-02
    8 KSP preconditioned resid norm 5.459453638348e-04 true resid norm 3.548481385286e-06 ||r(i)||/||b|| 5.064965435887e-03
    9 KSP preconditioned resid norm 2.435745735484e-04 true resid norm 1.708889316423e-06 ||r(i)||/||b|| 2.439202684655e-03
   10 KSP preconditioned resid norm 8.264338652367e-05 true resid norm 4.816892092487e-07 ||r(i)||/||b|| 6.875445946541e-04
   11 KSP preconditioned resid norm 2.869421008933e-05 true resid norm 2.297141941031e-07 ||r(i)||/||b|| 3.278851787385e-04
   12 KSP preconditioned resid norm 1.126260481221e-05 true resid norm 1.157469224325e-07 ||r(i)||/||b|| 1.652126917903e-04
   13 KSP preconditioned resid norm 3.183234319281e-06 true resid norm 2.964758722361e-08 ||r(i)||/||b|| 4.231782225705e-05
   14 KSP preconditioned resid norm 1.315553340592e-06 true resid norm 1.114853946331e-08 ||r(i)||/||b|| 1.591299514109e-05
   15 KSP preconditioned resid norm 6.727350475971e-07 true resid norm 4.697218910484e-09 ||r(i)||/||b|| 6.704629063308e-06
   16 KSP preconditioned resid norm 2.943828887797e-07 true resid norm 2.160101611332e-09 ||r(i)||/||b|| 3.083245707520e-06
KSP Object: 1 MPI process
  type: gmres
    restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
    happy breakdown tolerance=1e-30
  maximum iterations=100, initial guess is zero
  tolerances: relative=1e-07, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
  type: fieldsplit
    FieldSplit with Schur preconditioner, factorization FULL
    Preconditioner for the Schur complement formed from A11
    Split info:
    Split number 0 Defined by IS
    Split number 1 Defined by IS
    KSP solver for A00 block
      KSP Object: (fieldsplit_0_) 1 MPI process
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        not checking for convergence
      PC Object: (fieldsplit_0_) 1 MPI process
        type: lu
          out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          matrix ordering: nd
          factor fill ratio given 5., needed 5.95328
            Factored matrix:
              Mat Object: (fieldsplit_0_) 1 MPI process
                type: seqbaij
                rows=33282, cols=33282, bs=2
                package used to perform factorization: petsc
                total: nonzeros=4511180, allocated nonzeros=4511180
        linear system matrix, which is also used to construct the preconditioner:
        Mat Object: (fieldsplit_0_) 1 MPI process
          type: seqbaij
          rows=33282, cols=33282, bs=2
          total: nonzeros=757764, allocated nonzeros=757764
          total number of mallocs used during MatSetValues calls=0
    KSP solver for S = A11 - A10 inv(A00) A01
      KSP Object: (fieldsplit_1_) 1 MPI process
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        not checking for convergence
      PC Object: (fieldsplit_1_) 1 MPI process
        type: lu
          out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          matrix ordering: nd
          factor fill ratio given 5., needed 7.03531
            Factored matrix:
              Mat Object: (fieldsplit_1_) 1 MPI process
                type: seqaij
                rows=4225, cols=4225
                package used to perform factorization: petsc
                total: nonzeros=204425, allocated nonzeros=204425
                  not using I-node routines
        linear system matrix, followed by the matrix used to construct the preconditioner:
        Mat Object: (fieldsplit_1_) 1 MPI process
          type: schurcomplement
          rows=4225, cols=4225
            has attached null space
            Schur complement A11 - A10 inv(A00) A01
            A11
              Mat Object: (fieldsplit_1_) 1 MPI process
                type: seqaij
                rows=4225, cols=4225
                total: nonzeros=29057, allocated nonzeros=29057
                total number of mallocs used during MatSetValues calls=0
                  has attached null space
                  not using I-node routines
            A10
              Mat Object: 1 MPI process
                type: seqaij
                rows=4225, cols=33282, rbs=1, cbs=2
                total: nonzeros=156930, allocated nonzeros=156930
                total number of mallocs used during MatSetValues calls=0
                  not using I-node routines
            KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
            A01
              Mat Object: 1 MPI process
                type: seqaij
                rows=33282, cols=4225, rbs=2, cbs=1
                total: nonzeros=156930, allocated nonzeros=156930
                total number of mallocs used during MatSetValues calls=0
                  using I-node routines: found 16639 nodes, limit used is 5
        Mat Object: (fieldsplit_1_) 1 MPI process
          type: seqaij
          rows=4225, cols=4225
          total: nonzeros=29057, allocated nonzeros=29057
          total number of mallocs used during MatSetValues calls=0
            has attached null space
            not using I-node routines
  linear system matrix, followed by the matrix used to construct the preconditioner:
  Mat Object: 1 MPI process
    type: nest
    rows=37507, cols=37507
      has attached null space
        MatNest, rows=2, cols=2, structure:
        (0,0) : type=seqbaij, rows=33282, cols=33282
        (0,1) : type=seqaij, rows=33282, cols=4225
        (1,0) : type=seqaij, rows=4225, cols=33282
        (1,1) : type=seqaij, rows=4225, cols=4225
  Mat Object: 1 MPI process
    type: nest
    rows=37507, cols=37507
      has attached null space
        MatNest, rows=2, cols=2, structure:
        (0,0) : prefix="fieldsplit_0_", type=seqbaij, rows=33282, cols=33282
        (0,1) : type=seqaij, rows=33282, cols=4225
        (1,0) : type=seqaij, rows=4225, cols=33282
        (1,1) : prefix="fieldsplit_1_", type=seqaij, rows=4225, cols=4225

SNES iterations: 1; SNES converged reason: CONVERGED_ITS
   KSP iterations: 16; KSP converged reason: CONVERGED_RTOL

Providing auxiliary operators

An inconvenience here is that we must build \(\mathcal{P}\), even though we only need \(-\nu^{-1} Q\) in additional to \(\mathcal{A}\) in the preconditioner.

Firedrake offers a facilities to build Python preconditioning objects, utilising petsc4py.

In this case, we can subclass the AuxiliaryOperatorPC to provide the mass matrix.

[19]:
class MassMatrix(AuxiliaryOperatorPC):
    _prefix = "mass_"
    def form(self, pc, test, trial):
        # Extract the original form and bcs
        a, bcs = super().form(pc, test, trial)
        # Grab the definition of nu from the user application context (a dict)
        nu = self.get_appctx(pc)["nu"]
        return (-1/nu * test*trial*dx, bcs)

Now we just need to select parameters such that this Python preconditioner is used.

[20]:
mass_parameters = {
    "mat_type": "nest", # We only need the blocks
    "ksp_view": None,
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "fieldsplit_0": {
        "ksp_type": "preonly",
        "pc_type": "lu",
    },
    "fieldsplit_1": {
        "ksp_type": "preonly",
        "pc_type": "python",
        "pc_python_type": "__main__.MassMatrix",
        "mass_pc_type": "lu",
    }
}
[21]:
appctx = {"nu": nu} # arbitrary user data that is available inside the user PC object
w.zero()
solver = create_solver(mass_parameters, appctx=appctx)
solver.solve()
convergence(solver)
KSP Object: 1 MPI process
  type: gmres
    restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
    happy breakdown tolerance=1e-30
  maximum iterations=10000, initial guess is zero
  tolerances: relative=1e-07, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
  type: fieldsplit
    FieldSplit with Schur preconditioner, factorization FULL
    Preconditioner for the Schur complement formed from A11
    Split info:
    Split number 0 Defined by IS
    Split number 1 Defined by IS
    KSP solver for A00 block
      KSP Object: (fieldsplit_0_) 1 MPI process
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        not checking for convergence
      PC Object: (fieldsplit_0_) 1 MPI process
        type: lu
          out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          matrix ordering: nd
          factor fill ratio given 5., needed 5.95328
            Factored matrix:
              Mat Object: (fieldsplit_0_) 1 MPI process
                type: seqbaij
                rows=33282, cols=33282, bs=2
                package used to perform factorization: petsc
                total: nonzeros=4511180, allocated nonzeros=4511180
        linear system matrix, which is also used to construct the preconditioner:
        Mat Object: (fieldsplit_0_) 1 MPI process
          type: seqbaij
          rows=33282, cols=33282, bs=2
          total: nonzeros=757764, allocated nonzeros=757764
          total number of mallocs used during MatSetValues calls=0
    KSP solver for S = A11 - A10 inv(A00) A01
      KSP Object: (fieldsplit_1_) 1 MPI process
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        not checking for convergence
      PC Object: (fieldsplit_1_) 1 MPI process
        type: python
          Python: __main__.MassMatrix
        Firedrake custom preconditioner MassMatrix
        PC to apply inverse
        PC Object: (fieldsplit_1_mass_) 1 MPI process
          type: lu
            out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            matrix ordering: nd
            factor fill ratio given 5., needed 7.03531
              Factored matrix:
                Mat Object: (fieldsplit_1_mass_) 1 MPI process
                  type: seqaij
                  rows=4225, cols=4225
                  package used to perform factorization: petsc
                  total: nonzeros=204425, allocated nonzeros=204425
                    not using I-node routines
          linear system matrix, followed by the matrix used to construct the preconditioner:
          Mat Object: (fieldsplit_1_) 1 MPI process
            type: schurcomplement
            rows=4225, cols=4225
              has attached null space
              Schur complement A11 - A10 inv(A00) A01
              A11
                Mat Object: (fieldsplit_1_) 1 MPI process
                  type: seqaij
                  rows=4225, cols=4225
                  total: nonzeros=4225, allocated nonzeros=4225
                  total number of mallocs used during MatSetValues calls=0
                    has attached null space
                    not using I-node routines
              A10
                Mat Object: 1 MPI process
                  type: seqaij
                  rows=4225, cols=33282, rbs=1, cbs=2
                  total: nonzeros=156930, allocated nonzeros=156930
                  total number of mallocs used during MatSetValues calls=0
                    not using I-node routines
              KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
              A01
                Mat Object: 1 MPI process
                  type: seqaij
                  rows=33282, cols=4225, rbs=2, cbs=1
                  total: nonzeros=156930, allocated nonzeros=156930
                  total number of mallocs used during MatSetValues calls=0
                    using I-node routines: found 16639 nodes, limit used is 5
          Mat Object: (fieldsplit_1_mass_) 1 MPI process
            type: seqaij
            rows=4225, cols=4225
            total: nonzeros=29057, allocated nonzeros=29057
            total number of mallocs used during MatSetValues calls=0
              has attached null space
              not using I-node routines
        linear system matrix, followed by the matrix used to construct the preconditioner:
        Mat Object: (fieldsplit_1_) 1 MPI process
          type: schurcomplement
          rows=4225, cols=4225
            has attached null space
            Schur complement A11 - A10 inv(A00) A01
            A11
              Mat Object: (fieldsplit_1_) 1 MPI process
                type: seqaij
                rows=4225, cols=4225
                total: nonzeros=4225, allocated nonzeros=4225
                total number of mallocs used during MatSetValues calls=0
                  has attached null space
                  not using I-node routines
            A10
              Mat Object: 1 MPI process
                type: seqaij
                rows=4225, cols=33282, rbs=1, cbs=2
                total: nonzeros=156930, allocated nonzeros=156930
                total number of mallocs used during MatSetValues calls=0
                  not using I-node routines
            KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
            A01
              Mat Object: 1 MPI process
                type: seqaij
                rows=33282, cols=4225, rbs=2, cbs=1
                total: nonzeros=156930, allocated nonzeros=156930
                total number of mallocs used during MatSetValues calls=0
                  using I-node routines: found 16639 nodes, limit used is 5
        Mat Object: (fieldsplit_1_) 1 MPI process
          type: seqaij
          rows=4225, cols=4225
          total: nonzeros=4225, allocated nonzeros=4225
          total number of mallocs used during MatSetValues calls=0
            has attached null space
            not using I-node routines
  linear system matrix, which is also used to construct the preconditioner:
  Mat Object: 1 MPI process
    type: nest
    rows=37507, cols=37507
      has attached null space
        MatNest, rows=2, cols=2, structure:
        (0,0) : prefix="fieldsplit_0_", type=seqbaij, rows=33282, cols=33282
        (0,1) : type=seqaij, rows=33282, cols=4225
        (1,0) : type=seqaij, rows=4225, cols=33282
        (1,1) : prefix="fieldsplit_1_", type=seqaij, rows=4225, cols=4225

SNES iterations: 1; SNES converged reason: CONVERGED_ITS
   KSP iterations: 16; KSP converged reason: CONVERGED_RTOL

This performs identically to the previous approach, except that the preconditioning matrix is only built for the pressure space, and constructed “on demand”.

Multigrid preconditioners and smoothers

So far, we’ve only used direct solvers for the blocks. We can also use iterative methods. Here we’ll use geometric multigrid to solve

In the same way that Firedrake hooks up solvers such that PCFIELDSPLIT is enabled, if a problem was defined on a mesh from a MeshHierarchy, PCMG and SNESFAS are also available.

[22]:
fieldsplit_mg_parameters = {
    "mat_type": "nest",
    "ksp_view": None,
    "pc_type": "fieldsplit",
    "pc_fieldsplit_type": "schur",
    "fieldsplit_0": {
        "ksp_type": "preonly",
        "pc_type": "mg",
        "mg_levels": {
            "ksp_type": "chebyshev",
            "ksp_max_it": 2,
        }
    },
    "fieldsplit_1": {
        "ksp_type": "chebyshev",
        "ksp_max_it": 2,
        "pc_type": "python",
        "pc_python_type": f"{__name__}.MassMatrix",
        "mass_pc_type": "sor",
    }
}

Now, when the solver runs, PETSc will call back in to Firedrake for restriction and prolongation, as well as rediscretising \(A\) on the coarser levels.

[23]:
appctx = {"nu": nu} # arbitrary user data that is available inside the user PC object
w.zero()
solver = create_solver(fieldsplit_mg_parameters, appctx=appctx)
solver.solve()
convergence(solver)
KSP Object: 1 MPI process
  type: gmres
    restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
    happy breakdown tolerance=1e-30
  maximum iterations=10000, initial guess is zero
  tolerances: relative=1e-07, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
  type: fieldsplit
    FieldSplit with Schur preconditioner, factorization FULL
    Preconditioner for the Schur complement formed from A11
    Split info:
    Split number 0 Defined by IS
    Split number 1 Defined by IS
    KSP solver for A00 block
      KSP Object: (fieldsplit_0_) 1 MPI process
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        not checking for convergence
      PC Object: (fieldsplit_0_) 1 MPI process
        type: mg
          type is MULTIPLICATIVE, levels=4 cycles=v
            Cycles per PCApply=1
            Not using Galerkin computed coarse grid matrices
        Coarse grid solver -- level 0 -------------------------------
          KSP Object: (fieldsplit_0_mg_coarse_) 1 MPI process
            type: preonly
            maximum iterations=10000, initial guess is zero
            tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            not checking for convergence
          PC Object: (fieldsplit_0_mg_coarse_) 1 MPI process
            type: lu
              out-of-place factorization
              tolerance for zero pivot 2.22045e-14
              using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
              matrix ordering: nd
              factor fill ratio given 5., needed 2.493
                Factored matrix:
                  Mat Object: (fieldsplit_0_mg_coarse_) 1 MPI process
                    type: seqbaij
                    rows=578, cols=578, bs=2
                    package used to perform factorization: petsc
                    total: nonzeros=30644, allocated nonzeros=30644
            linear system matrix, which is also used to construct the preconditioner:
            Mat Object: (fieldsplit_0_mg_levels_0_) 1 MPI process
              type: seqbaij
              rows=578, cols=578, bs=2
              total: nonzeros=12292, allocated nonzeros=12292
              total number of mallocs used during MatSetValues calls=0
        Down solver (pre-smoother) on level 1 -------------------------------
          KSP Object: (fieldsplit_0_mg_levels_1_) 1 MPI process
            type: chebyshev
              Chebyshev polynomial of first kind
              eigenvalue targets used: min 0.0993791, max 1.09317
              eigenvalues estimated via gmres: min 0.026502, max 0.993791
              eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1]              KSP Object: (fieldsplit_0_mg_levels_1_esteig_) 1 MPI process
                type: gmres
                  restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
                  happy breakdown tolerance=1e-30
                maximum iterations=10, initial guess is zero
                tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
                left preconditioning
                using PRECONDITIONED norm type for convergence test
              estimating eigenvalues using a noisy random number generated right-hand side
            maximum iterations=2, nonzero initial guess
            tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            not checking for convergence
          PC Object: (fieldsplit_0_mg_levels_1_) 1 MPI process
            type: sor
              type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
            linear system matrix, which is also used to construct the preconditioner:
            Mat Object: (fieldsplit_0_mg_levels_1_) 1 MPI process
              type: seqbaij
              rows=2178, cols=2178, bs=2
              total: nonzeros=48132, allocated nonzeros=48132
              total number of mallocs used during MatSetValues calls=0
        Up solver (post-smoother) same as down solver (pre-smoother)
        Down solver (pre-smoother) on level 2 -------------------------------
          KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
            type: chebyshev
              Chebyshev polynomial of first kind
              eigenvalue targets used: min 0.099358, max 1.09294
              eigenvalues estimated via gmres: min 0.019399, max 0.99358
              eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1]              KSP Object: (fieldsplit_0_mg_levels_2_esteig_) 1 MPI process
                type: gmres
                  restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
                  happy breakdown tolerance=1e-30
                maximum iterations=10, initial guess is zero
                tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
                left preconditioning
                using PRECONDITIONED norm type for convergence test
              estimating eigenvalues using a noisy random number generated right-hand side
            maximum iterations=2, nonzero initial guess
            tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            not checking for convergence
          PC Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
            type: sor
              type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
            linear system matrix, which is also used to construct the preconditioner:
            Mat Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
              type: seqbaij
              rows=8450, cols=8450, bs=2
              total: nonzeros=190468, allocated nonzeros=190468
              total number of mallocs used during MatSetValues calls=0
        Up solver (post-smoother) same as down solver (pre-smoother)
        Down solver (pre-smoother) on level 3 -------------------------------
          KSP Object: (fieldsplit_0_mg_levels_3_) 1 MPI process
            type: chebyshev
              Chebyshev polynomial of first kind
              eigenvalue targets used: min 0.0992963, max 1.09226
              eigenvalues estimated via gmres: min 0.0129753, max 0.992963
              eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1]              KSP Object: (fieldsplit_0_mg_levels_3_esteig_) 1 MPI process
                type: gmres
                  restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
                  happy breakdown tolerance=1e-30
                maximum iterations=10, initial guess is zero
                tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
                left preconditioning
                using PRECONDITIONED norm type for convergence test
              estimating eigenvalues using a noisy random number generated right-hand side
            maximum iterations=2, nonzero initial guess
            tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
            left preconditioning
            not checking for convergence
          PC Object: (fieldsplit_0_mg_levels_3_) 1 MPI process
            type: sor
              type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
            linear system matrix, which is also used to construct the preconditioner:
            Mat Object: (fieldsplit_0_) 1 MPI process
              type: seqbaij
              rows=33282, cols=33282, bs=2
              total: nonzeros=757764, allocated nonzeros=757764
              total number of mallocs used during MatSetValues calls=0
        Up solver (post-smoother) same as down solver (pre-smoother)
        linear system matrix, which is also used to construct the preconditioner:
        Mat Object: (fieldsplit_0_) 1 MPI process
          type: seqbaij
          rows=33282, cols=33282, bs=2
          total: nonzeros=757764, allocated nonzeros=757764
          total number of mallocs used during MatSetValues calls=0
    KSP solver for S = A11 - A10 inv(A00) A01
      KSP Object: (fieldsplit_1_) 1 MPI process
        type: chebyshev
          Chebyshev polynomial of first kind
          eigenvalue targets used: min 0.09586, max 1.05446
          eigenvalues estimated via gmres: min 0.111926, max 0.9586
          eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1]          KSP Object: (fieldsplit_1_esteig_) 1 MPI process
            type: gmres
              restart=30, using classical (unmodified) Gram-Schmidt orthogonalization with no iterative refinement
              happy breakdown tolerance=1e-30
            maximum iterations=10, initial guess is zero
            tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
            left preconditioning
            using PRECONDITIONED norm type for convergence test
          estimating eigenvalues using a noisy random number generated right-hand side
        maximum iterations=2, initial guess is zero
        tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using PRECONDITIONED norm type for convergence test
      PC Object: (fieldsplit_1_) 1 MPI process
        type: python
          Python: __main__.MassMatrix
        Firedrake custom preconditioner MassMatrix
        PC to apply inverse
        PC Object: (fieldsplit_1_mass_) 1 MPI process
          type: sor
            type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
          linear system matrix, followed by the matrix used to construct the preconditioner:
          Mat Object: (fieldsplit_1_) 1 MPI process
            type: schurcomplement
            rows=4225, cols=4225
              has attached null space
              Schur complement A11 - A10 inv(A00) A01
              A11
                Mat Object: (fieldsplit_1_) 1 MPI process
                  type: seqaij
                  rows=4225, cols=4225
                  total: nonzeros=4225, allocated nonzeros=4225
                  total number of mallocs used during MatSetValues calls=0
                    has attached null space
                    not using I-node routines
              A10
                Mat Object: 1 MPI process
                  type: seqaij
                  rows=4225, cols=33282, rbs=1, cbs=2
                  total: nonzeros=156930, allocated nonzeros=156930
                  total number of mallocs used during MatSetValues calls=0
                    not using I-node routines
              KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
              A01
                Mat Object: 1 MPI process
                  type: seqaij
                  rows=33282, cols=4225, rbs=2, cbs=1
                  total: nonzeros=156930, allocated nonzeros=156930
                  total number of mallocs used during MatSetValues calls=0
                    using I-node routines: found 16639 nodes, limit used is 5
          Mat Object: (fieldsplit_1_mass_) 1 MPI process
            type: seqaij
            rows=4225, cols=4225
            total: nonzeros=29057, allocated nonzeros=29057
            total number of mallocs used during MatSetValues calls=0
              has attached null space
              not using I-node routines
        linear system matrix, followed by the matrix used to construct the preconditioner:
        Mat Object: (fieldsplit_1_) 1 MPI process
          type: schurcomplement
          rows=4225, cols=4225
            has attached null space
            Schur complement A11 - A10 inv(A00) A01
            A11
              Mat Object: (fieldsplit_1_) 1 MPI process
                type: seqaij
                rows=4225, cols=4225
                total: nonzeros=4225, allocated nonzeros=4225
                total number of mallocs used during MatSetValues calls=0
                  has attached null space
                  not using I-node routines
            A10
              Mat Object: 1 MPI process
                type: seqaij
                rows=4225, cols=33282, rbs=1, cbs=2
                total: nonzeros=156930, allocated nonzeros=156930
                total number of mallocs used during MatSetValues calls=0
                  not using I-node routines
            KSP solver for A00 block viewable with the additional option -fieldsplit_0_ksp_view
            A01
              Mat Object: 1 MPI process
                type: seqaij
                rows=33282, cols=4225, rbs=2, cbs=1
                total: nonzeros=156930, allocated nonzeros=156930
                total number of mallocs used during MatSetValues calls=0
                  using I-node routines: found 16639 nodes, limit used is 5
        Mat Object: (fieldsplit_1_) 1 MPI process
          type: seqaij
          rows=4225, cols=4225
          total: nonzeros=4225, allocated nonzeros=4225
          total number of mallocs used during MatSetValues calls=0
            has attached null space
            not using I-node routines
  linear system matrix, which is also used to construct the preconditioner:
  Mat Object: 1 MPI process
    type: nest
    rows=37507, cols=37507
      has attached null space
        MatNest, rows=2, cols=2, structure:
        (0,0) : prefix="fieldsplit_0_", type=seqbaij, rows=33282, cols=33282
        (0,1) : type=seqaij, rows=33282, cols=4225
        (1,0) : type=seqaij, rows=4225, cols=33282
        (1,1) : prefix="fieldsplit_1_", type=seqaij, rows=4225, cols=4225

SNES iterations: 1; SNES converged reason: CONVERGED_ITS
   KSP iterations: 10; KSP converged reason: CONVERGED_RTOL

We can also do monolithic, or “all at once” multigrid. Here we’re using Vanka smoothing. This is supported by a new preconditioner in PETSc PCPATCH.

[24]:
vanka_parameters = {
    "mat_type": "matfree", # We only need the action
    "ksp_type": "fgmres",
    "ksp_max_it": 25,
    "pc_type": "mg",
    "mg_levels": {
        "ksp_type": "chebyshev",
        "ksp_convergence_test": "skip",
        "ksp_max_it": 2,
        "pc_type": "python",
        "pc_python_type": "firedrake.PatchPC",
        "patch": {
            "pc_patch_save_operators": 1,
            "pc_patch_partition_of_unity": False,
            "pc_patch_construct_dim": 0,
            # Topological decomposition
            "pc_patch_construct_type": "vanka",
            # Pressure space is constraint space
            "pc_patch_exclude_subspaces": 1,
            # Configure the solver on each patch
            "pc_patch_sub": {
                "mat_type": "dense",
                "ksp_type": "preonly",
                "pc_type": "lu",
                "pc_factor_shift_type": "nonzero",
            }
        }
    },
    "mg_coarse": {
        "mat_type": "aij",
        "ksp_type": "preonly",
        "pc_type": "lu",
        "pc_factor_mat_solver_type": "mumps",
    }
}

The solver can be invoked as below, but frequently crashes Jupyter notebooks:

[25]:
#w.zero()
#solver = create_solver(vanka_parameters)
#solver.solve()
#convergence(solver)
[ ]: