ENERGY-MINIMIZING, SYMMETRIC DISCRETIZATIONS FOR ANISOTROPIC MESHES AND ENERGY FUNCTIONAL EXTRAPOLATION

. Self-adjoint diﬀerential operators often arise from variational calculus on energy functionals. In this case, a direct discretization of the energy functional induces a discretization of the diﬀerential operator. Following this approach, the discrete equations are naturally symmetric if the energy functional was self-adjoint, a property that may be lost when using standard diﬀerence formulas on nonuniform meshes or when the diﬀerential operator has varying coeﬃcients. Low order ﬁnite diﬀerence or ﬁnite element systems can be derived by this approach in a systematic way and on logically structured meshes they become compact diﬀerence formulas. Extrapolation formulas used on the discrete energy can then lead to higher oder approximations of the diﬀerential operator. A rigorous analysis is presented for extrapolation used in combination with nonstandard integration rules for ﬁnite elements. Extrapolation can likewise be applied on matrix-free ﬁnite diﬀerence stencils. In our applications, both schemes show up to quartic order of convergence.

1. Introduction.Self-adjoint differential operators are common in many applications.When discretizing such operators it is often essential to maintain the symmetry.This may have theoretical reasons when the eigenstructure of the matrix is of interest.Often there are also practical reasons, when solvers should be applied that rely on symmetry, see, e.g., [21].In a finite element setting, symmetry is naturally preserved.However, for finite differences, a standard derivation of the discretization can lead to a nonsymmetric system matrix when the grid is irregular or when variable material parameters appear in the partial differential equation or the related energy functional J(u).
For self-adjoint model problems, the construction of finite difference schemes that preserve symmetry is a classical topic; see, e.g., [25, p. 196 ff.] or [6,1,24].In this paper, we present a natural approach to obtain symmetric finite difference stencils for anisotropic meshes by considering the energy functional corresponding to the partial differential equation similar as proposed in [25].These stencils here lead to a novel approach to obtain a matrix-free implementation of the discretized operator.This is the key to reduce the memory footprint and in consequence it also helps to reduce the amount of data that must be transferred to the processing units for computing a matrix-vector multiplication.On many modern architectures, memory access bandwidth is limiting the performance of iterative solvers.With matrix-free methods the repeated loading of the system matrix of the discretized operator can be avoided.
Our approach is applicable to Cartesian meshes consisting of rectangular hexahedra or meshes with curvilinear tensor product structure.Such tensor product struc-ture meshes appear in applications where a logically rectangular mesh is subjected to a curvilinear transformation.Typical examples are three-dimensional meshes in cylindrical coordinates or two-dimensional meshes in polar coordinates; cf. Figure 2.1.More general curvilinear coordinate systems are also possible, see, e.g., Figure 5.1.
On a family of successively finer meshes, the finite element or finite difference solution will converge to the solution of the differential equation.Under certain conditions, this limit process is governed by asymptotic error expansions, whose dominating terms can be eliminated by extrapolation [4,16] to obtain results with higher order accuracy.The combination of extrapolation with multilevel and multigrid solvers seems in many ways natural and has thus recently seen renewed interest [17,5,23].However, classical extrapolation techniques for boundary value problems depend on the regularity of the solution which limits their applicability.Therefore, so-called implicit variants of extrapolation have been developed [20,19,10].These methods are related to the so-called τ -extrapolation, a technique that has been proposed in the combination with multigrid solvers [9,3].The theoretical justification of these techniques does not depend on the global regularity of the solution so that a wider range of applicability is possible.
On particular application of our here presented approach arises in the context of Tokamak fusion plasma modeled in Gyrokinetic codes such as Gysela [8]; see also our follow-up paper on the connection and application in detail [14].
In this paper, we will present an implicit extrapolation strategy that combines nodal finite element functions or finite difference discretizations for different meshing parameters with the goal to increase the resulting order of convergence on curvilinear and anisotropic meshes.These methods can then be naturally combined with fast multigrid solvers.For finite elements, we use a nonstandard integration rule from [15,11] to show equality between an extrapolated stiffness matrix from linear nodal basis functions and the stiffness matrix from a quadratic nodal basis set.An extrapolated finite difference scheme will be derived from the underlying energy functional by extrapolating the discretized form of the functional.Here, the discretized functional leads to symmetric matrices that can be combined analogously as the stiffness matrices for finite elements in an attempt to eliminate the dominating error terms.Numerically, we will see that the stencils derived and extrapolated in this form also yield improved convergence rates when combined equivalently to their corresponding finite element counterpart.where u comes from a suitable space and where certain boundary conditions such as u = 0 on ∂Ω are imposed.Additionally, f : Ω → R and A = α 11 α 12 α 12 α 22 : R 2 → R 2 is a symmetric, continuously bounded positive definite diffusion tensor with eigenvalues

Model problem
and logical (center) representation of an annulus, with an anisotropic, tensor product structured mesh, using polar coordinates.Mesh element

in gray in both meshes; close-up view (right) with further subdivision into two triangles (bottom right). Additional regular subdivision of a mesh element, as used for our extrapolation method from section 4, indicated by dashed lines (left to right) and quadratic markers for intermediate nodes (right).
In order to derive finite difference stencils that lead to symmetric linear systems, we consider the energy expression (2.2) element by element.Here, we assume a mesh of rectangular hexahedrals in three dimensions or rectangles in two dimensions which can be extended to a mesh of tensor product structure.Prominent examples of tensor product structured domains and meshes arise from problems posed in curvilinear coordinates such as polar coordinates.They are used to describe physical domains with curved coordinate lines by a logical domain of, often, Cartesian coordinates as illustrated in Figure 2.1 (left and center).2D curvilinear coordinates can also be used to describe the cross sections of more complicated three-dimensional geometries.An example motivating this method in this article in particular, is the Tokamak geometry used in plasma fusion [2,28].
We will restrict our study to arbitrary curvilinear transformations of an anisotropic Cartesian mesh in two dimensions, but the approach can be easily extended to the three-dimensional case.In the following, our presentation will consider the case of a (deformed) annulus described by the physical description Ω := F (Ω ), given by an invertible function F : Ω L → Ω such that DF exists almost everywhere, F (r, θ) = (x, y), and a logical domain Ω = (r min , r max ) × [0, 2π), where r min > 0. As the deformed annulus is just a pleasant representation of a tensor structured mesh, we exclude r min = 0 here.Note that (r min , r max ) × 0 effectively represents interior points of the domain so that we enforce periodic boundary conditions thereon.The following study, however, remains also valid in case of Cartesian coordinates and/or Dirichlet or Neumann boundary conditions.
The resulting anisotropic mesh is given by the nodes (r i , θ j ) ∈ Ω L with We denote h := max i h i and k := max j k j and introduce the notation for any function v depending on (r, θ) and q 1 , q 2 ∈ [−1, 1].
Remark 1. Due to our anisotropic meshes, we may have h s = h s−1 and k t = k t−1 .Our notation has to be understood as and not as v (i−1)+q,j = v(r i−1 + qh i , θ j ).The subscript (i − 1) + q is thus understood as an index only, not as a computation.The same holds for the second variable.
Using the mapping F , each mesh element R i,j = [r i , r i +h i ]×[θ j , θ j +k j ] of the logical domain Ω L has its curvilinear representation in F (R i,j ) ⊂ Ω; see Figure 2.1 for a simple example.After subdivision of R i,j into two triangles T 1 i,j and T 2 i,j (see Figure 2.1; bottom right), these triangles are naturally represented by F (T m i,j ), m = 1, 2. By transformation, we obtain The functional arguments are A = A(F (r, θ)), DF −T = DF −T (F (r, θ)), and det DF = det DF (r, θ).Note that in principle we have to distinguish u(r, θ) and u(x, y).We refrain from doing so to not overload the notation, the variables should become clear from the context.In order to simplify the notation, we define and obtain the localized, reformulated energy expressions on R i,j and T m i,j , m = 1, 2, respectively, as (2.6) Note that (2.5) is symmetric and thus a rθ = 1 2 a rθ + 1 2 a θr .Additionally, note that a rθ = 0 if the diffusion tensor was diagonal and the physical domain could be described by, e.g., a simple polar coordinates transformation.
Obviously, the global energy can be written as Mesh elements R i,j (left; five and nine point stencil) or T 1 i,j and T 2 i,j (right; seven point stencil), respectively, for which u(rs, θt), for fixed but arbitrary s and t, appears in the discretized, localized energy functional.
3. Discretization of local energy expressions with finite differences.Our approach to derive discretizations of the PDE is based on first discretizing the energy, i.e. the integral J(u) represented by a sum of integrals over elements as in (2.7).Thus we now consider either the localized energy J(u) on the mesh element R i,j or on T 1 i,j and T 2 i,j .To obtain five-point or nine-point stencils, we consider rectangular elements.Triangles would have to be considered if seven-point stencils with support as illustrated in Figure 3.1 are derived.Note that for simple geometries and coefficients seven-or nine-point stencils may also degenerate to five-point stencils.
Numerical methods for integrals of the form (2.6) have been studied in [15] for the case of triangles and quadrilaterals.These methods are based on a combination of elementary finite difference formulas to approximate the derivatives and elementary numerical integration rules.Note that the discretization of these integrals is related to the process of computing the element stiffness matrices and assembling them to a global system.However, for deriving finite element discretizations, special shape functions are used.These shape functions are often polynomials so that they can be differentiated analytically, and the exact derivatives can be multiplied with the coefficients of the PDE.Numerical quadrature formulas are then applied in order to obtain the local element stiffness matrices.
The approach taken here differs from this standard finite element approach by using difference formulas to approximate the derivatives.At first sight replacing analytical derivatives by finite differences seems to introduce additional errors.As shown in [15], however, this does not necessarily lead to higher overall errors.Furthermore, as in [15], we will exploit that the combination of elementary difference and quadrature leads to approximations with asymptotic error expansions.This in turn invites the use of Richardson-type extrapolation.Note that this use of extrapolation differs from the classical one as e.g.analyzed in [16], where extrapolation is applied to numerical approximations of a PDE with different mesh sizes.In this case, extrapolation relies on the existence of expansion of the numerical PDE solution which in turn depends on the global regularity of the solution.In this article, we will use extrapolation implicitly, applied to the energy functional locally.Thus, extrapolation is used in this article to construct a higher order discretization by using higher order accuracy to compute the stiffness matrices.In practice, the improved discrete systems can be effectively constructed by combining low order difference stencils that originate from different mesh resolutions, similar as they are used in a multigrid algorithm.Different from conventional multigrid, we will use the hierarchy of different meshes here not only to accelerate the convergence of the linear solver, but additionally to raise the order of approximation.
We proceed by introducing the difference operators that are central O(h 2 ) and O(k 2 ) approximations for the derivatives v r and v θ , respectively, for all functions v who satisfy the necessary smoothness assumptions.For constructing the necessary approximations, we additionally introduce the averaging operators that constitute O(h 2 ) and O(k 2 ) accurate approximations to v as before.Note that this is in analogy to definitions (3.6) and (3.5) of [15], respectively.
The application of a midpoint rule to integrate a rr (r, θ)u 2 r (r, θ) over [r, r + h] (see (2.6)) formally requires a function evaluation at r + h/2.Then, besides the straightforward the resulting approximation by quadrature can further be approximated by Further variants combining differencing using δ with averaging using µ are possible when the second variable θ is also considered .Because of the differencing on u and averaging applied to a, all the function evaluations are shifted to r and r + h, similar as they would occur in a trapezoidal rule.Though at first sight the averaging on a rr seems to introduce additional errors, we use it here in a context that it has the effect of replacing a midpoint integration by a trapezoidal integration, both of which produce asymptotically the same error estimates.We also refer to [15] where it is shown that the averaging does not necessarily produce worse errors when developing cubature formulas for integrands containing derivatives.The averaging can also be seen as some linear interpolation.
Note that the roles of quadrature first and differencing/averaging second can also be inversed as long as the integrand is also defined accross the boundaries of the quadrilaterals.
Then, going one step further and using differencing and averaging in the θdirection, we can consider the integral over R i,j .Evaluating the numerical approximation of the integral by a tensor product midpoint rule (also called center rule in [15]) eventually only requires function values of u and a rr at the corners of the rectangle.Note that this has been introduced as Discretization 2 already in [15].Additionally, we can e.g.use [15,Theorem 3.5], showing that this approximation of the integral containing derivatives enjoys an even error expansion in powers of h, i.e. the odd error terms vanish.We also point to [15,Theorem 3.6], stating here that when a rr and u r are polynomials so that the original integrand a rr u 2 r is a polynomial of degree p, then the error expansion is finite with highest term of O(h p ) (in case of integrating over a rectangle).
Specifically, combining both theorems from [15], the approximation to the integral will be exact when u and a rr are linear polynomials on R i,j .
We next note that trivially the same techniques can be applied to integrands involving derivatives of u in θ directions, while rather straightforward modifications are necessary for the mixed terms.
Function values of u s,t = u(r s , θ t ) are only needed from the adjacent mesh elements; see the indicated elements in Figure 3.1.For a more general discussion on the how to choose these values, we refer to the guidelines (G1-G3) discussed in [15].With an appropriate choice of finite difference rule and the integration rule, all function values needed will lie in the support of the corresponding basis functions in the finite element counterpart.Eventually, defining u = (u 1,1 , u 1,2 , . . ., u 2,1 , . . ., u nr,n θ −1 ), we obtain a quadratic form to represent the discretized energy.The numerical solution is characterized as minimizing this energy.
In order to minimize (3.5), we compute the zeros of the derivative with respect to u, leading to a system of linear equations.Note again that each u s,t only appears in few of the discretized summands of (2.7) so that the resulting system is sparse, see Figure 3.1 and subsection 3.1 for exemplary details.

A symmetric five point stencil.
We expect the five point stencil to yield quadratic convergence only in the absence of mixed terms u r u θ in the energy functional.Therefore, we assume a rθ = 0 in this section.This, e.g., holds when standard polar coordinates are used and when a diagonal diffusion tensor is considered.We now explicit one approach generically described in the previous section.We use the midpoint rule in r for the first summand (with derivative in r) and the midpoint rule in ϑ for the second summand (with derivative in ϑ).For the element We then can replace the partial derivatives by using (3.1) and (3.2).If the coefficient a rr and a ϑϑ are known as continuous (and analytical) functions, one may evaluate the functions at center points.If we want to get rid of these intermediate evaluations, we use linear interpolation as given by the the averaging operators (3.3) and (3.4).Eventually, trapezoidal or midpoint rule is used to integrate the remaining integral terms.We then have Using the trapezoidal rule, we also obtain Using h i ≤ h 2 and k j ≤ k 2 , i, j ≥ 1, and the vector of function values u = (u 1,1 , u 1,2 , . . ., u 2,1 , . . ., u nr,n θ −1 ), we implicitly obtain the discrete energy operator E such that Remark 2. Although we allow for different discretizations in r and θ, we assume to have a 0 < τ < ∞ such that h = τ k.
In order to minimize (3.9), we have to compute the derivative of (3.9) with respect to u.However, we never explicitly compute E. Instead, we remark that for any interior point (r s , θ t ) the derivative ∂ ∂us,t only depends on four summands; cf Figure 3.1.Thus, Equivalently we obtain the derivatives for the expressions with (i, j) ∈ I s,t \ (s, t).The critical point condition then yields, after reordering of the summands, a five point stencil, whose entries are with right hand side

A symmetric nine point stencil.
The nine point stencil is obtained similarly to the the five-point stencil in subsection 3.1, just by acknowledging a rθ = 0. We proceed by first applying the midpoint rule to evaluate the integrals, then we use (3.1)-(3.2) to approximate the derivatives and finally the averaging operators (3.3)-(3.4)for approximating the remaining terms.We thus have function evaluations only at the corner points of the elements.We obtain By following the same lines as in subsection 3.1 and equations (3.9)-(3.11)for our modified Jlhs Ri,j (u) and the same Jrhs Ri,j (u), we obtain a nine point stencil with the entries with right hand side as given in (3.14).

Symmetric seven point stencils.
Symmetric seven point stencils with support as depicted in Figure 3.1 (right) are slightly more laborious to obtain.Instead of considering R i,j , one considers T 1 i,j and T 2 i,j (see Figure 2.1; bottom right).We refer to [15,Sec. 4.2] for further details.If A is symmetric positive definite, then the operator could also be rewritten to avoid the mixed terms as it was done for [15, Discretization 7].

Extrapolation methods based on a two-level hierarchical refinement.
In the following we will present an extrapolation method based on [11] where intermediate nodes are introduced in the 2D grid.After some preliminary notation, we first provide the finite difference stencils and finite element discretizations used with extrapolation; see subsection 4.2, subsection 4.3, and subsection 4.4.
In subsection 4.4, we provide the concrete extrapolated stiffness matrices and right hand sides; in subsection 4.4.1, we give some additional theory, provided nonstandard integration rules are used with finite element discretizations.

Two-level-hierarchical grids.
Let us consider a two-level-hierarchical grid, i.e., for any anisotropic grid as given before, we incorporate all intermediate points (r i+ 1  2 , θ j ), (r i , θ j+ 1 2 ), and (r into the set of nodes on Ω ; see Figure 2.1.The coarse nodes are then given by {x 1 , . . ., x nc }.The new (logical) nodes are {x nc+1 , . . ., x n f } and the refined mesh is given by {x 1 , . . ., x n f }.We denote the corresponding triangulations by T C and T F .

Finite difference stencils.
With the two-level-hierarchical grid given, we apply our compact five-or nine-point finite difference stencils from (3.13) or (3.16), respectively.We then obtain two stiffness matrices, generically denoted K F and K C , on the fine triangulation T F and on the coarse triangulation T C , respectively.

4.3.
Finite elements with standard and nonstandard integration.Before introducing our choice of numerical integration, we formally define the nodal linear finite element spaces which satisfy ∆ ∈ supp(ϕ * s ) ∩ supp(ϕ * t ).Then, we have where ϕ L C α and ϕ L C β , α, β ∈ {1, 2, 3}, are the corresponding functions on the reference element F −1 (∆) = T ; see Figure 4.1 and the appendix of our preprint [13].As before, we have reduced the notation by defining

5). The functional arguments are DF
Standard numerical integration of the bilinear form.A standard rule of numerical integration, which is exact for polynomials p ∈ P 2 , is given by the Newton-Cotes formula see Figure 4.1 for a definition of ξ (T,i) , i = 1, 2, 3.In this paper, when refering to standard numerical integration, we always use (4.4) for (4.2).
Nonstandard numerical integration of the bilinear form [15,11].In order to introduce the nonstandard integration rules, we modify expression (4.2).As in [15,11], we introduce the directional derivative and obtain after a short computation from the last line of (4.2), the expression Here, we have used a generic superindex • * to no further define the basis functions ϕ * α and ϕ * β ; which, in the sequel, can also be chosen from (4.11) instead of (4.1).The nonstandard integration rule from [15,11] then writes where the evaluation points ξ (T,n) are defined in Figure 4.1.
Remark 3. Geometrically and more generally spoken: we evaluate the partial derivative with respect to the direction ξ n and the coefficient b ξnξn at the center node of the triangle edge which is parallel to the direction ξ n , n ∈ {1, 2, 3}.This integration formula is only exact for constant summands in (4.8).Although b ξnξn is not constant in our examples, we will see in section 5 that this formula yields cubic convergence as predicted when combined with extrapolation.
Numerical integration of the linear form.Independently of the integration of the bilinear form, the linear form will be approximated by the Newton-Cotes rule where z i , i ∈ {1, 2, 3}, are the corner nodes as indicated in Figure 4.1.This numerical integration rule is exact for polynomials p ∈ P 1 .

Extrapolated matrices and vectors.
In this section, we present our extrapolation strategy for finite differences and finite elements.We will focus on our compact finite difference stencils as summarized in subsection 4.2 as well as on nodal linear finite elements numerically integrated by one of the rules presented in subsection 4.3.
Given a discretization scheme on T F as well as its corresponding scheme on T C , we assemble two stiffness matrices, generically denoted K F and K C , on the fine and the coarse mesh, respectively.We introduce the extrapolated stiffness matrix and corresponding right hand side by Here and in the following, the nodes, matrices, and vectors are ordered such that the coarse nodes, denoted by index • c , come first and the intermediate fine nodes, denoted by index • f , come second.
In practice, we will see that this extrapolation step can increase the convergence order from two to three.From a theoretical point of view, we can prove that the sytem (4.10) is identical to one of quadratic basis functions if nonstandard numerical integration rules are used; see subsection 4.4.1.
Remark 4. We remark that the use of quadratic basis functions versus linear ones will lead to better accuracy even if the solution exhibits regularity only locally.For example, in the case of a reentrant corner, the solution is typically smooth away from the singularity and the use of higher order elements there is well justified.Consequently, also the implicit extrapolation will be beneficial in this case, and this improvement of accuracy is achieved independently of the singularity is treated.Here, for example, methods as developed in [7,18] could be used.We point out that this is in contrast to standard explicit Richardson-style extrapolation methods applied to PDE.Their justification relies on global asymptotic expansions.These expansions are affected by the so-called pollution effect caused by singular solutions at a reentrant corner that leads to a global error term of lower order.Standard Richardson extrapolation will only eliminate the higher order error terms, the global pollution error term will remain.Only if the global asymptotic error expansion exists and its order coefficients are known the explict extrapolation can also be modified such that it eliminates the pollution effect.This would have to be done for each reentrant corner singularity.

Theoretical results for finite elements with nonstandard integration.
In order to show some theoretical results, we have to introduce some additional finite element spaces.As our work is essentially based on [11], we analogously assume that our coefficients are smooth enough to justify higher order approximations at all.Additional spaces and related numerical integration rules.The nodal quadratic (V Q F ) as well as the two-level h-(V H F ) and p-hierarchical (V P F ) finite element spaces on the fine mesh are given by For the generic macro element T 1 i,j , similar to (4.2), we consider where ϕ * s and ϕ * t are two finite element basis functions from a space defined in (4.11) with common support on T 1 i,j .In order to distinguish the different basis functions, we indicate * ∈ {Q, H, P }.

Nonstandard numerical integration of the bilinear form on V H
F .For the linear h-hierarchical basis functions from V H F , the nonstandard integration of (4.12) where ϕ H α and ϕ H β , α, β ∈ {1, . . ., 6}, are the corresponding basis functions on the subdivided reference element and where the evaluation points ξ (m,n)

Numerical integration of the linear form on V H
F .The right hand side expression will be approximated by applying the quadrature rule (4.9) on each micro element ∆ m , m ∈ {1, 2, 3, 4}.

Numerical integration of the linear form on V P
F and V Q F .The right hand side expression for p-hierarchical and quadratic basis functions will be obtained by the quadrature rule (4.4) on the entire reference element T (i.e., without subdivision).
We denote the corresponding stiffness matrices on the spaces (4.1) and (4.11) and numerically approximated by either (4.8), (4.13), or (4.14) F and proceed accordingly for the right hand sides and vectors on the corresponding spaces.
Theoretical results.First, we rephrase [11,Theorem 3.3] for the nodal bases from V L F (and V L C C ) and V Q F .We will then show a relation between the nodal and the hierarchical approach as it was presented in [10] for the case of constant coefficients.as given by (4.10).Furthermore, let the quadratic stiffness matrix K Q F be obtained by (4.14).Let the right hand sides be accordingly obtained by (4.9) for the linear case and by (4.4) for the quadratic case; and f L,ex F computed as in (4.10).Then, we have and, trivially, both systems K L,ex Proof.For the linear nodal basis, we map each triangle ∆ = ∆ m i,j ∈ T F , m = 1, . . ., 4 onto the reference element T without subdivsion.However, the nonstandard integration rules (4.8), (4.13), and (4.14) are consistent in the way that the directional derivatives are always evaluated on the triangle edge which is parallel to the direction; cf. also Remark 3. Except for the last term in (4.14), they are also always evaluated at the center node of the corresponding edge.
Since the element to element transformations are only affine transformations which keep parallel lines parallel, we can theoretically consider macro elements for the linear nodal basis functions.Instead of (4.8) for each ∆ F ⊂ ∆ C ∈ T C , we then consider (4.13) with basis functions ϕ L α and ϕ L β , α, β ∈ {1, . . ., 6}.For a compact sum presentation, we define ϕ L C γ := 0 for γ > 3 although these functions do actually not exist; they are removed from the following formulas by the indicator function anyway.
In accordance to (4.10), we obtain the stiffness matrix K L,ex F by where I ≤3 (α, β) is the indicator function Note that the second line in (4.16) accounts for the second summand in K L,ex F which is only subtracted from the coarse block; cf.(4.10).
The proof of K L,ex F = K Q F is only based on the comparison of functional values; cf.[11].For the sake of convenience, we provide the values of the derivatives in the appendix of our preprint [13].Considering (4.14) with ϕ Q α and ϕ Q β , α, β ∈ {1, . . ., 6}, and (4.16), we easily see from the appendix of our preprint [13] that the first parts are identical.The same applies to the expressions in the second lines since the directional derivatives of ϕ Q γ , γ ∈ {4, 5, 6}, are zero at ξ (T,n) , n ∈ {1, 2, 3}.The numerical integration rules (4.9)only uses node evaluations where the corresponding basis functions are either zero or one.We refrain from providing the values in an additional table.For the right hand side expression f L,ex F and α ∈ {1, . . ., 6}, we consequently have 4 3 ) to obtain f Q F yields the same values.We now show a relation between the nodal approach using K L,ex F u = f L,ex F and the hierarchical approach K H,ex F u = f H,ex F using the nonstandard quadrature rules.Le us therefore introduce the transformation between the nodal and the hierarchical basis on the (logical) domain Ω , if there exists an edge e in T C s. t. x s ∈ e and x t ∈ ∂e, 0, otherwise. (4.17) Note that edges are open sets, i.e., • e = e.The inverse transformation from nodal to hierarchical basis is Lemma 4.2.For the stiffness matrices K L F and K H F built from (4.8) and (4.13), respectively, and For the right hand sides f L F and f H F built from (4.9) and (4.4), respectively, and Remark 5. Lemma 4.2 seems obvious.However, for our nonstandard quadrature rules, it may not be clear that the coefficients b (ξiξi) are evaluated implicitly at exactly the same nodes when Proof.As in the proof of Theorem 4.1, we use the macro element for the theoretical assembly of K L F .We have In order to show the equality of the two matrix expressions, we always consider the corresponding computations on the reference element.On the reference element, the lower diagonal block of the transformation T H from the hierarchical to the nodal basis is Note that the functions ϕ * α are associated with the nodes z α , α = 1, . . ., 6; cf. Figure 4.2.For the nodal space, we have the basis functions { ϕ L α } α=1,...,6 , for the hierarchical space, we have the basis functions { ϕ L C α } α=1,...,3 ∪ { ϕ L α } α=4,..., 6 .Since the basis functions on z 4 , z 5 , and z 6 and the integration rules are identic on the macro element (cf.(4.8) and (4.13)), we trivially have The remaining part of the proof only relies on the values of the directional derivatives at the evaluation points given in Figure 4.2.
Theorem 4.3.For the stiffness matrices and right hand sides approximated by our (nonstandard) integration rules as described, we have Proof.In Lemma 4.2 we have shown that The proof is complete by using Theorem 4.1.
Remark 6 (Note on boundary conditions).Note that the implementation of Dirichlet boundary conditions and the symmetrization of the matrix can be done for the matrices on both levels (fine, F , and coarse, C) before combining the corresponding matrices and vectors to the extrapolated matrix and right hand side (4.10).This yields the same solution as in the case where the boundary conditions are implemented for the extra-polated matrix directly.However, both systems are not identical since the equations of boundary conditions on fine nodes are scaled by the factor 4  3 .where r 1 > 0; as introduced and then used in [2,28] to describe tokamak cross-sections from fusion plasma applications.According to [2,28], we use κ = 0.3 and δ = 0.2.We also provide results for κ = δ = 0 which results in a standard polar coordinate transformation.We use an anisotropic grid which is disturbed in θ-direction and refined in r-direction such that h max /h min = 8; cf. Figure 5.1.
Furthermore, we use the manufactured solution, u(x, y) = (1.3 2 − r 2 (x, y)) cos(2πx) sin(2πy).(5.3)We refrain from giving the right hand side f , which is a more than lengthy expression.We thank Edoardo Zoni for providing his Python script for the symbolic differentiation needed to compute the right hand side.The Dirichlet boundary conditions on (r, θ) ∈ {r 1 , 1.3} × [0, 2π] are given accordingly.
Note that the two-level grids yield a natural set-up to use the presented extrapolation method in terms of a multigrid algorithm as presented in [10,11].In the current contribution, we will, however, build the matrix (4.10) for the extrapolated finite elements or differences explicitly.The interpretation as a multigrid method and the design of an efficient multigrid solver is left to the future work [14].Here, all linear systems, in the standard or extrapolated approach, will be solved with a direct method.

Convergence results for tensor product structured grids.
In this section, we consider two different geometries bounded away from the artificial singularity at r 1 = 0.1.On the (deformed) disks, we use a representative anisotropic, tensor product structured grid.We see that all our developed discretizations yield quadratic convergence for the circular as well as the deformed geometry; cf.If the extrapolation techniques as presented are used, i.e., if the stiffness matrices of two levels are combined, we obtain even fourth order convergence in 2 -norm for the circular geometry; cf. Figure 5.2.For the deformed geometry, the 2 -convergence is still much better than cubic but only almost of fourth order Figure 5.3.Depending on the geometry, the convergence in inf-norm is between third and fourth order.Despite the anisotropic mesh, the coefficient distribution, and the (deformed) geometry, we see that the convergence in 2 norm is even (almost) quartic.
Remark 7. The standard FE analysis predicts only third order for quadratic elements.However, we employ meshes that can be considered as created by a piecewise smooth transformation from a uniform mesh, so that certain symmetry conditions are approximately satisfied.Therefore we observe superconvergence effects.Note that extrapolation for symmetric finite differences on a uniform mesh would be expected to lead to fourth order accuracy, since only even orders appear in the asymptotic error expansion when the problem is regular enough.A detailed analysis of these effects will be an interesting direction for future research.
In order to directly highlight the differences in the geometries considered in

Convergence results for curvilinear coordinates when approaching the origin.
In this section, we consider the additional difficulty introduced by ap-  proaching the artificial singularity, i.e., we choose r 1 = 10 −6 .We only consider the more complex, deformed geometry.
In Figure 5.4, we see that the standard integration for P 1 elements does not lead to optimal convergence.Our nonstandard integration as well as the nine point stencil, on the other hand, yield qualitatively identitic results as in the case where r 1 = 0.1.
The suboptimal convergence of standard integration P 1 elements in fact results from an insufficient approximation around the artificial singularity where the exact solution oscillates.It is possible to retrieve an optimal convergence behavior by refining the mesh towards the origin.In Table 5.1, we present the results for a mesh built according to [12], i.e. with nodes r i = r 1 + (1.3 − r 1 )(i/n r ) 3 , i = 0, .., n r , and uniform θ j , j = 1, .., n θ ; see Figure 5.1 (right).

Conclusion.
We have presented finite difference discretizations maintaining, in a natural way, the symmetry of partial differential operators on anisotropic tensor product structured grids.In the absence of mixed terms u r u θ in the considered energy functional, quadratic convergence can be achieved with the presented five point stencil.Otherwise, nine point stencils should be used.We have also presented finite element approaches using nonstandard integration rules.
Furthermore, we have presented an extrapolation method combining grids with different discretization parameters to raise the convergence order.For the presented nonstandard finite element integration, we can provide a rigorous analysis.As predicted by the theory, the approach using nonstandard integration on linear elements resulted in the same discretization as a quadratic basis approach.In the experiments, the presented finite difference schemes behave similarly although lacking a rigorous proof.We obtain cubic convergence in the inf-norm for extrapolated finite element and finite difference discretizations and up to quartic convergence in 2 -norm.
A natural way to use our presented extrapolation technique are multigrid methods where the necessary two grid operators are inherently present.In fact, we believe this combination is one of the most efficient (in terms of memory requirements and floating point operations per second) techniques to reach higher order accuracy.For the combination of multigrid with our extrapolation technique, see [14].

J
and localized energy expressions.For a given domain Ω ∈ R d , d = 2, 3, the well-known model problem of finding a solution of the partial differential equation −∇ • (A∇u) = f in Ω, (2.1) can be traced back to the energy minimizing problem of min u
1) on T F and T C , respectively.Now, let ∆ ∈ {T F , T C } and correspondingly * ∈ {L, L C }. Consider two finite element basis functions ϕ * s and ϕ * t from the corresponding space defined in (4.1)

(4. 11 )
where V L C C was defined in (4.1).For the spaces from (4.11), we use a different reference element representation.Instead of mapping each ∆ ∈ T F onto the reference element, we map macro elements ∆ ∈ T C such as T n i,j = 4 m=1 ∆ m ij,n as defined in Figure 2.1, n ∈ {1, 2}, onto a subdivided reference element T = 4 m=1 ∆ m ; see Figure 4.2.

Theorem 4 . 1 .
Let the stiffness matrices K L F and K L C F be obtained by (4.8) and combined in K L,ex

F
5, 6} on the reference element.Here, f := f | det DF || det D F | and z α are defined as in Figure 4.2.Using (4.4 [γ, β] ⊂ e T C :⇔ The straigt line [z γ , z β ] represents one half of the closure of an edge e in T C ;
Figure 5.2-5.5, we display the parameters that change in bold figure.