The Adaptive Cross-Approximation Technique For The 3-D Boundary-Element Method PDF

IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO.
2, MARCH 2002
421
The Adaptive Cross-Approximation Technique

for the 3-D Boundary-Element Method
Stefan Kurz, Member, IEEE, Oliver Rain, and Sergej Rjasanow
AbstractIt is well known that the classical boundary-element

method (BEM) yields fully populated matrices. Their manipulation is cumbersome with respect to memory consumption and computational costs. This paper describes a novel approach where the
matrices are split into collections of blocks of various sizes. Those
blocks which describe remote interactions are adaptively approximated by low rank submatrices. This procedure reduces the algorithmic complexity for matrix setup and matrix-by-vector products to approximately ( ). The proposed method has been examined in a testing environment and implemented into an existing
BEM-finite-element method (FEM) code for electromagnetic and
electromechanical problems. The advantages of the new method
are demonstrated by means of several examples.
Index TermsBoundary-element methods, fast methods, finiteelement methods.
I. INTRODUCTION
HE APPLICATION of the boundary-element method

(BEM) for the solution of linear electromagnetic problems
has many advantages. Only the boundaries of the considered
domains need to be discretized, open boundary problems pose
no additional difficulties, and problems including motion can
be treated elegantly. However, application of the BEM leads to
dense matrices. The storage requirements and computational
, where
is the number of unknowns,
costs are of
when a preconditioned iterative solver is applied. This means
that only relatively small problems can be solved on usual PCs
or workstations. One remedy could be the exploitation of the
parallelism inherent to the BEM [1].
In this paper, a different approach is presented, which
reduces the algorithmic complexity for matrix setup and ma. This approach
trix-by-vector products to approximately
is called adaptive cross approximation (ACA) and will be
explained in detail in Section II. The first part of Section III
is devoted to the solution of the Laplace equation by means
of the ACA-BEM. These computations have been performed
in an ACA testing environment to collect informations about
memory requirements, compression rates and CPU times. In
a second step, the ACA algorithm has been implemented into
an existing BEM-finite-element method (FEM) code for the
solution of electromagnetic and electromechanical problems.
The second part of Section III reports results obtained by this
Manuscript received July 5, 2001; revised October 25, 2001.

S. Kurz and O. Rain are with the Robert Bosch GmbH, 70049 Stuttgart, Germany (e-mail: stefan.kurz2@de.bosch.com; oliver.rain@de.bosch.com).
S. Rjasanow is with the Universitt des Saarlandes, Fachbereich Mathematik,
66041 Saarbrcken, Germany (e-mail: rjasanow@num.uni-sb.de).
Publisher Item Identifier S 0018-9464(02)02351-8.
Fig. 1. Clustering for a simple example with ten collocation points. A large
distance between two collocation points results in a large difference of the
respective equation numbers.
code in connection with ACA. A comparison to existing fast

methods for the BEM can be found in Section IV.
II. THE ADAPTIVE CROSS APPROXIMATION
Large dense matrices coming from integral equations have
no explicit structure in general. However, it is possible to find a
permutation so that the matrix with permuted rows and columns
contains rather large blocks close to some low-rank matrices
[2][5].
To find a suitable permutation, a cluster tree is constructed by
recursively partitioning the collocation points according to some
geometrical criterion. A simple example for such a clustering is
given in Fig. 1. A large distance between two collocation points
results in a large difference of the respective equation numbers.
Next, cluster pairs which are geometrically well separated are
identified. They will be regarded as admissible cluster pairs,
e.g., the clusters {1, 2, 3, 4, 5} and {8, 9, 10} in Fig. 1. The
cluster tree together with the set of admissible cluster pairs
allows to split the matrix into a collection of blocks of various
sizes. The block structure for the simple example is shown in
Fig. 2. Since the off-diagonal blocks which describe remote interactions are close to some low-rank matrices, it might be a
good idea to approximate them by low-rank matrices. We are,
thus, led to the following matrix approximation problem for the
individual blocks of the given matrix.
and an accuracy
, find
Given a matrix
an approximant with
and provide the
.
minimal possible value for
0018-9464/02$17.00 2002 IEEE
422
IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO. 2, MARCH 2002
Fig. 2. The permuted matrix for the example depicted in Fig. 1 contains rather
large off-diagonal blocks which describe remote interactions and which are
close to some low-rank matrices.
Here,
denotes the Frobenius norm of the matrix . The
solution of this problem is given by the singular-value decomposition (SVD) of the block
Fig. 3. TEAM problem 10. An exciting coil is set between two steel channels,
and a steel plate is inserted between the channels. The surfaces of this geometry
have been discretized by linear triangular elements to obtain an input mesh for
the ACA testing environment.
(1)
and denote the greatest singular triples of the
where
matrix and the rank is chosen so that the required accuracy
of the approximation is fulfilled.
Since the SVD requires the computation of the whole matrix
in advance and since SVD is rather expensive with respect
this analytical solution is not
to numerical work
practicable.
We present now the algorithm of ACA, which allows to generate only few rows and columns of the matrix and approximate the rest of the matrix using only this information.
and for
compute
Let
the algorithm updates the approximation

to
. Note that
contains the exact pivot rows and pivot
the approximation
. An appropriate stopping
columns of the matrix for all
criterion is given by
(2)
Since the matrix will not be generated completely only the
is available. This norm can be
norm of the approximation
computed recursively the following way:
(3)
The amount of numerical work required by the ACA algorithm
. Thus, if the numerical rank of the approxis
imation remains constant (which is usually the case), then the
total numerical work for the approximation and the memory re.
quirements are both of the order
III. EXAMPLES
A. Application to the Laplace Equation
This algorithm produces a sequence of decompositions of the

, where
is a low-rank
matrix into a sum
matrix (
) and
denotes the error of the approximation. It is important to remark that neither the matrix
nor the error
will be computed completely. In the first
step of the algorithm, the row with index
of the matrix
will be generated and the corresponding row of the error
will be computed. During this computation the position and the
-row of
will be
value of the maximum element in the
determined (Step 2). This element will be called the pivot element. In Step 3, the
-row of
will be normalized and
. Since the position
of the pivot element
denoted by
in the
-row of
is known we are able to compute the corresponding column of this matrix and denote it as
(Step 4).
During the computation the position of the next pivot element in
-column will be fixed (
) in Step 5. The last step of
the
Now we apply the ACA algorithm to two mesh sequences.

The aim of these computations is to examine the numerical
properties of the ACA algorithm rather than to solve a technical problem. The ACA testing environment deals with an ex.
terior Dirichlet problem for the Laplace equation
The considered boundary surface is discretized by linear triangular elements. The potential is represented by a singleand a double-layer potential (direct method). Nodal collocation
yields a linear system whose system matrices are approximated
by means of the ACA. The approximated system is solved iteratively by using the generalized minimum residual method
5
(GMRES). In all computations, we set the accuracy
10 .
First, we consider the geometry of testing electromagnetic
anaylsis methods (TEAM) problem 10 [6]. The coarsest mesh
5000 collocation points is shown in
with approximately
Fig. 3.
We perform two mesh refinements in order to get meshes
with about 20 000 and 80 000 collocation points, respectively.
KURZ et al.: THE ACA TECHNIQUE FOR THE 3-D BEM
TABLE I
MEMORY REQUIREMENTS USING THE ACA ALGORITHM
TABLE II
COMPUTATION TIMES USING THE ACA ALGORITHM
The values refer to a 1.2-GHz AMD Athlon PC. Note that the table shows the wallclock
time and not the CPU time. Therefore, it includes the swap time which the computer needed
during the computation for the finest mesh. Still, even the wallclock time does not grow like
423
TABLE III
MEMORY REQUIREMENTS USING THE ACA ALGORITHM
TABLE IV
COMPUTATION TIMES USING THE ACA ALGORITHM
The values refer to a 1.2-GHz AMD Athlon PC.
The size of the approximants and their relative size are given
in Table III. The average scaling factors due to the mesh refinements are 5.7 and 4.8, respectively. Thus, we again observe
the asymptotically linear behavior of the memory consumption.
Analogously to the first example we give the time spent generating the approximants as well as the costs of an iteration step
in Table IV.
The numerical examples above show that the memory usage
of the BEM matrices computed by the ACA method grows almost linearly with the number of unknowns on the boundary.
The same behavior is observable with respect to the time of
approximant generation and the matrix vector multiplication.
Hence, by using the ACA method we are able to handle BEM
problems whose solution by application of standard BEM would
be impossible with the same resources.
Fig. 4. Electromechanical relay. The magnetic circuit consists of a pole core,
a magnetic yoke, and a movable armature. Again, the surfaces have been
discretized by linear triangular elements to obtain an input mesh for the ACA
testing environment.
It means that the memory amount of a fully populated matrix

as well as the costs of the matrix vector multiplication would
grow after each refinement step with ratio 16. Table I shows the
memory requirements using the ACA algorithm for the three
TEAM meshes. There are the real size of the approximants
given as well as their relative size compared to the full storage.
Taking into account available resources, application of the standard BEM would be possible on the coarsest mesh only.
We can observe the almost linear behavior of the memory
usage. The average scaling factor of the matrix size after the first
mesh refinement is equal to 6.0 and decreases to 5.2 after the
second one. Thus, we see that the ratio the matrix size is growing
with gets closer to linear for large . Also, the time needed for
an iteration step of GMRES and for generation of the approximants grows almost linearly, because the costs of the corresponding matrix vector multiplication performed in GMRES directly depend on the matrix size. These data are given in Table II.
The second mesh sequence is based on the geometry of an
electromechanical relay as shown in Fig. 4 and explained in
more detail in [1] and [7].
Again, we consider three meshes and study the behavior of
the memory usage and costs of the matrix vector multiplication.
B. Application to a BEM-FEM-Code
Electromagnetic devices can be analyzed by the coupled
BEM-FEM method, where the conducting and magnetic parts
are discretized by finite elements. In contrast, the surrounding
space is described with the BEM. This discretization scheme is
well suited for problems including moving parts and has been
described in detail elsewhere [7][9].
In the air domain, the BEM is applied to solve the equa, where is the Coulomb gauged magnetic
tion
vector potential and an impressed source current density. This
vector equation decouples into three scalar equations for the
Cartesian components of , so that we are left with the same
situation as in the ACA testing environment. We implemented
the ACA algorithm into the BEM-FEM code and performed
computations for the examples depicted in Figs. 3 and 4. However, quadratic six-noded triangles in connection with quadratic
ten-noded tetrahedra have been employed for this analysis.
TEAM problem 10 has been treated as a magnetostatic
problem (for details, see [10]). The symmetry of the problem
has intentionally been disregarded. Some results are collected
in Table V.
The difference of the flux densities with and without ACA
(0.5%) is much smaller than the difference to the measured value
of 1.67 T (3.4%) which is due to the still relatively coarse mesh.
However, the computer resources for ACA-BEM dropped to
about half the amount needed for the standard BEM.
424
IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO. 2, MARCH 2002
TABLE V
MESH AND COMPUTATIONAL DATA FOR TEAM PROBLEM 10
The values refer to a 300-MHz Sun Ultra workstation.
TABLE VI
MESH AND COMPUTATIONAL DATA FOR THE RELAY
tions [5], multipole decomposition [11]) is that only the original entries of the system matrix are used for its approximation.
Thus, the already-developed procedures for the generating of
the BEM matrices can be used after some minor modifications.
The ACA algorithm is not difficult to implement in contrast to
practical implementation of the Taylor series or spherical harmonics used in the multipole method. On the other hand, the
multipole method allows the rapid computation of fields and potentials in the BEM domain once the problem has been solved
[12].
The second advantage of the ACA-BEM is that any arbitrary accuracy of the approximation can easily be reached. In
the worst case, the whole matrix will be generated without any
error. Using the sequence of the less and less accurate approximations of the same coarse discretization we are able to fix the
bound of the acceptable approximation error. Then, an obvious
reduction of this bound due to the increased dimension of the
matrix can be used for the final computations on the fine grid.
REFERENCES
The values refer to a 300-MHz Sun Ultra workstation.
As a final example, the closing process of the electromechanical relay has been studied, where only half of the mesh shown in
Fig. 4 was considered by taking advantage of the symmetry (for
details, see [1] and [7]). Results of this computation are given
in Table VI.
This example requires an enormous amount of CPU time,
because there are many time steps and the BEM matrices have
to be reprocessed frequently due to the motion of the armature.
The ACA implementation for problems with symmetry is not
yet optimized. Despite that fact the memory requirement could
still be reduced to 50% of the previous value.
IV. CONCLUSION
The memory consumption of the standard BEM turns out to
be the limiting factor in many practical applications. The above
results show that the ACA-BEM is a feasible means to overcome
these limitations.
The main advantage of the ACA method over the other fast
BEM techniques (H-Matrices [4], pseudoskeleton approxima-
[1] V. Rischmller, M. Haas, S. Kurz, and W. M. Rucker, 3D transient

analysis of electromechanical devices using parallel BEM coupled to
FEM, IEEE Trans. Magn., vol. 36, pp. 13601363, July 2000.
[2] M. Bebendorf, Approximation of boundary element matrices, Numer.
Math., vol. 86, no. 4, pp. 565589, 2000.
[3] M. Bebendorf and S. Rjasanow, Matrix compression for the radiation
heat transfer in exhaust pipes, in Multifield Problems, A.-M. Sndig,
W. Schiehlen, and W. L. Wendland, Eds. Berlin, Germany: SpringerVerlag, 2000, pp. 183191.
[4] W. Hackbusch, A sparse matrix arithmetic based on H-matricesPart
I, Computing, vol. 62, no. 2, pp. 89108, 1999.
[5] S. A. Goreinov, E. E. Tyrtyshnikov, and N. L. Zamarashkin, A theory
of pseudoskeleton approximations, Linear Algebra Applicat., vol. 261,
pp. 121, 1997.
[6] T. Nakata, N. Takahashi, and K. Fujiwara, Summary of results for benchmark problem 10 (steel plates around a coil),
COMPEL, pp. 335344, Sept. 1992. [Online]. Available: http://ics.eclyon.fr/team.html.
[7] S. Kurz, U. Becker, and H. Maisch, Dynamic simulation of electromechanical systemsFrom Maxwells theory to common rail diesel injection, Naturwissenschaften, 2001, to be published.
[8] S. Kurz, J. Fetzer, G. Lehner, and W. M. Rucker, A novel formulation
for 3D eddy current problems with moving bodies using a Lagrangian
description and BEM-FEM coupling, IEEE Trans. Magn., vol. 34, pp.
30683073, Sept. 1998.
, Numerical analysis of 3D eddy current problems with moving
[9]
bodies using BEM-FEM coupling, Surveys Math. Ind., vol. 9, pp.
131150, 1999.
[10] K. Preis et al., Numerical analysis of 3D magnetostatic fields, IEEE
Trans. Magn., vol. 27, pp. 37983803, Sept. 1991.
[11] V. Rokhlin, Rapid solution of integral equations of classical potential
theory, J. Comput. Phys., vol. 60, no. 2, pp. 187207, 1985.
[12] A. Buchau, W. Rieger, and W. M. Rucker, Fast field computations with
the fast multipole method, COMPEL, vol. 20, no. 2, pp. 547561, 2001.

The Adaptive Cross-Approximation Technique For The 3-D Boundary-Element Method PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Adaptive Cross-Approximation Technique For The 3-D Boundary-Element Method PDF

Uploaded by

Copyright:

Available Formats

IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO.

The Adaptive Cross-Approximation Technique

AbstractIt is well known that the classical boundary-element

HE APPLICATION of the boundary-element method

Manuscript received July 5, 2001; revised October 25, 2001.

code in connection with ACA. A comparison to existing fast

0018-9464/02$17.00 2002 IEEE

IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO. 2, MARCH 2002

the algorithm updates the approximation

This algorithm produces a sequence of decompositions of the

Now we apply the ACA algorithm to two mesh sequences.

KURZ et al.: THE ACA TECHNIQUE FOR THE 3-D BEM

The values refer to a 1.2-GHz AMD Athlon PC.

It means that the memory amount of a fully populated matrix

IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO. 2, MARCH 2002

The values refer to a 300-MHz Sun Ultra workstation.

The values refer to a 300-MHz Sun Ultra workstation.

[1] V. Rischmller, M. Haas, S. Kurz, and W. M. Rucker, 3D transient

You might also like