[1] 
G. W. Zumbusch.
Adaptive Parallel Multilevel Methods for Partial Differential
Equations.
Habilitation, Universität Bonn, 2001. [ bib ] In this text we propose a spacefilling curve enumeration scheme for the load balancing problem. It is based on the principles of selfsimilarity and scaling invariance. It provides even multilevel locality, i.e. as much locality on each scale as possible. We introduce the spacefilling curve schemes and prove some of the properties of the partitions. The scheme is cheap, deterministic, incremental, can be parallelised and provides acceptable partitions. However, even more striking, it seems to be one of the few partitioning methods where quasioptimal estimates can be shown. We are able to derive sharp estimates both on the partition and on the multilevel algorithms on the partition, which is more than is known about competing graph partitioning load balancing methods so far. Furthermore, we give a survey of the three main aspects of the efficient treatment of PDEs, that is, discretisation, multilevel solution and parallelisation. We will treat the main features of each of the three distinct topics and cover the historical background and modern developments. We demonstrate how all three ingredients can be put together to give an adaptive and parallel multilevel approach for the solution of PDEs. Error estimators and adaptive grid refinement techniques for ordinary and for sparse grid discretisations are presented. Different types of additive and multiplicative multilevel solvers are discussed with respect to parallel implementation and application to adaptive refined grids. Efficiency issues are treated both for the sequential multilevel methods and for the parallel version by hash table storage techniques. Furthermore, spacefilling curve enumeration for parallel load balancing and processor cache efficiency are discussed. We will apply the method to elliptic boundary value problems. We are able to derive estimates for the quality of the partitions by spacefilling curves and the load balancing of the numerical algorithms on the grids. Even for adaptive grid refinement within certain ranges we are able to prove that the partitions are quasioptimal, i.e. the cut sizes of the dual graph are only a constant factor away from optimum independent of the mesh size. Hence we obtain asymptotic optimality of the parallel algorithms. This seems to be remarkable in comparison to graph based heuristics, where little is known about the quality. Furthermore we were able to demonstrate the performance of the method on a range of the world's largest parallel computers, namely ASCI Blue Pacific and a prototype Cray T3E (now presumably at NSA), which are each larger than any nonUS system. We complement this data by simulations run on Parnass2, which was the first nonUS selfmade cluster in the list of the world's largest 500 computers (TOP500). We also demonstrate that this cluster is able to outperform many other commercial parallel computers on a per processor base.
