Exascale system modeling

Planning the next generation of supercomputers

Planning and financ­ing a large com­put­er sys­tem of­ten starts be­fore tech­no­log­i­cal de­tails and pro­ducts are avail­able.

—Gero Dittmann, IBM scientist

Some of the most disruptive business insights and scientific breakthroughs of our time, such as supply-chain optimization, machine-failure prediction or the study of galaxy evolution and dark energy, are only possible thanks to Big Data analytics and massively parallel software on very large computer systems.

As the largest computers are about to grow by another order of magnitude, we are approaching the exascale era. Building such systems is a daunting task for any organization. Project planning requires early estimates of the system size, performance and power consumption. Our models provide these estimates and support tailoring of the hardware to a particular software, thus increasing efficiency and reducing costs.

Core methods

card_1

PISA

Software analysis for system performance evaluation.

More

card_2

ExtrAx

Predicting software models at the exascale.

More

card_3

ExaBounds

Using fast, analytic models to predict the performance and power consumption of computing systems.

More

Case study

Square Kilometre Array

Targeted to become the largest and most precise radio telescope in the world, the Square Kilometre Array (SKA) is a next-generation instrument that will need to solve a real Big Data challenge. As the name suggests, the SKA will have a collecting area of one million square meters, made up of a vast array of antennas located on two continents, in South Africa and Australia. Before being made available to astronomers, the radio signals captured by these tens of thousands of antennas will need to be processed electronically. This processing step will face many challenges, one of which being a very limited power budget.

Efficient hardware–software system design practices are necessary to explore large sets of hardware designs to find the most suitable system that meets the system performance and power constraints. Our proposed methodology is one such novel design practice. We have successfully applied our tools to analyze various algorithms of the SKA digital processing pipeline. Such algorithms include Fast-Fourier transforms, beamforming, correlation and sky imagining. Our results provide preliminary guidelines for the design of the SKA compute nodes.

PISA

Workload characterization plays a crucial role in system performance evaluation. A good understanding of software properties can support design decisions to match hardware to applications. To enable design-space exploration studies of a large set of hardware design points, we analyze applications in a hardware-independent manner and then combine the software signatures with fast analytic performance models.

The Platform-Independent Software Analysis (PISA) framework is the enabler of our full-system design and performance evaluation solution. The framework characterizes single-node and multi-node software implementations in a hardware-agnostic manner at application run-time.

This workload characterization tool aims to enable design-space exploration not only of existing, but also of non-existing system architectures. The framework characterizes applications per thread and process by quantifying the parallelism, the flow-control behavior, the memory access patterns and the inter-process communication behavior.

 

ExtrAx

Collecting an actual profile for exascale applications requires the availability of an exascale machine itself. This is impossible with today’ technologies. Nonetheless such a software profile would provide valuable information to computer engineers and researchers designing exascale systems.

ExtrAx (Extrapolation of Application at Exascale) is a framework that predicts profiles of exascale applications starting from a set of actual profiles collected at a smaller scale. The latter can be obtained via measurements performed on existing computing systems.

ExtrAx intelligence is based on cognitive technologies that couple supervised and unsupervised machine-learning methods. Additionally, if preliminary knowledge of the target application is available — computational, memory, or communication complexity — this can be easily integrated into ExtrAx in order to shorten the model construction phase and to improve prediction accuracy even further.

The framework takes as input a set of application profiles preferably selected by using a statistically sound design of experiments. Then, it classifies the software thread profiles such that all threads in a class scale similarly. Finally, ExtraX generates prediction models for each thread class. The resulting extrapolation models can be used to predict application profiles at any target scale.

ExaBounds

Predicting the performance and power consumption of exascale supercomputing systems is a difficult problem. Conventional methodologies include simulating such systems, which is intractable for exascale systems because simulations are too slow.

ExaBounds solves this problem by using fast, analytic models to predict the performance and power consumption of computing systems. The analytic models capture the interactions of a piece of software running on a computing system and identifies the bottlenecks.

This allows us to answer such questions as, “Is there sufficient memory and network bandwidth and are there sufficient processing resources available in the architecture design? Is a fat-tree network topology more suitable than a torus topology for the communication pattern of an application?”

The input to ExaBounds are the exascale application profiles generated by ExtrAx and a description of the architecture configuration. Using a complex mathematical model, it predicts the performance and power consumption of the system while running the applications described by the input profiles. Furthermore, it can visualize the architectural bottlenecks.

An overview of a compute node modeled in ExaBounds is shown at right. Such compute nodes can be connected via network topologies such as fat-tree, multi-dimensional torus or HyperX.

Overview of a compute node modeled in ExaBounds

Ask the experts

Andreea Simona Anghel

Andreea Simona Anghel

IBM Research scientist

Geto Dittmann

Geto Dittmann

IBM Research scientist and team lead

Rik Jongerius

Rik Jongerius

IBM Research scientist

Giovanni Mariani

Giovanni Mariani

IBM Research scientist

Publications

[1] G. Mariani, A. Anghel, R. Jongerius, G. Dittmann,
Classification of Thread Profiles for Scaling Application Behavior,”
Journal of Parallel Computing, 66, 1-21, 2017.

[2] G. Mariani, A. Anghel, R. Jongerius, G. Dittmann,
“Predicting cloud performance for HPC applications: A user-oriented approach,”
in Proc. IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2017.

[3] S. Poddar, R. Jongerius, F. Leandro, G. Mariani, G. Dittmann, A. Anghel, H. Corporaal,
MeSAP: A fast analytic power model for DRAM memories,”
in Proc. IEEE Design, Automation and Test in Europe Conference, pp. 49-54, 2017.

[4] G. Mariani, A. Anghel, R. Jongerius, G. Dittmann,
Scaling Properties of Parallel Applications to Exascale,”
Int’l J. on Parallel Programming, 44(5), 975-1002, Springer, 2016.

[5] A. Anghel, L. Vasilescu, G. Mariani, R. Jongerius, G. Dittmann,
An Instrumentation Approach for Hardware-Agnostic Software Characterization,”
International Journal on Parallel Programming, 44(5), 924-948, 2016.

[6] R. Jongerius,
Exascale Computer System Design: The Square Kilometer Array,”
PhD dissertation, 2016

[7] A. Anghel, L. Vasilescu, G. Mariani, R. Jongerius, G. Dittmann,
An Instrumentation Approach for Hardware-Agnostic Software Characterization,”
in Proc. 12th ACM International Conference on Computing Frontiers (CF’15) New York, USA, pp. 3:1-3:8 2015.

[8] G. Mariani, A. Anghel, R. Jongerius, G. Dittmann,
Scaling Application Properties to Exascale,”
in Proc. 12th ACM International Conference on Computing Frontiers (CF’15) New York, USA, pp. 31:1-31:8, 2015.

[9] R. Jongerius, G. Mariani, A. Anghel, E. Vermij, G. Dittmann, H. Corporaal,
Analytic processor model for fast design-space exploration,”
in Proc. IEEE Int’l Conference on Computer Design (ICCD’15), New York, USA, 2015.

[10] A. Anghel, G. Dittmann, R. Jongerius, and R. Luijten,
Spatio-temporal locality characterization,”
in Proc. 46th IEEE/ACM Int’al Symposium on Microarchitecture (MICRO-46) Workshops: 1st Workshop on Near-Data Processing, 2013.