Parallel computing

Performance evaluation


Comprehensive performance evaluation of parallel computing systems

In parallel computing systems, a significant amount of the time an application needs to generate results is spent on the communication between the numerous processors. In the example case of a CPMD application on a 256-processor cluster, up to half of the application's execution time is spent on communications. We expect that a significant performance improvement can be achieved by developing a deep understanding of the dynamics involved in these communications.

Our measurement framework integrates hardware and software measurements by synchronizing the application's communication tasks with hardware performance counters. This precise technique allows us to measure the effect of demanding communication patterns on congestion in the cluster's interconnection network. We believe that this holistic approach yields a better insight into parallel application performance than the observation of hardware and software separately.

Our measurements have revealed a relationship between parallel application efficiency and interconnect congestion on an IBM pSeries 690 Regatta system. We will extend these insights to a wider range of systems by creating application and interconnection models for future complex parallel computer systems.