PERCS (Productive, Easy-to-use Reconfigurable Computing System) is a DARPA-funded project to build a 10 PF system based on the IBM P7 power processor for the 2011 timeframe. This system uses a novel hierarchical mesh interconnect fabric, with the switching function distributed over many “Hub” chips. Our team is working on the performance analysis and has validated the choices of bandwidth of the links between supernodes as well as within supernodes and validated the topology choice. Within this project we first applied our MPI trace driven simulation for several key HPC applications. We are continually enhancing this method and currently we are adding a capability to evaluate the performance of the MPI libraries that are used in production HPC applications. We have also validated the GUPS (Giga Updates Per Second) benchmark for the full machine under synthetic traffic load. Using a parallel simulation technique on a large SMP server, we have simulated the maximum size of the PERCS machine consisting of 64k P7 chips, 16k hub chips and 50,000 interconnect links .
Within PERCS, we are also validating the performance of the Hub chip at the VHDL (VLSI Hardware Design Language) level. We have developed an approach to circumvent the long run times typical for simulations at the VHDL level. In some cases we take packet traces generated with our ZRL network simulations or Powerbus traces generated in cycle accurate CPU simulations from the P7 design team to assess the throughput and latency performance.