tau  2.25.2
About: TAU (Tuning and Analysis Utilities) is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C++, C, Java and Python.
  Fossies Dox: tau-2.25.2.tar.gz  ("inofficial" and yet experimental doxygen-generated source code documentation)  

APEX: Autonomic Performance Environment for eXascale



One of the key components of the XPRESS project is a new approach to performance observation, measurement, analysis and runtime decision making in order to optimize performance. The particular challenges of accurately measuring the performance characteristics of ParalleX applications requires a new approach to parallel performance observation. The standard model of multiple operating system processes and threads observing themselves in a first-person manner while writing out performance profiles or traces for offline analysis will not adequately capture the full execution context, nor provide opportunities for runtime adaptation within OpenX. The approach taken in the XPRESS project is a new performance measurement system, called (Autonomic Performance Environment for eXascale). APEX will include methods for information sharing between the layers of the software stack, from the hardware through operating and runtime systems, all the way to domain specific or legacy applications. The performance measurement components will incorporate relevant information across stack layers, with merging of third-person performance observation of node-level and global resources, remote processes, and both operating and runtime system threads.



Essentially, APEX is both a measurement system for introspection, as well as a Policy Engine for modifying runtime behavior based on the observations. While APEX has capabilities for generating profile data for post-mortem analysis, the key purpose of the measurement is to provide support for policy enforcement. To that end, APEX is designed to have very low overhead and minimize perturbation of runtime worker thread productivity. APEX supports both start/stop timers and either event-based or periodic counter samples. Measurements are taken synchronously, but profiling statistics and internal state management is performed by (preferably lower-priority) threads distinct from the running application. The heart of APEX is an event handler that dispatches events to registered listeners within APEX. Policy enforcement can trigger synchronously when events are triggered by the OS/RS or application, or can occur asynchronously on a periodic basis.

APEX is a library written in C++, and has both C and C++ external interfaces. While the C interface can be used for either language, some C++ applications prefer to work with namespaces (i.e. apex::*) rather than prefixes (i.e. apex_*). All functionality is supported through both interfaces, and the C interface contains inlined implementations of the C++ code.

While the designed purpose for APEX is supporting the current and future needs of ParalleX runtimes within the XPRESS project (such as HPX3, HPX5), experimental support is also available for introspection of current runtimes such as OpenMP. APEX could potentially be integrated into other runtime systems, such as any of a number of lightweight task based systems. The introspection provided by APEX is intended to be in the third-person model, rather than traditional first-person, per-thread/per-process application profile or tracing measurement. APEX is designed to combine information from the OS, Runtime, hardware and application in order to guide policy decisions.

For distributed communication, APEX provides an API to be implemented for the required communication for a given application. An MPI implementation is provided as a reference, and both HPX3 and HPX5 implementations have been implemented. In this way, APEX is integrated into the observed runtime, and asyncrhonous communication is provided at a lower priority, in order to minimize perturbation of the application.

The direct links to each API are here:

User Manual

For a complete user manual, please see the APEX documentation.


Support for this work was provided through Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (and Basic Energy Sciences/Biological and Environmental Research/High Energy Physics/Fusion Energy Sciences/Nuclear Physics) under award numbers DE-SC0008638, DE-SC0008704, DE-FG02-11ER26050 and DE-SC0006925.