"Fossies" - the Fresh Open Source Software Archive

Member "berkeley_upc-2019.4.2/gasnet/pami-conduit/README" (27 May 2019, 7270 Bytes) of package /linux/misc/berkeley_upc-2019.4.2.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file. See also the last Fossies "Diffs" side-by-side code changes report for "README": 2.28.0_vs_2019.4.0.

    1 GASNet pami-conduit documentation
    2 Paul H. Hargrove <PHHargrove@lbl.gov>
    3 
    4 User Information:
    5 -----------------
    6 
    7 This is an implementation of the GASNet CORE and EXTENDED
    8 API using the IBM PAMI communication protocol.
    9 
   10 Where this conduit runs:
   11 -----------------------
   12 
   13 pami-conduit implements GASNet over IBM's Parallel Active Messaging Interface
   14 (PAMI) which is available on the IBM Blue Gene/Q, IBM PERCS/POWER 775 systems,
   15 and InfiniBand-connected clusters running IBM's Parallel Environment (PE)
   16 software.  There is a good collection of PAMI-related links available from
   17 https://github.com/jeffhammond/pami-examples/blob/master/README
   18 
   19 PAMI is the recommended GASNet conduit on the Blue Gene/Q, and POWER 775
   20 (a.k.a. PERCS) systems, and has been developed and tested on both.  On
   21 InfiniBand-connected clusters, ibv-conduit is likely to provide superior
   22 performance.
   23 
   24 There are no known minimum required versions of PAMI or related software
   25 
   26 Optional compile-time settings:
   27 ------------------------------
   28 
   29 * The following compile-time settings from extended-ref
   30   (see the extended-ref README)
   31 
   32  + GASNETI_THREADINFO_OPT - optimize thread discovery using hidden local variable
   33 
   34  + GASNETI_LAZY_BEGINFUNCTION - postpone thread discovery to first use
   35 
   36  + GASNETE_SCATTER_EOPS_ACROSS_CACHELINES(1/0) - scatter newly allocated eops
   37     across cache lines to reduce false sharing
   38 
   39 Recognized environment variables:
   40 ---------------------------------
   41 
   42 * All the standard GASNet environment variables (see top-level README)
   43 
   44 * GASNET_BARRIER - barrier algorithm selection
   45   In addition to the algorithms in the top-level README, there are two
   46   PAMI-specific values supported:
   47     PAMIDISSEM - like AMDISSEM, but implemented using PAMI-level AMs.
   48     PAMIALLREDUCE - barrier matching is implemented in terms of a
   49                     PAMI-level ALLREDUCE collective operation.
   50   Currently PAMIDISSEM is the default on all PAMI platforms.
   51 
   52 * GASNET_USE_PAMI_COLL - enable use of native-PAMI collectives
   53   Not all collectives are supported for all input conditions, but when
   54   support is available this setting controls if it will be used.
   55   [NOTE: currently only blocking collectives are implemented over PAMI]
   56   Additionally, the following allow finer-grained control over which
   57   collective operations use PAMI_Collective() when GASNET_USE_PAMI_COLL
   58   is enabled:
   59     GASNET_USE_PAMI_BROADCAST - gasnet_coll_broadcast functions
   60     GASNET_USE_PAMI_EXCHANGE  - gasnet_coll_exchange functions
   61     GASNET_USE_PAMI_GATHER    - gasnet_coll_gather functions
   62     GASNET_USE_PAMI_GATHERALL - gasnet_coll_gather_all functions
   63     GASNET_USE_PAMI_SCATTER   - gasnet_coll_scatter functions
   64   Default value for variables in this family is YES
   65 
   66 * GASNET_NETWORKDEPTH - depth of AM Request queues (default 1024)
   67   This integer parameter sets the limit on the number of outstanding
   68   Active Message Requests, where outstanding is defined in terms of
   69   local completion of the network send.
   70   Too-small values may reduce performance of AM-intensive applications.
   71   Too-large values may result in excessive buffering requirements in
   72   AM-intensive applications which can both reduce performance and can
   73   result in excessive memory use.
   74   Applications not sending "floods" of AMs will be be insensitive to
   75   the value of this parameter.
   76 
   77 * GASNET_AMPOLL_MAX - limit on work done in AMPoll (default 16)
   78   This integer parameter sets the maximum number of PAMI operations
   79   to be retired by a call to gasnet_AMPoll().
   80 
   81 Known problems:
   82 ---------------
   83 
   84 * See the GASNet Bugzilla server for details on known bugs:
   85   https://gasnet-bugs.lbl.gov/
   86 
   87 Future work:
   88 ------------
   89 
   90 The following are planned work items for pami-conduit:
   91 
   92 * Use dynamic registration (firehose) when local address is out-of-segment?
   93   Initial benchmarks seem to show PAMI getting RDMA speeds for xfers of
   94   sufficient size even when using PAMI_Put/Get, suggesting that some
   95   dynamic registration is already used internally.
   96   However, the gap between Put and Rput bandwidth between 2KB and 64KB as
   97   measured with 1 proc-per-node on PERCS shows that there is currently a
   98   *possibility* that dynamic registration (firehose) could be beneficial.
   99 
  100 * Register bounce buffers used for AM headers and payloads and apply
  101   the appropriate "use_rdma" hints.
  102 
  103 * Use multiple PAMI contexts/endpoints.  At a minimum it would be desirable
  104   to separate the AM and RDMA for independent progress.  Use of multiple
  105   endpoints when using pthreads is also worth some implementation effort.
  106   A separate context used for the exit coordination would prevent deadlock
  107   when exiting from an AM handler.
  108 
  109 * Explore use of PAMI's "remote_async_progress" hint.
  110 
  111 * Explore use of bounce buffers to avoid blocking for local completion
  112   of non-blocking/non-bulk Puts.
  113 
  114 * Explore use of conduit-level flow control for AMs, though it is not yet
  115   certain that this is needed as it was with dcmf-conduit.
  116 
  117 * Improve exit handling to raise SIGQUIT for non-collective exits.
  118 
  119 * For sufficiently small payloads, AMRequestLong could use a bounce buffer
  120   to avoid stalling for local completion.
  121 
  122 * Explore use of PAMI_Send_immediate() for small enough Medium and/or Long AMs.
  123 
  124 ==============================================================================
  125 
  126 Design Overview:
  127 ----------------
  128 
  129 * Core API:
  130   + GASNet's AMs are implemented in terms of PAMI's AMs, and execute
  131     handlers directly from the PAMI callbacks.
  132     - Short AMs use PAMI_Send_immediate() due to their length.
  133     - Medium and Long AMs use PAMI_Send().
  134     - Medium AMs copy their payloads to bounce buffers to avoid
  135       stalling for local completion.
  136     - Long Request AMs block for local completion.
  137     - LongAsync Request AMs do NOT block for local completion.
  138     - Long Replies AMs copy their payloads to bounce buffers, like
  139       a Medium AM, since our running of AM handlers from PAMI's
  140       callbacks precludes blocking for local completion.
  141   + The current default barrier is a PAMI-specific implementation of the
  142     dissemination barrier in terms of PAMI_Send_immediate().
  143   + GASNet's exit handling is done using an PAMI "all reduce" operation
  144     (w/ a timeout) to determine the MAX() of the exit codes, and whether
  145     the exit is collective.  For non-collective exits, the conduit is
  146     currently calling exit(1) and using the fact that IBM's software
  147     will take care of aborting the job.  However this does NOT get the
  148     desired behavior of raising SIGQUIT on the non-exiting nodes.
  149   + PSHM is supported through the default mechanisms.
  150 
  151 * Extended API:
  152   + GASNET_SEGMENT_FAST and GASNET_SEGMENT_LARGE are identical:
  153     - The segment is allocated using mmap() via the default mechanisms.
  154     - The segment is pinned/registered as a single PAMI memory region.
  155   + All Extended API operations are performed using PAMI_Rput and _Rget
  156     when both addresses fall in the GASNet segment, and PAMI_Put and
  157     _Get otherwise.  As a result, GASNET_SEGMENT_EVERYTHING "just works".
  158   + The blocking operations block for remote completion, of course.
  159   + The non-blocking NON-BULK Put operations will stall for the required
  160     local completion, but don't need to stall for remote completion.
  161     There is not currently any use of bounce buffers for these Puts.