"Fossies" - the Fresh Open Source Software Archive

Member "opensaf-5.21.09/src/imm/README.SC_ABSENCE" (31 May 2021, 4238 Bytes) of package /linux/misc/opensaf-5.21.09.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file. See also the last Fossies "Diffs" side-by-side code changes report for "README.SC_ABSENCE": 5.20.05_vs_5.20.08.

    1 SC ABSENCE: Allow IMMNDs to survive SC absence (5.0)
    2 ====================================================
    3 Prior to this enhancement, absence of both SCs will cause IMMNDs to restart,
    4 also the cluster will be reboot by AMF. With this feature, IMMNDs on payloads
    5 continue to provide limited service until an SC is back.
    6 
    7 
    8 CONFIGURATION
    9 =============
   10 To enable this feature, IMMSV_SC_ABSENCE_ALLOWED environment variable must be
   11 set for IMMD (immd.conf)
   12 
   13     export IMMSV_SC_ABSENCE_ALLOWED=900
   14 
   15 The value indicates the number of seconds cluster will tolerate SC absence,
   16 value of zero indicates the feature is disabled.
   17 See immd.conf for more details.
   18 
   19 
   20 IMMND
   21 =====
   22 With SC absence feature enabled, IMMNDs on payloads now can be coordinator.
   23 That can happen even when the SCs are not absent.
   24 
   25 For example, the cluster only has one SC and IMMND on SC restarts, one of the
   26 IMMNDs on payloads will be elected as new coordinator. Without SC absence
   27 enabled, the cluster will not tolerate that situation and a cluster reboot will
   28 occur.
   29 
   30 If PBE is configured together with this feature, make sure that the shared file
   31 system (where sqlite database is located) is accessible from all nodes of the
   32 cluster.
   33 
   34 Upon receiving the IMMD down event, payload based IMMNDs unregister with MDS
   35 and then:
   36 - remove all local clients,
   37 - discard all implementers,
   38 - finalize all admin owners,
   39 - abort all non-critical CCBs.
   40 
   41 That means the IMMNDs only keep class definitions and object information in
   42 their memories during SC absence.
   43 
   44 After cleaning up those things, MDS will be registered again to allow clients to
   45 read the objects but only config data can be read because there's currently no
   46 OI attached for runtime data.
   47 
   48 Other operations with IMM service will get SA_AIS_ERR_TRY_AGAIN during SC
   49 absence.
   50 If you retry the APIs on SA_AIS_ERR_TRY_AGAIN, you should retry at least the
   51 amount of time that you set for IMMSV_SC_ABSENCE_ALLOWED.
   52 
   53 If you get SA_AIS_ERR_BAD_HANDLE, you must re-initialize the handles.
   54 
   55 
   56 IMMD
   57 ====
   58 After coming back from SC absence, the active IMMD will wait for the veteran
   59 IMMNDs to introduce for 3 seconds. If there's no introduction from veteran IMMND
   60 within 3 seconds, IMMD will start to load from repository. This is to avoid the
   61 race condition where IMMD receives and processes introduce message from the
   62 local IMMND or a newly joined IMMND before the veteran IMMNDs.
   63 
   64 The veteran IMMNDs also include highest fevs, latest id of implementer/admo/ccb
   65 in the introduce message to help IMMD restore these counters back to the state
   66 right before SC absence.
   67 
   68 IMMD then elects one of the veteran IMMNDs as new coordinator and the data is
   69 sync'ed to the SC based IMMNDs. After that, IMM service becomes fully functional
   70 again.
   71 
   72 
   73 SC ABSENCE and 2PBE
   74 ===================
   75 Support for absent IMMD is incompatible with 2PBE. If both are configured then
   76 2PBE will win and the absence of IMMD feature will be ignored. An error message
   77 is printed in this case to the syslog at startup.
   78 
   79 
   80 SC ABSENCE and ROAMING SC
   81 =========================
   82 When both SC Absence and Roaming SC features are enabled, multiple partitioned
   83 clusters can occur due to network split. If PBE database is configured on local
   84 node then many diverted IMM databases can occur. If rejoining these partitioned
   85 clusters into one cluster, any undefined behavior may happen. To avoid this,
   86 IMM implements mechanism to reboot nodes used to be on different partition with
   87 selected coordinator [#2936]
   88 
   89 - IMMND sends re-introduce using refresh id 3 with ex-IMMD node id.
   90 - When a payload becomes controller, the IMMD will select IMMND coordinator
   91 (prioritize local IMMND) and send the reply message to reboot nodes which have
   92 ex-IMMD node id different from ex-IMMD of selected coordinator.
   93 - Active IMMD uses new IMMD_A2S_MSG_INTRO_RSP_2 to checkpoint node info with
   94 ex-IMMD to standby IMMD.
   95 - IMMND uses MDS_RED_SUBSCRIBE to know Active/Standby in order to discard FEVS
   96 from unknown IMMD or while waiting the acceptance of re-introduce message to
   97 avoid IMMND itself restarted due to OUT OR ORDER. This mechanism is also
   98 applicable for rejoining multiple headless partitions from network split.
   99 
  100 To enable this mechanism, please export IMMSV_COORD_SELECT_NODE=1 in immd.conf