"Fossies" - the Fresh Open Source Software Archive

Member "git-2.23.0.windows.1/Documentation/technical/rerere.txt" (16 Aug 2019, 6486 Bytes) of package /windows/misc/git-2.23.0.windows.1.zip:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 Rerere
    2 ======
    3 
    4 This document describes the rerere logic.
    5 
    6 Conflict normalization
    7 ----------------------
    8 
    9 To ensure recorded conflict resolutions can be looked up in the rerere
   10 database, even when branches are merged in a different order,
   11 different branches are merged that result in the same conflict, or
   12 when different conflict style settings are used, rerere normalizes the
   13 conflicts before writing them to the rerere database.
   14 
   15 Different conflict styles and branch names are normalized by stripping
   16 the labels from the conflict markers, and removing the common ancestor
   17 version from the `diff3` conflict style. Branches that are merged
   18 in different order are normalized by sorting the conflict hunks.  More
   19 on each of those steps in the following sections.
   20 
   21 Once these two normalization operations are applied, a conflict ID is
   22 calculated based on the normalized conflict, which is later used by
   23 rerere to look up the conflict in the rerere database.
   24 
   25 Removing the common ancestor version
   26 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   27 
   28 Say we have three branches AB, AC and AC2.  The common ancestor of
   29 these branches has a file with a line containing the string "A" (for
   30 brevity this is called "line A" in the rest of the document).  In
   31 branch AB this line is changed to "B", in AC, this line is changed to
   32 "C", and branch AC2 is forked off of AC, after the line was changed to
   33 "C".
   34 
   35 Forking a branch ABAC off of branch AB and then merging AC into it, we
   36 get a conflict like the following:
   37 
   38     <<<<<<< HEAD
   39     B
   40     =======
   41     C
   42     >>>>>>> AC
   43 
   44 Doing the analogous with AC2 (forking a branch ABAC2 off of branch AB
   45 and then merging branch AC2 into it), using the diff3 conflict style,
   46 we get a conflict like the following:
   47 
   48     <<<<<<< HEAD
   49     B
   50     ||||||| merged common ancestors
   51     A
   52     =======
   53     C
   54     >>>>>>> AC2
   55 
   56 By resolving this conflict, to leave line D, the user declares:
   57 
   58     After examining what branches AB and AC did, I believe that making
   59     line A into line D is the best thing to do that is compatible with
   60     what AB and AC wanted to do.
   61 
   62 As branch AC2 refers to the same commit as AC, the above implies that
   63 this is also compatible what AB and AC2 wanted to do.
   64 
   65 By extension, this means that rerere should recognize that the above
   66 conflicts are the same.  To do this, the labels on the conflict
   67 markers are stripped, and the common ancestor version is removed.  The above
   68 examples would both result in the following normalized conflict:
   69 
   70     <<<<<<<
   71     B
   72     =======
   73     C
   74     >>>>>>>
   75 
   76 Sorting hunks
   77 ~~~~~~~~~~~~~
   78 
   79 As before, lets imagine that a common ancestor had a file with line A
   80 its early part, and line X in its late part.  And then four branches
   81 are forked that do these things:
   82 
   83     - AB: changes A to B
   84     - AC: changes A to C
   85     - XY: changes X to Y
   86     - XZ: changes X to Z
   87 
   88 Now, forking a branch ABAC off of branch AB and then merging AC into
   89 it, and forking a branch ACAB off of branch AC and then merging AB
   90 into it, would yield the conflict in a different order.  The former
   91 would say "A became B or C, what now?" while the latter would say "A
   92 became C or B, what now?"
   93 
   94 As a reminder, the act of merging AC into ABAC and resolving the
   95 conflict to leave line D means that the user declares:
   96 
   97     After examining what branches AB and AC did, I believe that
   98     making line A into line D is the best thing to do that is
   99     compatible with what AB and AC wanted to do.
  100 
  101 So the conflict we would see when merging AB into ACAB should be
  102 resolved the same way---it is the resolution that is in line with that
  103 declaration.
  104 
  105 Imagine that similarly previously a branch XYXZ was forked from XY,
  106 and XZ was merged into it, and resolved "X became Y or Z" into "X
  107 became W".
  108 
  109 Now, if a branch ABXY was forked from AB and then merged XY, then ABXY
  110 would have line B in its early part and line Y in its later part.
  111 Such a merge would be quite clean.  We can construct 4 combinations
  112 using these four branches ((AB, AC) x (XY, XZ)).
  113 
  114 Merging ABXY and ACXZ would make "an early A became B or C, a late X
  115 became Y or Z" conflict, while merging ACXY and ABXZ would make "an
  116 early A became C or B, a late X became Y or Z".  We can see there are
  117 4 combinations of ("B or C", "C or B") x ("X or Y", "Y or X").
  118 
  119 By sorting, the conflict is given its canonical name, namely, "an
  120 early part became B or C, a late part becames X or Y", and whenever
  121 any of these four patterns appear, and we can get to the same conflict
  122 and resolution that we saw earlier.
  123 
  124 Without the sorting, we'd have to somehow find a previous resolution
  125 from combinatorial explosion.
  126 
  127 Conflict ID calculation
  128 ~~~~~~~~~~~~~~~~~~~~~~~
  129 
  130 Once the conflict normalization is done, the conflict ID is calculated
  131 as the sha1 hash of the conflict hunks appended to each other,
  132 separated by <NUL> characters.  The conflict markers are stripped out
  133 before the sha1 is calculated.  So in the example above, where we
  134 merge branch AC which changes line A to line C, into branch AB, which
  135 changes line A to line C, the conflict ID would be
  136 SHA1('B<NUL>C<NUL>').
  137 
  138 If there are multiple conflicts in one file, the sha1 is calculated
  139 the same way with all hunks appended to each other, in the order in
  140 which they appear in the file, separated by a <NUL> character.
  141 
  142 Nested conflicts
  143 ~~~~~~~~~~~~~~~~
  144 
  145 Nested conflicts are handled very similarly to "simple" conflicts.
  146 Similar to simple conflicts, the conflict is first normalized by
  147 stripping the labels from conflict markers, stripping the common ancestor
  148 version, and the sorting the conflict hunks, both for the outer and the
  149 inner conflict.  This is done recursively, so any number of nested
  150 conflicts can be handled.
  151 
  152 Note that this only works for conflict markers that "cleanly nest".  If
  153 there are any unmatched conflict markers, rerere will fail to handle
  154 the conflict and record a conflict resolution.
  155 
  156 The only difference is in how the conflict ID is calculated.  For the
  157 inner conflict, the conflict markers themselves are not stripped out
  158 before calculating the sha1.
  159 
  160 Say we have the following conflict for example:
  161 
  162     <<<<<<< HEAD
  163     1
  164     =======
  165     <<<<<<< HEAD
  166     3
  167     =======
  168     2
  169     >>>>>>> branch-2
  170     >>>>>>> branch-3~
  171 
  172 After stripping out the labels of the conflict markers, and sorting
  173 the hunks, the conflict would look as follows:
  174 
  175     <<<<<<<
  176     1
  177     =======
  178     <<<<<<<
  179     2
  180     =======
  181     3
  182     >>>>>>>
  183     >>>>>>>
  184 
  185 and finally the conflict ID would be calculated as:
  186 `sha1('1<NUL><<<<<<<\n3\n=======\n2\n>>>>>>><NUL>')`