"Fossies" - the Fresh Open Source Software Archive

Member "git-bisect-lk2009.html" (15 Dec 2018, 80998 Bytes) of package /linux/misc/git-htmldocs-2.20.1.tar.xz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) HTML source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 <?xml version="1.0" encoding="UTF-8"?>
    2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
    3     "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
    4 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
    5 <head>
    6 <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
    7 <meta name="generator" content="AsciiDoc 8.6.10" />
    8 <title>Fighting regressions with git bisect</title>
    9 <style type="text/css">
   10 /* Shared CSS for AsciiDoc xhtml11 and html5 backends */
   11 
   12 /* Default font. */
   13 body {
   14   font-family: Georgia,serif;
   15 }
   16 
   17 /* Title font. */
   18 h1, h2, h3, h4, h5, h6,
   19 div.title, caption.title,
   20 thead, p.table.header,
   21 #toctitle,
   22 #author, #revnumber, #revdate, #revremark,
   23 #footer {
   24   font-family: Arial,Helvetica,sans-serif;
   25 }
   26 
   27 body {
   28   margin: 1em 5% 1em 5%;
   29 }
   30 
   31 a {
   32   color: blue;
   33   text-decoration: underline;
   34 }
   35 a:visited {
   36   color: fuchsia;
   37 }
   38 
   39 em {
   40   font-style: italic;
   41   color: navy;
   42 }
   43 
   44 strong {
   45   font-weight: bold;
   46   color: #083194;
   47 }
   48 
   49 h1, h2, h3, h4, h5, h6 {
   50   color: #527bbd;
   51   margin-top: 1.2em;
   52   margin-bottom: 0.5em;
   53   line-height: 1.3;
   54 }
   55 
   56 h1, h2, h3 {
   57   border-bottom: 2px solid silver;
   58 }
   59 h2 {
   60   padding-top: 0.5em;
   61 }
   62 h3 {
   63   float: left;
   64 }
   65 h3 + * {
   66   clear: left;
   67 }
   68 h5 {
   69   font-size: 1.0em;
   70 }
   71 
   72 div.sectionbody {
   73   margin-left: 0;
   74 }
   75 
   76 hr {
   77   border: 1px solid silver;
   78 }
   79 
   80 p {
   81   margin-top: 0.5em;
   82   margin-bottom: 0.5em;
   83 }
   84 
   85 ul, ol, li > p {
   86   margin-top: 0;
   87 }
   88 ul > li     { color: #aaa; }
   89 ul > li > * { color: black; }
   90 
   91 .monospaced, code, pre {
   92   font-family: "Courier New", Courier, monospace;
   93   font-size: inherit;
   94   color: navy;
   95   padding: 0;
   96   margin: 0;
   97 }
   98 pre {
   99   white-space: pre-wrap;
  100 }
  101 
  102 #author {
  103   color: #527bbd;
  104   font-weight: bold;
  105   font-size: 1.1em;
  106 }
  107 #email {
  108 }
  109 #revnumber, #revdate, #revremark {
  110 }
  111 
  112 #footer {
  113   font-size: small;
  114   border-top: 2px solid silver;
  115   padding-top: 0.5em;
  116   margin-top: 4.0em;
  117 }
  118 #footer-text {
  119   float: left;
  120   padding-bottom: 0.5em;
  121 }
  122 #footer-badges {
  123   float: right;
  124   padding-bottom: 0.5em;
  125 }
  126 
  127 #preamble {
  128   margin-top: 1.5em;
  129   margin-bottom: 1.5em;
  130 }
  131 div.imageblock, div.exampleblock, div.verseblock,
  132 div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock,
  133 div.admonitionblock {
  134   margin-top: 1.0em;
  135   margin-bottom: 1.5em;
  136 }
  137 div.admonitionblock {
  138   margin-top: 2.0em;
  139   margin-bottom: 2.0em;
  140   margin-right: 10%;
  141   color: #606060;
  142 }
  143 
  144 div.content { /* Block element content. */
  145   padding: 0;
  146 }
  147 
  148 /* Block element titles. */
  149 div.title, caption.title {
  150   color: #527bbd;
  151   font-weight: bold;
  152   text-align: left;
  153   margin-top: 1.0em;
  154   margin-bottom: 0.5em;
  155 }
  156 div.title + * {
  157   margin-top: 0;
  158 }
  159 
  160 td div.title:first-child {
  161   margin-top: 0.0em;
  162 }
  163 div.content div.title:first-child {
  164   margin-top: 0.0em;
  165 }
  166 div.content + div.title {
  167   margin-top: 0.0em;
  168 }
  169 
  170 div.sidebarblock > div.content {
  171   background: #ffffee;
  172   border: 1px solid #dddddd;
  173   border-left: 4px solid #f0f0f0;
  174   padding: 0.5em;
  175 }
  176 
  177 div.listingblock > div.content {
  178   border: 1px solid #dddddd;
  179   border-left: 5px solid #f0f0f0;
  180   background: #f8f8f8;
  181   padding: 0.5em;
  182 }
  183 
  184 div.quoteblock, div.verseblock {
  185   padding-left: 1.0em;
  186   margin-left: 1.0em;
  187   margin-right: 10%;
  188   border-left: 5px solid #f0f0f0;
  189   color: #888;
  190 }
  191 
  192 div.quoteblock > div.attribution {
  193   padding-top: 0.5em;
  194   text-align: right;
  195 }
  196 
  197 div.verseblock > pre.content {
  198   font-family: inherit;
  199   font-size: inherit;
  200 }
  201 div.verseblock > div.attribution {
  202   padding-top: 0.75em;
  203   text-align: left;
  204 }
  205 /* DEPRECATED: Pre version 8.2.7 verse style literal block. */
  206 div.verseblock + div.attribution {
  207   text-align: left;
  208 }
  209 
  210 div.admonitionblock .icon {
  211   vertical-align: top;
  212   font-size: 1.1em;
  213   font-weight: bold;
  214   text-decoration: underline;
  215   color: #527bbd;
  216   padding-right: 0.5em;
  217 }
  218 div.admonitionblock td.content {
  219   padding-left: 0.5em;
  220   border-left: 3px solid #dddddd;
  221 }
  222 
  223 div.exampleblock > div.content {
  224   border-left: 3px solid #dddddd;
  225   padding-left: 0.5em;
  226 }
  227 
  228 div.imageblock div.content { padding-left: 0; }
  229 span.image img { border-style: none; vertical-align: text-bottom; }
  230 a.image:visited { color: white; }
  231 
  232 dl {
  233   margin-top: 0.8em;
  234   margin-bottom: 0.8em;
  235 }
  236 dt {
  237   margin-top: 0.5em;
  238   margin-bottom: 0;
  239   font-style: normal;
  240   color: navy;
  241 }
  242 dd > *:first-child {
  243   margin-top: 0.1em;
  244 }
  245 
  246 ul, ol {
  247     list-style-position: outside;
  248 }
  249 ol.arabic {
  250   list-style-type: decimal;
  251 }
  252 ol.loweralpha {
  253   list-style-type: lower-alpha;
  254 }
  255 ol.upperalpha {
  256   list-style-type: upper-alpha;
  257 }
  258 ol.lowerroman {
  259   list-style-type: lower-roman;
  260 }
  261 ol.upperroman {
  262   list-style-type: upper-roman;
  263 }
  264 
  265 div.compact ul, div.compact ol,
  266 div.compact p, div.compact p,
  267 div.compact div, div.compact div {
  268   margin-top: 0.1em;
  269   margin-bottom: 0.1em;
  270 }
  271 
  272 tfoot {
  273   font-weight: bold;
  274 }
  275 td > div.verse {
  276   white-space: pre;
  277 }
  278 
  279 div.hdlist {
  280   margin-top: 0.8em;
  281   margin-bottom: 0.8em;
  282 }
  283 div.hdlist tr {
  284   padding-bottom: 15px;
  285 }
  286 dt.hdlist1.strong, td.hdlist1.strong {
  287   font-weight: bold;
  288 }
  289 td.hdlist1 {
  290   vertical-align: top;
  291   font-style: normal;
  292   padding-right: 0.8em;
  293   color: navy;
  294 }
  295 td.hdlist2 {
  296   vertical-align: top;
  297 }
  298 div.hdlist.compact tr {
  299   margin: 0;
  300   padding-bottom: 0;
  301 }
  302 
  303 .comment {
  304   background: yellow;
  305 }
  306 
  307 .footnote, .footnoteref {
  308   font-size: 0.8em;
  309 }
  310 
  311 span.footnote, span.footnoteref {
  312   vertical-align: super;
  313 }
  314 
  315 #footnotes {
  316   margin: 20px 0 20px 0;
  317   padding: 7px 0 0 0;
  318 }
  319 
  320 #footnotes div.footnote {
  321   margin: 0 0 5px 0;
  322 }
  323 
  324 #footnotes hr {
  325   border: none;
  326   border-top: 1px solid silver;
  327   height: 1px;
  328   text-align: left;
  329   margin-left: 0;
  330   width: 20%;
  331   min-width: 100px;
  332 }
  333 
  334 div.colist td {
  335   padding-right: 0.5em;
  336   padding-bottom: 0.3em;
  337   vertical-align: top;
  338 }
  339 div.colist td img {
  340   margin-top: 0.3em;
  341 }
  342 
  343 @media print {
  344   #footer-badges { display: none; }
  345 }
  346 
  347 #toc {
  348   margin-bottom: 2.5em;
  349 }
  350 
  351 #toctitle {
  352   color: #527bbd;
  353   font-size: 1.1em;
  354   font-weight: bold;
  355   margin-top: 1.0em;
  356   margin-bottom: 0.1em;
  357 }
  358 
  359 div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {
  360   margin-top: 0;
  361   margin-bottom: 0;
  362 }
  363 div.toclevel2 {
  364   margin-left: 2em;
  365   font-size: 0.9em;
  366 }
  367 div.toclevel3 {
  368   margin-left: 4em;
  369   font-size: 0.9em;
  370 }
  371 div.toclevel4 {
  372   margin-left: 6em;
  373   font-size: 0.9em;
  374 }
  375 
  376 span.aqua { color: aqua; }
  377 span.black { color: black; }
  378 span.blue { color: blue; }
  379 span.fuchsia { color: fuchsia; }
  380 span.gray { color: gray; }
  381 span.green { color: green; }
  382 span.lime { color: lime; }
  383 span.maroon { color: maroon; }
  384 span.navy { color: navy; }
  385 span.olive { color: olive; }
  386 span.purple { color: purple; }
  387 span.red { color: red; }
  388 span.silver { color: silver; }
  389 span.teal { color: teal; }
  390 span.white { color: white; }
  391 span.yellow { color: yellow; }
  392 
  393 span.aqua-background { background: aqua; }
  394 span.black-background { background: black; }
  395 span.blue-background { background: blue; }
  396 span.fuchsia-background { background: fuchsia; }
  397 span.gray-background { background: gray; }
  398 span.green-background { background: green; }
  399 span.lime-background { background: lime; }
  400 span.maroon-background { background: maroon; }
  401 span.navy-background { background: navy; }
  402 span.olive-background { background: olive; }
  403 span.purple-background { background: purple; }
  404 span.red-background { background: red; }
  405 span.silver-background { background: silver; }
  406 span.teal-background { background: teal; }
  407 span.white-background { background: white; }
  408 span.yellow-background { background: yellow; }
  409 
  410 span.big { font-size: 2em; }
  411 span.small { font-size: 0.6em; }
  412 
  413 span.underline { text-decoration: underline; }
  414 span.overline { text-decoration: overline; }
  415 span.line-through { text-decoration: line-through; }
  416 
  417 div.unbreakable { page-break-inside: avoid; }
  418 
  419 
  420 /*
  421  * xhtml11 specific
  422  *
  423  * */
  424 
  425 div.tableblock {
  426   margin-top: 1.0em;
  427   margin-bottom: 1.5em;
  428 }
  429 div.tableblock > table {
  430   border: 3px solid #527bbd;
  431 }
  432 thead, p.table.header {
  433   font-weight: bold;
  434   color: #527bbd;
  435 }
  436 p.table {
  437   margin-top: 0;
  438 }
  439 /* Because the table frame attribute is overriden by CSS in most browsers. */
  440 div.tableblock > table[frame="void"] {
  441   border-style: none;
  442 }
  443 div.tableblock > table[frame="hsides"] {
  444   border-left-style: none;
  445   border-right-style: none;
  446 }
  447 div.tableblock > table[frame="vsides"] {
  448   border-top-style: none;
  449   border-bottom-style: none;
  450 }
  451 
  452 
  453 /*
  454  * html5 specific
  455  *
  456  * */
  457 
  458 table.tableblock {
  459   margin-top: 1.0em;
  460   margin-bottom: 1.5em;
  461 }
  462 thead, p.tableblock.header {
  463   font-weight: bold;
  464   color: #527bbd;
  465 }
  466 p.tableblock {
  467   margin-top: 0;
  468 }
  469 table.tableblock {
  470   border-width: 3px;
  471   border-spacing: 0px;
  472   border-style: solid;
  473   border-color: #527bbd;
  474   border-collapse: collapse;
  475 }
  476 th.tableblock, td.tableblock {
  477   border-width: 1px;
  478   padding: 4px;
  479   border-style: solid;
  480   border-color: #527bbd;
  481 }
  482 
  483 table.tableblock.frame-topbot {
  484   border-left-style: hidden;
  485   border-right-style: hidden;
  486 }
  487 table.tableblock.frame-sides {
  488   border-top-style: hidden;
  489   border-bottom-style: hidden;
  490 }
  491 table.tableblock.frame-none {
  492   border-style: hidden;
  493 }
  494 
  495 th.tableblock.halign-left, td.tableblock.halign-left {
  496   text-align: left;
  497 }
  498 th.tableblock.halign-center, td.tableblock.halign-center {
  499   text-align: center;
  500 }
  501 th.tableblock.halign-right, td.tableblock.halign-right {
  502   text-align: right;
  503 }
  504 
  505 th.tableblock.valign-top, td.tableblock.valign-top {
  506   vertical-align: top;
  507 }
  508 th.tableblock.valign-middle, td.tableblock.valign-middle {
  509   vertical-align: middle;
  510 }
  511 th.tableblock.valign-bottom, td.tableblock.valign-bottom {
  512   vertical-align: bottom;
  513 }
  514 
  515 
  516 /*
  517  * manpage specific
  518  *
  519  * */
  520 
  521 body.manpage h1 {
  522   padding-top: 0.5em;
  523   padding-bottom: 0.5em;
  524   border-top: 2px solid silver;
  525   border-bottom: 2px solid silver;
  526 }
  527 body.manpage h2 {
  528   border-style: none;
  529 }
  530 body.manpage div.sectionbody {
  531   margin-left: 3em;
  532 }
  533 
  534 @media print {
  535   body.manpage div#toc { display: none; }
  536 }
  537 
  538 
  539 </style>
  540 <script type="text/javascript">
  541 /*<![CDATA[*/
  542 var asciidoc = {  // Namespace.
  543 
  544 /////////////////////////////////////////////////////////////////////
  545 // Table Of Contents generator
  546 /////////////////////////////////////////////////////////////////////
  547 
  548 /* Author: Mihai Bazon, September 2002
  549  * http://students.infoiasi.ro/~mishoo
  550  *
  551  * Table Of Content generator
  552  * Version: 0.4
  553  *
  554  * Feel free to use this script under the terms of the GNU General Public
  555  * License, as long as you do not remove or alter this notice.
  556  */
  557 
  558  /* modified by Troy D. Hanson, September 2006. License: GPL */
  559  /* modified by Stuart Rackham, 2006, 2009. License: GPL */
  560 
  561 // toclevels = 1..4.
  562 toc: function (toclevels) {
  563 
  564   function getText(el) {
  565     var text = "";
  566     for (var i = el.firstChild; i != null; i = i.nextSibling) {
  567       if (i.nodeType == 3 /* Node.TEXT_NODE */) // IE doesn't speak constants.
  568         text += i.data;
  569       else if (i.firstChild != null)
  570         text += getText(i);
  571     }
  572     return text;
  573   }
  574 
  575   function TocEntry(el, text, toclevel) {
  576     this.element = el;
  577     this.text = text;
  578     this.toclevel = toclevel;
  579   }
  580 
  581   function tocEntries(el, toclevels) {
  582     var result = new Array;
  583     var re = new RegExp('[hH]([1-'+(toclevels+1)+'])');
  584     // Function that scans the DOM tree for header elements (the DOM2
  585     // nodeIterator API would be a better technique but not supported by all
  586     // browsers).
  587     var iterate = function (el) {
  588       for (var i = el.firstChild; i != null; i = i.nextSibling) {
  589         if (i.nodeType == 1 /* Node.ELEMENT_NODE */) {
  590           var mo = re.exec(i.tagName);
  591           if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") {
  592             result[result.length] = new TocEntry(i, getText(i), mo[1]-1);
  593           }
  594           iterate(i);
  595         }
  596       }
  597     }
  598     iterate(el);
  599     return result;
  600   }
  601 
  602   var toc = document.getElementById("toc");
  603   if (!toc) {
  604     return;
  605   }
  606 
  607   // Delete existing TOC entries in case we're reloading the TOC.
  608   var tocEntriesToRemove = [];
  609   var i;
  610   for (i = 0; i < toc.childNodes.length; i++) {
  611     var entry = toc.childNodes[i];
  612     if (entry.nodeName.toLowerCase() == 'div'
  613      && entry.getAttribute("class")
  614      && entry.getAttribute("class").match(/^toclevel/))
  615       tocEntriesToRemove.push(entry);
  616   }
  617   for (i = 0; i < tocEntriesToRemove.length; i++) {
  618     toc.removeChild(tocEntriesToRemove[i]);
  619   }
  620 
  621   // Rebuild TOC entries.
  622   var entries = tocEntries(document.getElementById("content"), toclevels);
  623   for (var i = 0; i < entries.length; ++i) {
  624     var entry = entries[i];
  625     if (entry.element.id == "")
  626       entry.element.id = "_toc_" + i;
  627     var a = document.createElement("a");
  628     a.href = "#" + entry.element.id;
  629     a.appendChild(document.createTextNode(entry.text));
  630     var div = document.createElement("div");
  631     div.appendChild(a);
  632     div.className = "toclevel" + entry.toclevel;
  633     toc.appendChild(div);
  634   }
  635   if (entries.length == 0)
  636     toc.parentNode.removeChild(toc);
  637 },
  638 
  639 
  640 /////////////////////////////////////////////////////////////////////
  641 // Footnotes generator
  642 /////////////////////////////////////////////////////////////////////
  643 
  644 /* Based on footnote generation code from:
  645  * http://www.brandspankingnew.net/archive/2005/07/format_footnote.html
  646  */
  647 
  648 footnotes: function () {
  649   // Delete existing footnote entries in case we're reloading the footnodes.
  650   var i;
  651   var noteholder = document.getElementById("footnotes");
  652   if (!noteholder) {
  653     return;
  654   }
  655   var entriesToRemove = [];
  656   for (i = 0; i < noteholder.childNodes.length; i++) {
  657     var entry = noteholder.childNodes[i];
  658     if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote")
  659       entriesToRemove.push(entry);
  660   }
  661   for (i = 0; i < entriesToRemove.length; i++) {
  662     noteholder.removeChild(entriesToRemove[i]);
  663   }
  664 
  665   // Rebuild footnote entries.
  666   var cont = document.getElementById("content");
  667   var spans = cont.getElementsByTagName("span");
  668   var refs = {};
  669   var n = 0;
  670   for (i=0; i<spans.length; i++) {
  671     if (spans[i].className == "footnote") {
  672       n++;
  673       var note = spans[i].getAttribute("data-note");
  674       if (!note) {
  675         // Use [\s\S] in place of . so multi-line matches work.
  676         // Because JavaScript has no s (dotall) regex flag.
  677         note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1];
  678         spans[i].innerHTML =
  679           "[<a id='_footnoteref_" + n + "' href='#_footnote_" + n +
  680           "' title='View footnote' class='footnote'>" + n + "</a>]";
  681         spans[i].setAttribute("data-note", note);
  682       }
  683       noteholder.innerHTML +=
  684         "<div class='footnote' id='_footnote_" + n + "'>" +
  685         "<a href='#_footnoteref_" + n + "' title='Return to text'>" +
  686         n + "</a>. " + note + "</div>";
  687       var id =spans[i].getAttribute("id");
  688       if (id != null) refs["#"+id] = n;
  689     }
  690   }
  691   if (n == 0)
  692     noteholder.parentNode.removeChild(noteholder);
  693   else {
  694     // Process footnoterefs.
  695     for (i=0; i<spans.length; i++) {
  696       if (spans[i].className == "footnoteref") {
  697         var href = spans[i].getElementsByTagName("a")[0].getAttribute("href");
  698         href = href.match(/#.*/)[0];  // Because IE return full URL.
  699         n = refs[href];
  700         spans[i].innerHTML =
  701           "[<a href='#_footnote_" + n +
  702           "' title='View footnote' class='footnote'>" + n + "</a>]";
  703       }
  704     }
  705   }
  706 },
  707 
  708 install: function(toclevels) {
  709   var timerId;
  710 
  711   function reinstall() {
  712     asciidoc.footnotes();
  713     if (toclevels) {
  714       asciidoc.toc(toclevels);
  715     }
  716   }
  717 
  718   function reinstallAndRemoveTimer() {
  719     clearInterval(timerId);
  720     reinstall();
  721   }
  722 
  723   timerId = setInterval(reinstall, 500);
  724   if (document.addEventListener)
  725     document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false);
  726   else
  727     window.onload = reinstallAndRemoveTimer;
  728 }
  729 
  730 }
  731 asciidoc.install();
  732 /*]]>*/
  733 </script>
  734 </head>
  735 <body class="article">
  736 <div id="header">
  737 <h1>Fighting regressions with git bisect</h1>
  738 <span id="author">Christian Couder</span><br />
  739 <span id="email"><code>&lt;<a href="mailto:chriscool@tuxfamily.org">chriscool@tuxfamily.org</a>&gt;</code></span><br />
  740 <span id="revdate">2009/11/08</span>
  741 </div>
  742 <div id="content">
  743 <div class="sect1">
  744 <h2 id="_abstract">Abstract</h2>
  745 <div class="sectionbody">
  746 <div class="paragraph"><p>"git bisect" enables software users and developers to easily find the
  747 commit that introduced a regression. We show why it is important to
  748 have good tools to fight regressions. We describe how "git bisect"
  749 works from the outside and the algorithms it uses inside. Then we
  750 explain how to take advantage of "git bisect" to improve current
  751 practices. And we discuss how "git bisect" could improve in the
  752 future.</p></div>
  753 </div>
  754 </div>
  755 <div class="sect1">
  756 <h2 id="_introduction_to_git_bisect">Introduction to "git bisect"</h2>
  757 <div class="sectionbody">
  758 <div class="paragraph"><p>Git is a Distributed Version Control system (DVCS) created by Linus
  759 Torvalds and maintained by Junio Hamano.</p></div>
  760 <div class="paragraph"><p>In Git like in many other Version Control Systems (VCS), the different
  761 states of the data that is managed by the system are called
  762 commits. And, as VCS are mostly used to manage software source code,
  763 sometimes "interesting" changes of behavior in the software are
  764 introduced in some commits.</p></div>
  765 <div class="paragraph"><p>In fact people are specially interested in commits that introduce a
  766 "bad" behavior, called a bug or a regression. They are interested in
  767 these commits because a commit (hopefully) contains a very small set
  768 of source code changes. And it&#8217;s much easier to understand and
  769 properly fix a problem when you only need to check a very small set of
  770 changes, than when you don&#8217;t know where look in the first place.</p></div>
  771 <div class="paragraph"><p>So to help people find commits that introduce a "bad" behavior, the
  772 "git bisect" set of commands was invented. And it follows of course
  773 that in "git bisect" parlance, commits where the "interesting
  774 behavior" is present are called "bad" commits, while other commits are
  775 called "good" commits. And a commit that introduce the behavior we are
  776 interested in is called a "first bad commit". Note that there could be
  777 more than one "first bad commit" in the commit space we are searching.</p></div>
  778 <div class="paragraph"><p>So "git bisect" is designed to help find a "first bad commit". And to
  779 be as efficient as possible, it tries to perform a binary search.</p></div>
  780 </div>
  781 </div>
  782 <div class="sect1">
  783 <h2 id="_fighting_regressions_overview">Fighting regressions overview</h2>
  784 <div class="sectionbody">
  785 <div class="sect2">
  786 <h3 id="_regressions_a_big_problem">Regressions: a big problem</h3>
  787 <div class="paragraph"><p>Regressions are a big problem in the software industry. But it&#8217;s
  788 difficult to put some real numbers behind that claim.</p></div>
  789 <div class="paragraph"><p>There are some numbers about bugs in general, like a NIST study in
  790 2002 <a href="#1">[1]</a> that said:</p></div>
  791 <div class="quoteblock">
  792 <div class="content">
  793 <div class="paragraph"><p>Software bugs, or errors, are so prevalent and so detrimental that
  794 they cost the U.S. economy an estimated $59.5 billion annually, or
  795 about 0.6 percent of the gross domestic product, according to a newly
  796 released study commissioned by the Department of Commerce&#8217;s National
  797 Institute of Standards and Technology (NIST). At the national level,
  798 over half of the costs are borne by software users and the remainder
  799 by software developers/vendors.  The study also found that, although
  800 all errors cannot be removed, more than a third of these costs, or an
  801 estimated $22.2 billion, could be eliminated by an improved testing
  802 infrastructure that enables earlier and more effective identification
  803 and removal of software defects. These are the savings associated with
  804 finding an increased percentage (but not 100 percent) of errors closer
  805 to the development stages in which they are introduced. Currently,
  806 over half of all errors are not found until "downstream" in the
  807 development process or during post-sale software use.</p></div>
  808 </div>
  809 <div class="attribution">
  810 </div></div>
  811 <div class="paragraph"><p>And then:</p></div>
  812 <div class="quoteblock">
  813 <div class="content">
  814 <div class="paragraph"><p>Software developers already spend approximately 80 percent of
  815 development costs on identifying and correcting defects, and yet few
  816 products of any type other than software are shipped with such high
  817 levels of errors.</p></div>
  818 </div>
  819 <div class="attribution">
  820 </div></div>
  821 <div class="paragraph"><p>Eventually the conclusion started with:</p></div>
  822 <div class="quoteblock">
  823 <div class="content">
  824 <div class="paragraph"><p>The path to higher software quality is significantly improved software
  825 testing.</p></div>
  826 </div>
  827 <div class="attribution">
  828 </div></div>
  829 <div class="paragraph"><p>There are other estimates saying that 80% of the cost related to
  830 software is about maintenance <a href="#2">[2]</a>.</p></div>
  831 <div class="paragraph"><p>Though, according to Wikipedia <a href="#3">[3]</a>:</p></div>
  832 <div class="quoteblock">
  833 <div class="content">
  834 <div class="paragraph"><p>A common perception of maintenance is that it is merely fixing
  835 bugs. However, studies and surveys over the years have indicated that
  836 the majority, over 80%, of the maintenance effort is used for
  837 non-corrective actions (Pigosky 1997). This perception is perpetuated
  838 by users submitting problem reports that in reality are functionality
  839 enhancements to the system.</p></div>
  840 </div>
  841 <div class="attribution">
  842 </div></div>
  843 <div class="paragraph"><p>But we can guess that improving on existing software is very costly
  844 because you have to watch out for regressions. At least this would
  845 make the above studies consistent among themselves.</p></div>
  846 <div class="paragraph"><p>Of course some kind of software is developed, then used during some
  847 time without being improved on much, and then finally thrown away. In
  848 this case, of course, regressions may not be a big problem. But on the
  849 other hand, there is a lot of big software that is continually
  850 developed and maintained during years or even tens of years by a lot
  851 of people. And as there are often many people who depend (sometimes
  852 critically) on such software, regressions are a really big problem.</p></div>
  853 <div class="paragraph"><p>One such software is the Linux kernel. And if we look at the Linux
  854 kernel, we can see that a lot of time and effort is spent to fight
  855 regressions. The release cycle start with a 2 weeks long merge
  856 window. Then the first release candidate (rc) version is tagged. And
  857 after that about 7 or 8 more rc versions will appear with around one
  858 week between each of them, before the final release.</p></div>
  859 <div class="paragraph"><p>The time between the first rc release and the final release is
  860 supposed to be used to test rc versions and fight bugs and especially
  861 regressions. And this time is more than 80% of the release cycle
  862 time. But this is not the end of the fight yet, as of course it
  863 continues after the release.</p></div>
  864 <div class="paragraph"><p>And then this is what Ingo Molnar (a well known Linux kernel
  865 developer) says about his use of git bisect:</p></div>
  866 <div class="quoteblock">
  867 <div class="content">
  868 <div class="paragraph"><p>I most actively use it during the merge window (when a lot of trees
  869 get merged upstream and when the influx of bugs is the highest) - and
  870 yes, there have been cases that i used it multiple times a day. My
  871 average is roughly once a day.</p></div>
  872 </div>
  873 <div class="attribution">
  874 </div></div>
  875 <div class="paragraph"><p>So regressions are fought all the time by developers, and indeed it is
  876 well known that bugs should be fixed as soon as possible, so as soon
  877 as they are found. That&#8217;s why it is interesting to have good tools for
  878 this purpose.</p></div>
  879 </div>
  880 <div class="sect2">
  881 <h3 id="_other_tools_to_fight_regressions">Other tools to fight regressions</h3>
  882 <div class="paragraph"><p>So what are the tools used to fight regressions? They are nearly the
  883 same as those used to fight regular bugs. The only specific tools are
  884 test suites and tools similar as "git bisect".</p></div>
  885 <div class="paragraph"><p>Test suites are very nice. But when they are used alone, they are
  886 supposed to be used so that all the tests are checked after each
  887 commit. This means that they are not very efficient, because many
  888 tests are run for no interesting result, and they suffer from
  889 combinational explosion.</p></div>
  890 <div class="paragraph"><p>In fact the problem is that big software often has many different
  891 configuration options and that each test case should pass for each
  892 configuration after each commit. So if you have for each release: N
  893 configurations, M commits and T test cases, you should perform:</p></div>
  894 <div class="listingblock">
  895 <div class="content">
  896 <pre><code>N * M * T tests</code></pre>
  897 </div></div>
  898 <div class="paragraph"><p>where N, M and T are all growing with the size your software.</p></div>
  899 <div class="paragraph"><p>So very soon it will not be possible to completely test everything.</p></div>
  900 <div class="paragraph"><p>And if some bugs slip through your test suite, then you can add a test
  901 to your test suite. But if you want to use your new improved test
  902 suite to find where the bug slipped in, then you will either have to
  903 emulate a bisection process or you will perhaps bluntly test each
  904 commit backward starting from the "bad" commit you have which may be
  905 very wasteful.</p></div>
  906 </div>
  907 </div>
  908 </div>
  909 <div class="sect1">
  910 <h2 id="_git_bisect_overview">"git bisect" overview</h2>
  911 <div class="sectionbody">
  912 <div class="sect2">
  913 <h3 id="_starting_a_bisection">Starting a bisection</h3>
  914 <div class="paragraph"><p>The first "git bisect" subcommand to use is "git bisect start" to
  915 start the search. Then bounds must be set to limit the commit
  916 space. This is done usually by giving one "bad" and at least one
  917 "good" commit. They can be passed in the initial call to "git bisect
  918 start" like this:</p></div>
  919 <div class="listingblock">
  920 <div class="content">
  921 <pre><code>$ git bisect start [BAD [GOOD...]]</code></pre>
  922 </div></div>
  923 <div class="paragraph"><p>or they can be set using:</p></div>
  924 <div class="listingblock">
  925 <div class="content">
  926 <pre><code>$ git bisect bad [COMMIT]</code></pre>
  927 </div></div>
  928 <div class="paragraph"><p>and:</p></div>
  929 <div class="listingblock">
  930 <div class="content">
  931 <pre><code>$ git bisect good [COMMIT...]</code></pre>
  932 </div></div>
  933 <div class="paragraph"><p>where BAD, GOOD and COMMIT are all names that can be resolved to a
  934 commit.</p></div>
  935 <div class="paragraph"><p>Then "git bisect" will checkout a commit of its choosing and ask the
  936 user to test it, like this:</p></div>
  937 <div class="listingblock">
  938 <div class="content">
  939 <pre><code>$ git bisect start v2.6.27 v2.6.25
  940 Bisecting: 10928 revisions left to test after this (roughly 14 steps)
  941 [2ec65f8b89ea003c27ff7723525a2ee335a2b393] x86: clean up using max_low_pfn on 32-bit</code></pre>
  942 </div></div>
  943 <div class="paragraph"><p>Note that the example that we will use is really a toy example, we
  944 will be looking for the first commit that has a version like
  945 "2.6.26-something", that is the commit that has a "SUBLEVEL = 26" line
  946 in the top level Makefile. This is a toy example because there are
  947 better ways to find this commit with Git than using "git bisect" (for
  948 example "git blame" or "git log -S&lt;string&gt;").</p></div>
  949 </div>
  950 <div class="sect2">
  951 <h3 id="_driving_a_bisection_manually">Driving a bisection manually</h3>
  952 <div class="paragraph"><p>At this point there are basically 2 ways to drive the search. It can
  953 be driven manually by the user or it can be driven automatically by a
  954 script or a command.</p></div>
  955 <div class="paragraph"><p>If the user is driving it, then at each step of the search, the user
  956 will have to test the current commit and say if it is "good" or "bad"
  957 using the "git bisect good" or "git bisect bad" commands respectively
  958 that have been described above. For example:</p></div>
  959 <div class="listingblock">
  960 <div class="content">
  961 <pre><code>$ git bisect bad
  962 Bisecting: 5480 revisions left to test after this (roughly 13 steps)
  963 [66c0b394f08fd89236515c1c84485ea712a157be] KVM: kill file-&gt;f_count abuse in kvm</code></pre>
  964 </div></div>
  965 <div class="paragraph"><p>And after a few more steps like that, "git bisect" will eventually
  966 find a first bad commit:</p></div>
  967 <div class="listingblock">
  968 <div class="content">
  969 <pre><code>$ git bisect bad
  970 2ddcca36c8bcfa251724fe342c8327451988be0d is the first bad commit
  971 commit 2ddcca36c8bcfa251724fe342c8327451988be0d
  972 Author: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
  973 Date:   Sat May 3 11:59:44 2008 -0700
  974 
  975     Linux 2.6.26-rc1
  976 
  977 :100644 100644 5cf82581... 4492984e... M      Makefile</code></pre>
  978 </div></div>
  979 <div class="paragraph"><p>At this point we can see what the commit does, check it out (if it&#8217;s
  980 not already checked out) or tinker with it, for example:</p></div>
  981 <div class="listingblock">
  982 <div class="content">
  983 <pre><code>$ git show HEAD
  984 commit 2ddcca36c8bcfa251724fe342c8327451988be0d
  985 Author: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
  986 Date:   Sat May 3 11:59:44 2008 -0700
  987 
  988     Linux 2.6.26-rc1
  989 
  990 diff --git a/Makefile b/Makefile
  991 index 5cf8258..4492984 100644
  992 --- a/Makefile
  993 +++ b/Makefile
  994 @@ -1,7 +1,7 @@
  995  VERSION = 2
  996  PATCHLEVEL = 6
  997 -SUBLEVEL = 25
  998 -EXTRAVERSION =
  999 +SUBLEVEL = 26
 1000 +EXTRAVERSION = -rc1
 1001  NAME = Funky Weasel is Jiggy wit it
 1002 
 1003  # *DOCUMENTATION*</code></pre>
 1004 </div></div>
 1005 <div class="paragraph"><p>And when we are finished we can use "git bisect reset" to go back to
 1006 the branch we were in before we started bisecting:</p></div>
 1007 <div class="listingblock">
 1008 <div class="content">
 1009 <pre><code>$ git bisect reset
 1010 Checking out files: 100% (21549/21549), done.
 1011 Previous HEAD position was 2ddcca3... Linux 2.6.26-rc1
 1012 Switched to branch 'master'</code></pre>
 1013 </div></div>
 1014 </div>
 1015 <div class="sect2">
 1016 <h3 id="_driving_a_bisection_automatically">Driving a bisection automatically</h3>
 1017 <div class="paragraph"><p>The other way to drive the bisection process is to tell "git bisect"
 1018 to launch a script or command at each bisection step to know if the
 1019 current commit is "good" or "bad". To do that, we use the "git bisect
 1020 run" command. For example:</p></div>
 1021 <div class="listingblock">
 1022 <div class="content">
 1023 <pre><code>$ git bisect start v2.6.27 v2.6.25
 1024 Bisecting: 10928 revisions left to test after this (roughly 14 steps)
 1025 [2ec65f8b89ea003c27ff7723525a2ee335a2b393] x86: clean up using max_low_pfn on 32-bit
 1026 $
 1027 $ git bisect run grep '^SUBLEVEL = 25' Makefile
 1028 running grep ^SUBLEVEL = 25 Makefile
 1029 Bisecting: 5480 revisions left to test after this (roughly 13 steps)
 1030 [66c0b394f08fd89236515c1c84485ea712a157be] KVM: kill file-&gt;f_count abuse in kvm
 1031 running grep ^SUBLEVEL = 25 Makefile
 1032 SUBLEVEL = 25
 1033 Bisecting: 2740 revisions left to test after this (roughly 12 steps)
 1034 [671294719628f1671faefd4882764886f8ad08cb] V4L/DVB(7879): Adding cx18 Support for mxl5005s
 1035 ...
 1036 ...
 1037 running grep ^SUBLEVEL = 25 Makefile
 1038 Bisecting: 0 revisions left to test after this (roughly 0 steps)
 1039 [2ddcca36c8bcfa251724fe342c8327451988be0d] Linux 2.6.26-rc1
 1040 running grep ^SUBLEVEL = 25 Makefile
 1041 2ddcca36c8bcfa251724fe342c8327451988be0d is the first bad commit
 1042 commit 2ddcca36c8bcfa251724fe342c8327451988be0d
 1043 Author: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
 1044 Date:   Sat May 3 11:59:44 2008 -0700
 1045 
 1046     Linux 2.6.26-rc1
 1047 
 1048 :100644 100644 5cf82581... 4492984e... M      Makefile
 1049 bisect run success</code></pre>
 1050 </div></div>
 1051 <div class="paragraph"><p>In this example, we passed "grep <em>^SUBLEVEL = 25</em> Makefile" as
 1052 parameter to "git bisect run". This means that at each step, the grep
 1053 command we passed will be launched. And if it exits with code 0 (that
 1054 means success) then git bisect will mark the current state as
 1055 "good". If it exits with code 1 (or any code between 1 and 127
 1056 included, except the special code 125), then the current state will be
 1057 marked as "bad".</p></div>
 1058 <div class="paragraph"><p>Exit code between 128 and 255 are special to "git bisect run". They
 1059 make it stop immediately the bisection process. This is useful for
 1060 example if the command passed takes too long to complete, because you
 1061 can kill it with a signal and it will stop the bisection process.</p></div>
 1062 <div class="paragraph"><p>It can also be useful in scripts passed to "git bisect run" to "exit
 1063 255" if some very abnormal situation is detected.</p></div>
 1064 </div>
 1065 <div class="sect2">
 1066 <h3 id="_avoiding_untestable_commits">Avoiding untestable commits</h3>
 1067 <div class="paragraph"><p>Sometimes it happens that the current state cannot be tested, for
 1068 example if it does not compile because there was a bug preventing it
 1069 at that time. This is what the special exit code 125 is for. It tells
 1070 "git bisect run" that the current commit should be marked as
 1071 untestable and that another one should be chosen and checked out.</p></div>
 1072 <div class="paragraph"><p>If the bisection process is driven manually, you can use "git bisect
 1073 skip" to do the same thing. (In fact the special exit code 125 makes
 1074 "git bisect run" use "git bisect skip" in the background.)</p></div>
 1075 <div class="paragraph"><p>Or if you want more control, you can inspect the current state using
 1076 for example "git bisect visualize". It will launch gitk (or "git log"
 1077 if the <code>DISPLAY</code> environment variable is not set) to help you find a
 1078 better bisection point.</p></div>
 1079 <div class="paragraph"><p>Either way, if you have a string of untestable commits, it might
 1080 happen that the regression you are looking for has been introduced by
 1081 one of these untestable commits. In this case it&#8217;s not possible to
 1082 tell for sure which commit introduced the regression.</p></div>
 1083 <div class="paragraph"><p>So if you used "git bisect skip" (or the run script exited with
 1084 special code 125) you could get a result like this:</p></div>
 1085 <div class="listingblock">
 1086 <div class="content">
 1087 <pre><code>There are only 'skip'ped commits left to test.
 1088 The first bad commit could be any of:
 1089 15722f2fa328eaba97022898a305ffc8172db6b1
 1090 78e86cf3e850bd755bb71831f42e200626fbd1e0
 1091 e15b73ad3db9b48d7d1ade32f8cd23a751fe0ace
 1092 070eab2303024706f2924822bfec8b9847e4ac1b
 1093 We cannot bisect more!</code></pre>
 1094 </div></div>
 1095 </div>
 1096 <div class="sect2">
 1097 <h3 id="_saving_a_log_and_replaying_it">Saving a log and replaying it</h3>
 1098 <div class="paragraph"><p>If you want to show other people your bisection process, you can get a
 1099 log using for example:</p></div>
 1100 <div class="listingblock">
 1101 <div class="content">
 1102 <pre><code>$ git bisect log &gt; bisect_log.txt</code></pre>
 1103 </div></div>
 1104 <div class="paragraph"><p>And it is possible to replay it using:</p></div>
 1105 <div class="listingblock">
 1106 <div class="content">
 1107 <pre><code>$ git bisect replay bisect_log.txt</code></pre>
 1108 </div></div>
 1109 </div>
 1110 </div>
 1111 </div>
 1112 <div class="sect1">
 1113 <h2 id="_git_bisect_details">"git bisect" details</h2>
 1114 <div class="sectionbody">
 1115 <div class="sect2">
 1116 <h3 id="_bisection_algorithm">Bisection algorithm</h3>
 1117 <div class="paragraph"><p>As the Git commits form a directed acyclic graph (DAG), finding the
 1118 best bisection commit to test at each step is not so simple. Anyway
 1119 Linus found and implemented a "truly stupid" algorithm, later improved
 1120 by Junio Hamano, that works quite well.</p></div>
 1121 <div class="paragraph"><p>So the algorithm used by "git bisect" to find the best bisection
 1122 commit when there are no skipped commits is the following:</p></div>
 1123 <div class="paragraph"><p>1) keep only the commits that:</p></div>
 1124 <div class="paragraph"><p>a) are ancestor of the "bad" commit (including the "bad" commit itself),
 1125 b) are not ancestor of a "good" commit (excluding the "good" commits).</p></div>
 1126 <div class="paragraph"><p>This means that we get rid of the uninteresting commits in the DAG.</p></div>
 1127 <div class="paragraph"><p>For example if we start with a graph like this:</p></div>
 1128 <div class="listingblock">
 1129 <div class="content">
 1130 <pre><code>G-Y-G-W-W-W-X-X-X-X
 1131            \ /
 1132             W-W-B
 1133            /
 1134 Y---G-W---W
 1135  \ /   \
 1136 Y-Y     X-X-X-X
 1137 
 1138 -&gt; time goes this way -&gt;</code></pre>
 1139 </div></div>
 1140 <div class="paragraph"><p>where B is the "bad" commit, "G" are "good" commits and W, X, and Y
 1141 are other commits, we will get the following graph after this first
 1142 step:</p></div>
 1143 <div class="listingblock">
 1144 <div class="content">
 1145 <pre><code>W-W-W
 1146      \
 1147       W-W-B
 1148      /
 1149 W---W</code></pre>
 1150 </div></div>
 1151 <div class="paragraph"><p>So only the W and B commits will be kept. Because commits X and Y will
 1152 have been removed by rules a) and b) respectively, and because commits
 1153 G are removed by rule b) too.</p></div>
 1154 <div class="paragraph"><p>Note for Git users, that it is equivalent as keeping only the commit
 1155 given by:</p></div>
 1156 <div class="listingblock">
 1157 <div class="content">
 1158 <pre><code>git rev-list BAD --not GOOD1 GOOD2...</code></pre>
 1159 </div></div>
 1160 <div class="paragraph"><p>Also note that we don&#8217;t require the commits that are kept to be
 1161 descendants of a "good" commit. So in the following example, commits W
 1162 and Z will be kept:</p></div>
 1163 <div class="listingblock">
 1164 <div class="content">
 1165 <pre><code>G-W-W-W-B
 1166    /
 1167 Z-Z</code></pre>
 1168 </div></div>
 1169 <div class="paragraph"><p>2) starting from the "good" ends of the graph, associate to each
 1170 commit the number of ancestors it has plus one</p></div>
 1171 <div class="paragraph"><p>For example with the following graph where H is the "bad" commit and A
 1172 and D are some parents of some "good" commits:</p></div>
 1173 <div class="listingblock">
 1174 <div class="content">
 1175 <pre><code>A-B-C
 1176      \
 1177       F-G-H
 1178      /
 1179 D---E</code></pre>
 1180 </div></div>
 1181 <div class="paragraph"><p>this will give:</p></div>
 1182 <div class="listingblock">
 1183 <div class="content">
 1184 <pre><code>1 2 3
 1185 A-B-C
 1186      \6 7 8
 1187       F-G-H
 1188 1   2/
 1189 D---E</code></pre>
 1190 </div></div>
 1191 <div class="paragraph"><p>3) associate to each commit: min(X, N - X)</p></div>
 1192 <div class="paragraph"><p>where X is the value associated to the commit in step 2) and N is the
 1193 total number of commits in the graph.</p></div>
 1194 <div class="paragraph"><p>In the above example we have N = 8, so this will give:</p></div>
 1195 <div class="listingblock">
 1196 <div class="content">
 1197 <pre><code>1 2 3
 1198 A-B-C
 1199      \2 1 0
 1200       F-G-H
 1201 1   2/
 1202 D---E</code></pre>
 1203 </div></div>
 1204 <div class="paragraph"><p>4) the best bisection point is the commit with the highest associated
 1205 number</p></div>
 1206 <div class="paragraph"><p>So in the above example the best bisection point is commit C.</p></div>
 1207 <div class="paragraph"><p>5) note that some shortcuts are implemented to speed up the algorithm</p></div>
 1208 <div class="paragraph"><p>As we know N from the beginning, we know that min(X, N - X) can&#8217;t be
 1209 greater than N/2. So during steps 2) and 3), if we would associate N/2
 1210 to a commit, then we know this is the best bisection point. So in this
 1211 case we can just stop processing any other commit and return the
 1212 current commit.</p></div>
 1213 </div>
 1214 <div class="sect2">
 1215 <h3 id="_bisection_algorithm_debugging">Bisection algorithm debugging</h3>
 1216 <div class="paragraph"><p>For any commit graph, you can see the number associated with each
 1217 commit using "git rev-list --bisect-all".</p></div>
 1218 <div class="paragraph"><p>For example, for the above graph, a command like:</p></div>
 1219 <div class="listingblock">
 1220 <div class="content">
 1221 <pre><code>$ git rev-list --bisect-all BAD --not GOOD1 GOOD2</code></pre>
 1222 </div></div>
 1223 <div class="paragraph"><p>would output something like:</p></div>
 1224 <div class="listingblock">
 1225 <div class="content">
 1226 <pre><code>e15b73ad3db9b48d7d1ade32f8cd23a751fe0ace (dist=3)
 1227 15722f2fa328eaba97022898a305ffc8172db6b1 (dist=2)
 1228 78e86cf3e850bd755bb71831f42e200626fbd1e0 (dist=2)
 1229 a1939d9a142de972094af4dde9a544e577ddef0e (dist=2)
 1230 070eab2303024706f2924822bfec8b9847e4ac1b (dist=1)
 1231 a3864d4f32a3bf5ed177ddef598490a08760b70d (dist=1)
 1232 a41baa717dd74f1180abf55e9341bc7a0bb9d556 (dist=1)
 1233 9e622a6dad403b71c40979743bb9d5be17b16bd6 (dist=0)</code></pre>
 1234 </div></div>
 1235 </div>
 1236 <div class="sect2">
 1237 <h3 id="_bisection_algorithm_discussed">Bisection algorithm discussed</h3>
 1238 <div class="paragraph"><p>First let&#8217;s define "best bisection point". We will say that a commit X
 1239 is a best bisection point or a best bisection commit if knowing its
 1240 state ("good" or "bad") gives as much information as possible whether
 1241 the state of the commit happens to be "good" or "bad".</p></div>
 1242 <div class="paragraph"><p>This means that the best bisection commits are the commits where the
 1243 following function is maximum:</p></div>
 1244 <div class="listingblock">
 1245 <div class="content">
 1246 <pre><code>f(X) = min(information_if_good(X), information_if_bad(X))</code></pre>
 1247 </div></div>
 1248 <div class="paragraph"><p>where information_if_good(X) is the information we get if X is good
 1249 and information_if_bad(X) is the information we get if X is bad.</p></div>
 1250 <div class="paragraph"><p>Now we will suppose that there is only one "first bad commit". This
 1251 means that all its descendants are "bad" and all the other commits are
 1252 "good". And we will suppose that all commits have an equal probability
 1253 of being good or bad, or of being the first bad commit, so knowing the
 1254 state of c commits gives always the same amount of information
 1255 wherever these c commits are on the graph and whatever c is. (So we
 1256 suppose that these commits being for example on a branch or near a
 1257 good or a bad commit does not give more or less information).</p></div>
 1258 <div class="paragraph"><p>Let&#8217;s also suppose that we have a cleaned up graph like one after step
 1259 1) in the bisection algorithm above. This means that we can measure
 1260 the information we get in terms of number of commit we can remove from
 1261 the graph..</p></div>
 1262 <div class="paragraph"><p>And let&#8217;s take a commit X in the graph.</p></div>
 1263 <div class="paragraph"><p>If X is found to be "good", then we know that its ancestors are all
 1264 "good", so we want to say that:</p></div>
 1265 <div class="listingblock">
 1266 <div class="content">
 1267 <pre><code>information_if_good(X) = number_of_ancestors(X)  (TRUE)</code></pre>
 1268 </div></div>
 1269 <div class="paragraph"><p>And this is true because at step 1) b) we remove the ancestors of the
 1270 "good" commits.</p></div>
 1271 <div class="paragraph"><p>If X is found to be "bad", then we know that its descendants are all
 1272 "bad", so we want to say that:</p></div>
 1273 <div class="listingblock">
 1274 <div class="content">
 1275 <pre><code>information_if_bad(X) = number_of_descendants(X)  (WRONG)</code></pre>
 1276 </div></div>
 1277 <div class="paragraph"><p>But this is wrong because at step 1) a) we keep only the ancestors of
 1278 the bad commit. So we get more information when a commit is marked as
 1279 "bad", because we also know that the ancestors of the previous "bad"
 1280 commit that are not ancestors of the new "bad" commit are not the
 1281 first bad commit. We don&#8217;t know if they are good or bad, but we know
 1282 that they are not the first bad commit because they are not ancestor
 1283 of the new "bad" commit.</p></div>
 1284 <div class="paragraph"><p>So when a commit is marked as "bad" we know we can remove all the
 1285 commits in the graph except those that are ancestors of the new "bad"
 1286 commit. This means that:</p></div>
 1287 <div class="listingblock">
 1288 <div class="content">
 1289 <pre><code>information_if_bad(X) = N - number_of_ancestors(X)  (TRUE)</code></pre>
 1290 </div></div>
 1291 <div class="paragraph"><p>where N is the number of commits in the (cleaned up) graph.</p></div>
 1292 <div class="paragraph"><p>So in the end this means that to find the best bisection commits we
 1293 should maximize the function:</p></div>
 1294 <div class="listingblock">
 1295 <div class="content">
 1296 <pre><code>f(X) = min(number_of_ancestors(X), N - number_of_ancestors(X))</code></pre>
 1297 </div></div>
 1298 <div class="paragraph"><p>And this is nice because at step 2) we compute number_of_ancestors(X)
 1299 and so at step 3) we compute f(X).</p></div>
 1300 <div class="paragraph"><p>Let&#8217;s take the following graph as an example:</p></div>
 1301 <div class="listingblock">
 1302 <div class="content">
 1303 <pre><code>            G-H-I-J
 1304            /       \
 1305 A-B-C-D-E-F         O
 1306            \       /
 1307             K-L-M-N</code></pre>
 1308 </div></div>
 1309 <div class="paragraph"><p>If we compute the following non optimal function on it:</p></div>
 1310 <div class="listingblock">
 1311 <div class="content">
 1312 <pre><code>g(X) = min(number_of_ancestors(X), number_of_descendants(X))</code></pre>
 1313 </div></div>
 1314 <div class="paragraph"><p>we get:</p></div>
 1315 <div class="listingblock">
 1316 <div class="content">
 1317 <pre><code>            4 3 2 1
 1318             G-H-I-J
 1319 1 2 3 4 5 6/       \0
 1320 A-B-C-D-E-F         O
 1321            \       /
 1322             K-L-M-N
 1323             4 3 2 1</code></pre>
 1324 </div></div>
 1325 <div class="paragraph"><p>but with the algorithm used by git bisect we get:</p></div>
 1326 <div class="listingblock">
 1327 <div class="content">
 1328 <pre><code>            7 7 6 5
 1329             G-H-I-J
 1330 1 2 3 4 5 6/       \0
 1331 A-B-C-D-E-F         O
 1332            \       /
 1333             K-L-M-N
 1334             7 7 6 5</code></pre>
 1335 </div></div>
 1336 <div class="paragraph"><p>So we chose G, H, K or L as the best bisection point, which is better
 1337 than F. Because if for example L is bad, then we will know not only
 1338 that L, M and N are bad but also that G, H, I and J are not the first
 1339 bad commit (since we suppose that there is only one first bad commit
 1340 and it must be an ancestor of L).</p></div>
 1341 <div class="paragraph"><p>So the current algorithm seems to be the best possible given what we
 1342 initially supposed.</p></div>
 1343 </div>
 1344 <div class="sect2">
 1345 <h3 id="_skip_algorithm">Skip algorithm</h3>
 1346 <div class="paragraph"><p>When some commits have been skipped (using "git bisect skip"), then
 1347 the bisection algorithm is the same for step 1) to 3). But then we use
 1348 roughly the following steps:</p></div>
 1349 <div class="paragraph"><p>6) sort the commit by decreasing associated value</p></div>
 1350 <div class="paragraph"><p>7) if the first commit has not been skipped, we can return it and stop
 1351 here</p></div>
 1352 <div class="paragraph"><p>8) otherwise filter out all the skipped commits in the sorted list</p></div>
 1353 <div class="paragraph"><p>9) use a pseudo random number generator (PRNG) to generate a random
 1354 number between 0 and 1</p></div>
 1355 <div class="paragraph"><p>10) multiply this random number with its square root to bias it toward
 1356 0</p></div>
 1357 <div class="paragraph"><p>11) multiply the result by the number of commits in the filtered list
 1358 to get an index into this list</p></div>
 1359 <div class="paragraph"><p>12) return the commit at the computed index</p></div>
 1360 </div>
 1361 <div class="sect2">
 1362 <h3 id="_skip_algorithm_discussed">Skip algorithm discussed</h3>
 1363 <div class="paragraph"><p>After step 7) (in the skip algorithm), we could check if the second
 1364 commit has been skipped and return it if it is not the case. And in
 1365 fact that was the algorithm we used from when "git bisect skip" was
 1366 developed in Git version 1.5.4 (released on February 1st 2008) until
 1367 Git version 1.6.4 (released July 29th 2009).</p></div>
 1368 <div class="paragraph"><p>But Ingo Molnar and H. Peter Anvin (another well known linux kernel
 1369 developer) both complained that sometimes the best bisection points
 1370 all happened to be in an area where all the commits are
 1371 untestable. And in this case the user was asked to test many
 1372 untestable commits, which could be very inefficient.</p></div>
 1373 <div class="paragraph"><p>Indeed untestable commits are often untestable because a breakage was
 1374 introduced at one time, and that breakage was fixed only after many
 1375 other commits were introduced.</p></div>
 1376 <div class="paragraph"><p>This breakage is of course most of the time unrelated to the breakage
 1377 we are trying to locate in the commit graph. But it prevents us to
 1378 know if the interesting "bad behavior" is present or not.</p></div>
 1379 <div class="paragraph"><p>So it is a fact that commits near an untestable commit have a high
 1380 probability of being untestable themselves. And the best bisection
 1381 commits are often found together too (due to the bisection algorithm).</p></div>
 1382 <div class="paragraph"><p>This is why it is a bad idea to just chose the next best unskipped
 1383 bisection commit when the first one has been skipped.</p></div>
 1384 <div class="paragraph"><p>We found that most commits on the graph may give quite a lot of
 1385 information when they are tested. And the commits that will not on
 1386 average give a lot of information are the one near the good and bad
 1387 commits.</p></div>
 1388 <div class="paragraph"><p>So using a PRNG with a bias to favor commits away from the good and
 1389 bad commits looked like a good choice.</p></div>
 1390 <div class="paragraph"><p>One obvious improvement to this algorithm would be to look for a
 1391 commit that has an associated value near the one of the best bisection
 1392 commit, and that is on another branch, before using the PRNG. Because
 1393 if such a commit exists, then it is not very likely to be untestable
 1394 too, so it will probably give more information than a nearly randomly
 1395 chosen one.</p></div>
 1396 </div>
 1397 <div class="sect2">
 1398 <h3 id="_checking_merge_bases">Checking merge bases</h3>
 1399 <div class="paragraph"><p>There is another tweak in the bisection algorithm that has not been
 1400 described in the "bisection algorithm" above.</p></div>
 1401 <div class="paragraph"><p>We supposed in the previous examples that the "good" commits were
 1402 ancestors of the "bad" commit. But this is not a requirement of "git
 1403 bisect".</p></div>
 1404 <div class="paragraph"><p>Of course the "bad" commit cannot be an ancestor of a "good" commit,
 1405 because the ancestors of the good commits are supposed to be
 1406 "good". And all the "good" commits must be related to the bad commit.
 1407 They cannot be on a branch that has no link with the branch of the
 1408 "bad" commit. But it is possible for a good commit to be related to a
 1409 bad commit and yet not be neither one of its ancestor nor one of its
 1410 descendants.</p></div>
 1411 <div class="paragraph"><p>For example, there can be a "main" branch, and a "dev" branch that was
 1412 forked of the main branch at a commit named "D" like this:</p></div>
 1413 <div class="listingblock">
 1414 <div class="content">
 1415 <pre><code>A-B-C-D-E-F-G  &lt;--main
 1416        \
 1417         H-I-J  &lt;--dev</code></pre>
 1418 </div></div>
 1419 <div class="paragraph"><p>The commit "D" is called a "merge base" for branch "main" and "dev"
 1420 because it&#8217;s the best common ancestor for these branches for a merge.</p></div>
 1421 <div class="paragraph"><p>Now let&#8217;s suppose that commit J is bad and commit G is good and that
 1422 we apply the bisection algorithm like it has been previously
 1423 described.</p></div>
 1424 <div class="paragraph"><p>As described in step 1) b) of the bisection algorithm, we remove all
 1425 the ancestors of the good commits because they are supposed to be good
 1426 too.</p></div>
 1427 <div class="paragraph"><p>So we would be left with only:</p></div>
 1428 <div class="listingblock">
 1429 <div class="content">
 1430 <pre><code>H-I-J</code></pre>
 1431 </div></div>
 1432 <div class="paragraph"><p>But what happens if the first bad commit is "B" and if it has been
 1433 fixed in the "main" branch by commit "F"?</p></div>
 1434 <div class="paragraph"><p>The result of such a bisection would be that we would find that H is
 1435 the first bad commit, when in fact it&#8217;s B. So that would be wrong!</p></div>
 1436 <div class="paragraph"><p>And yes it can happen in practice that people working on one branch
 1437 are not aware that people working on another branch fixed a bug! It
 1438 could also happen that F fixed more than one bug or that it is a
 1439 revert of some big development effort that was not ready to be
 1440 released.</p></div>
 1441 <div class="paragraph"><p>In fact development teams often maintain both a development branch and
 1442 a maintenance branch, and it would be quite easy for them if "git
 1443 bisect" just worked when they want to bisect a regression on the
 1444 development branch that is not on the maintenance branch. They should
 1445 be able to start bisecting using:</p></div>
 1446 <div class="listingblock">
 1447 <div class="content">
 1448 <pre><code>$ git bisect start dev main</code></pre>
 1449 </div></div>
 1450 <div class="paragraph"><p>To enable that additional nice feature, when a bisection is started
 1451 and when some good commits are not ancestors of the bad commit, we
 1452 first compute the merge bases between the bad and the good commits and
 1453 we chose these merge bases as the first commits that will be checked
 1454 out and tested.</p></div>
 1455 <div class="paragraph"><p>If it happens that one merge base is bad, then the bisection process
 1456 is stopped with a message like:</p></div>
 1457 <div class="listingblock">
 1458 <div class="content">
 1459 <pre><code>The merge base BBBBBB is bad.
 1460 This means the bug has been fixed between BBBBBB and [GGGGGG,...].</code></pre>
 1461 </div></div>
 1462 <div class="paragraph"><p>where BBBBBB is the sha1 hash of the bad merge base and [GGGGGG,&#8230;]
 1463 is a comma separated list of the sha1 of the good commits.</p></div>
 1464 <div class="paragraph"><p>If some of the merge bases are skipped, then the bisection process
 1465 continues, but the following message is printed for each skipped merge
 1466 base:</p></div>
 1467 <div class="listingblock">
 1468 <div class="content">
 1469 <pre><code>Warning: the merge base between BBBBBB and [GGGGGG,...] must be skipped.
 1470 So we cannot be sure the first bad commit is between MMMMMM and BBBBBB.
 1471 We continue anyway.</code></pre>
 1472 </div></div>
 1473 <div class="paragraph"><p>where BBBBBB is the sha1 hash of the bad commit, MMMMMM is the sha1
 1474 hash of the merge base that is skipped and [GGGGGG,&#8230;]  is a comma
 1475 separated list of the sha1 of the good commits.</p></div>
 1476 <div class="paragraph"><p>So if there is no bad merge base, the bisection process continues as
 1477 usual after this step.</p></div>
 1478 </div>
 1479 </div>
 1480 </div>
 1481 <div class="sect1">
 1482 <h2 id="_best_bisecting_practices">Best bisecting practices</h2>
 1483 <div class="sectionbody">
 1484 <div class="sect2">
 1485 <h3 id="_using_test_suites_and_git_bisect_together">Using test suites and git bisect together</h3>
 1486 <div class="paragraph"><p>If you both have a test suite and use git bisect, then it becomes less
 1487 important to check that all tests pass after each commit. Though of
 1488 course it is probably a good idea to have some checks to avoid
 1489 breaking too many things because it could make bisecting other bugs
 1490 more difficult.</p></div>
 1491 <div class="paragraph"><p>You can focus your efforts to check at a few points (for example rc
 1492 and beta releases) that all the T test cases pass for all the N
 1493 configurations. And when some tests don&#8217;t pass you can use "git
 1494 bisect" (or better "git bisect run"). So you should perform roughly:</p></div>
 1495 <div class="listingblock">
 1496 <div class="content">
 1497 <pre><code>c * N * T + b * M * log2(M) tests</code></pre>
 1498 </div></div>
 1499 <div class="paragraph"><p>where c is the number of rounds of test (so a small constant) and b is
 1500 the ratio of bug per commit (hopefully a small constant too).</p></div>
 1501 <div class="paragraph"><p>So of course it&#8217;s much better as it&#8217;s O(N * T) vs O(N * T * M) if
 1502 you would test everything after each commit.</p></div>
 1503 <div class="paragraph"><p>This means that test suites are good to prevent some bugs from being
 1504 committed and they are also quite good to tell you that you have some
 1505 bugs. But they are not so good to tell you where some bugs have been
 1506 introduced. To tell you that efficiently, git bisect is needed.</p></div>
 1507 <div class="paragraph"><p>The other nice thing with test suites, is that when you have one, you
 1508 already know how to test for bad behavior. So you can use this
 1509 knowledge to create a new test case for "git bisect" when it appears
 1510 that there is a regression. So it will be easier to bisect the bug and
 1511 fix it. And then you can add the test case you just created to your
 1512 test suite.</p></div>
 1513 <div class="paragraph"><p>So if you know how to create test cases and how to bisect, you will be
 1514 subject to a virtuous circle:</p></div>
 1515 <div class="paragraph"><p>more tests &#8658; easier to create tests &#8658; easier to bisect &#8658; more tests</p></div>
 1516 <div class="paragraph"><p>So test suites and "git bisect" are complementary tools that are very
 1517 powerful and efficient when used together.</p></div>
 1518 </div>
 1519 <div class="sect2">
 1520 <h3 id="_bisecting_build_failures">Bisecting build failures</h3>
 1521 <div class="paragraph"><p>You can very easily automatically bisect broken builds using something
 1522 like:</p></div>
 1523 <div class="listingblock">
 1524 <div class="content">
 1525 <pre><code>$ git bisect start BAD GOOD
 1526 $ git bisect run make</code></pre>
 1527 </div></div>
 1528 </div>
 1529 <div class="sect2">
 1530 <h3 id="_passing_sh_c_some_commands_to_git_bisect_run">Passing sh -c "some commands" to "git bisect run"</h3>
 1531 <div class="paragraph"><p>For example:</p></div>
 1532 <div class="listingblock">
 1533 <div class="content">
 1534 <pre><code>$ git bisect run sh -c "make || exit 125; ./my_app | grep 'good output'"</code></pre>
 1535 </div></div>
 1536 <div class="paragraph"><p>On the other hand if you do this often, then it can be worth having
 1537 scripts to avoid too much typing.</p></div>
 1538 </div>
 1539 <div class="sect2">
 1540 <h3 id="_finding_performance_regressions">Finding performance regressions</h3>
 1541 <div class="paragraph"><p>Here is an example script that comes slightly modified from a real
 1542 world script used by Junio Hamano <a href="#4">[4]</a>.</p></div>
 1543 <div class="paragraph"><p>This script can be passed to "git bisect run" to find the commit that
 1544 introduced a performance regression:</p></div>
 1545 <div class="listingblock">
 1546 <div class="content">
 1547 <pre><code>#!/bin/sh
 1548 
 1549 # Build errors are not what I am interested in.
 1550 make my_app || exit 255
 1551 
 1552 # We are checking if it stops in a reasonable amount of time, so
 1553 # let it run in the background...
 1554 
 1555 ./my_app &gt;log 2&gt;&amp;1 &amp;
 1556 
 1557 # ... and grab its process ID.
 1558 pid=$!
 1559 
 1560 # ... and then wait for sufficiently long.
 1561 sleep $NORMAL_TIME
 1562 
 1563 # ... and then see if the process is still there.
 1564 if kill -0 $pid
 1565 then
 1566         # It is still running -- that is bad.
 1567         kill $pid; sleep 1; kill $pid;
 1568         exit 1
 1569 else
 1570         # It has already finished (the $pid process was no more),
 1571         # and we are happy.
 1572         exit 0
 1573 fi</code></pre>
 1574 </div></div>
 1575 </div>
 1576 <div class="sect2">
 1577 <h3 id="_following_general_best_practices">Following general best practices</h3>
 1578 <div class="paragraph"><p>It is obviously a good idea not to have commits with changes that
 1579 knowingly break things, even if some other commits later fix the
 1580 breakage.</p></div>
 1581 <div class="paragraph"><p>It is also a good idea when using any VCS to have only one small
 1582 logical change in each commit.</p></div>
 1583 <div class="paragraph"><p>The smaller the changes in your commit, the most effective "git
 1584 bisect" will be. And you will probably need "git bisect" less in the
 1585 first place, as small changes are easier to review even if they are
 1586 only reviewed by the committer.</p></div>
 1587 <div class="paragraph"><p>Another good idea is to have good commit messages. They can be very
 1588 helpful to understand why some changes were made.</p></div>
 1589 <div class="paragraph"><p>These general best practices are very helpful if you bisect often.</p></div>
 1590 </div>
 1591 <div class="sect2">
 1592 <h3 id="_avoiding_bug_prone_merges">Avoiding bug prone merges</h3>
 1593 <div class="paragraph"><p>First merges by themselves can introduce some regressions even when
 1594 the merge needs no source code conflict resolution. This is because a
 1595 semantic change can happen in one branch while the other branch is not
 1596 aware of it.</p></div>
 1597 <div class="paragraph"><p>For example one branch can change the semantic of a function while the
 1598 other branch add more calls to the same function.</p></div>
 1599 <div class="paragraph"><p>This is made much worse if many files have to be fixed to resolve
 1600 conflicts. That&#8217;s why such merges are called "evil merges". They can
 1601 make regressions very difficult to track down. It can even be
 1602 misleading to know the first bad commit if it happens to be such a
 1603 merge, because people might think that the bug comes from bad conflict
 1604 resolution when it comes from a semantic change in one branch.</p></div>
 1605 <div class="paragraph"><p>Anyway "git rebase" can be used to linearize history. This can be used
 1606 either to avoid merging in the first place. Or it can be used to
 1607 bisect on a linear history instead of the non linear one, as this
 1608 should give more information in case of a semantic change in one
 1609 branch.</p></div>
 1610 <div class="paragraph"><p>Merges can be also made simpler by using smaller branches or by using
 1611 many topic branches instead of only long version related branches.</p></div>
 1612 <div class="paragraph"><p>And testing can be done more often in special integration branches
 1613 like linux-next for the linux kernel.</p></div>
 1614 </div>
 1615 <div class="sect2">
 1616 <h3 id="_adapting_your_work_flow">Adapting your work-flow</h3>
 1617 <div class="paragraph"><p>A special work-flow to process regressions can give great results.</p></div>
 1618 <div class="paragraph"><p>Here is an example of a work-flow used by Andreas Ericsson:</p></div>
 1619 <div class="ulist"><ul>
 1620 <li>
 1621 <p>
 1622 write, in the test suite, a test script that exposes the regression
 1623 </p>
 1624 </li>
 1625 <li>
 1626 <p>
 1627 use "git bisect run" to find the commit that introduced it
 1628 </p>
 1629 </li>
 1630 <li>
 1631 <p>
 1632 fix the bug that is often made obvious by the previous step
 1633 </p>
 1634 </li>
 1635 <li>
 1636 <p>
 1637 commit both the fix and the test script (and if needed more tests)
 1638 </p>
 1639 </li>
 1640 </ul></div>
 1641 <div class="paragraph"><p>And here is what Andreas said about this work-flow <a href="#5">[5]</a>:</p></div>
 1642 <div class="quoteblock">
 1643 <div class="content">
 1644 <div class="paragraph"><p>To give some hard figures, we used to have an average report-to-fix
 1645 cycle of 142.6 hours (according to our somewhat weird bug-tracker
 1646 which just measures wall-clock time). Since we moved to Git, we&#8217;ve
 1647 lowered that to 16.2 hours. Primarily because we can stay on top of
 1648 the bug fixing now, and because everyone&#8217;s jockeying to get to fix
 1649 bugs (we&#8217;re quite proud of how lazy we are to let Git find the bugs
 1650 for us). Each new release results in ~40% fewer bugs (almost certainly
 1651 due to how we now feel about writing tests).</p></div>
 1652 </div>
 1653 <div class="attribution">
 1654 </div></div>
 1655 <div class="paragraph"><p>Clearly this work-flow uses the virtuous circle between test suites
 1656 and "git bisect". In fact it makes it the standard procedure to deal
 1657 with regression.</p></div>
 1658 <div class="paragraph"><p>In other messages Andreas says that they also use the "best practices"
 1659 described above: small logical commits, topic branches, no evil
 1660 merge,&#8230; These practices all improve the bisectability of the commit
 1661 graph, by making it easier and more useful to bisect.</p></div>
 1662 <div class="paragraph"><p>So a good work-flow should be designed around the above points. That
 1663 is making bisecting easier, more useful and standard.</p></div>
 1664 </div>
 1665 <div class="sect2">
 1666 <h3 id="_involving_qa_people_and_if_possible_end_users">Involving QA people and if possible end users</h3>
 1667 <div class="paragraph"><p>One nice about "git bisect" is that it is not only a developer
 1668 tool. It can effectively be used by QA people or even end users (if
 1669 they have access to the source code or if they can get access to all
 1670 the builds).</p></div>
 1671 <div class="paragraph"><p>There was a discussion at one point on the linux kernel mailing list
 1672 of whether it was ok to always ask end user to bisect, and very good
 1673 points were made to support the point of view that it is ok.</p></div>
 1674 <div class="paragraph"><p>For example David Miller wrote <a href="#6">[6]</a>:</p></div>
 1675 <div class="quoteblock">
 1676 <div class="content">
 1677 <div class="paragraph"><p>What people don&#8217;t get is that this is a situation where the "end node
 1678 principle" applies. When you have limited resources (here: developers)
 1679 you don&#8217;t push the bulk of the burden upon them. Instead you push
 1680 things out to the resource you have a lot of, the end nodes (here:
 1681 users), so that the situation actually scales.</p></div>
 1682 </div>
 1683 <div class="attribution">
 1684 </div></div>
 1685 <div class="paragraph"><p>This means that it is often "cheaper" if QA people or end users can do
 1686 it.</p></div>
 1687 <div class="paragraph"><p>What is interesting too is that end users that are reporting bugs (or
 1688 QA people that reproduced a bug) have access to the environment where
 1689 the bug happens. So they can often more easily reproduce a
 1690 regression. And if they can bisect, then more information will be
 1691 extracted from the environment where the bug happens, which means that
 1692 it will be easier to understand and then fix the bug.</p></div>
 1693 <div class="paragraph"><p>For open source projects it can be a good way to get more useful
 1694 contributions from end users, and to introduce them to QA and
 1695 development activities.</p></div>
 1696 </div>
 1697 <div class="sect2">
 1698 <h3 id="_using_complex_scripts">Using complex scripts</h3>
 1699 <div class="paragraph"><p>In some cases like for kernel development it can be worth developing
 1700 complex scripts to be able to fully automate bisecting.</p></div>
 1701 <div class="paragraph"><p>Here is what Ingo Molnar says about that <a href="#7">[7]</a>:</p></div>
 1702 <div class="quoteblock">
 1703 <div class="content">
 1704 <div class="paragraph"><p>i have a fully automated bootup-hang bisection script. It is based on
 1705 "git-bisect run". I run the script, it builds and boots kernels fully
 1706 automatically, and when the bootup fails (the script notices that via
 1707 the serial log, which it continuously watches - or via a timeout, if
 1708 the system does not come up within 10 minutes it&#8217;s a "bad" kernel),
 1709 the script raises my attention via a beep and i power cycle the test
 1710 box. (yeah, i should make use of a managed power outlet to 100%
 1711 automate it)</p></div>
 1712 </div>
 1713 <div class="attribution">
 1714 </div></div>
 1715 </div>
 1716 <div class="sect2">
 1717 <h3 id="_combining_test_suites_git_bisect_and_other_systems_together">Combining test suites, git bisect and other systems together</h3>
 1718 <div class="paragraph"><p>We have seen that test suites and git bisect are very powerful when
 1719 used together. It can be even more powerful if you can combine them
 1720 with other systems.</p></div>
 1721 <div class="paragraph"><p>For example some test suites could be run automatically at night with
 1722 some unusual (or even random) configurations. And if a regression is
 1723 found by a test suite, then "git bisect" can be automatically
 1724 launched, and its result can be emailed to the author of the first bad
 1725 commit found by "git bisect", and perhaps other people too. And a new
 1726 entry in the bug tracking system could be automatically created too.</p></div>
 1727 </div>
 1728 </div>
 1729 </div>
 1730 <div class="sect1">
 1731 <h2 id="_the_future_of_bisecting">The future of bisecting</h2>
 1732 <div class="sectionbody">
 1733 <div class="sect2">
 1734 <h3 id="_git_replace">"git replace"</h3>
 1735 <div class="paragraph"><p>We saw earlier that "git bisect skip" is now using a PRNG to try to
 1736 avoid areas in the commit graph where commits are untestable. The
 1737 problem is that sometimes the first bad commit will be in an
 1738 untestable area.</p></div>
 1739 <div class="paragraph"><p>To simplify the discussion we will suppose that the untestable area is
 1740 a simple string of commits and that it was created by a breakage
 1741 introduced by one commit (let&#8217;s call it BBC for bisect breaking
 1742 commit) and later fixed by another one (let&#8217;s call it BFC for bisect
 1743 fixing commit).</p></div>
 1744 <div class="paragraph"><p>For example:</p></div>
 1745 <div class="listingblock">
 1746 <div class="content">
 1747 <pre><code>...-Y-BBC-X1-X2-X3-X4-X5-X6-BFC-Z-...</code></pre>
 1748 </div></div>
 1749 <div class="paragraph"><p>where we know that Y is good and BFC is bad, and where BBC and X1 to
 1750 X6 are untestable.</p></div>
 1751 <div class="paragraph"><p>In this case if you are bisecting manually, what you can do is create
 1752 a special branch that starts just before the BBC. The first commit in
 1753 this branch should be the BBC with the BFC squashed into it. And the
 1754 other commits in the branch should be the commits between BBC and BFC
 1755 rebased on the first commit of the branch and then the commit after
 1756 BFC also rebased on.</p></div>
 1757 <div class="paragraph"><p>For example:</p></div>
 1758 <div class="listingblock">
 1759 <div class="content">
 1760 <pre><code>      (BBC+BFC)-X1'-X2'-X3'-X4'-X5'-X6'-Z'
 1761      /
 1762 ...-Y-BBC-X1-X2-X3-X4-X5-X6-BFC-Z-...</code></pre>
 1763 </div></div>
 1764 <div class="paragraph"><p>where commits quoted with ' have been rebased.</p></div>
 1765 <div class="paragraph"><p>You can easily create such a branch with Git using interactive rebase.</p></div>
 1766 <div class="paragraph"><p>For example using:</p></div>
 1767 <div class="listingblock">
 1768 <div class="content">
 1769 <pre><code>$ git rebase -i Y Z</code></pre>
 1770 </div></div>
 1771 <div class="paragraph"><p>and then moving BFC after BBC and squashing it.</p></div>
 1772 <div class="paragraph"><p>After that you can start bisecting as usual in the new branch and you
 1773 should eventually find the first bad commit.</p></div>
 1774 <div class="paragraph"><p>For example:</p></div>
 1775 <div class="listingblock">
 1776 <div class="content">
 1777 <pre><code>$ git bisect start Z' Y</code></pre>
 1778 </div></div>
 1779 <div class="paragraph"><p>If you are using "git bisect run", you can use the same manual fix up
 1780 as above, and then start another "git bisect run" in the special
 1781 branch. Or as the "git bisect" man page says, the script passed to
 1782 "git bisect run" can apply a patch before it compiles and test the
 1783 software <a href="#8">[8]</a>. The patch should turn a current untestable commits
 1784 into a testable one. So the testing will result in "good" or "bad" and
 1785 "git bisect" will be able to find the first bad commit. And the script
 1786 should not forget to remove the patch once the testing is done before
 1787 exiting from the script.</p></div>
 1788 <div class="paragraph"><p>(Note that instead of a patch you can use "git cherry-pick BFC" to
 1789 apply the fix, and in this case you should use "git reset --hard
 1790 HEAD^" to revert the cherry-pick after testing and before returning
 1791 from the script.)</p></div>
 1792 <div class="paragraph"><p>But the above ways to work around untestable areas are a little bit
 1793 clunky. Using special branches is nice because these branches can be
 1794 shared by developers like usual branches, but the risk is that people
 1795 will get many such branches. And it disrupts the normal "git bisect"
 1796 work-flow. So, if you want to use "git bisect run" completely
 1797 automatically, you have to add special code in your script to restart
 1798 bisection in the special branches.</p></div>
 1799 <div class="paragraph"><p>Anyway one can notice in the above special branch example that the Z'
 1800 and Z commits should point to the same source code state (the same
 1801 "tree" in git parlance). That&#8217;s because Z' result from applying the
 1802 same changes as Z just in a slightly different order.</p></div>
 1803 <div class="paragraph"><p>So if we could just "replace" Z by Z' when we bisect, then we would
 1804 not need to add anything to a script. It would just work for anyone in
 1805 the project sharing the special branches and the replacements.</p></div>
 1806 <div class="paragraph"><p>With the example above that would give:</p></div>
 1807 <div class="listingblock">
 1808 <div class="content">
 1809 <pre><code>      (BBC+BFC)-X1'-X2'-X3'-X4'-X5'-X6'-Z'-...
 1810      /
 1811 ...-Y-BBC-X1-X2-X3-X4-X5-X6-BFC-Z</code></pre>
 1812 </div></div>
 1813 <div class="paragraph"><p>That&#8217;s why the "git replace" command was created. Technically it
 1814 stores replacements "refs" in the "refs/replace/" hierarchy. These
 1815 "refs" are like branches (that are stored in "refs/heads/") or tags
 1816 (that are stored in "refs/tags"), and that means that they can
 1817 automatically be shared like branches or tags among developers.</p></div>
 1818 <div class="paragraph"><p>"git replace" is a very powerful mechanism. It can be used to fix
 1819 commits in already released history, for example to change the commit
 1820 message or the author. And it can also be used instead of git "grafts"
 1821 to link a repository with another old repository.</p></div>
 1822 <div class="paragraph"><p>In fact it&#8217;s this last feature that "sold" it to the Git community, so
 1823 it is now in the "master" branch of Git&#8217;s Git repository and it should
 1824 be released in Git 1.6.5 in October or November 2009.</p></div>
 1825 <div class="paragraph"><p>One problem with "git replace" is that currently it stores all the
 1826 replacements refs in "refs/replace/", but it would be perhaps better
 1827 if the replacement refs that are useful only for bisecting would be in
 1828 "refs/replace/bisect/". This way the replacement refs could be used
 1829 only for bisecting, while other refs directly in "refs/replace/" would
 1830 be used nearly all the time.</p></div>
 1831 </div>
 1832 <div class="sect2">
 1833 <h3 id="_bisecting_sporadic_bugs">Bisecting sporadic bugs</h3>
 1834 <div class="paragraph"><p>Another possible improvement to "git bisect" would be to optionally
 1835 add some redundancy to the tests performed so that it would be more
 1836 reliable when tracking sporadic bugs.</p></div>
 1837 <div class="paragraph"><p>This has been requested by some kernel developers because some bugs
 1838 called sporadic bugs do not appear in all the kernel builds because
 1839 they are very dependent on the compiler output.</p></div>
 1840 <div class="paragraph"><p>The idea is that every 3 test for example, "git bisect" could ask the
 1841 user to test a commit that has already been found to be "good" or
 1842 "bad" (because one of its descendants or one of its ancestors has been
 1843 found to be "good" or "bad" respectively). If it happens that a commit
 1844 has been previously incorrectly classified then the bisection can be
 1845 aborted early, hopefully before too many mistakes have been made. Then
 1846 the user will have to look at what happened and then restart the
 1847 bisection using a fixed bisect log.</p></div>
 1848 <div class="paragraph"><p>There is already a project called BBChop created by Ealdwulf Wuffinga
 1849 on Github that does something like that using Bayesian Search Theory
 1850 <a href="#9">[9]</a>:</p></div>
 1851 <div class="quoteblock">
 1852 <div class="content">
 1853 <div class="paragraph"><p>BBChop is like <em>git bisect</em> (or equivalent), but works when your bug
 1854 is intermittent. That is, it works in the presence of false negatives
 1855 (when a version happens to work this time even though it contains the
 1856 bug). It assumes that there are no false positives (in principle, the
 1857 same approach would work, but adding it may be non-trivial).</p></div>
 1858 </div>
 1859 <div class="attribution">
 1860 </div></div>
 1861 <div class="paragraph"><p>But BBChop is independent of any VCS and it would be easier for Git
 1862 users to have something integrated in Git.</p></div>
 1863 </div>
 1864 </div>
 1865 </div>
 1866 <div class="sect1">
 1867 <h2 id="_conclusion">Conclusion</h2>
 1868 <div class="sectionbody">
 1869 <div class="paragraph"><p>We have seen that regressions are an important problem, and that "git
 1870 bisect" has nice features that complement very well practices and
 1871 other tools, especially test suites, that are generally used to fight
 1872 regressions. But it might be needed to change some work-flows and
 1873 (bad) habits to get the most out of it.</p></div>
 1874 <div class="paragraph"><p>Some improvements to the algorithms inside "git bisect" are possible
 1875 and some new features could help in some cases, but overall "git
 1876 bisect" works already very well, is used a lot, and is already very
 1877 useful. To back up that last claim, let&#8217;s give the final word to Ingo
 1878 Molnar when he was asked by the author how much time does he think
 1879 "git bisect" saves him when he uses it:</p></div>
 1880 <div class="quoteblock">
 1881 <div class="content">
 1882 <div class="paragraph"><p>a <em>lot</em>.</p></div>
 1883 <div class="paragraph"><p>About ten years ago did i do my first <em>bisection</em> of a Linux patch
 1884 queue. That was prior the Git (and even prior the BitKeeper) days. I
 1885 literally days spent sorting out patches, creating what in essence
 1886 were standalone commits that i guessed to be related to that bug.</p></div>
 1887 <div class="paragraph"><p>It was a tool of absolute last resort. I&#8217;d rather spend days looking
 1888 at printk output than do a manual <em>patch bisection</em>.</p></div>
 1889 <div class="paragraph"><p>With Git bisect it&#8217;s a breeze: in the best case i can get a ~15 step
 1890 kernel bisection done in 20-30 minutes, in an automated way. Even with
 1891 manual help or when bisecting multiple, overlapping bugs, it&#8217;s rarely
 1892 more than an hour.</p></div>
 1893 <div class="paragraph"><p>In fact it&#8217;s invaluable because there are bugs i would never even
 1894 <em>try</em> to debug if it wasn&#8217;t for git bisect. In the past there were bug
 1895 patterns that were immediately hopeless for me to debug - at best i
 1896 could send the crash/bug signature to lkml and hope that someone else
 1897 can think of something.</p></div>
 1898 <div class="paragraph"><p>And even if a bisection fails today it tells us something valuable
 1899 about the bug: that it&#8217;s non-deterministic - timing or kernel image
 1900 layout dependent.</p></div>
 1901 <div class="paragraph"><p>So git bisect is unconditional goodness - and feel free to quote that
 1902 ;-)</p></div>
 1903 </div>
 1904 <div class="attribution">
 1905 </div></div>
 1906 </div>
 1907 </div>
 1908 <div class="sect1">
 1909 <h2 id="_acknowledgments">Acknowledgments</h2>
 1910 <div class="sectionbody">
 1911 <div class="paragraph"><p>Many thanks to Junio Hamano for his help in reviewing this paper, for
 1912 reviewing the patches I sent to the Git mailing list, for discussing
 1913 some ideas and helping me improve them, for improving "git bisect" a
 1914 lot and for his awesome work in maintaining and developing Git.</p></div>
 1915 <div class="paragraph"><p>Many thanks to Ingo Molnar for giving me very useful information that
 1916 appears in this paper, for commenting on this paper, for his
 1917 suggestions to improve "git bisect" and for evangelizing "git bisect"
 1918 on the linux kernel mailing lists.</p></div>
 1919 <div class="paragraph"><p>Many thanks to Linus Torvalds for inventing, developing and
 1920 evangelizing "git bisect", Git and Linux.</p></div>
 1921 <div class="paragraph"><p>Many thanks to the many other great people who helped one way or
 1922 another when I worked on Git, especially to Andreas Ericsson, Johannes
 1923 Schindelin, H. Peter Anvin, Daniel Barkalow, Bill Lear, John Hawley,
 1924 Shawn O. Pierce, Jeff King, Sam Vilain, Jon Seymour.</p></div>
 1925 <div class="paragraph"><p>Many thanks to the Linux-Kongress program committee for choosing the
 1926 author to given a talk and for publishing this paper.</p></div>
 1927 </div>
 1928 </div>
 1929 <div class="sect1">
 1930 <h2 id="_references">References</h2>
 1931 <div class="sectionbody">
 1932 <div class="ulist"><ul>
 1933 <li>
 1934 <p>
 1935 <a id="1"></a>[1] <a href="https://www.nist.gov/sites/default/files/documents/director/planning/report02-3.pdf"><em>The Economic Impacts of Inadequate Infratructure for Software Testing</em>.  Nist Planning Report 02-3</a>, see Executive Summary and Chapter 8.
 1936 </p>
 1937 </li>
 1938 <li>
 1939 <p>
 1940 <a id="2"></a>[2] <a href="http://www.oracle.com/technetwork/java/codeconvtoc-136057.html"><em>Code Conventions for the Java Programming Language</em>. Sun Microsystems.</a>
 1941 </p>
 1942 </li>
 1943 <li>
 1944 <p>
 1945 <a id="3"></a>[3] <a href="https://en.wikipedia.org/wiki/Software_maintenance"><em>Software maintenance</em>. Wikipedia.</a>
 1946 </p>
 1947 </li>
 1948 <li>
 1949 <p>
 1950 <a id="4"></a>[4] <a href="https://public-inbox.org/git/7vps5xsbwp.fsf_-_@assigned-by-dhcp.cox.net/">Junio C Hamano. <em>Automated bisect success story</em>.</a>
 1951 </p>
 1952 </li>
 1953 <li>
 1954 <p>
 1955 <a id="5"></a>[5] <a href="https://lwn.net/Articles/317154/">Christian Couder. <em>Fully automated bisecting with "git bisect run"</em>. LWN.net.</a>
 1956 </p>
 1957 </li>
 1958 <li>
 1959 <p>
 1960 <a id="6"></a>[6] <a href="https://lwn.net/Articles/277872/">Jonathan Corbet. <em>Bisection divides users and developers</em>. LWN.net.</a>
 1961 </p>
 1962 </li>
 1963 <li>
 1964 <p>
 1965 <a id="7"></a>[7] <a href="http://marc.info/?l=linux-kernel&amp;m=119702753411680&amp;w=2">Ingo Molnar. <em>Re: BUG 2.6.23-rc3 can&#8217;t see sd partitions on Alpha</em>. Linux-kernel mailing list.</a>
 1966 </p>
 1967 </li>
 1968 <li>
 1969 <p>
 1970 <a id="8"></a>[8] <a href="https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html">Junio C Hamano and the git-list. <em>git-bisect(1) Manual Page</em>. Linux Kernel Archives.</a>
 1971 </p>
 1972 </li>
 1973 <li>
 1974 <p>
 1975 <a id="9"></a>[9] <a href="https://github.com/Ealdwulf/bbchop">Ealdwulf. <em>bbchop</em>. GitHub.</a>
 1976 </p>
 1977 </li>
 1978 </ul></div>
 1979 </div>
 1980 </div>
 1981 </div>
 1982 <div id="footnotes"><hr /></div>
 1983 <div id="footer">
 1984 <div id="footer-text">
 1985 Last updated
 1986  2018-12-14 12:53:38 JST
 1987 </div>
 1988 </div>
 1989 </body>
 1990 </html>