"Fossies" - the Fresh Open Source Software Archive

Member "haproxy-2.0.0/doc/coding-style.txt" (16 Jun 2019, 38582 Bytes) of package /linux/misc/haproxy-2.0.0.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file. See also the latest Fossies "Diffs" side-by-side code changes report for "coding-style.txt": 1.9.8_vs_2.0.0.

    1 2015/09/21 - HAProxy coding style - Willy Tarreau <w@1wt.eu>
    2 ------------------------------------------------------------
    3 
    4 A number of contributors are often embarrassed with coding style issues, they
    5 don't always know if they're doing it right, especially since the coding style
    6 has elvoved along the years. What is explained here is not necessarily what is
    7 applied in the code, but new code should as much as possible conform to this
    8 style. Coding style fixes happen when code is replaced. It is useless to send
    9 patches to fix coding style only, they will be rejected, unless they belong to
   10 a patch series which needs these fixes prior to get code changes. Also, please
   11 avoid fixing coding style in the same patches as functional changes, they make
   12 code review harder.
   13 
   14 A good way to quickly validate your patch before submitting it is to pass it
   15 through the Linux kernel's checkpatch.pl utility which can be downloaded here :
   16 
   17    http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/scripts/checkpatch.pl
   18 
   19 Running it with the following options relaxes its checks to accommodate to the
   20 extra degree of freedom that is tolerated in HAProxy's coding style compared to
   21 the stricter style used in the kernel :
   22 
   23    checkpatch.pl -q --max-line-length=160 --no-tree --no-signoff \
   24                  --ignore=LEADING_SPACE,CODE_INDENT,DEEP_INDENTATION \
   25                  --ignore=ELSE_AFTER_BRACE  < patch
   26 
   27 You can take its output as hints instead of strict rules, but in general its
   28 output will be accurate and it may even spot some real bugs.
   29 
   30 When modifying a file, you must accept the terms of the license of this file
   31 which is recalled at the top of the file, or is explained in the LICENSE file,
   32 or if not stated, defaults to LGPL version 2.1 or later for files in the
   33 'include' directory, and GPL version 2 or later for all other files.
   34 
   35 When adding a new file, you must add a copyright banner at the top of the
   36 file with your real name, e-mail address and a reminder of the license.
   37 Contributions under incompatible licenses or too restrictive licenses might
   38 get rejected. If in doubt, please apply the principle above for existing files.
   39 
   40 All code examples below will intentionally be prefixed with "  | " to mark
   41 where the code aligns with the first column, and tabs in this document will be
   42 represented as a series of 8 spaces so that it displays the same everywhere.
   43 
   44 
   45 1) Indentation and alignment
   46 ----------------------------
   47 
   48 1.1) Indentation
   49 ----------------
   50 
   51 Indentation and alignment are two completely different things that people often
   52 get wrong. Indentation is used to mark a sub-level in the code. A sub-level
   53 means that a block is executed in the context of another block (eg: a function
   54 or a condition) :
   55 
   56   | main(int argc, char **argv)
   57   | {
   58   |         int i;
   59   | 
   60   |         if (argc < 2)
   61   |                 exit(1);
   62   | }
   63 
   64 In the example above, the code belongs to the main() function and the exit()
   65 call belongs to the if statement. Indentation is made with tabs (\t, ASCII 9),
   66 which allows any developer to configure their preferred editor to use their
   67 own tab size and to still get the text properly indented. Exactly one tab is
   68 used per sub-level. Tabs may only appear at the beginning of a line or after
   69 another tab. It is illegal to put a tab after some text, as it mangles displays
   70 in a different manner for different users (particularly when used to align
   71 comments or values after a #define). If you're tempted to put a tab after some
   72 text, then you're doing it wrong and you need alignment instead (see below).
   73 
   74 Note that there are places where the code was not properly indented in the
   75 past. In order to view it correctly, you may have to set your tab size to 8
   76 characters.
   77 
   78 
   79 1.2) Alignment
   80 --------------
   81 
   82 Alignment is used to continue a line in a way to makes things easier to group
   83 together. By definition, alignment is character-based, so it uses spaces. Tabs
   84 would not work because for one tab there would not be as many characters on all
   85 displays. For instance, the arguments in a function declaration may be broken
   86 into multiple lines using alignment spaces :
   87 
   88   | int http_header_match2(const char *hdr, const char *end,
   89   |                        const char *name, int len)
   90   | {
   91   | ...
   92   | }
   93 
   94 In this example, the "const char *name" part is aligned with the first
   95 character of the group it belongs to (list of function arguments). Placing it
   96 here makes it obvious that it's one of the function's arguments. Multiple lines
   97 are easy to handle this way. This is very common with long conditions too :
   98 
   99   |         if ((len < eol - sol) &&
  100   |             (sol[len] == ':') &&
  101   |             (strncasecmp(sol, name, len) == 0)) {
  102   |                 ctx->del = len;
  103   |         }
  104 
  105 If we take again the example above marking tabs with "[-Tabs-]" and spaces
  106 with "#", we get this :
  107 
  108   | [-Tabs-]if ((len < eol - sol) &&
  109   | [-Tabs-]####(sol[len] == ':') &&
  110   | [-Tabs-]####(strncasecmp(sol, name, len) == 0)) {
  111   | [-Tabs-][-Tabs-]ctx->del = len;
  112   | [-Tabs-]}
  113 
  114 It is worth noting that some editors tend to confuse indentations and aligment.
  115 Emacs is notoriously known for this brokenness, and is responsible for almost
  116 all of the alignment mess. The reason is that Emacs only counts spaces, tries
  117 to fill as many as possible with tabs and completes with spaces. Once you know
  118 it, you just have to be careful, as alignment is not used much, so generally it
  119 is just a matter of replacing the last tab with 8 spaces when this happens.
  120 
  121 Indentation should be used everywhere there is a block or an opening brace. It
  122 is not possible to have two consecutive closing braces on the same column, it
  123 means that the innermost was not indented.
  124 
  125 Right :
  126 
  127   | main(int argc, char **argv)
  128   | {
  129   |         if (argc > 1) {
  130   |                 printf("Hello\n");
  131   |         }
  132   |         exit(0);
  133   | }
  134 
  135 Wrong :
  136 
  137   | main(int argc, char **argv)
  138   | {
  139   | if (argc > 1) {
  140   |         printf("Hello\n");
  141   | }
  142   | exit(0);
  143   | }
  144 
  145 A special case applies to switch/case statements. Due to my editor's settings,
  146 I've been used to align "case" with "switch" and to find it somewhat logical
  147 since each of the "case" statements opens a sublevel belonging to the "switch"
  148 statement. But indenting "case" after "switch" is accepted too. However in any
  149 case, whatever follows the "case" statement must be indented, whether or not it
  150 contains braces :
  151 
  152   | switch (*arg) {
  153   | case 'A': {
  154   |         int i;
  155   |         for (i = 0; i < 10; i++)
  156   |                 printf("Please stop pressing 'A'!\n");
  157   |         break;
  158   | }
  159   | case 'B':
  160   |         printf("You pressed 'B'\n");
  161   |         break;
  162   | case 'C':
  163   | case 'D':
  164   |         printf("You pressed 'C' or 'D'\n");
  165   |         break;
  166   | default:
  167   |         printf("I don't know what you pressed\n");
  168   | }
  169 
  170 
  171 2) Braces
  172 ---------
  173 
  174 Braces are used to delimit multiple-instruction blocks. In general it is
  175 preferred to avoid braces around single-instruction blocks as it reduces the
  176 number of lines :
  177 
  178 Right :
  179 
  180   | if (argc >= 2)
  181   |         exit(0);
  182 
  183 Wrong :
  184 
  185   | if (argc >= 2) {
  186   |         exit(0);
  187   | }
  188 
  189 But it is not that strict, it really depends on the context. It happens from
  190 time to time that single-instruction blocks are enclosed within braces because
  191 it makes the code more symmetrical, or more readable. Example :
  192 
  193   | if (argc < 2) {
  194   |         printf("Missing argument\n");
  195   |         exit(1);
  196   | } else {
  197   |         exit(0);
  198   | }
  199 
  200 Braces are always needed to declare a function. A function's opening brace must
  201 be placed at the beginning of the next line :
  202 
  203 Right :
  204 
  205   | int main(int argc, char **argv)
  206   | {
  207   |         exit(0);
  208   | }
  209 
  210 Wrong :
  211 
  212   | int main(int argc, char **argv) {
  213   |         exit(0);
  214   | }
  215 
  216 Note that a large portion of the code still does not conforms to this rule, as
  217 it took years to get all authors to adapt to this more common standard which
  218 is now preferred, as it avoids visual confusion when function declarations are
  219 broken on multiple lines :
  220 
  221 Right :
  222 
  223   | int foo(const char *hdr, const char *end,
  224   |         const char *name, const char *err,
  225   |         int len)
  226   | {
  227   |         int i;
  228 
  229 Wrong :
  230 
  231   | int foo(const char *hdr, const char *end,
  232   |         const char *name, const char *err,
  233   |         int len) {
  234   |         int i;
  235 
  236 Braces should always be used where there might be an ambiguity with the code
  237 later. The most common example is the stacked "if" statement where an "else"
  238 may be added later at the wrong place breaking the code, but it also happens
  239 with comments or long arguments in function calls. In general, if a block is
  240 more than one line long, it should use braces.
  241 
  242 Dangerous code waiting of a victim :
  243 
  244   | if (argc < 2)
  245   |         /* ret must not be negative here */
  246   |         if (ret < 0)
  247   |                 return -1;
  248 
  249 Wrong change :
  250 
  251   | if (argc < 2)
  252   |         /* ret must not be negative here */
  253   |         if (ret < 0)
  254   |                 return -1;
  255   | else
  256   |         return 0;
  257 
  258 It will do this instead of what your eye seems to tell you :
  259 
  260   | if (argc < 2)
  261   |         /* ret must not be negative here */
  262   |         if (ret < 0)
  263   |                 return -1;
  264   |         else
  265   |                 return 0;
  266 
  267 Right :
  268 
  269   | if (argc < 2) {
  270   |         /* ret must not be negative here */
  271   |         if (ret < 0)
  272   |                 return -1;
  273   | }
  274   | else
  275   |         return 0;
  276 
  277 Similarly dangerous example :
  278 
  279   | if (ret < 0)
  280   |         /* ret must not be negative here */
  281   |         complain();
  282   | init();
  283 
  284 Wrong change to silent the annoying message :
  285 
  286   | if (ret < 0)
  287   |         /* ret must not be negative here */
  288   |         //complain();
  289   | init();
  290 
  291 ... which in fact means :
  292 
  293   | if (ret < 0)
  294   |         init();
  295 
  296 
  297 3) Breaking lines
  298 -----------------
  299 
  300 There is no strict rule for line breaking. Some files try to stick to the 80
  301 column limit, but given that various people use various tab sizes, it does not
  302 make much sense. Also, code is sometimes easier to read with less lines, as it
  303 represents less surface on the screen (since each new line adds its tabs and
  304 spaces). The rule is to stick to the average line length of other lines. If you
  305 are working in a file which fits in 80 columns, try to keep this goal in mind.
  306 If you're in a function with 120-chars lines, there is no reason to add many
  307 short lines, so you can make longer lines.
  308 
  309 In general, opening a new block should lead to a new line. Similarly, multiple
  310 instructions should be avoided on the same line. But some constructs make it
  311 more readable when those are perfectly aligned :
  312 
  313 A copy-paste bug in the following construct will be easier to spot :
  314 
  315   | if (omult % idiv == 0) { omult /= idiv; idiv = 1; }
  316   | if (idiv % omult == 0) { idiv /= omult; omult = 1; }
  317   | if (imult % odiv == 0) { imult /= odiv; odiv = 1; }
  318   | if (odiv % imult == 0) { odiv /= imult; imult = 1; }
  319 
  320 than in this one :
  321 
  322   | if (omult % idiv == 0) {
  323   |         omult /= idiv;
  324   |         idiv = 1;
  325   | }
  326   | if (idiv % omult == 0) {
  327   |         idiv /= omult;
  328   |         omult = 1;
  329   | }
  330   | if (imult % odiv == 0) {
  331   |         imult /= odiv;
  332   |         odiv = 1;
  333   | }
  334   | if (odiv % imult == 0) {
  335   |         odiv /= imult;
  336   |         imult = 1;
  337   | }
  338 
  339 What is important is not to mix styles. For instance there is nothing wrong
  340 with having many one-line "case" statements as long as most of them are this
  341 short like below :
  342 
  343   | switch (*arg) {
  344   | case 'A': ret = 1; break;
  345   | case 'B': ret = 2; break;
  346   | case 'C': ret = 4; break;
  347   | case 'D': ret = 8; break;
  348   | default : ret = 0; break;
  349   | }
  350 
  351 Otherwise, prefer to have the "case" statement on its own line as in the
  352 example in section 1.2 about alignment. In any case, avoid to stack multiple
  353 control statements on the same line, so that it will never be the needed to
  354 add two tab levels at once :
  355 
  356 Right :
  357 
  358   | switch (*arg) {
  359   | case 'A':
  360   |         if (ret < 0)
  361   |                 ret = 1;
  362   |         break;
  363   | default : ret = 0; break;
  364   | }
  365 
  366 Wrong :
  367 
  368   | switch (*arg) {
  369   | case 'A': if (ret < 0)
  370   |                 ret = 1;
  371   |         break;
  372   | default : ret = 0; break;
  373   | }
  374 
  375 Right :
  376 
  377   | if (argc < 2)
  378   |         if (ret < 0)
  379   |                 return -1;
  380 
  381 or Right :
  382 
  383   | if (argc < 2)
  384   |         if (ret < 0) return -1;
  385 
  386 but Wrong :
  387 
  388   | if (argc < 2) if (ret < 0) return -1;
  389 
  390 
  391 When complex conditions or expressions are broken into multiple lines, please
  392 do ensure that alignment is perfectly appropriate, and group all main operators
  393 on the same side (which you're free to choose as long as it does not change for
  394 every block. Putting binary operators on the right side is preferred as it does
  395 not mangle with alignment but various people have their preferences.
  396 
  397 Right :
  398 
  399   | if ((txn->flags & TX_NOT_FIRST) &&
  400   |     ((req->flags & BF_FULL) ||
  401   |      req->r < req->lr ||
  402   |      req->r > req->data + req->size - global.tune.maxrewrite)) {
  403   |         return 0;
  404   | }
  405 
  406 Right :
  407 
  408   | if ((txn->flags & TX_NOT_FIRST)
  409   |     && ((req->flags & BF_FULL)
  410   |         || req->r < req->lr
  411   |         || req->r > req->data + req->size - global.tune.maxrewrite)) {
  412   |         return 0;
  413   | }
  414 
  415 Wrong :
  416 
  417   | if ((txn->flags & TX_NOT_FIRST) &&
  418   |    ((req->flags & BF_FULL) ||
  419   |      req->r < req->lr
  420   |    || req->r > req->data + req->size - global.tune.maxrewrite)) {
  421   |         return 0;
  422   | }
  423 
  424 If it makes the result more readable, parenthesis may even be closed on their
  425 own line in order to align with the opening one. Note that should normally not
  426 be needed because such code would be too complex to be digged into.
  427 
  428 The "else" statement may either be merged with the closing "if" brace or lie on
  429 its own line. The later is preferred but it adds one extra line to each control
  430 block which is annoying in short ones. However, if the "else" is followed by an
  431 "if", then it should really be on its own line and the rest of the if/else
  432 blocks must follow the same style.
  433 
  434 Right :
  435 
  436   | if (a < b) {
  437   |         return a;
  438   | }
  439   | else {
  440   |         return b;
  441   | }
  442 
  443 Right :
  444 
  445   | if (a < b) {
  446   |         return a;
  447   | } else {
  448   |         return b;
  449   | }
  450 
  451 Right :
  452 
  453   | if (a < b) {
  454   |         return a;
  455   | }
  456   | else if (a != b) {
  457   |         return b;
  458   | }
  459   | else {
  460   |         return 0;
  461   | }
  462 
  463 Wrong :
  464 
  465   | if (a < b) {
  466   |         return a;
  467   | } else if (a != b) {
  468   |         return b;
  469   | } else {
  470   |         return 0;
  471   | }
  472 
  473 Wrong :
  474 
  475   | if (a < b) {
  476   |         return a;
  477   | }
  478   | else if (a != b) {
  479   |         return b;
  480   | } else {
  481   |         return 0;
  482   | }
  483 
  484 
  485 4) Spacing
  486 ----------
  487 
  488 Correctly spacing code is very important. When you have to spot a bug at 3am,
  489 you need it to be clear. When you expect other people to review your code, you
  490 want it to be clear and don't want them to get nervous when trying to find what
  491 you did.
  492 
  493 Always place spaces around all binary or ternary operators, commas, as well as
  494 after semi-colons and opening braces if the line continues :
  495 
  496 Right :
  497 
  498   | int ret = 0;
  499   | /* if (x >> 4) { x >>= 4; ret += 4; } */
  500   | ret += (x >> 4) ? (x >>= 4, 4) : 0;
  501   | val = ret + ((0xFFFFAA50U >> (x << 1)) & 3) + 1;
  502 
  503 Wrong :
  504 
  505   | int ret=0;
  506   | /* if (x>>4) {x>>=4;ret+=4;} */
  507   | ret+=(x>>4)?(x>>=4,4):0;
  508   | val=ret+((0xFFFFAA50U>>(x<<1))&3)+1;
  509 
  510 Never place spaces after unary operators (&, *, -, !, ~, ++, --) nor cast, as
  511 they might be confused with they binary counterpart, nor before commas or
  512 semicolons :
  513 
  514 Right :
  515 
  516   | bit = !!(~len++ ^ -(unsigned char)*x);
  517 
  518 Wrong :
  519 
  520   | bit = ! ! (~len++ ^ - (unsigned char) * x) ;
  521 
  522 Note that "sizeof" is a unary operator which is sometimes considered as a
  523 language keyword, but in no case it is a function. It does not require
  524 parenthesis so it is sometimes followed by spaces and sometimes not when
  525 there are no parenthesis. Most people do not really care as long as what
  526 is written is unambiguous.
  527 
  528 Braces opening a block must be preceded by one space unless the brace is
  529 placed on the first column :
  530 
  531 Right :
  532 
  533   | if (argc < 2) {
  534   | }
  535 
  536 Wrong :
  537 
  538   | if (argc < 2){
  539   | }
  540 
  541 Do not add unneeded spaces inside parenthesis, they just make the code less
  542 readable.
  543 
  544 Right :
  545 
  546   | if (x < 4 && (!y || !z))
  547   |         break;
  548 
  549 Wrong :
  550 
  551   | if ( x < 4 && ( !y || !z ) )
  552   |         break;
  553 
  554 Language keywords must all be followed by a space. This is true for control
  555 statements (do, for, while, if, else, return, switch, case), and for types
  556 (int, char, unsigned). As an exception, the last type in a cast does not take
  557 a space before the closing parenthesis). The "default" statement in a "switch"
  558 construct is generally just followed by the colon. However the colon after a
  559 "case" or "default" statement must be followed by a space.
  560 
  561 Right :
  562 
  563   | if (nbargs < 2) {
  564   |         printf("Missing arg at %c\n", *(char *)ptr);
  565   |         for (i = 0; i < 10; i++) beep();
  566   |         return 0;
  567   | }
  568   | switch (*arg) {
  569 
  570 Wrong :
  571 
  572   | if(nbargs < 2){
  573   |         printf("Missing arg at %c\n", *(char*)ptr);
  574   |         for(i = 0; i < 10; i++)beep();
  575   |         return 0;
  576   | }
  577   | switch(*arg) {
  578 
  579 Function calls are different, the opening parenthesis is always coupled to the
  580 function name without any space. But spaces are still needed after commas :
  581 
  582 Right :
  583 
  584   | if (!init(argc, argv))
  585   |         exit(1);
  586 
  587 Wrong :
  588 
  589   | if (!init (argc,argv))
  590   |         exit(1);
  591 
  592 
  593 5) Excess or lack of parenthesis
  594 --------------------------------
  595 
  596 Sometimes there are too many parenthesis in some formulas, sometimes there are
  597 too few. There are a few rules of thumb for this. The first one is to respect
  598 the compiler's advice. If it emits a warning and asks for more parenthesis to
  599 avoid confusion, follow the advice at least to shut the warning. For instance,
  600 the code below is quite ambiguous due to its alignment :
  601 
  602   | if (var1 < 2 || var2 < 2 &&
  603   |     var3 != var4) {
  604   |         /* fail */
  605   |         return -3;
  606   | }
  607 
  608 Note that this code does :
  609 
  610   | if (var1 < 2 || (var2 < 2 && var3 != var4)) {
  611   |         /* fail */
  612   |         return -3;
  613   | }
  614 
  615 But maybe the author meant :
  616 
  617   | if ((var1 < 2 || var2 < 2) && var3 != var4) {
  618   |         /* fail */
  619   |         return -3;
  620   | }
  621 
  622 A second rule to put parenthesis is that people don't always know operators
  623 precedence too well. Most often they have no issue with operators of the same
  624 category (eg: booleans, integers, bit manipulation, assignment) but once these
  625 operators are mixed, it causes them all sort of issues. In this case, it is
  626 wise to use parenthesis to avoid errors. One common error concerns the bit
  627 shift operators because they're used to replace multiplies and divides but
  628 don't have the same precedence :
  629 
  630 The expression :
  631 
  632   | x = y * 16 + 5;
  633 
  634 becomes :
  635 
  636   | x = y << 4 + 5;
  637 
  638 which is wrong because it is equivalent to :
  639 
  640   | x = y << (4 + 5);
  641 
  642 while the following was desired instead :
  643 
  644   | x = (y << 4) + 5;
  645 
  646 It is generally fine to write boolean expressions based on comparisons without
  647 any parenthesis. But on top of that, integer expressions and assignments should
  648 then be protected. For instance, there is an error in the expression below
  649 which should be safely rewritten :
  650 
  651 Wrong :
  652 
  653   | if (var1 > 2 && var1 < 10 ||
  654   |     var1 > 2 + 256 && var2 < 10 + 256 ||
  655   |     var1 > 2 + 1 << 16 && var2 < 10 + 2 << 16)
  656   |         return 1;
  657 
  658 Right (may remove a few parenthesis depending on taste) :
  659 
  660   | if ((var1 > 2 && var1 < 10) ||
  661   |     (var1 > (2 + 256) && var2 < (10 + 256)) ||
  662   |     (var1 > (2 + (1 << 16)) && var2 < (10 + (1 << 16))))
  663   |         return 1;
  664 
  665 The "return" statement is not a function, so it takes no argument. It is a
  666 control statement which is followed by the expression to be returned. It does
  667 not need to be followed by parenthesis :
  668 
  669 Wrong :
  670 
  671   | int ret0()
  672   | {
  673   |         return(0);
  674   | }
  675 
  676 Right :
  677 
  678   | int ret0()
  679   | {
  680   |         return 0;
  681   | }
  682 
  683 Parenthesisis are also found in type casts. Type casting should be avoided as
  684 much as possible, especially when it concerns pointer types. Casting a pointer
  685 disables the compiler's type checking and is the best way to get caught doing
  686 wrong things with data not the size you expect. If you need to manipulate
  687 multiple data types, you can use a union instead. If the union is really not
  688 convenient and casts are easier, then try to isolate them as much as possible,
  689 for instance when initializing function arguments or in another function. Not
  690 proceeding this way causes huge risks of not using the proper pointer without
  691 any notification, which is especially true during copy-pastes.
  692 
  693 Wrong :
  694 
  695   | void *check_private_data(void *arg1, void *arg2)
  696   | {
  697   |         char *area;
  698   |
  699   |         if (*(int *)arg1 > 1000)
  700   |                 return NULL;
  701   |         if (memcmp(*(const char *)arg2, "send(", 5) != 0))
  702   |                 return NULL;
  703   |         area = malloc(*(int *)arg1);
  704   |         if (!area)
  705   |                 return NULL;
  706   |         memcpy(area, *(const char *)arg2 + 5, *(int *)arg1);
  707   |         return area;
  708   | }
  709 
  710 Right :
  711 
  712   | void *check_private_data(void *arg1, void *arg2)
  713   | {
  714   |         char *area;
  715   |         int len = *(int *)arg1;
  716   |         const char *msg = arg2;
  717   |
  718   |         if (len > 1000)
  719   |                 return NULL;
  720   |         if (memcmp(msg, "send(", 5) != 0)
  721   |                 return NULL;
  722   |         area = malloc(len);
  723   |         if (!area)
  724   |                 return NULL;
  725   |         memcpy(area, msg + 5, len);
  726   |         return area;
  727   | }
  728 
  729 
  730 6) Ambiguous comparisons with zero or NULL
  731 ------------------------------------------
  732 
  733 In C, '0' has no type, or it has the type of the variable it is assigned to.
  734 Comparing a variable or a return value with zero means comparing with the
  735 representation of zero for this variable's type. For a boolean, zero is false.
  736 For a pointer, zero is NULL. Very often, to make things shorter, it is fine to
  737 use the '!' unary operator to compare with zero, as it is shorter and easier to
  738 remind or understand than a plain '0'. Since the '!' operator is read "not", it
  739 helps read code faster when what follows it makes sense as a boolean, and it is
  740 often much more appropriate than a comparison with zero which makes an equal
  741 sign appear at an undesirable place. For instance :
  742 
  743   | if (!isdigit(*c) && !isspace(*c))
  744   |         break;
  745 
  746 is easier to understand than :
  747 
  748   | if (isdigit(*c) == 0 && isspace(*c) == 0)
  749   |         break;
  750 
  751 For a char this "not" operator can be reminded as "no remaining char", and the
  752 absence of comparison to zero implies existence of the tested entity, hence the
  753 simple strcpy() implementation below which automatically stops once the last
  754 zero is copied :
  755 
  756   | void my_strcpy(char *d, const char *s)
  757   | {
  758   |         while ((*d++ = *s++));
  759   | }
  760 
  761 Note the double parenthesis in order to avoid the compiler telling us it looks
  762 like an equality test.
  763 
  764 For a string or more generally any pointer, this test may be understood as an
  765 existence test or a validity test, as the only pointer which will fail to
  766 validate equality is the NULL pointer :
  767 
  768   | area = malloc(1000);
  769   | if (!area)
  770   |         return -1;
  771 
  772 However sometimes it can fool the reader. For instance, strcmp() precisely is
  773 one of such functions whose return value can make one think the opposite due to
  774 its name which may be understood as "if strings compare...". Thus it is strongly
  775 recommended to perform an explicit comparison with zero in such a case, and it
  776 makes sense considering that the comparison's operator is the same that is
  777 wanted to compare the strings (note that current config parser lacks a lot in
  778 this regards) :
  779 
  780     strcmp(a, b) == 0  <=>  a == b
  781     strcmp(a, b) != 0  <=>  a != b
  782     strcmp(a, b) <  0  <=>  a <  b
  783     strcmp(a, b) >  0  <=>  a >  b
  784 
  785 Avoid this :
  786 
  787   | if (strcmp(arg, "test"))
  788   |         printf("this is not a test\n");
  789   |
  790   | if (!strcmp(arg, "test"))
  791   |         printf("this is a test\n");
  792 
  793 Prefer this :
  794 
  795   | if (strcmp(arg, "test") != 0)
  796   |         printf("this is not a test\n");
  797   |
  798   | if (strcmp(arg, "test") == 0)
  799   |         printf("this is a test\n");
  800 
  801 
  802 7) System call returns
  803 ----------------------
  804 
  805 This is not directly a matter of coding style but more of bad habits. It is
  806 important to check for the correct value upon return of syscalls. The proper
  807 return code indicating an error is described in its man page. There is no
  808 reason to consider wider ranges than what is indicated. For instance, it is
  809 common to see such a thing :
  810 
  811   | if ((fd = open(file, O_RDONLY)) < 0)
  812   |         return -1;
  813 
  814 This is wrong. The man page says that -1 is returned if an error occurred. It
  815 does not suggest that any other negative value will be an error. It is possible
  816 that a few such issues have been left in existing code. They are bugs for which
  817 fixes are accepted, even though they're currently harmless since open() is not
  818 known for returning negative values at the moment.
  819 
  820 
  821 8) Declaring new types, names and values
  822 ----------------------------------------
  823 
  824 Please refrain from using "typedef" to declare new types, they only obfuscate
  825 the code. The reader never knows whether he's manipulating a scalar type or a
  826 struct. For instance it is not obvious why the following code fails to build :
  827 
  828   | int delay_expired(timer_t exp, timer_us_t now)
  829   | {
  830   |         return now >= exp;
  831   | }
  832 
  833 With the types declared in another file this way :
  834 
  835   | typedef unsigned int timer_t;
  836   | typedef struct timeval timer_us_t;
  837 
  838 This cannot work because we're comparing a scalar with a struct, which does
  839 not make sense. Without a typedef, the function would have been written this
  840 way without any ambiguity and would not have failed :
  841 
  842   | int delay_expired(unsigned int exp, struct timeval *now)
  843   | {
  844   |         return now >= exp->tv_sec;
  845   | }
  846 
  847 Declaring special values may be done using enums. Enums are a way to define
  848 structured integer values which are related to each other. They are perfectly
  849 suited for state machines. While the first element is always assigned the zero
  850 value, not everybody knows that, especially people working with multiple
  851 languages all the day. For this reason it is recommended to explicitly force
  852 the first value even if it's zero. The last element should be followed by a
  853 comma if it is planned that new elements might later be added, this will make
  854 later patches shorter. Conversely, if the last element is placed in order to
  855 get the number of possible values, it must not be followed by a comma and must
  856 be preceded by a comment :
  857 
  858   | enum {
  859   |         first = 0,
  860   |         second,
  861   |         third,
  862   |         fourth,
  863   | };
  864 
  865 
  866   | enum {
  867   |         first = 0,
  868   |         second,
  869   |         third,
  870   |         fourth,
  871   |         /* nbvalues must always be placed last */
  872   |         nbvalues
  873   | };
  874 
  875 Structure names should be short enough not to mangle function declarations,
  876 and explicit enough to avoid confusion (which is the most important thing).
  877 
  878 Wrong :
  879 
  880   | struct request_args { /* arguments on the query string */
  881   |         char *name;
  882   |         char *value;
  883   |         struct misc_args *next;
  884   | };
  885 
  886 Right :
  887 
  888   | struct qs_args { /* arguments on the query string */
  889   |         char *name;
  890   |         char *value;
  891   |         struct qs_args *next;
  892   | }
  893 
  894 
  895 When declaring new functions or structures, please do not use CamelCase, which
  896 is a style where upper and lower case are mixed in a single word. It causes a
  897 lot of confusion when words are composed from acronyms, because it's hard to
  898 stick to a rule. For instance, a function designed to generate an ISN (initial
  899 sequence number) for a TCP/IP connection could be called :
  900 
  901   - generateTcpipIsn()
  902   - generateTcpIpIsn()
  903   - generateTcpIpISN()
  904   - generateTCPIPISN()
  905   etc...
  906 
  907 None is right, none is wrong, these are just preferences which might change
  908 along the code. Instead, please use an underscore to separate words. Lowercase
  909 is preferred for the words, but if acronyms are upcased it's not dramatic. The
  910 real advantage of this method is that it creates unambiguous levels even for
  911 short names.
  912 
  913 Valid examples :
  914 
  915   - generate_tcpip_isn()
  916   - generate_tcp_ip_isn()
  917   - generate_TCPIP_ISN()
  918   - generate_TCP_IP_ISN()
  919 
  920 Another example is easy to understand when 3 arguments are involved in naming
  921 the function :
  922 
  923 Wrong (naming conflict) :
  924 
  925   | /* returns A + B * C */
  926   | int mulABC(int a, int b, int c)
  927   | {
  928   |         return a + b * c;
  929   | }
  930   |
  931   | /* returns (A + B) * C */
  932   | int mulABC(int a, int b, int c)
  933   | {
  934   |         return (a + b) * c;
  935   | }
  936 
  937 Right (unambiguous naming) :
  938 
  939   | /* returns A + B * C */
  940   | int mul_a_bc(int a, int b, int c)
  941   | {
  942   |         return a + b * c;
  943   | }
  944   |
  945   | /* returns (A + B) * C */
  946   | int mul_ab_c(int a, int b, int c)
  947   | {
  948   |         return (a + b) * c;
  949   | }
  950 
  951 Whenever you manipulate pointers, try to declare them as "const", as it will
  952 save you from many accidental misuses and will only cause warnings to be
  953 emitted when there is a real risk. In the examples below, it is possible to
  954 call my_strcpy() with a const string only in the first declaration. Note that
  955 people who ignore "const" are often the ones who cast a lot and who complain
  956 from segfaults when using strtok() !
  957 
  958 Right :
  959 
  960   | void my_strcpy(char *d, const char *s)
  961   | {
  962   |         while ((*d++ = *s++));
  963   | }
  964   |
  965   | void say_hello(char *dest)
  966   | {
  967   |         my_strcpy(dest, "hello\n");
  968   | }
  969 
  970 Wrong :
  971 
  972   | void my_strcpy(char *d, char *s)
  973   | {
  974   |         while ((*d++ = *s++));
  975   | }
  976   |
  977   | void say_hello(char *dest)
  978   | {
  979   |         my_strcpy(dest, "hello\n");
  980   | }
  981 
  982 
  983 9) Getting macros right
  984 -----------------------
  985 
  986 It is very common for macros to do the wrong thing when used in a way their
  987 author did not have in mind. For this reason, macros must always be named with
  988 uppercase letters only. This is the only way to catch the developer's eye when
  989 using them, so that he double-checks whether he's taking risks or not. First,
  990 macros must never ever be terminated by a semi-colon, or they will close the
  991 wrong block once in a while. For instance, the following will cause a build
  992 error before the "else" due to the double semi-colon :
  993 
  994 Wrong :
  995 
  996   | #define WARN printf("warning\n");
  997   | ...
  998   |         if (a < 0)
  999   |                 WARN;
 1000   |         else
 1001   |                 a--;
 1002 
 1003 Right :
 1004 
 1005   | #define WARN printf("warning\n")
 1006 
 1007 If multiple instructions are needed, then use a do { } while (0) block, which
 1008 is the only construct which respects *exactly* the semantics of a single
 1009 instruction :
 1010 
 1011   | #define WARN do { printf("warning\n"); log("warning\n"); } while (0)
 1012   | ...
 1013   |
 1014   |         if (a < 0)
 1015   |                 WARN;
 1016   |         else
 1017   |                 a--;
 1018 
 1019 Second, do not put unprotected control statements in macros, they will
 1020 definitely cause bugs :
 1021 
 1022 Wrong :
 1023 
 1024   | #define WARN if (verbose) printf("warning\n")
 1025   | ...
 1026   |         if (a < 0)
 1027   |                 WARN;
 1028   |         else
 1029   |                 a--;
 1030 
 1031 Which is equivalent to the undesired form below :
 1032 
 1033   |         if (a < 0)
 1034   |                 if (verbose)
 1035   |                         printf("warning\n");
 1036   |                 else
 1037   |                         a--;
 1038 
 1039 Right way to do it :
 1040 
 1041   | #define WARN do { if (verbose) printf("warning\n"); } while (0)
 1042   | ...
 1043   |         if (a < 0)
 1044   |                 WARN;
 1045   |         else
 1046   |                 a--;
 1047 
 1048 Which is equivalent to :
 1049 
 1050   |         if (a < 0)
 1051   |                 do { if (verbose) printf("warning\n"); } while (0);
 1052   |         else
 1053   |                 a--;
 1054 
 1055 Macro parameters must always be surrounded by parenthesis, and must never be
 1056 duplicated in the same macro unless explicitly stated. Also, macros must not be
 1057 defined with operators without surrounding parenthesis. The MIN/MAX macros are
 1058 a pretty common example of multiple misuses, but this happens as early as when
 1059 using bit masks. Most often, in case of any doubt, try to use inline functions
 1060 instead.
 1061 
 1062 Wrong :
 1063 
 1064   | #define MIN(a, b) a < b ? a : b
 1065   |
 1066   | /* returns 2 * min(a,b) + 1 */
 1067   | int double_min_p1(int a, int b)
 1068   | {
 1069   |         return 2 * MIN(a, b) + 1;
 1070   | }
 1071 
 1072 What this will do :
 1073 
 1074   | int double_min_p1(int a, int b)
 1075   | {
 1076   |         return 2 * a < b ? a : b + 1;
 1077   | }
 1078 
 1079 Which is equivalent to :
 1080 
 1081   | int double_min_p1(int a, int b)
 1082   | {
 1083   |         return (2 * a) < b ? a : (b + 1);
 1084   | }
 1085 
 1086 The first thing to fix is to surround the macro definition with parenthesis to
 1087 avoid this mistake :
 1088 
 1089   | #define MIN(a, b) (a < b ? a : b)
 1090 
 1091 But this is still not enough, as can be seen in this example :
 1092 
 1093   | /* compares either a or b with c */
 1094   | int min_ab_c(int a, int b, int c)
 1095   | {
 1096   |         return MIN(a ? a : b, c);
 1097   | }
 1098 
 1099 Which is equivalent to :
 1100 
 1101   | int min_ab_c(int a, int b, int c)
 1102   | {
 1103   |         return (a ? a : b < c ? a ? a : b : c);
 1104   | }
 1105 
 1106 Which in turn means a totally different thing due to precedence :
 1107 
 1108   | int min_ab_c(int a, int b, int c)
 1109   | {
 1110   |         return (a ? a : ((b < c) ? (a ? a : b) : c));
 1111   | }
 1112 
 1113 This can be fixed by surrounding *each* argument in the macro with parenthesis:
 1114 
 1115   | #define MIN(a, b) ((a) < (b) ? (a) : (b))
 1116 
 1117 But this is still not enough, as can be seen in this example :
 1118 
 1119   | int min_ap1_b(int a, int b)
 1120   | {
 1121   |         return MIN(++a, b);
 1122   | }
 1123 
 1124 Which is equivalent to :
 1125 
 1126   | int min_ap1_b(int a, int b)
 1127   | {
 1128   |         return ((++a) < (b) ? (++a) : (b));
 1129   | }
 1130 
 1131 Again, this is wrong because "a" is incremented twice if below b. The only way
 1132 to fix this is to use a compound statement and to assign each argument exactly
 1133 once to a local variable of the same type :
 1134 
 1135   | #define MIN(a, b) ({ typeof(a) __a = (a); typeof(b) __b = (b);  \
 1136   |                      ((__a) < (__b) ? (__a) : (__b));           \
 1137   |                   })
 1138 
 1139 At this point, using static inline functions is much cleaner if a single type
 1140 is to be used :
 1141 
 1142   | static inline int min(int a, int b)
 1143   | {
 1144   |         return a < b ? a : b;
 1145   | }
 1146 
 1147 
 1148 10) Includes
 1149 ------------
 1150 
 1151 Includes are as much as possible listed in alphabetically ordered groups :
 1152   - the libc-standard includes (those without any path component)
 1153   - the includes more or less system-specific (sys/*, netinet/*, ...)
 1154   - includes from the local "common" subdirectory
 1155   - includes from the local "types" subdirectory
 1156   - includes from the local "proto" subdirectory
 1157 
 1158 Each section is just visually delimited from the other ones using an empty
 1159 line. The two first ones above may be merged into a single section depending on
 1160 developer's preference. Please do not copy-paste include statements from other
 1161 files. Having too many includes significantly increases build time and makes it
 1162 hard to find which ones are needed later. Just include what you need and if
 1163 possible in alphabetical order so that when something is missing, it becomes
 1164 obvious where to look for it and where to add it.
 1165 
 1166 All files should include <common/config.h> because this is where build options
 1167 are prepared.
 1168 
 1169 Header files are split in two directories ("types" and "proto") depending on
 1170 what they provide. Types, structures, enums and #defines must go into the
 1171 "types" directory. Function prototypes and inlined functions must go into the
 1172 "proto" directory. This split is because of inlined functions which
 1173 cross-reference types from other files, which cause a chicken-and-egg problem
 1174 if the functions and types are declared at the same place.
 1175 
 1176 All headers which do not depend on anything currently go to the "common"
 1177 subdirectory, but are equally well placed into the "proto" directory. It is
 1178 possible that one day the "common" directory will disappear.
 1179 
 1180 Include files must be protected against multiple inclusion using the common
 1181 #ifndef/#define/#endif trick with a tag derived from the include file and its
 1182 location.
 1183 
 1184 
 1185 11) Comments
 1186 ------------
 1187 
 1188 Comments are preferably of the standard 'C' form using /* */. The C++ form "//"
 1189 are tolerated for very short comments (eg: a word or two) but should be avoided
 1190 as much as possible. Multi-line comments are made with each intermediate line
 1191 starting with a star aligned with the first one, as in this example :
 1192 
 1193   | /*
 1194   |  * This is a multi-line
 1195   |  * comment.
 1196   |  */
 1197 
 1198 If multiple code lines need a short comment, try to align them so that you can
 1199 have multi-line sentences. This is rarely needed, only for really complex
 1200 constructs.
 1201 
 1202 Do not tell what you're doing in comments, but explain why you're doing it if
 1203 it seems not to be obvious. Also *do* indicate at the top of function what they
 1204 accept and what they don't accept. For instance, strcpy() only accepts output
 1205 buffers at least as large as the input buffer, and does not support any NULL
 1206 pointer. There is nothing wrong with that if the caller knows it.
 1207 
 1208 Wrong use of comments :
 1209 
 1210   | int flsnz8(unsigned int x)
 1211   | {
 1212   |         int ret = 0;                         /* initialize ret */
 1213   |         if (x >> 4) { x >>= 4; ret += 4; }   /* add 4 to ret if needed */
 1214   |         return ret + ((0xFFFFAA50U >> (x << 1)) & 3) + 1; /* add ??? */
 1215   | }
 1216   | ...
 1217   | bit = ~len + (skip << 3) + 9;        /* update bit */
 1218 
 1219 Right use of comments :
 1220 
 1221   | /* This function returns the positoin of the highest bit set in the lowest
 1222   |  * byte of <x>, between 0 and 7. It only works if <x> is non-null. It uses
 1223   |  * a 32-bit value as a lookup table to return one of 4 values for the
 1224   |  * highest 16 possible 4-bit values.
 1225   |  */
 1226   | int flsnz8(unsigned int x)
 1227   | {
 1228   |         int ret = 0;
 1229   |         if (x >> 4) { x >>= 4; ret += 4; }
 1230   |         return ret + ((0xFFFFAA50U >> (x << 1)) & 3) + 1;
 1231   | }
 1232   | ...
 1233   | bit = ~len + (skip << 3) + 9; /* (skip << 3) + (8 - len), saves 1 cycle */
 1234 
 1235 
 1236 12) Use of assembly
 1237 -------------------
 1238 
 1239 There are many projects where use of assembly code is not welcome. There is no
 1240 problem with use of assembly in haproxy, provided that :
 1241 
 1242   a) an alternate C-form is provided for architectures not covered
 1243   b) the code is small enough and well commented enough to be maintained
 1244 
 1245 It is important to take care of various incompatibilities between compiler
 1246 versions, for instance regarding output and cloberred registers. There are
 1247 a number of documentations on the subject on the net. Anyway if you are
 1248 fiddling with assembly, you probably know that already.
 1249 
 1250 Example :
 1251   | /* gcc does not know when it can safely divide 64 bits by 32 bits. Use this
 1252   |  * function when you know for sure that the result fits in 32 bits, because
 1253   |  * it is optimal on x86 and on 64bit processors.
 1254   |  */
 1255   | static inline unsigned int div64_32(unsigned long long o1, unsigned int o2)
 1256   | {
 1257   |         unsigned int result;
 1258   | #ifdef __i386__
 1259   |         asm("divl %2"
 1260   |             : "=a" (result)
 1261   |             : "A"(o1), "rm"(o2));
 1262   | #else
 1263   |         result = o1 / o2;
 1264   | #endif
 1265   |         return result;
 1266   | }
 1267