"Fossies" - the Fresh Open Source Software Archive

Member "complexity-1.13/doc/compute.texi" (28 Jul 2018, 10593 Bytes) of package /linux/privat/complexity-1.13.tar.xz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 @page
    2 @node    Complexity Computation
    3 @chapter Complexity Computation
    4 @cindex  Complexity Computation
    5 
    6 The principal goal
    7 Fundamentally, this program counts lines of non-comment source lines,
    8 multiplies by a ``nesting factor'' for each level of logic nesting and
    9 divides by a scaling factor so that the typical results lie roughly in
   10 the same range as @code{pmccabe} results.  That happens to be approximately 20.
   11 
   12 @menu
   13 * parsing::             Parsing Method
   14 * scoring algorithm::   Complexity Measurement Algorithm
   15 * scores::              Complexity Scores
   16 * stats::               Complexity Statistics
   17 * tuning::              Scoring Adjustments
   18 @end menu
   19 
   20 @node     parsing
   21 @section  Parsing Method
   22 @cindex   parsing
   23 
   24 The method chosen for parsing the source has an effect on what gets
   25 seen (scored) by the program.
   26 
   27 @menu
   28 * complexity parsing::  Complexity Measurement Parsing
   29 * post-pp parsing::     Post-PreProcessing Parsing
   30 * during pp parsing::   During PreProcessing Parsing
   31 * pmccabe parsing::     @code{pmccabe} Parsing
   32 @end menu
   33 
   34 @node       complexity parsing
   35 @subsection Complexity Measurement Parsing
   36 
   37 This program examines the actual source a human looks at when the
   38 file is opened, provided it is not pre-processed by @code{unifdef},
   39 @xref{complexity unifdef, @code{unifdef}}.  This was chosen because
   40 uncompiled code adds to the complexity of what a human must understand.
   41 However, sometimes the source will contain unbalanced braces a la:
   42 @example
   43 #if FOO
   44   for (int ix = foo;;) @{
   45 #else
   46   for (int ix = bar;;) @{
   47 #endif
   48     code...
   49   @}
   50 @end example
   51 rendering code that cannot be parsed correctly.  @code{unifdef}-ing
   52 makes it parsable.  Unfortunately, because the practice of @code{ifdef}-ing
   53 unbalanced curly braces is so common, this program cannot rely on
   54 finding the correct closing brace.
   55 
   56 @strong{CAVEAT}: for the purposes of this program, procedures end
   57 when either a matching closing brace is found @emph{or} a closing curly brace
   58 is found in column 1, whichever comes first.  If the closing brace
   59 in column one does not match the procedure opening brace, the
   60 procedure is considered unscorable.
   61 
   62 Fortunately, unscorable procedures are relatively unusual.
   63 
   64 @strong{CAVEAT2}: K&R procedure headers are not recognized.
   65 If anything other than an opening curly brace appears after
   66 the parameter list will cause the code recognizer to go back
   67 into ``look for a procedure header'' mode.  K&R procedures
   68 are not just not scored, they are completely ignored.
   69 
   70 This should probably get fixed, though.
   71 
   72 @node       post-pp parsing
   73 @subsection Post-PreProcessing Parsing
   74 
   75 Another approach would be to use the @code{C} compiler and analize the
   76 tokens coming out of the preprocessor.  The drawbacks are that macro
   77 expansions will add to the complexity, even though they do not add to
   78 human perceived complexity, and uncompiled code do not add to the
   79 complexity measure.  The benefit, of course, is that you know for
   80 certain where a procedure body starts and ends.
   81 
   82 @node       during pp parsing
   83 @subsection During PreProcessing Parsing
   84 
   85 This would require going into the C preprocessor code and cause macros
   86 to not be expanded.  Again, the great benefit is that you know for
   87 certain you can find the starting and ending braces for every
   88 procedure body.  The downsides are the extra work and, again, the
   89 uncompiled code won't get counted in the complexity measure.
   90 
   91 This might be a useful exercise to do some day, just to see
   92 how helpful it might be.  Being able to recognize all procedure
   93 bodies without fail would be a good thing.
   94 
   95 @node       pmccabe parsing
   96 @subsection @code{pmccabe} Parsing
   97 
   98 The @code{pmccabe} parsing actually inspired the method for this program.
   99 Thd difference is that @code{pmccabe} will always keep scanning until
  100 a procedure body's closing curly brace is found, even if that means
  101 counting the code from several following procedure definitions.
  102 The consequence of this is that this program's code will see some
  103 procedures that @code{pmccabe} will not, and vice versa.
  104 
  105 @node    scoring algorithm
  106 @section Complexity Measurement Algorithm
  107 
  108 Fundamentally, this program counts non-comment source lines and
  109 examines elements of parenthesized expressions.  This score
  110 is multiplied by a nesting scoring factor for each layer of code nesting.
  111 
  112 A parenthesized expression is scanned for operators.  If they are all
  113 arithmetic operators, or all arithmetic and one relational operator,
  114 the score is zero.  If all the operators are boolean @code{and}s or
  115 they are all @code{or}s, then the score is one.  An assignment
  116 operator with arithmetic operators also scores one.  If you mix
  117 relational operators and all @code{and}s or all @code{or}s, the score
  118 is the number of boolean elements.  If you mix @code{and}s and
  119 @code{or}s at the same parenthetical level, the two counts are
  120 multiplied, unless the boolean element count is higher.
  121 
  122 Fundamentally, do not use multiple relational or boolean operators at
  123 the same parenthetical level, unless they are all boolean @code{and}s
  124 or they are all boolean @code{or}s.  If you use boolean operators and
  125 relational operators in one expression, you are charged one statement
  126 for each boolean element.
  127 
  128 After scoring each statement and any parenthesized expressions, the
  129 score is multiplied by any encompassing controlled block and added to
  130 the score of that block.  A ``controlled block'' is a curly-braced
  131 collection of statements controlled by one of the statement
  132 controlling statements @code{do}, @code{for}, @code{else}, @code{if},
  133 @code{switch}, or @code{while}.  Stand alone blocks for scoping local
  134 variables do not trigger the multiplier.
  135 
  136 You may trace the scores of parenthesized expressions and code
  137 blocks (@pxref{complexity trace, trace output file}).  You will see
  138 the raw score of the code block or expression.
  139 
  140 The final score is the outermost score divided by the ``scaling factor'',
  141 @xref{complexity scale, complexity scaling factor}.
  142 
  143 @node     scores
  144 @section  Complexity Scores
  145 @cindex   scores
  146 
  147 The ``Complexity Scores'' table shows the score of each procedure identified
  148 that also exceeded the threshold score,
  149 @xref{complexity threshold, ---threshold}.  The entries on each line are:
  150 
  151 @itemize  @bullet
  152 @item
  153 The computed score
  154 @item
  155 The number of lines between the opening and closing curly braces
  156 @item
  157 The number of non-comment, non-blank lines found there
  158 @item
  159 The name of the source file
  160 @item
  161 The line number of the opening curly brace
  162 @item
  163 The name of the procedure
  164 @end itemize
  165 
  166 The output is sorted by the score and then the number of non-comment lines.
  167 Procedures with scores below the threshold are not displayed.
  168 
  169 @node     stats
  170 @section  Complexity Statistics
  171 @cindex   statistics
  172 
  173 The statistics are displayed both as a table and as a histogram,
  174 @xref{Example Output}.  It is under the control of the
  175 @ref{complexity histogram, ---histogram} option.
  176 The statistics are for each non-comment
  177 source line and each source line is given the score of its
  178 encompassing procedure.  This way, larger procedures are given
  179 proportionally more weight than one line procedures.
  180 
  181 The histogram is broken up into three ranges.  Scores of 0 through 99
  182 are displayed in 10 point groupings, 100 through 999 in 100 point
  183 groupings and 1000 and above (good grief!!, but they exist) are in
  184 1000 point groupings.  The number of asterisks represent the number
  185 of lines of code that are in procedures that score in the specified
  186 range.
  187 
  188 The tabular statistics are also based on lines, not procedures.
  189 @table @samp
  190 @item Average line score
  191 This is the procedure score times the non-comment
  192 line count, all added up and divided by the total non-comment source
  193 lines found.
  194 @item 25%-ile score
  195 @itemx 50%-ile score
  196 @itemx 75%-ile score
  197 @itemx Highest score
  198 Since the distribution of scores is nothing like a bell curve, the mean
  199 and standard deviation do not give a very clear picture of the distribution
  200 of the scores.  Typically, the standard deviation is larger than the average
  201 score.  So, instead the program prints the the four quartile scores.
  202 The score for which 25, 50, and 75 percent of code is scored less than,
  203 plus the highest scoring procedure (100 percent of code scores less than
  204 or equal to that score).
  205 @item Unscored procedures
  206 If any procedures were found that could not be scored, the number of such
  207 procedures is printed.
  208 @end table
  209 
  210 @node     tuning
  211 @section  Scoring Adjustments
  212 @cindex   tuning
  213 @cindex   scores
  214 
  215 Scores can be adjusted with three different options:
  216 @table @samp
  217 @item nesting-penalty
  218 @xref{complexity nesting-penalty, ---nesting-penalty}.
  219 @item demi-nesting-penalty
  220 @xref{complexity demi-nesting-penalty, ---demi-nesting-penalty}.
  221 @item scale
  222 @xref{complexity scale, ---scale}.
  223 @end table
  224 
  225 The raw score is the number of lines or statements, whichever is
  226 greater, adjusted by a factor for the depth of the logic.  Statements
  227 are nested when they are inside of a block of statements for a
  228 ``block'' statement (viz., ``do'', ``for'', ``if'', ``switch'' or
  229 ``while'').  Statements within blocks used to constrain the scope of
  230 variables (not controlled by a block statement) are not multiplied by
  231 this factor.
  232 
  233 Expressions are nested when contained within parentheses.
  234 The @i{cost} of these is different.  Block level nesting multiplies the
  235 score for the block by the @code{--nesting-penalty} factor (2.0 by default).
  236 Nested expressions are multiplied by the @code{--demi-nesting-penalty},
  237 the square root of @code{--nesting-penalty} by default.
  238 
  239 Some attempt is made to judge the complexity of an expression.
  240 A complicated expression is one that contains an assignment operator,
  241 more than one relation operator, or a mixture of ``and'' and ``or''
  242 operators with any other different kind of non-arithmetic operator.
  243 Expression scores are minimized by:
  244 
  245 @itemize  @bullet
  246 @item
  247 Doing assignments outside of
  248 boolean expressions, or at least parenthesizing them.
  249 @item
  250 Parenthesizing each relationship operation in an expression
  251 of multiple ``and'' and/or ``or'' operations.  Yes, precedence
  252 parses them correctly, but it is less clear.
  253 @item
  254 Parenthesizing groups of ``and'' and ``or'' operations so that
  255 operators of only one type appear at one level.  For example,
  256 the first expression below instead of the second.  Yes, precedence
  257 means the effect is the same, but we're after code clarity so that
  258 correctness is more obvious.
  259 @example
  260 1: ((a && b) || (c && d))
  261 2: (a && b || c && d)
  262 @end example
  263 The first adds 2 to the raw score (before dividing by the scaling factor).
  264 The latter will add 5, assuming a @code{demi-nesting-penalty} of @code{1.41}.
  265 @end itemize