"Fossies" - the Fresh Open Source Software Archive

Member "libsafe-2.0-16/doc/whitepaper-2.0/whitepaper-20.tex" (19 Apr 2001, 17426 Bytes) of package /linux/misc/old/libsafe-2.0-16.tgz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 \documentclass[]{article}
    2 \usepackage{epsfig}
    3 \usepackage{setspace}
    4 \usepackage{fancyheadings}
    5 %\usepackage{threeparttable}
    6 %\usepackage{graphicx}
    7 %\usepackage[lineno5]{lgrind}
    8 %\usepackage[hang]{subfigure}
    9 \usepackage{url}
   10 \usepackage{entry}
   11 
   12 \setlength{\oddsidemargin}{0in}
   13 \setlength{\topmargin}{-0.5in}
   14 \setlength{\textheight}{9.0in}
   15 \setlength{\textwidth}{6.5in}
   16 
   17 \pagestyle{fancyplain}
   18 \lhead{}
   19 \chead{
   20     libsafe-2.0 White Paper \\
   21     {\bf VERSION 3-21-01} 
   22 }
   23 \rhead{}
   24 
   25 \newcommand{\compress}{
   26     \parskip 0in
   27     \topsep 0in
   28     \itemsep 0in
   29     \partopsep 0in
   30 }
   31 
   32 \newlength{\figwidth}
   33 \setlength{\figwidth}{\columnwidth}
   34 
   35 \singlespacing
   36 
   37 
   38 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   39 
   40 \begin{document}
   41 
   42 \bibliographystyle{plain}
   43 
   44 
   45 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   46 % Title page
   47 
   48 \begin{singlespace}
   49 
   50 \title{ Libsafe 2.0: Detection of Format String Vulnerability Exploits }
   51 
   52 \author{
   53     Timothy Tsai and Navjot Singh \\
   54     Avaya Labs, Avaya Inc. \\
   55     600 Mountain Ave \\
   56     Murray Hill, NJ  07974  USA \\
   57     \{ttsai,singh\}@avaya.com \\
   58     %{\tt http://www.research.avayalabs.com/project/libsafe.html}
   59 }
   60 
   61 \date{February 6, 2001}
   62 
   63 \maketitle
   64 
   65 \end{singlespace}
   66 
   67 
   68 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   69 % Abstract
   70 
   71 \begin{singlespace}
   72 
   73 \begin{abstract}
   74 
   75 This white paper describes a significant new feature of libsafe version 2.0:
   76 the ability to detect and handle format string vulnerability exploits.  Such
   77 exploits have recently garnered attention in security advisories, discussion
   78 lists, web sites devoted to security, and even conventional media such as
   79 television and newspapers.  Examples of vulnerable software include {\tt
   80 wu-ftpd} (a common FTP daemon) and {\tt bind} (A DNS [Domain Name System]
   81 server).  This paper describes the vulnerability and the technique libsafe uses
   82 to detect and handle exploits.
   83 
   84 \begin{Ventry}{NOTE}
   85 \item[NOTE]
   86     This paper only describes one particular feature of libsafe version 2.0:
   87     the ability to detect and handle format string vulnerability exploits.
   88     Other features include support for code compiled without frame pointer
   89     instructions, extra debugging facilities, and bug fixes.  See \cite{usenix}
   90     for details of the original version of libsafe.
   91 \end{Ventry}
   92 
   93 \end{abstract}
   94 
   95 \end{singlespace}
   96 
   97 
   98 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   99 
  100 \section{Introduction}
  101 \label{sec:introduction}
  102 
  103 Buffer overflow exploits constitute perhaps the most common form of computer
  104 security attack~\cite{SSP89*326,Rochlis89,Seeley89}.  Such exploits take
  105 advantage of programming errors to overflow buffers, thus writing unintended
  106 data to the part of memory that immediately follows the targeted buffers.  If
  107 the targeted buffer exists on the process stack, then the exploit often
  108 attempts to overwrite a return address on the stack, which often results in
  109 obtaining root access to that machine.  The original version of libsafe,
  110 version 1.3~\cite{usenix}, presented a significant advance in the detection and
  111 handling of buffer overflow attacks by offering a solution that detects a large
  112 number of exploits with low overhead and tremendous ease of
  113 use\footnote{Libsafe requires no specific security expertise and can be
  114 installed in under one minute!}.
  115 
  116 Recently, another widespread vulnerability has received a great deal of
  117 attention:  the format string vulnerability\cite{bind_report,wuftpd_report}.
  118 The latest version of libsafe, version 2.0, implements a solution for detecting
  119 and handling the most dangerous format string vulnerability exploits, while
  120 preserving the low overhead and ease of use of the original libsafe.
  121 
  122 The most common source of this vulnerability is the ubiquitous {\tt printf()}
  123 function.  Consider the following vulnerable piece of code:
  124 
  125 \begin{verbatim}
  126     printf("%x %x %x %x\n");
  127 \end{verbatim}
  128 
  129 The above code will usually compile with no warnings\footnote{For {\tt gcc},
  130 warnings are produced with the {\tt -Wall} option, but not with the default
  131 warning level.}, even though it obviously lacks the required number of
  132 arguments.  If this code is executed, it will print out four hexadecimal
  133 numbers, corresponding to the values on the stack where it expects the missing
  134 arguments to be present.  This allows an attacker to examine the contents of
  135 the stack.
  136 
  137 The following code illustrates an even more insidious form of the format string
  138 vulnerability:
  139 
  140 \begin{verbatim}
  141     printf("%.*d%n\n", (int) start_attack_code, 0, return_addr_ptr);
  142 \end{verbatim}
  143 
  144 The above example takes advantage of a relatively seldom used {\tt printf()}
  145 specifier:  {\tt \%n}.  This specifier calculates the current number of
  146 characters produced by the {\tt printf()} function and writes this number to
  147 the memory location indicated by the corresponding pointer in the argument
  148 list.  In our example, the pointer is {\tt return\_addr\_ptr}.  The astute
  149 observer may realize at this point that a malicious attacker can potentially
  150 overwrite any memory location, including locations containing return addresses.
  151 Furthermore, the above form of the {\tt printf()} statement controls the exact
  152 number that is written to the memory location.  Our example writes the value
  153 {\tt start\_attack\_code} to the location {\tt return\_addr\_ptr}.  Assuming
  154 that {\tt start\_attack\_code} is the starting address for some attack code,
  155 the next return from that exploited function will cause the attack code to be
  156 executed.  Often, this attack code causes a shell to be started, and if the
  157 process under attack is privileged (as is the case with many daemon process),
  158 then an attacker can obtain a root shell.
  159 
  160 Fortunately, it takes a bit more ingenuity to actually take advantage of this
  161 vulnerability.  Usually, vulnerable code occurs in a form similar to the
  162 following:
  163 
  164 \begin{verbatim}
  165     if (illegal_command(command)) {
  166         sprintf(error_msg, "Illegal command: %s", command);
  167         ...
  168         syslog(LOG_WARNING, error_msg);
  169         return;
  170     }
  171 \end{verbatim}
  172 
  173 In this example, {\tt command} is a character buffer that contains a command
  174 from the user.  If the command is illegal, then the {\tt sprintf()} statement
  175 forms an error message that is passed to {\tt syslog()}.  Under normal
  176 circumstances, {\tt syslog()} will simply append {\tt error\_msg} to the
  177 appropriate log file.  However, if {\tt command} contains {\tt printf()}
  178 specifiers, such as those in the first two code examples, then bad things can
  179 happen.
  180 
  181 Such code vulnerabilities exist in real life, and the
  182 corresponding exploits also exist.  In fact, existence of these and similar
  183 vulnerabilities and the relative ease of obtaining exploits has largely led to
  184 the prevalence of so-called ``script kiddies,'' or attackers who systematically
  185 attack remote machines using downloaded scripts in the hopes of finding a
  186 machine that is vulnerable.  Such attackers often possess only a rudimentary
  187 knowledge of networks and systems.  However, they often find great success due
  188 to the surprisingly large number of Internet-connected machines that execute
  189 vulnerable software.  Part of the problem is the complexity of system
  190 maintenance.  Making sure that one's machine has the latest version of every
  191 software package is not simple, especially since system maintenance is
  192 often a secondary responsibility.  Also, some vulnerabilities are still mostly
  193 unknown, and software updates to fix the problem may not yet be available.
  194 
  195 This is where libsafe version 2.0 is valuable.  Libsafe version 2.0 will foil
  196 all format string vulnerability exploits that attempt to overwrite return
  197 addresses on the stack.  If such an attack is attempted, libsafe will log a
  198 warning and terminate the targeted process.  As with version 1.3, installation
  199 is extremely easy and requires no knowledge of the system, applications,
  200 exploits, or even libsafe itself.  Also, because libsafe incurs relatively
  201 little overhead, it can be used to protect all processes on a machine, thereby
  202 potentially detecting instances of vulnerabilities that may yet be unknown.
  203 
  204 
  205 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  206 
  207 \section{Implementation}
  208 \label{sec:implementation}
  209 
  210 %interception
  211 %    -- glibc code
  212 %ra check
  213 %    -- uint _libsafe_raVariableP(void *addr)
  214 %span check
  215 %    -- uint _libsafe_span_stack_frames(void *start_addr, void *end_addr)
  216 %handling
  217 %    -- _libsafe_die()
  218 
  219 The implementation of format string vulnerability detection in libsafe version
  220 2.0 borrows heavily from the basic detection mechanism in version 1.3.  There
  221 are three main steps in the detection mechanism:
  222 
  223 \begin{Ventry}{Violation handling}
  224 \item[Interception]
  225     Libsafe executes its own version of selected vulnerable functions.
  226 \item[Safety check]
  227     Libsafe determines if the function can be safely executed.
  228 \item[Violation handling]
  229     If the function cannot be safely executed, libsafe executes warning and
  230     termination actions.
  231 \end{Ventry}
  232 
  233 %----------------------------------------------------------------------
  234 
  235 \subsection{Interception}
  236 \label{subsec:interception}
  237 
  238 The basic idea behind libsafe is the interception of vulnerable functions by
  239 safer alternatives that first check to make sure that the functions can be
  240 safely executed based on their arguments.  If the check passes, libsafe either
  241 calls the original function or executes code that is functionally equivalent.
  242 Otherwise, warnings are posted and the process is terminated.
  243 
  244 Libsafe is able to intercept functions (i.e., substitute its alternatives in
  245 place of the original functions) because it is implemented as a shared library
  246 that is loaded into memory before the standard library (i.e., {\tt
  247 /lib/libc.so}).  For Linux systems, the run-time loader, {\tt ld.so}, is
  248 responsible for loading the various program code and libraries into memory.
  249 For programs that require the standard library, {\tt ld.so} loads this library
  250 into memory and links all references to library functions in the program code
  251 to the library functions.  If libsafe is activated, {\tt ld.so} loads the
  252 libsafe library into memory before the standard library.  Because the libsafe
  253 alternative functions have the same names as the original standard library
  254 functions, {\tt ld.so} uses the libsafe functions in place of the standard
  255 library functions.
  256 
  257 Most of the libsafe functions perform a safety check and then call the original
  258 function or a safer alternative (e.g., {\tt snprintf()} is called in place of
  259 {\tt sprintf()}).  However, two functions are treated differently:  {\tt
  260 \_IO\_vfprintf()} and {\tt \_IO\_vfscanf()}\footnote{{\tt \_IO\_vfprintf()} and
  261 {\tt \_IO\_vfscanf()} are the core functions that all other {\tt *printf()} and
  262 {\tt *scanf()} functions eventually call.  Thus, intercepting these two core
  263 functions effectively intercepts the entire family of {\tt *printf()} and {\tt
  264 *scanf()} functions.  Note: {\tt syslog()} also eventually calls {\tt
  265 \_IO\_vfprintf()}}.  For {\tt \_IO\_vfprintf()} and {\tt \_IO\_vfscanf()}, the
  266 original source code from libc-2.1.3-91 is incorporated directly into libsafe.
  267 Libsafe needs the original source code because the safety checks for these two
  268 functions require knowledge of local variables.
  269 
  270 
  271 %----------------------------------------------------------------------
  272 
  273 \subsection{Safety check}
  274 \label{subsec:safety_check}
  275 
  276 The safety checks for each function are highly specific to each function.  For
  277 {\tt \_IO\_vfprintf()}, libsafe performs two checks:
  278 
  279 \begin{Lentry}
  280 \item[Return address and frame pointer check]
  281     For each {\tt \%n} specifier, libsafe checks the associated pointer
  282     argument.  Each such pointer argument is passed to {\tt
  283     \_libsafe\_raVariableP(void *addr)}, where {\tt addr} is the pointer
  284     argument.  {\tt \_libsafe\_raVariableP(void *addr)} returns {\tt 1} only if
  285     it determines that {\tt addr} points to a return address or a frame pointer
  286     on the stack.  Otherwise, it returns {\tt 0}, which means that {\tt addr}
  287     points to an address that is either not on the stack or which is on the
  288     stack, but which is not a return address or a frame pointer.  If {\tt
  289     \_libsafe\_raVariableP()} returns {\tt 1}, then libsafe has found a
  290     violation.
  291 \item[Frame span check]
  292     The argument list for any function should always be contained within a
  293     single stack frame.  Thus, attacks that attempt to probe the stack using
  294     statements such as {\tt printf("\%x \%x ...")} might require arguments that
  295     extend beyond the current stack frame.  The {\tt
  296     \_libsafe\_span\_stack\_frames(void *start\_addr, void *end\_addr)}
  297     function returns {\tt 1} only if {\tt start\_addr} and {\tt end\_addr} are
  298     located in two different stack frames.  If {\tt
  299     \_libsafe\_span\_stack\_frames()} returns {\tt 1}, then libsafe has found a
  300     violation.
  301 \end{Lentry}
  302 
  303 \begin{figure}[htbp]
  304 \centerline{\psfig{figure=stack.eps,height=3.5in}}
  305 \caption{Stack Frames}
  306 \label{fig:stack_frames}
  307 \end{figure}
  308 
  309 To perform these two checks, libsafe determines the locations and sizes of the
  310 frames on the stack.  Figure~\ref{fig:stack_frames} illustrates the
  311 organization of a process stack.  The beginning of each stack frame is
  312 indicated by the presence of a frame pointer that points back to the previous
  313 stack frame.  Libsafe finds each stack frame by starting at the top-most frame
  314 and traversing the frame pointers until it finds the stack frame for {\tt
  315 main()}.  The top-most frame corresponds to a libsafe function.  Within this
  316 libsafe function, the frame pointer is found by using the gcc function {\tt
  317 \_\_builtin\_frame\_pointer(0)}.  The return address back into the calling
  318 function is located immediately before each frame pointer.  This technique
  319 works for most processes, with a few exceptions.  Certain compilers may not
  320 produce code that places frame pointers on the stack (e.g., {\tt gcc
  321 -fomit-frame-pointer}), and some customized compilers may not locate return
  322 addresses immediately next to the frame pointer (e.g., the StackGuard
  323 compiler~\cite{stackguard98}).
  324 
  325 %----------------------------------------------------------------------
  326 
  327 \subsection{Violation handling}
  328 \label{subsec:handling}
  329 
  330 If libsafe finds a violation during a safety check, then it performs the
  331 actions in Table~\ref{tab:actions}.
  332 
  333 \begin{table}[htbp]
  334 \begin{center}
  335 \caption{Libsafe Actions After Finding a Violation}
  336 \label{tab:actions}
  337 \begin{tabular}{|l||c|c|} \hline
  338 Action  & Default   & Optional? \\ \hline\hline
  339 Terminate process
  340     & Off/On    & Not optional  \\ \hline
  341 Add a entry to {\tt /var/log/secure} using {\tt syslog()}
  342     & On        & Optional \\ \hline
  343 Print a warning to {\tt stderr}
  344     & On        & Not optional \\ \hline
  345 Dump a hexadecimal version of the stack contents to a file
  346     & Off       & Optional \\ \hline
  347 Send email to a list of recipients
  348     & Off       & Optional \\ \hline
  349 Produce a core dump by calling {\tt abort()}
  350     & Off       & Optional \\ \hline
  351 \end{tabular}
  352 \end{center}
  353 \end{table}
  354 
  355 The main libsafe action after detecting a violation is to terminate the
  356 process.  Data integrity after a violation cannot be assured, and therefore,
  357 the safest course of action is to terminate the entire process.  However, for
  358 violations of the return address and frame pointer check, libsafe can
  359 optionally allow the process to continue execution.  This exception is based on
  360 the assumption that programmers will almost never (or at least should never)
  361 produce code that attempts to use the {\tt \%n} specifier to overwrite a return
  362 address or frame pointer.  In practice, most occurrences of such attacks result
  363 from processing user input that unexpectedly contains the {\tt \%n} specifier.
  364 In such instances, since the input is garbage, libsafe can usually allow the
  365 process to continue to process the input as long as the {\tt \%n} specifier is
  366 not permitted to write to memory.
  367 
  368 %----------------------------------------------------------------------
  369 
  370 \subsection{Notes}
  371 \label{subsec:notes}
  372 
  373 \begin{enumerate}
  374 \item Libsafe relies on the location of frame pointers on the stack to
  375     determine the location of stack frames and return addresses.  Some programs
  376     have been compiled without code to embed frame pointers on the stack (e.g.,
  377     by using {\tt gcc -fomit-frame-pointer}).  For such code, libsafe will
  378     automatically detect the absence of frame pointers on the stack and allow
  379     the program to execute normally.  However, it will not be able to detect
  380     any exploits for such programs.
  381 \item Libsafe is linked with glibc and is incompatible with libc5.  If you have
  382     a program that is linked with libc5, you will need to either obtain an
  383     updated version linked with glibc or recompile the source code yourself
  384     with glibc.
  385 \end{enumerate}
  386 
  387 
  388 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  389 
  390 \section{Software Availability}
  391 \label{sec:software_availability}
  392 
  393 Libsafe version 2.0 has not yet been released to the general public.  However,
  394 it is our intention to release the software under the Lesser GNU Public License
  395 sometime in the near future.  Please contact Timothy Tsai (ttsai@avaya.com) if
  396 you have any questions or are interested in evaluating the software.
  397 
  398 
  399 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  400 
  401 \begin{singlespace}
  402 %\compress
  403 \bibliography{whitepaper-20}
  404 \end{singlespace}
  405 
  406 \end{document}
  407