"Fossies" - the Fresh Open Source Software Archive

Member "libsafe-2.0-16/doc/whitepaper-1.3/whitepaper-13.tex" (6 Feb 2001, 34381 Bytes) of package /linux/misc/old/libsafe-2.0-16.tgz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 \documentclass[]{article}
    2 \usepackage{epsfig}
    3 \usepackage{setspace}
    4 \usepackage{fancyheadings}
    5 \usepackage{threeparttable}
    6 \usepackage{graphicx}
    7 \usepackage[lineno5]{lgrind}
    8 \usepackage[hang]{subfigure}
    9 \usepackage{url}
   10 \usepackage{entry}
   11 
   12 \setlength{\oddsidemargin}{0in}
   13 \setlength{\topmargin}{-0.5in}
   14 \setlength{\textheight}{9.0in}
   15 \setlength{\textwidth}{6.5in}
   16 
   17 \pagestyle{fancyplain}
   18 \lhead{}
   19 \chead{White Paper}
   20 \rhead{}
   21 
   22 \newcommand{\compress}{
   23     \parskip 0in
   24     \topsep 0in
   25     \itemsep 0in
   26     \partopsep 0in
   27 }
   28 
   29 \newlength{\figwidth}
   30 \setlength{\figwidth}{\columnwidth}
   31 
   32 \hyphenation{char-ac-ter-is-tics}
   33 
   34 \renewcommand{\textfraction}{.01}
   35 
   36 \singlespacing
   37 
   38 
   39 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   40 
   41 \begin{document}
   42 
   43 \bibliographystyle{plain}
   44 
   45 
   46 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   47 % Title page
   48 
   49 \begin{singlespace}
   50 
   51 \title{ Libsafe: Protecting Critical Elements of Stacks }
   52 
   53 \author{
   54     Arash Baratloo, Timothy Tsai, and Navjot Singh \\
   55     Bell Labs, Lucent Technologies \\
   56     600 Mountain Ave \\
   57     Murray Hill, NJ  07974  USA \\
   58     \{arash,ttsai,singh\}@research.bell-labs.com \\
   59     {\tt http://www.bell-labs.com/org/11356/libsafe.html}
   60 }
   61 
   62 \date{December 25, 1999}
   63 
   64 \maketitle
   65 
   66 \end{singlespace}
   67 
   68 
   69 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   70 % Abstract
   71 
   72 \begin{singlespace}
   73 
   74 \begin{abstract}
   75 
   76 The exploitation of buffer overflow vulnerabilities in process stacks
   77 constitutes a significant portion of security attacks.  We present a new method
   78 to detect and handle such attacks.  In contrast to previous methods, this new
   79 method works with any existing pre-compiled executable and can be used
   80 transparently, even on a system-wide basis.  The method intercepts all calls to
   81 library functions that are known to be vulnerable.  A substitute version of the
   82 corresponding function implements the original functionality, but in a manner
   83 that ensures that any buffer overflows are contained within the current stack
   84 frame.  This method has been implemented on Linux as a dynamically loadable
   85 library called {\em libsafe}.  Libsafe has been shown to detect several known
   86 attacks and can potentially prevent yet unknown attacks.  Experiments indicate
   87 that the performance overhead of libsafe is negligible.
   88 
   89 \end{abstract}
   90 
   91 \end{singlespace}
   92 
   93 
   94 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   95 
   96 \section{Introduction}
   97 \label{sec:introduction}
   98 
   99 As the Internet has grown, the opportunities for attempts to access remote
  100 systems improperly have increased.  Several security attacks, such as the 1988
  101 Internet Worm~\cite{SSP89*326,Rochlis89,Seeley89}, have even become entrenched
  102 in Internet history.  Some attacks, such as the Internet Worm, merely annoy or
  103 occupy system resources.  However, other attacks are more insidious because
  104 they seize root privileges and modify, corrupt, or steal data.
  105 
  106 \begin{figure}[htbp]
  107 \centerline{
  108     \psfig{figure=cert_absolute.eps,width=.47\textwidth}
  109     \hspace{.2in}
  110     \psfig{figure=cert_percentages.eps,width=.47\textwidth}
  111 }
  112 \caption{Number of Reported CERT Security Advisories Attributable to Buffer
  113     Overflow (Data from \cite{wagner00})}
  114 \label{fig:attack_increase}
  115 \end{figure}
  116 
  117 Perhaps, the most common form of attack takes advantage of the buffer overflow
  118 bug.  Figure~\ref{fig:attack_increase} shows the increase in the number of
  119 reported CERT \cite{cert} security advisories that are based on buffer
  120 overflow.  In recent years, attacks that exploit buffer overflow bugs have
  121 accounted for approximately half of all reported CERT advisories.  The buffer
  122 overflow bug may be due to errors in specifying function prototypes or in
  123 implementing functions.  In either case, an inordinately large amount of data
  124 is written to the buffer, thus overflowing it and overwriting the memory
  125 immediately following the end of the buffer.  The overflow injects additional
  126 code into an unsuspecting process and then hijacks control of that process to
  127 execute the injected code.  The hijacking of control is usually accomplished by
  128 overwriting return addresses on the process stack or by overwriting function
  129 pointers in the process memory.  In either case, an instruction that alters the
  130 control flow (such as a return, call, or jump instruction) may inadvertently
  131 transfer execution to the wrong address that points at the injected code
  132 instead of the intended code.
  133 
  134 \begin{table*}[htbp]
  135 \begin{center}
  136 \caption{Partial List of Unsafe Functions in the Standard C Library}
  137 \label{table:unsafe-functions}
  138 \begin{tabular}{|l|l|l|} \hline
  139 Function prototype  & Potential problem \\ \hline \hline
  140 
  141 {\tt strcpy(char *dest, const char *src)} & May overflow the {\tt dest}
  142 buffer. \\
  143 
  144 {\tt strcat(char *dest, const char *src)} & May overflow the {\tt dest}
  145 buffer. \\
  146 
  147 {\tt getwd(char *buf)} & May overflow the {\tt buf} buffer. \\
  148 
  149 {\tt gets(char *s)} & May overflow the {\tt s} buffer. \\
  150 
  151 {\tt fscanf(FILE *stream, const char *format, ...)} & May overflow its
  152 arguments. \\
  153 
  154 {\tt scanf(const char *format, ...)} & May overflow its
  155 arguments. \\
  156 
  157 {\tt realpath(char *path, char resolved\_path[])} & May overflow the
  158 {\tt path} buffer. \\
  159 
  160 {\tt sprintf(char *str, const char *format, ...)} & May overflow the
  161 {\tt str} buffer. \\
  162 
  163 \hline
  164 \end{tabular}
  165 \end{center}
  166 \end{table*}
  167 
  168 Programs written in C have always been plagued with buffer overflows.  Two
  169 reasons contribute to this factor.  First, the C programming language does not
  170 automatically bounds-check array and pointer references.  Second, and more
  171 importantly, many of the functions provided by the standard C library, such as
  172 those listed in Table~\ref{table:unsafe-functions}, are unsafe.  Therefore, it
  173 is up to the programmers to check explicitly that the use of these functions
  174 cannot overflow buffers.  However, programmers often omit these checks.
  175 Consequently, many programs are plagued with buffer overflows, which makes them
  176 vulnerable to security attacks.
  177 
  178 Preventing buffer overflows is clearly desirable.  If one did not have
  179 access to a C program's source code, the general problem of
  180 automatically bounds-checking array and pointer references is very
  181 difficult, if not impossible.  So at first, it might seem natural to
  182 dismiss any attempts to perform automatic bounds checking at runtime
  183 when one does not have access to the source code.  One of the
  184 contributions of this paper is to demonstrate that by leveraging some
  185 information that is available only at runtime, together with
  186 context-specific security knowledge, one can automatically prevent
  187 security attacks that exploit unsafe functions to overflow stack
  188 buffers.  Such an exploit is illustrated in the following example.
  189 
  190 
  191 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  192 
  193 \section{Buffer Overflow Exploit}
  194 \label{sec:exploit}
  195 
  196 The most general form of security attack achieves two goals:
  197 \begin{enumerate}
  198 \compress
  199 \item
  200 Inject the attack code, which is typically a small sequence of
  201 instructions that spawns a shell, into a running process.
  202 \item
  203 Change the execution path of the running process to execute the attack code.
  204 \end{enumerate}
  205 It is important to note that these two goals are mutually dependent on
  206 each other: injecting attack code without the ability to execute it is
  207 not a security vulnerability.
  208 
  209 By far, the most popular form of buffer overflow exploitation is to attack
  210 buffers on the stack, referred to as the {\em stack smashing attack}.  As is
  211 discussed below, the reason for this popularity is because overflowing stack
  212 buffers can achieve {\em both goals simultaneously}.  Another form of buffer
  213 overflow attack known as the {\em heap smashing attack}, is to attack buffers
  214 residing on the heap (a similar attack involves buffers residing in data
  215 space).  Heap smashing attacks are much harder to exploit, simply because it is
  216 difficult to change the execution path of a running process by overflowing heap
  217 buffers.  For this reason, heap smashing attacks are far less prevalent.
  218 
  219 \begin{figure*}
  220 {\parbox {\figwidth}{\lgrindfile{t1.tex}}}
  221 \caption{A Sample Program to Demonstrate a Stack Smashing Attack}
  222 \label{fig:sample-exploit}
  223 \end{figure*}
  224 
  225 \begin{figure*}[htbp]
  226 \centering
  227 \subfigure[before the attack\label{fig:stack-smashinga}]{\includegraphics*[height=3.2in]{exploit1.eps}}
  228 \subfigure[after injecting the attack code\label{fig:stack-smashingb}]{\includegraphics*[height=3.2in]{exploit2.eps}}
  229 \subfigure[executing the attack code\label{fig:stack-smashingc}]{\includegraphics*[height=3.2in]{exploit3.eps}}
  230 \caption{Buffer Overflow on Process Stack}
  231 \label{fig:stack-smashing}
  232 \end{figure*}
  233 
  234 A complete C program to demonstrate the stack smashing attack is shown
  235 in Figure~\ref{fig:sample-exploit}.  Figure~\ref{fig:stack-smashing}
  236 illustrates the address space of a process undergoing this attack.
  237 The process stack after executing the initialization code and entering
  238 the {\tt main()} function (but before executing any of the
  239 instructions) is illustrated in Figure~\ref{fig:stack-smashinga}.
  240 Notice the structure of the top stack frame (i.e., the stack frame for
  241 {\tt main()}).  This stack frame contains, in order, the function
  242 parameters, the return address of the calling function, the previous
  243 frame pointer, and finally the stack variable {\tt buffer}.  Looking
  244 at the sample program in Figure~\ref{fig:sample-exploit}, a sequence
  245 of instructions for spawning a shell is stored in a string variable
  246 called {\tt shellcode} (lines 3-6).  The two {\tt for} loops in the
  247 {\tt main} function prepare the attack code by writing two sequences
  248 of bytes to {\tt large\_string}: the {\tt for} loop starting on
  249 line~16 writes the (future) starting address of the attack code; then
  250 the {\tt for} loop starting on line~18 copies the attack code
  251 (excluding the terminating null character).  The stack is smashed on
  252 line~20 by the {\tt strcpy()} function.
  253 Figure~\ref{fig:stack-smashingb} depicts the process' stack space
  254 after executing the {\tt strcpy()} call.  Notice how the unsafe use of
  255 {\tt strcpy()} simultaneously achieves both requirements of the stack
  256 smashing attack: (1) it injects the attack code by writing it on the
  257 process' stack space, and (2) by overwriting the return address with
  258 the address of the attack code, it instruments the stack to alter the
  259 execution path.  The attack completes once the {\tt return} statement
  260 on line~21 is executed: the instruction pointer ``jumps'' and starts
  261 executing the attack code.  This step is illustrated in
  262 Figure~\ref{fig:stack-smashingc}.
  263 
  264 In a real security attack, the attack code would normally come from an
  265 environment variable, user input, or even worse, from a network connection.  A
  266 successful attack on a privileged process would give the attacker an
  267 interactive shell with the user-ID of {\tt root}!
  268 
  269 
  270 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  271 
  272 \section{Related Work}
  273 \label{sec:related_work}
  274 
  275 The Internet Worm that infected tens of thousands of hosts in 1988 was
  276 one of the first well-known buffer overflow attacks, although there
  277 are some anecdotal evidence that buffer overflow attacks date back to
  278 the 1960's~\cite{cowan99}.  In particular, the Internet Worm exploited
  279 a buffer overflow vulnerability of the finger daemon.  The proportion
  280 of attacks based on buffer overflows is increasing each year---in
  281 recent years, buffer overflow attacks have become the most widely used
  282 type of security attack~\cite{wagner00}.  Among such attacks, the
  283 stack smashing attack is the most popular
  284 form~\cite{Instenes:1997:SSW,thomas99}.
  285 
  286 The majority of buffer overflow attacks, including the one exploited by the
  287 Internet Worm is based on the stack smashing attack.  Detailed descriptions of
  288 stack smashing attacks are presented in~\cite{smith97,thomas99}, and
  289 cook-book-like recipes are presented in~\cite{Mudge95,aleph198,dildog}.
  290 
  291 Researchers in the areas of operating systems, static code analyzers
  292 and compilers, and run-time middleware systems have proposed solutions
  293 to circumvent stack smashing type of attacks.  In most operating
  294 systems the stack region is marked as executable, which means that
  295 code located in the stack memory can be executed.  Because this
  296 ``feature'' is used by stack smashing attacks, making the stack
  297 non-executable is a commonly proposed method for preventing overflow
  298 attacks.  A kernel patch removing the stack execution permission has
  299 been made available~\cite{openwall}.  This approach, however, has some
  300 drawbacks.  First, patching and recompiling the kernel is not feasible
  301 for everyone.  Second, {\em nested function calls} or {\em trampoline
  302 functions}, which are used extensively by LISP interpreters and
  303 Objective C compilers, and the most common implementation of signal
  304 handler returns on Unix (as well as Linux), rely on an executable
  305 stack to work properly.  And finally, an alternative attack on stacks
  306 known as {\em return-into-libc}, which directs the program control
  307 into code located in shared libraries, cannot be prevented by making
  308 the stack non-executable~\cite{woj98}.  Because of those reasons, Linus
  309 Torvalds has consistently refused to incorporate this change into the
  310 Linux kernel~\cite{linux98a}.
  311 
  312 Snarskii has developed a custom implementation of the standard C
  313 library for FreeBSD~\cite{snarskii97}.  Similar to libsafe, this
  314 library targets the set of unsafe functions, and inspects the process
  315 stack to detect buffer overflows that write across frame pointers.  In
  316 contrast to libsafe, this is a custom implementation and replaces the
  317 standard C library.
  318 
  319 Several commonly used tools, such as Lint~\cite{lint78}, and those
  320 proposed in~\cite{Evans96} use compile-time analysis to detect common
  321 programming errors.  Existing compilers have also been augmented to
  322 perform bounds-checking~\cite{gcc-extensions}.  These projects have
  323 demonstrated a limited success in preventing the general buffer
  324 overflow problem.  Wagner {\em et al.\/} have recently proposed the
  325 use of compile-time range analysis to ensure the ``safe'' use of C
  326 library functions~\cite{wagner00}.  Similar to our libsafe method,
  327 this project specifically concentrates on the set of unsafe library
  328 functions.  However, unlike our approach, this method requires source
  329 code, which is not always available, and may produce false positives:
  330 a correct program may produce warning or error messages.
  331 
  332 StackGuard~\cite{stackguard98} is another compiler extension that
  333 instruments the generated code with stack-bounds checks.
  334 Specifically, on function entry, a {\em canary} is placed near the
  335 caller's return address on the stack.  Before the function returns to
  336 the caller, the validity of this canary is checked and the program is
  337 terminated if a discrepancy is detected.  This approach works on the
  338 assumption that if the return address is tampered with (due to buffer
  339 overflows), the canary will also be modified, thus causing validation
  340 of the canary to fail.  With the exception of a few programs, this
  341 approach has shown to be effective.  In contrast to libsafe,
  342 StackGuard introduces a noticeable run-time overhead.  Furthermore,
  343 StackGuard requires source code access, and there are some programs,
  344 such as Netscape Navigator, Adobe Acrobat Reader, and Star Office,
  345 that it does not currently support.
  346 
  347 Janus~\cite{goldberg96:secure} is a run-time sand-boxing environment
  348 that confines each application to a set of predefined operations.  It
  349 works on the principle that ``an application can do little harm if its
  350 access to the underlying operating system is appropriately
  351 restricted.''  It relies on the operating system's debugging features,
  352 such as {\tt trace} and {\tt strace}, to observe and to confine a
  353 process to a sand-box.  Similar to our work, this approach works with
  354 existing binary applications and does not require an application's
  355 source code.  However, unlike our approach, Janus does not work with
  356 applications that legitimately need high privileges.  For example, the
  357 Unix {\tt login} process requires a high level of privilege to
  358 execute, but Janus is unable to selectively allow legitimate
  359 privileges while denying unauthorized privileges.  This inherent
  360 limitation prevents Janus from being applied to high privileged
  361 applications, where secure execution is most critical.
  362 
  363 
  364 
  365 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  366 
  367 \section{Libsafe}
  368 \label{sec:overview}
  369 
  370 This paper presents a novel method for performing detection and handling of
  371 buffer overflow attacks.  In contrast to previous methods and without requiring
  372 source code, our novel method can transparently protect processes against stack
  373 smashing attacks, even on a system-wide basis.  The method intercepts all calls
  374 to library functions that are known to be vulnerable.  A substitute version of
  375 the corresponding function implements the original functionality, but in a
  376 manner that ensures that any buffer overflows are contained within the current
  377 stack frame.
  378 
  379 The key idea is the ability to estimate a safe upper limit on the size of
  380 buffers automatically.  This estimation cannot be performed at compile time
  381 because the size of the buffer may not be known at that time.  Thus, the
  382 calculation of the buffer size must be made after the start of the function in
  383 which the buffer is accessed.  Our method is able to determine the maximum
  384 buffer size by realizing that such local buffers cannot extend beyond the end
  385 of the current stack frame.  This realization allows the substitute version of
  386 the function to limit buffer writes within the estimated buffer size.  Thus,
  387 the return address from that function, which is located on the stack, cannot be
  388 overwritten and control of the process cannot be commandeered.
  389 
  390 \begin{table}[thbp]
  391 \caption{List of Some Known Exploits That Are Detected}
  392 \label{tab:detected_exploits}
  393 \begin{center}
  394 \begin{tabular}{|l|l|l|} \hline
  395 Program Name    & Version   & Description \\ \hline\hline
  396 xlockmore   & 3.10      & Program to lock an X Window display \\ \hline
  397 amd     & 6.0       & Automatic remote file system mount daemon \\
  398 \hline
  399 imapd       & 3.6       & IMAP mail server \\ \hline
  400 elm     & 2.5 PL0pre8   & ELM mail user agent \\ \hline
  401 SuperProbe  & 2.11      & Program to probe for and identify video
  402 hardware \\ \hline
  403 \end{tabular}
  404 \end{center}
  405 \end{table}
  406 
  407 We have implemented the previously described method on Linux as a
  408 dynamically loadable library called {\em libsafe}.  Libsafe has
  409 demonstrated its ability to detect and prevent known security attacks
  410 on several commonly used applications, including those listed in
  411 Table~\ref{tab:detected_exploits}.\footnote{The security attacks are
  412 available from Crv's Security Bugware Page
  413 (\url{http://oliver.efri.hr/~crv/}).}  Libsafe's key benefit,
  414 moreover, is its ability to prevent yet unknown attacks.
  415 
  416 \begin{table}[thbp]
  417 \caption{Summary of Detection Technique Characteristics}
  418 \label{tab:summary}
  419 \begin{threeparttable}
  420 \begin{center}
  421 \begin{tabular}{|l||*{5}{p{.8in}|}} \hline
  422     & \multicolumn{5}{|c|}{Instrumentation Techniques} \\ \cline{2-6}
  423     & None
  424     & Libsafe
  425     & StackGuard
  426     & Janus 
  427     & Non-Executable Stack \\ \hline\hline
  428 
  429 \multicolumn{6}{|c|}{} \\
  430 \multicolumn{6}{|l|}{Effectiveness (what types of errors are handled?)} \\
  431 \hline
  432 Kernel Errors
  433     & No
  434     & No
  435     & Yes
  436     & No 
  437     & Yes \\ \hline
  438 Specification Errors
  439     & No
  440     & Yes
  441     & Yes\tnote{a}
  442     & Maybe\tnote{b} 
  443     & Maybe\tnote{c} \\ \hline
  444 Implementation Errors
  445     & No
  446     & Maybe\tnote{d}
  447     & Yes\tnote{a}
  448     & Maybe\tnote{b} 
  449     & Maybe\tnote{c} \\ \hline
  450 User Code Errors
  451     & No
  452     & No
  453     & Yes
  454     & Maybe\tnote{b} 
  455     & Maybe\tnote{c} \\ \hline
  456 
  457 \multicolumn{6}{|c|}{} \\
  458 \multicolumn{6}{|l|}{Other characteristics} \\ \hline
  459 Performance Overhead
  460     & None
  461     & Very low
  462     & Medium
  463     & Medium 
  464     & None \\ \hline
  465 Disk Usage Overhead
  466     & None
  467     & Very low
  468     & Low
  469     & Very low 
  470     & None \\ \hline
  471 Source Code Needed
  472     & No
  473     & No
  474     & Yes
  475     & No 
  476     & No \\ \hline
  477 Ease of Use
  478     & ---
  479     & Very easy
  480     & Easy\tnote{e}
  481     & Easy-Medium\tnote{f} 
  482     & Easy-Medium\tnote{g} \\ \hline
  483 
  484 \end{tabular}
  485 \begin{tablenotes}
  486 \compress
  487 \item[a] If libraries are instrumented.
  488 \item[b] Cannot catch hijacked privileges that are similar to
  489     legitimate privileges.
  490 \item[c] For certain types of exploits (see Section~\ref{sec:related_work}).
  491 \item[d] If we know which functions have errors.
  492 \item[e] Source code must be recompiled, and the compiler may also needed to be
  493     recompiled.
  494 \item[f] Policies need to be written.
  495 \item[g] Kernel may need to be patched and recompiled.
  496 \end{tablenotes}
  497 \end{center}
  498 \end{threeparttable}
  499 \end{table}
  500 
  501 The characteristics of libsafe are shown in Table 3 along with the
  502 corresponding characteristics of two alternative methods, StackGuard and Janus,
  503 which were described earlier in Section~\ref{sec:related_work}.  The first
  504 instrumentation technique labeled ``None'' is presented as a point of
  505 comparison and represents the original program with no modifications.  The
  506 upper half of Table~\ref{tab:summary} describes the types of errors that each
  507 method is able to handle.  Specification and implementation errors refer
  508 specifically to errors in standard library functions as described in the
  509 introductory section.  Kernel errors and user code errors refer to
  510 implementation errors in kernel code and user code, respectively.  The bottom
  511 half of the table describes other characteristics.  The performance overhead
  512 includes only the run-time overhead.  Time spent during configuration and
  513 compilation are not included.  The disk usage overhead is the extra disk space
  514 required due to additional shared libraries, increased executable binary file
  515 sizes, and configuration files.  The next to last row indicates whether source
  516 code is needed for that method.  The ease of use considers the complexity and
  517 time requirement of human efforts needed for configuration and compilation.
  518 
  519 
  520 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  521 
  522 \section{Implementation}
  523 \label{sec:implementation}
  524 
  525 The fundamental observations forming the basis of the libsafe library are the
  526 following:
  527 \begin{itemize}
  528 \compress
  529 \item
  530 Overflowing a stack variable---that is, injecting the attack code into a
  531 running process---does not necessarily lead to a successful stack smashing
  532 attack.  The attack must also divert the execution sequence of a process to run
  533 the attack code.
  534 \item
  535 Although buffer overflows cannot be stopped in general, automatic and
  536 transparent run-time mechanisms can prevent the overflow from corrupting a
  537 return address and altering the control flow of a process.
  538 \end{itemize}
  539 
  540 Refer to Figure~\ref{fig:stack-smashinga} for an example.  The {\tt
  541 strcpy()} function cannot determine the exact size of the destination
  542 variable {\tt buffer}.  At the time {\tt strcpy()} is called, the
  543 frame pointer (i.e., {\tt ebp} register in the Intel Architecture)
  544 will be pointing to a memory location containing the previous frame's
  545 frame pointer.  Furthermore, this memory address separates the stack
  546 variables (local to the current function) from the function arguments.
  547 Continuing with the example of Figure~\ref{fig:stack-smashinga}, the
  548 size of {\tt buffer} and all other stack variables residing on the top
  549 frame cannot extend beyond the frame pointer---this is a safe upper
  550 limit.  The size of variables residing on previous stack
  551 frames---below the top frame---can be bounded by traversing frame
  552 pointers to determine the stack frame locations and sizes for those
  553 variables.  A correct C program should never explicitly modify any
  554 stored frame pointers, nor should it explicitly modify any return
  555 addresses (located next to the frame pointers).  We use this knowledge
  556 to detect and limit stack buffer overflows.  As a result, the attack
  557 executed by calling the {\tt strcpy()} can be detected and terminated
  558 before the return address is corrupted (as in
  559 Figure~\ref{fig:stack-smashingb}).
  560 
  561 Libsafe implements the above technique.  It is implemented as a dynamically
  562 loadable library that is preloaded with every process it needs to protect.  The
  563 preloading injects the libsafe library between the program code and the
  564 dynamically loadable standard C library functions.  The library can then
  565 intercept and bounds-check the arguments before allowing the standard C library
  566 functions to execute.  In particular, it intercepts the unsafe functions listed
  567 in Table~\ref{table:unsafe-functions} to provide the following guarantees:
  568 \begin{itemize}
  569 \compress
  570 \item Correct programs will execute correctly, i.e., no false positives.
  571 \item The frame pointers, and more importantly return addresses, can never be
  572     overwritten by an intercepted function.  In most cases, an overflow
  573     that leads to overwriting the return address can be detected.
  574 \end{itemize}
  575 
  576 \begin{figure}[tbp]
  577 \centerline{\psfig{figure=inter1.eps,height=3.in}}
  578 \caption{Libsafe Containment of Buffer Overflow}
  579 \label{fig:intercept}
  580 \end{figure}
  581 
  582 Figure~\ref{fig:intercept} illustrates the memory of a process that
  583 has been linked with the libsafe library, and in particular, it shows
  584 the new implementation of {\tt strcpy()} in the libsafe library.  Once
  585 the program invokes {\tt strcpy()}, the version implemented in the
  586 libsafe library gets executed---this is due to the order in which the
  587 libraries were loaded.  The libsafe implementation of the {\tt
  588 strcpy()} function first computes the length of the source string and
  589 the upper bound on the size of the destination buffer (as explained
  590 above).  It then verifies that the length of the source string is less
  591 than the bound on the destination buffer.  If the verification
  592 succeeds, then the {\tt strcpy()} calls {\tt memcpy()} (implemented in
  593 the standard C library) to perform the operation.  However, if the
  594 verification fails, {\tt strcpy()} creates a {\tt syslog} entry and
  595 terminates the program.  A similar approach is applied to the other
  596 unsafe functions in the standard C library.
  597 
  598 The libsafe library has been implemented on Linux.  It uses the
  599 preload feature of dynamically loadable ELF libraries to automatically
  600 and transparently load with processes it needs to protect.  In
  601 essence, it can be used in one of two ways: (1) by defining the
  602 environment variable {\tt LD\_PRELOAD}, or (2) by listing the library
  603 in {\tt /etc/ld.so.preload}.  The former approach allows per-process
  604 control, where as the latter approach automatically loads the libsafe
  605 library machine-wide.
  606 
  607 The libsafe library does not use any Linux specific feature of ELF; these ELF
  608 features are available for many other versions of Unix such as Solaris, and
  609 have been used for other purposes~\cite{Alexandrov:1997:EOS,zlibc}.
  610 Furthermore, an alternative technique with a similar feature can be used for
  611 Windows NT~\cite{mediating_connectors,sosp93*80}.
  612 
  613 We have installed the libsafe library on a Linux machine.  The library is
  614 automatically loaded with every process and transparently protects each process
  615 from stack smashing attacks.  The protected applications include daemon
  616 processes such as Apache HTTP server, sendmail, and NFS server, as well as
  617 those started by users such as XFree86 server, Enlightenment window manager,
  618 GNU Emacs, Netscape Navigator, and Adobe Acrobat Reader.  We have used this
  619 machine for over a week and found the machine to be stable and running without
  620 a noticeable performance hit.
  621 
  622 
  623 \section{Performance}
  624 \label{sec:performance}
  625 
  626 The libsafe library is effective in detecting and preventing stack smashing
  627 attacks.  Extra code is needed to perform this detection, and that extra code
  628 incurs a performance overhead.  In this section we quantify the performance
  629 overhead associated with use of the libsafe library.
  630 Section~\ref{sec:kernel_tests} describes the overheads associated with
  631 synthetic kernel programs to illustrate the range of possible overheads.
  632 Section~\ref{sec:applicaton_tests} gives performance data for a selected set of
  633 actual applications.
  634 
  635 All experiments were conducted on a 400 MHz Pentium II machine with 128 MB of
  636 memory running RedHat Linux version 6.0.  Libsafe and all programs in
  637 Sections~\ref{sec:kernel_tests} and ~\ref{sec:applicaton_tests} were compiled
  638 (and optimized using -O2) with GCC compiler version 2.91.66.
  639 
  640 %------------------------------------------------------------------------------
  641 
  642 \subsection{Kernel Tests}
  643 \label{sec:kernel_tests}
  644 
  645 The first time each libsafe function is activated, the initialization of that
  646 particular function makes a {\tt dlsym()} call for each libc function that is
  647 called from this libsafe function.  Because the libc function has the same name
  648 as the corresponding libc version, the {\tt dlsym()} call is needed to obtain a
  649 pointer to the libc function.  Each {\tt dlsym()} call requires 1.26~$\mu$s.
  650 The interception and redirection of a C library function consists of an
  651 additional user-level function call, which approximately adds 0.04~$\mu$s of
  652 overhead.
  653 % ---certainly an acceptable result.
  654 
  655 \begin{figure}[htbp]
  656 \centerline{\psfig{figure=kernel_performance.eps,width=5.5in}}
  657 \caption{Performance of Libsafe Functions}
  658 \label{fig:kernel_performance}
  659 \end{figure}
  660 
  661 To quantify the performance overhead of the libsafe library we measure
  662 the execution times of five unsafe C library functions and compare the
  663 results with our ``safe'' versions.  The results are depicted in
  664 Figure~\ref{fig:kernel_performance}.  Reported times are ``wall
  665 clock'' elapsed times as reported by {\tt gettimeofday()}.  An
  666 interesting observation is that the libsafe versions of several
  667 functions outperform the original versions.  This is a repeatable
  668 behavior, and we have observed consistent findings on different
  669 machines and operating system versions.  This effect is due both to
  670 low-level optimizations and the fact that libsafe's implementation of
  671 most functions is different than those of C library.  For example,
  672 consider the performance of {\tt getwd()} and {\tt sprintf()}
  673 functions.  Our libsafe library replaces these functions with
  674 equivalent safe versions.  In particular, {\tt getwd()} is replaced
  675 with {\tt getcwd()} and {\tt sprintf()} is replaced with {\tt
  676 snprintf()}; on Linux, the safe versions execute faster.
  677 
  678 The figure also shows that the libsafe library can slow down the
  679 string operations {\tt strcpy()} and {\tt strcat()} by as much as
  680 0.5~$\mu$s per function call.  However, as the string size increases,
  681 the absolute overhead decreases because the execution time of the safe
  682 versions increases more slowly than that for the unsafe versions.  In
  683 fact, the safe version of {\tt strcat()} used with strings longer than
  684 256 bytes is actually faster than the unsafe version!  This is an
  685 example of how using a different implementation (e.g., using {\tt
  686 memcpy()} to copy a string) can outperform the standard implementation
  687 for certain cases.
  688 
  689 The slowdown effect of {\tt strcpy()} is observed in the {\tt realpath()}
  690 experiment.  When a program calls {\tt realpath()}, the libsafe library calls
  691 {\tt realpath()} but stores the result in a buffer in its own memory region.
  692 It then uses {\tt strcpy()} to copy the result to the final destination.  As
  693 Figure~\ref{fig:kernel_performance} shows, the slowdown effect of {\tt
  694 strcpy()} on {\tt realpath()} is less than 0.05~$\mu$s.
  695 
  696 %------------------------------------------------------------------------------
  697 
  698 \subsection{Application Tests}
  699 \label{sec:applicaton_tests}
  700 
  701 We used four real-world applications to illustrate the performance
  702 overhead associated with libsafe.  The applications are {\tt
  703 quicksort} (a CPU-bound program), {\tt imapd} (a network-bound
  704 program), {\tt tar} (an I/O-bound program), and {\tt xv} (a CPU and
  705 video-bound program).  Figure~\ref{fig:application_performance} shows
  706 the execution time for each of these applications using (1) the
  707 original libc (i.e., without libsafe), (2) the libsafe method, and (3)
  708 StackGuard.  The execution times are based on 100 runs and are given
  709 in seconds, with associated 95\% confidence intervals.  Reported times
  710 are ``wall clock'' elapsed times as reported by {\tt /bin/time}.
  711 
  712 \begin{figure}[htbp]
  713 \centerline{\psfig{figure=application_performance.eps,width=5.5in}}
  714 \caption{Mean Execution Times (With 95\% confidence intervals) of Sample
  715     Applications}
  716 \label{fig:application_performance}
  717 \end{figure}
  718 
  719 Figure~\ref{fig:application_performance} shows that the overheads
  720 associated with all detection methods are reasonable.  Libsafe is the
  721 most efficient method because only the unsafe library functions are
  722 intercepted.  The overall application test results are encouraging.
  723 We have installed and used libsafe on one of own machine, and in
  724 practice, we have found that this overhead is not noticeable.
  725 
  726 
  727 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  728 
  729 \section{Conclusions}
  730 \label{sec:conclusions}
  731 
  732 We have described a new method for preventing stack smashing attacks that rely
  733 on corrupting the return address, and implemented this method in as a
  734 dynamically loaded library called libsafe.  The libsafe library instruments a
  735 small set of library functions that are known to be vulnerable to buffer
  736 overflows.
  737 
  738 An interesting finding is the performance of libsafe.  We anticipated
  739 a low performance overhead at the onset of this project.  We were
  740 happily surprised to find how little this overhead is in practice.
  741 Because of low-level optimizations and because libsafe's
  742 implementation of most functions is different than those of C library,
  743 for some applications we actually observed a speedup.  This is
  744 encouraging since it indicates the viability of this approach.
  745 Furthermore, the elegance and simplicity of instrumenting the standard
  746 C library lead to a stable implementation.
  747 
  748 We believe that the stability, minimal performance overhead, and ease of
  749 implementation (i.e., no modification or recompilation of source code) of
  750 libsafe makes it an attractive first line of defense against stack smashing
  751 attacks.  We has demonstrated its effectiveness in testing it against several
  752 known buffer overflow attacks, but its real benefit, we believe, is its ability
  753 to prevent yet unknown attacks.
  754 
  755 
  756 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  757 
  758 \begin{singlespace}
  759 \compress
  760 \bibliography{whitepaper-13}
  761 \end{singlespace}
  762 
  763 \end{document}
  764 
  765 % LocalWords: LocalWords
  766 % LocalWords: Libsafe libsafe Torvalds
  767 % LocalWords: NJ ttsai singh
  768 % LocalWords: CERT Advisories trampoline Netscape StackGuard IMAP 
  769 % LocalWords: strcpy strcat preload
  770 % LocalWords: eps shellcode dlsym libc