"Fossies" - the Fresh Open Source Software Archive 
Member "libsafe-2.0-16/doc/whitepaper-2.0/whitepaper-20.tex" (19 Apr 2001, 17426 Bytes) of package /linux/misc/old/libsafe-2.0-16.tgz:
As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style:
standard) with prefixed line numbers.
Alternatively you can here
view or
download the uninterpreted source code file.
1 \documentclass[]{article}
2 \usepackage{epsfig}
3 \usepackage{setspace}
4 \usepackage{fancyheadings}
5 %\usepackage{threeparttable}
6 %\usepackage{graphicx}
7 %\usepackage[lineno5]{lgrind}
8 %\usepackage[hang]{subfigure}
9 \usepackage{url}
10 \usepackage{entry}
11
12 \setlength{\oddsidemargin}{0in}
13 \setlength{\topmargin}{-0.5in}
14 \setlength{\textheight}{9.0in}
15 \setlength{\textwidth}{6.5in}
16
17 \pagestyle{fancyplain}
18 \lhead{}
19 \chead{
20 libsafe-2.0 White Paper \\
21 {\bf VERSION 3-21-01}
22 }
23 \rhead{}
24
25 \newcommand{\compress}{
26 \parskip 0in
27 \topsep 0in
28 \itemsep 0in
29 \partopsep 0in
30 }
31
32 \newlength{\figwidth}
33 \setlength{\figwidth}{\columnwidth}
34
35 \singlespacing
36
37
38 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
39
40 \begin{document}
41
42 \bibliographystyle{plain}
43
44
45 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
46 % Title page
47
48 \begin{singlespace}
49
50 \title{ Libsafe 2.0: Detection of Format String Vulnerability Exploits }
51
52 \author{
53 Timothy Tsai and Navjot Singh \\
54 Avaya Labs, Avaya Inc. \\
55 600 Mountain Ave \\
56 Murray Hill, NJ 07974 USA \\
57 \{ttsai,singh\}@avaya.com \\
58 %{\tt http://www.research.avayalabs.com/project/libsafe.html}
59 }
60
61 \date{February 6, 2001}
62
63 \maketitle
64
65 \end{singlespace}
66
67
68 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
69 % Abstract
70
71 \begin{singlespace}
72
73 \begin{abstract}
74
75 This white paper describes a significant new feature of libsafe version 2.0:
76 the ability to detect and handle format string vulnerability exploits. Such
77 exploits have recently garnered attention in security advisories, discussion
78 lists, web sites devoted to security, and even conventional media such as
79 television and newspapers. Examples of vulnerable software include {\tt
80 wu-ftpd} (a common FTP daemon) and {\tt bind} (A DNS [Domain Name System]
81 server). This paper describes the vulnerability and the technique libsafe uses
82 to detect and handle exploits.
83
84 \begin{Ventry}{NOTE}
85 \item[NOTE]
86 This paper only describes one particular feature of libsafe version 2.0:
87 the ability to detect and handle format string vulnerability exploits.
88 Other features include support for code compiled without frame pointer
89 instructions, extra debugging facilities, and bug fixes. See \cite{usenix}
90 for details of the original version of libsafe.
91 \end{Ventry}
92
93 \end{abstract}
94
95 \end{singlespace}
96
97
98 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
99
100 \section{Introduction}
101 \label{sec:introduction}
102
103 Buffer overflow exploits constitute perhaps the most common form of computer
104 security attack~\cite{SSP89*326,Rochlis89,Seeley89}. Such exploits take
105 advantage of programming errors to overflow buffers, thus writing unintended
106 data to the part of memory that immediately follows the targeted buffers. If
107 the targeted buffer exists on the process stack, then the exploit often
108 attempts to overwrite a return address on the stack, which often results in
109 obtaining root access to that machine. The original version of libsafe,
110 version 1.3~\cite{usenix}, presented a significant advance in the detection and
111 handling of buffer overflow attacks by offering a solution that detects a large
112 number of exploits with low overhead and tremendous ease of
113 use\footnote{Libsafe requires no specific security expertise and can be
114 installed in under one minute!}.
115
116 Recently, another widespread vulnerability has received a great deal of
117 attention: the format string vulnerability\cite{bind_report,wuftpd_report}.
118 The latest version of libsafe, version 2.0, implements a solution for detecting
119 and handling the most dangerous format string vulnerability exploits, while
120 preserving the low overhead and ease of use of the original libsafe.
121
122 The most common source of this vulnerability is the ubiquitous {\tt printf()}
123 function. Consider the following vulnerable piece of code:
124
125 \begin{verbatim}
126 printf("%x %x %x %x\n");
127 \end{verbatim}
128
129 The above code will usually compile with no warnings\footnote{For {\tt gcc},
130 warnings are produced with the {\tt -Wall} option, but not with the default
131 warning level.}, even though it obviously lacks the required number of
132 arguments. If this code is executed, it will print out four hexadecimal
133 numbers, corresponding to the values on the stack where it expects the missing
134 arguments to be present. This allows an attacker to examine the contents of
135 the stack.
136
137 The following code illustrates an even more insidious form of the format string
138 vulnerability:
139
140 \begin{verbatim}
141 printf("%.*d%n\n", (int) start_attack_code, 0, return_addr_ptr);
142 \end{verbatim}
143
144 The above example takes advantage of a relatively seldom used {\tt printf()}
145 specifier: {\tt \%n}. This specifier calculates the current number of
146 characters produced by the {\tt printf()} function and writes this number to
147 the memory location indicated by the corresponding pointer in the argument
148 list. In our example, the pointer is {\tt return\_addr\_ptr}. The astute
149 observer may realize at this point that a malicious attacker can potentially
150 overwrite any memory location, including locations containing return addresses.
151 Furthermore, the above form of the {\tt printf()} statement controls the exact
152 number that is written to the memory location. Our example writes the value
153 {\tt start\_attack\_code} to the location {\tt return\_addr\_ptr}. Assuming
154 that {\tt start\_attack\_code} is the starting address for some attack code,
155 the next return from that exploited function will cause the attack code to be
156 executed. Often, this attack code causes a shell to be started, and if the
157 process under attack is privileged (as is the case with many daemon process),
158 then an attacker can obtain a root shell.
159
160 Fortunately, it takes a bit more ingenuity to actually take advantage of this
161 vulnerability. Usually, vulnerable code occurs in a form similar to the
162 following:
163
164 \begin{verbatim}
165 if (illegal_command(command)) {
166 sprintf(error_msg, "Illegal command: %s", command);
167 ...
168 syslog(LOG_WARNING, error_msg);
169 return;
170 }
171 \end{verbatim}
172
173 In this example, {\tt command} is a character buffer that contains a command
174 from the user. If the command is illegal, then the {\tt sprintf()} statement
175 forms an error message that is passed to {\tt syslog()}. Under normal
176 circumstances, {\tt syslog()} will simply append {\tt error\_msg} to the
177 appropriate log file. However, if {\tt command} contains {\tt printf()}
178 specifiers, such as those in the first two code examples, then bad things can
179 happen.
180
181 Such code vulnerabilities exist in real life, and the
182 corresponding exploits also exist. In fact, existence of these and similar
183 vulnerabilities and the relative ease of obtaining exploits has largely led to
184 the prevalence of so-called ``script kiddies,'' or attackers who systematically
185 attack remote machines using downloaded scripts in the hopes of finding a
186 machine that is vulnerable. Such attackers often possess only a rudimentary
187 knowledge of networks and systems. However, they often find great success due
188 to the surprisingly large number of Internet-connected machines that execute
189 vulnerable software. Part of the problem is the complexity of system
190 maintenance. Making sure that one's machine has the latest version of every
191 software package is not simple, especially since system maintenance is
192 often a secondary responsibility. Also, some vulnerabilities are still mostly
193 unknown, and software updates to fix the problem may not yet be available.
194
195 This is where libsafe version 2.0 is valuable. Libsafe version 2.0 will foil
196 all format string vulnerability exploits that attempt to overwrite return
197 addresses on the stack. If such an attack is attempted, libsafe will log a
198 warning and terminate the targeted process. As with version 1.3, installation
199 is extremely easy and requires no knowledge of the system, applications,
200 exploits, or even libsafe itself. Also, because libsafe incurs relatively
201 little overhead, it can be used to protect all processes on a machine, thereby
202 potentially detecting instances of vulnerabilities that may yet be unknown.
203
204
205 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
206
207 \section{Implementation}
208 \label{sec:implementation}
209
210 %interception
211 % -- glibc code
212 %ra check
213 % -- uint _libsafe_raVariableP(void *addr)
214 %span check
215 % -- uint _libsafe_span_stack_frames(void *start_addr, void *end_addr)
216 %handling
217 % -- _libsafe_die()
218
219 The implementation of format string vulnerability detection in libsafe version
220 2.0 borrows heavily from the basic detection mechanism in version 1.3. There
221 are three main steps in the detection mechanism:
222
223 \begin{Ventry}{Violation handling}
224 \item[Interception]
225 Libsafe executes its own version of selected vulnerable functions.
226 \item[Safety check]
227 Libsafe determines if the function can be safely executed.
228 \item[Violation handling]
229 If the function cannot be safely executed, libsafe executes warning and
230 termination actions.
231 \end{Ventry}
232
233 %----------------------------------------------------------------------
234
235 \subsection{Interception}
236 \label{subsec:interception}
237
238 The basic idea behind libsafe is the interception of vulnerable functions by
239 safer alternatives that first check to make sure that the functions can be
240 safely executed based on their arguments. If the check passes, libsafe either
241 calls the original function or executes code that is functionally equivalent.
242 Otherwise, warnings are posted and the process is terminated.
243
244 Libsafe is able to intercept functions (i.e., substitute its alternatives in
245 place of the original functions) because it is implemented as a shared library
246 that is loaded into memory before the standard library (i.e., {\tt
247 /lib/libc.so}). For Linux systems, the run-time loader, {\tt ld.so}, is
248 responsible for loading the various program code and libraries into memory.
249 For programs that require the standard library, {\tt ld.so} loads this library
250 into memory and links all references to library functions in the program code
251 to the library functions. If libsafe is activated, {\tt ld.so} loads the
252 libsafe library into memory before the standard library. Because the libsafe
253 alternative functions have the same names as the original standard library
254 functions, {\tt ld.so} uses the libsafe functions in place of the standard
255 library functions.
256
257 Most of the libsafe functions perform a safety check and then call the original
258 function or a safer alternative (e.g., {\tt snprintf()} is called in place of
259 {\tt sprintf()}). However, two functions are treated differently: {\tt
260 \_IO\_vfprintf()} and {\tt \_IO\_vfscanf()}\footnote{{\tt \_IO\_vfprintf()} and
261 {\tt \_IO\_vfscanf()} are the core functions that all other {\tt *printf()} and
262 {\tt *scanf()} functions eventually call. Thus, intercepting these two core
263 functions effectively intercepts the entire family of {\tt *printf()} and {\tt
264 *scanf()} functions. Note: {\tt syslog()} also eventually calls {\tt
265 \_IO\_vfprintf()}}. For {\tt \_IO\_vfprintf()} and {\tt \_IO\_vfscanf()}, the
266 original source code from libc-2.1.3-91 is incorporated directly into libsafe.
267 Libsafe needs the original source code because the safety checks for these two
268 functions require knowledge of local variables.
269
270
271 %----------------------------------------------------------------------
272
273 \subsection{Safety check}
274 \label{subsec:safety_check}
275
276 The safety checks for each function are highly specific to each function. For
277 {\tt \_IO\_vfprintf()}, libsafe performs two checks:
278
279 \begin{Lentry}
280 \item[Return address and frame pointer check]
281 For each {\tt \%n} specifier, libsafe checks the associated pointer
282 argument. Each such pointer argument is passed to {\tt
283 \_libsafe\_raVariableP(void *addr)}, where {\tt addr} is the pointer
284 argument. {\tt \_libsafe\_raVariableP(void *addr)} returns {\tt 1} only if
285 it determines that {\tt addr} points to a return address or a frame pointer
286 on the stack. Otherwise, it returns {\tt 0}, which means that {\tt addr}
287 points to an address that is either not on the stack or which is on the
288 stack, but which is not a return address or a frame pointer. If {\tt
289 \_libsafe\_raVariableP()} returns {\tt 1}, then libsafe has found a
290 violation.
291 \item[Frame span check]
292 The argument list for any function should always be contained within a
293 single stack frame. Thus, attacks that attempt to probe the stack using
294 statements such as {\tt printf("\%x \%x ...")} might require arguments that
295 extend beyond the current stack frame. The {\tt
296 \_libsafe\_span\_stack\_frames(void *start\_addr, void *end\_addr)}
297 function returns {\tt 1} only if {\tt start\_addr} and {\tt end\_addr} are
298 located in two different stack frames. If {\tt
299 \_libsafe\_span\_stack\_frames()} returns {\tt 1}, then libsafe has found a
300 violation.
301 \end{Lentry}
302
303 \begin{figure}[htbp]
304 \centerline{\psfig{figure=stack.eps,height=3.5in}}
305 \caption{Stack Frames}
306 \label{fig:stack_frames}
307 \end{figure}
308
309 To perform these two checks, libsafe determines the locations and sizes of the
310 frames on the stack. Figure~\ref{fig:stack_frames} illustrates the
311 organization of a process stack. The beginning of each stack frame is
312 indicated by the presence of a frame pointer that points back to the previous
313 stack frame. Libsafe finds each stack frame by starting at the top-most frame
314 and traversing the frame pointers until it finds the stack frame for {\tt
315 main()}. The top-most frame corresponds to a libsafe function. Within this
316 libsafe function, the frame pointer is found by using the gcc function {\tt
317 \_\_builtin\_frame\_pointer(0)}. The return address back into the calling
318 function is located immediately before each frame pointer. This technique
319 works for most processes, with a few exceptions. Certain compilers may not
320 produce code that places frame pointers on the stack (e.g., {\tt gcc
321 -fomit-frame-pointer}), and some customized compilers may not locate return
322 addresses immediately next to the frame pointer (e.g., the StackGuard
323 compiler~\cite{stackguard98}).
324
325 %----------------------------------------------------------------------
326
327 \subsection{Violation handling}
328 \label{subsec:handling}
329
330 If libsafe finds a violation during a safety check, then it performs the
331 actions in Table~\ref{tab:actions}.
332
333 \begin{table}[htbp]
334 \begin{center}
335 \caption{Libsafe Actions After Finding a Violation}
336 \label{tab:actions}
337 \begin{tabular}{|l||c|c|} \hline
338 Action & Default & Optional? \\ \hline\hline
339 Terminate process
340 & Off/On & Not optional \\ \hline
341 Add a entry to {\tt /var/log/secure} using {\tt syslog()}
342 & On & Optional \\ \hline
343 Print a warning to {\tt stderr}
344 & On & Not optional \\ \hline
345 Dump a hexadecimal version of the stack contents to a file
346 & Off & Optional \\ \hline
347 Send email to a list of recipients
348 & Off & Optional \\ \hline
349 Produce a core dump by calling {\tt abort()}
350 & Off & Optional \\ \hline
351 \end{tabular}
352 \end{center}
353 \end{table}
354
355 The main libsafe action after detecting a violation is to terminate the
356 process. Data integrity after a violation cannot be assured, and therefore,
357 the safest course of action is to terminate the entire process. However, for
358 violations of the return address and frame pointer check, libsafe can
359 optionally allow the process to continue execution. This exception is based on
360 the assumption that programmers will almost never (or at least should never)
361 produce code that attempts to use the {\tt \%n} specifier to overwrite a return
362 address or frame pointer. In practice, most occurrences of such attacks result
363 from processing user input that unexpectedly contains the {\tt \%n} specifier.
364 In such instances, since the input is garbage, libsafe can usually allow the
365 process to continue to process the input as long as the {\tt \%n} specifier is
366 not permitted to write to memory.
367
368 %----------------------------------------------------------------------
369
370 \subsection{Notes}
371 \label{subsec:notes}
372
373 \begin{enumerate}
374 \item Libsafe relies on the location of frame pointers on the stack to
375 determine the location of stack frames and return addresses. Some programs
376 have been compiled without code to embed frame pointers on the stack (e.g.,
377 by using {\tt gcc -fomit-frame-pointer}). For such code, libsafe will
378 automatically detect the absence of frame pointers on the stack and allow
379 the program to execute normally. However, it will not be able to detect
380 any exploits for such programs.
381 \item Libsafe is linked with glibc and is incompatible with libc5. If you have
382 a program that is linked with libc5, you will need to either obtain an
383 updated version linked with glibc or recompile the source code yourself
384 with glibc.
385 \end{enumerate}
386
387
388 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
389
390 \section{Software Availability}
391 \label{sec:software_availability}
392
393 Libsafe version 2.0 has not yet been released to the general public. However,
394 it is our intention to release the software under the Lesser GNU Public License
395 sometime in the near future. Please contact Timothy Tsai (ttsai@avaya.com) if
396 you have any questions or are interested in evaluating the software.
397
398
399 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
400
401 \begin{singlespace}
402 %\compress
403 \bibliography{whitepaper-20}
404 \end{singlespace}
405
406 \end{document}
407