"Fossies" - the Fresh Open Source Software Archive

Member "heaplayers-351/allocators/ptmalloc2/README" (6 Oct 2003, 7139 Bytes) of package /linux/misc/old/heaplayers_3_5_1.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 ptmalloc2 - a multi-thread malloc implementation
    2 ================================================
    3 
    4 Wolfram Gloger (wg@malloc.de)
    5 
    6 15 Dec 2001
    7 
    8 
    9 Introduction
   10 ============
   11 
   12 This package is a modified version of Doug Lea's malloc-2.7.0
   13 implementation (available seperately from ftp://g.oswego.edu/pub/misc)
   14 that I adapted for multiple threads, while trying to avoid lock
   15 contention as much as possible.  Many thanks should go to Doug Lea
   16 (dl@cs.oswego.edu) for the great original malloc implementation.
   17 
   18 As part of the GNU C library, the source files are available under the
   19 GNU Library General Public License (see the comments in the files).
   20 But as part of this stand-alone package, the code is also available
   21 under the (probably less restrictive) conditions described in the file
   22 'COPYRIGHT'.  In any case, there is no warranty whatsoever for this
   23 package.
   24 
   25 The current distribution should be available from:
   26 
   27 http://www.malloc.de/malloc/ptmalloc2.tar.gz
   28 
   29 Please note that ptmalloc2 is currently still in the testing phase.
   30 For example, support for non-ANSI compilers is currently not complete.
   31 For an implementation with a somewhat more proven record, you may look
   32 at ptmalloc.tar.gz, which is based on Doug Lea's malloc-2.6.x and is
   33 integrated into GNU libc until version glibc-2.2.x.
   34 
   35 
   36 Compilation and usage
   37 =====================
   38 
   39 It should be possible to compile malloc.c on any UN*X-like system that
   40 implements the sbrk(), mmap(), munmap() and mprotect() calls.  If
   41 mmap() is not available, it is only possible to produce a
   42 non-threadsafe implementation from the source file.  See the comments
   43 in the source file for descriptions of the compile-time options.
   44 Several thread interfaces are supported:
   45 
   46  o Posix threads (pthreads), compile with `-DUSE_PTHREADS=1'
   47    (and possibly with `-DUSE_TSD_DATA_HACK', see below)
   48  o Solaris threads, compile with `-DUSE_THR=1'
   49  o SGI sproc() threads, compile with `-DUSE_SPROC=1'
   50  o When compiling malloc.c as part of the GNU C library,
   51    i.e. when _LIBC is defined (no other defines necessary)
   52  o no threads, compile without any of the above definitions
   53 
   54 The distributed Makefile includes several targets (e.g. `solaris' for
   55 Solaris threads, but you probably want `posix' for recent Solaris
   56 versions) which cause malloc.c to be compiled with the appropriate
   57 flags.  The default is to compile for Posix threads.  Note that some
   58 compilers need special flags for multi-threaded code, e.g. with
   59 Solaris cc one should use:
   60 
   61 % make posix SYS_FLAGS='-mt'
   62 
   63 Some additional targets, ending in `-libc', are also provided in the
   64 Makefile, to compare performance of the test programs to the case when
   65 linking with the standard malloc implementation in libc.
   66 
   67 A potential problem remains: If any of the system-specific functions
   68 for getting/setting thread-specific data or for locking a mutex call
   69 one of the malloc-related functions internally, the implementation
   70 cannot work at all due to infinite recursion.  One example seems to be
   71 Solaris 2.4; a workaround for thr_getspecific() has been inserted into
   72 the thread-m.h file.  I would like to hear if this problem occurs on
   73 other systems, and whether similar workarounds could be applied.
   74 
   75 For Posix threads, too, an optional hack like that has been integrated
   76 (activated when defining USE_TSD_DATA_HACK) which depends on
   77 `pthread_t' being convertible to an integral type (which is of course
   78 not generally guaranteed).  USE_TSD_DATA_HACK is now the default
   79 because I haven't yet found a non-glibc pthreads system where this
   80 hack is _not_ needed.
   81 
   82 To use ptmalloc2 (i.e. when linking malloc.o into applications), no
   83 special precautions are necessary.
   84 
   85 On some systems, when overriding malloc and linking against shared
   86 libraries, the link order becomes very important.  E.g., when linking
   87 C++ programs on Solaris, don't rely on libC being included by default,
   88 but instead put `-lthread' behind `-lC' on the command line:
   89 
   90   CC ... malloc.o -lC -lthread
   91 
   92 This is because there are global constructors in libC that need
   93 malloc/ptmalloc, which in turn needs to have the thread library to be
   94 already initialized.
   95 
   96 Debugging hooks
   97 ===============
   98 
   99 All calls to malloc(), realloc(), free() and memalign() are routed
  100 through the global function pointers __malloc_hook, __realloc_hook,
  101 __free_hook and __memalign_hook if they are not NULL (see the malloc.h
  102 header file for declarations of these pointers).  Therefore the malloc
  103 implementation can be changed at runtime, if care is taken not to call
  104 free() or realloc() on pointers obtained with a different
  105 implementation than the one currently in effect.  (The easiest way to
  106 guarantee this is to set up the hooks before any malloc call, e.g.
  107 with a function pointed to by the global variable
  108 __malloc_initialize_hook).
  109 
  110 A useful application of the hooks is built-in into ptmalloc2: The
  111 implementation is usually very unforgiving with respect to misuse,
  112 such as free()ing a pointer twice or free()ing a pointer not obtained
  113 with malloc() (these will typically crash the application
  114 immediately).  To debug in such situations, you can set the
  115 environment variable `MALLOC_CHECK_' (note the trailing underscore).
  116 Performance will suffer somewhat, but you will get more controlled
  117 behaviour in the case of misuse.  If MALLOC_CHECK_=0, wrong free()s
  118 will be silently ignored, if MALLOC_CHECK_=1, diagnostics will be
  119 printed on stderr, and if MALLOC_CHECK_=2, abort() will be called on
  120 any error.
  121 
  122 You can now also tune other malloc parameters (normally adjused via
  123 mallopt() calls from the application) with environment variables:
  124 
  125     MALLOC_TRIM_THRESHOLD_    for deciding to shrink the heap (in bytes)
  126 
  127     MALLOC_TOP_PAD_           how much extra memory to allocate on
  128                               each system call (in bytes)
  129 
  130     MALLOC_MMAP_THRESHOLD_    min. size for chunks allocated via
  131                               mmap() (in bytes)
  132 
  133     MALLOC_MMAP_MAX_          max. number of mmapped regions to use
  134 
  135 Tests
  136 =====
  137 
  138 Two testing applications, t-test1 and t-test2, are included in this
  139 source distribution.  Both perform pseudo-random sequences of
  140 allocations/frees, and can be given numeric arguments (all arguments
  141 are optional):
  142 
  143 % t-test[12] <n-total> <n-parallel> <n-allocs> <size-max> <bins>
  144 
  145     n-total = total number of threads executed (default 10)
  146     n-parallel = number of threads running in parallel (2)
  147     n-allocs = number of malloc()'s / free()'s per thread (10000)
  148     size-max = max. size requested with malloc() in bytes (10000)
  149     bins = number of bins to maintain
  150 
  151 The first test `t-test1' maintains a completely seperate pool of
  152 allocated bins for each thread, and should therefore show full
  153 parallelism.  On the other hand, `t-test2' creates only a single pool
  154 of bins, and each thread randomly allocates/frees any bin.  Some lock
  155 contention is to be expected in this case, as the threads frequently
  156 cross each others arena.
  157 
  158 Performance results from t-test1 should be quite repeatable, while the
  159 behaviour of t-test2 depends on scheduling variations.
  160 
  161 Conclusion
  162 ==========
  163 
  164 I'm always interested in performance data and feedback, just send mail
  165 to ptmalloc@malloc.de.
  166 
  167 Good luck!