"Fossies" - the Fresh Open Source Software Archive

Member "pcre-8.44/doc/html/README.txt" (12 Feb 2020, 45484 Bytes) of package /linux/misc/pcre-8.44.tar.bz2:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file. For more information about "README.txt" see the Fossies "Dox" file reference documentation and the latest Fossies "Diffs" side-by-side code changes report: 8.43_vs_8.44.

    1 README file for PCRE (Perl-compatible regular expression library)
    2 -----------------------------------------------------------------
    3 
    4 NOTE: This set of files relates to PCRE releases that use the original API,
    5 with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
    6 first release of a new API, known as PCRE2, with release numbers starting at
    7 10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
    8 libraries (now called PCRE1) are still being maintained for bug fixes, but
    9 there will be no new development. New projects are advised to use the new PCRE2
   10 libraries.
   11 
   12 
   13 The latest release of PCRE1 is always available in three alternative formats
   14 from:
   15 
   16   https://ftp.pcre.org/pub/pcre/pcre-x.xx.tar.gz
   17   https://ftp.pcre.org/pub/pcre/pcre-x.xx.tar.bz2
   18   https://ftp.pcre.org/pub/pcre/pcre-x.xx.tar.zip
   19 
   20 
   21 There is a mailing list for discussion about the development of PCRE at
   22 pcre-dev@exim.org. You can access the archives and subscribe or manage your
   23 subscription here:
   24 
   25    https://lists.exim.org/mailman/listinfo/pcre-dev
   26 
   27 Please read the NEWS file if you are upgrading from a previous release.
   28 The contents of this README file are:
   29 
   30   The PCRE APIs
   31   Documentation for PCRE
   32   Contributions by users of PCRE
   33   Building PCRE on non-Unix-like systems
   34   Building PCRE without using autotools
   35   Building PCRE using autotools
   36   Retrieving configuration information
   37   Shared libraries
   38   Cross-compiling using autotools
   39   Using HP's ANSI C++ compiler (aCC)
   40   Compiling in Tru64 using native compilers
   41   Using Sun's compilers for Solaris
   42   Using PCRE from MySQL
   43   Making new tarballs
   44   Testing PCRE
   45   Character tables
   46   File manifest
   47 
   48 
   49 The PCRE APIs
   50 -------------
   51 
   52 PCRE is written in C, and it has its own API. There are three sets of
   53 functions, one for the 8-bit library, which processes strings of bytes, one for
   54 the 16-bit library, which processes strings of 16-bit values, and one for the
   55 32-bit library, which processes strings of 32-bit values. The distribution also
   56 includes a set of C++ wrapper functions (see the pcrecpp man page for details),
   57 courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
   58 C++. Other C++ wrappers have been created from time to time. See, for example:
   59 https://github.com/YasserAsmi/regexp, which aims to be simple and similar in
   60 style to the C API.
   61 
   62 The distribution also contains a set of C wrapper functions (again, just for
   63 the 8-bit library) that are based on the POSIX regular expression API (see the
   64 pcreposix man page). These end up in the library called libpcreposix. Note that
   65 this just provides a POSIX calling interface to PCRE; the regular expressions
   66 themselves still follow Perl syntax and semantics. The POSIX API is restricted,
   67 and does not give full access to all of PCRE's facilities.
   68 
   69 The header file for the POSIX-style functions is called pcreposix.h. The
   70 official POSIX name is regex.h, but I did not want to risk possible problems
   71 with existing files of that name by distributing it that way. To use PCRE with
   72 an existing program that uses the POSIX API, pcreposix.h will have to be
   73 renamed or pointed at by a link.
   74 
   75 If you are using the POSIX interface to PCRE and there is already a POSIX regex
   76 library installed on your system, as well as worrying about the regex.h header
   77 file (as mentioned above), you must also take care when linking programs to
   78 ensure that they link with PCRE's libpcreposix library. Otherwise they may pick
   79 up the POSIX functions of the same name from the other library.
   80 
   81 One way of avoiding this confusion is to compile PCRE with the addition of
   82 -Dregcomp=PCREregcomp (and similarly for the other POSIX functions) to the
   83 compiler flags (CFLAGS if you are using "configure" -- see below). This has the
   84 effect of renaming the functions so that the names no longer clash. Of course,
   85 you have to do the same thing for your applications, or write them using the
   86 new names.
   87 
   88 
   89 Documentation for PCRE
   90 ----------------------
   91 
   92 If you install PCRE in the normal way on a Unix-like system, you will end up
   93 with a set of man pages whose names all start with "pcre". The one that is just
   94 called "pcre" lists all the others. In addition to these man pages, the PCRE
   95 documentation is supplied in two other forms:
   96 
   97   1. There are files called doc/pcre.txt, doc/pcregrep.txt, and
   98      doc/pcretest.txt in the source distribution. The first of these is a
   99      concatenation of the text forms of all the section 3 man pages except
  100      the listing of pcredemo.c and those that summarize individual functions.
  101      The other two are the text forms of the section 1 man pages for the
  102      pcregrep and pcretest commands. These text forms are provided for ease of
  103      scanning with text editors or similar tools. They are installed in
  104      <prefix>/share/doc/pcre, where <prefix> is the installation prefix
  105      (defaulting to /usr/local).
  106 
  107   2. A set of files containing all the documentation in HTML form, hyperlinked
  108      in various ways, and rooted in a file called index.html, is distributed in
  109      doc/html and installed in <prefix>/share/doc/pcre/html.
  110 
  111 Users of PCRE have contributed files containing the documentation for various
  112 releases in CHM format. These can be found in the Contrib directory of the FTP
  113 site (see next section).
  114 
  115 
  116 Contributions by users of PCRE
  117 ------------------------------
  118 
  119 You can find contributions from PCRE users in the directory
  120 
  121   ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Contrib
  122 
  123 There is a README file giving brief descriptions of what they are. Some are
  124 complete in themselves; others are pointers to URLs containing relevant files.
  125 Some of this material is likely to be well out-of-date. Several of the earlier
  126 contributions provided support for compiling PCRE on various flavours of
  127 Windows (I myself do not use Windows). Nowadays there is more Windows support
  128 in the standard distribution, so these contibutions have been archived.
  129 
  130 A PCRE user maintains downloadable Windows binaries of the pcregrep and
  131 pcretest programs here:
  132 
  133   http://www.rexegg.com/pcregrep-pcretest.html
  134 
  135 
  136 Building PCRE on non-Unix-like systems
  137 --------------------------------------
  138 
  139 For a non-Unix-like system, please read the comments in the file
  140 NON-AUTOTOOLS-BUILD, though if your system supports the use of "configure" and
  141 "make" you may be able to build PCRE using autotools in the same way as for
  142 many Unix-like systems.
  143 
  144 PCRE can also be configured using the GUI facility provided by CMake's
  145 cmake-gui command. This creates Makefiles, solution files, etc. The file
  146 NON-AUTOTOOLS-BUILD has information about CMake.
  147 
  148 PCRE has been compiled on many different operating systems. It should be
  149 straightforward to build PCRE on any system that has a Standard C compiler and
  150 library, because it uses only Standard C functions.
  151 
  152 
  153 Building PCRE without using autotools
  154 -------------------------------------
  155 
  156 The use of autotools (in particular, libtool) is problematic in some
  157 environments, even some that are Unix or Unix-like. See the NON-AUTOTOOLS-BUILD
  158 file for ways of building PCRE without using autotools.
  159 
  160 
  161 Building PCRE using autotools
  162 -----------------------------
  163 
  164 If you are using HP's ANSI C++ compiler (aCC), please see the special note
  165 in the section entitled "Using HP's ANSI C++ compiler (aCC)" below.
  166 
  167 The following instructions assume the use of the widely used "configure; make;
  168 make install" (autotools) process.
  169 
  170 To build PCRE on system that supports autotools, first run the "configure"
  171 command from the PCRE distribution directory, with your current directory set
  172 to the directory where you want the files to be created. This command is a
  173 standard GNU "autoconf" configuration script, for which generic instructions
  174 are supplied in the file INSTALL.
  175 
  176 Most commonly, people build PCRE within its own distribution directory, and in
  177 this case, on many systems, just running "./configure" is sufficient. However,
  178 the usual methods of changing standard defaults are available. For example:
  179 
  180 CFLAGS='-O2 -Wall' ./configure --prefix=/opt/local
  181 
  182 This command specifies that the C compiler should be run with the flags '-O2
  183 -Wall' instead of the default, and that "make install" should install PCRE
  184 under /opt/local instead of the default /usr/local.
  185 
  186 If you want to build in a different directory, just run "configure" with that
  187 directory as current. For example, suppose you have unpacked the PCRE source
  188 into /source/pcre/pcre-xxx, but you want to build it in /build/pcre/pcre-xxx:
  189 
  190 cd /build/pcre/pcre-xxx
  191 /source/pcre/pcre-xxx/configure
  192 
  193 PCRE is written in C and is normally compiled as a C library. However, it is
  194 possible to build it as a C++ library, though the provided building apparatus
  195 does not have any features to support this.
  196 
  197 There are some optional features that can be included or omitted from the PCRE
  198 library. They are also documented in the pcrebuild man page.
  199 
  200 . By default, both shared and static libraries are built. You can change this
  201   by adding one of these options to the "configure" command:
  202 
  203   --disable-shared
  204   --disable-static
  205 
  206   (See also "Shared libraries on Unix-like systems" below.)
  207 
  208 . By default, only the 8-bit library is built. If you add --enable-pcre16 to
  209   the "configure" command, the 16-bit library is also built. If you add
  210   --enable-pcre32 to the "configure" command, the 32-bit library is also built.
  211   If you want only the 16-bit or 32-bit library, use --disable-pcre8 to disable
  212   building the 8-bit library.
  213 
  214 . If you are building the 8-bit library and want to suppress the building of
  215   the C++ wrapper library, you can add --disable-cpp to the "configure"
  216   command. Otherwise, when "configure" is run without --disable-pcre8, it will
  217   try to find a C++ compiler and C++ header files, and if it succeeds, it will
  218   try to build the C++ wrapper.
  219 
  220 . If you want to include support for just-in-time compiling, which can give
  221   large performance improvements on certain platforms, add --enable-jit to the
  222   "configure" command. This support is available only for certain hardware
  223   architectures. If you try to enable it on an unsupported architecture, there
  224   will be a compile time error.
  225 
  226 . When JIT support is enabled, pcregrep automatically makes use of it, unless
  227   you add --disable-pcregrep-jit to the "configure" command.
  228 
  229 . If you want to make use of the support for UTF-8 Unicode character strings in
  230   the 8-bit library, or UTF-16 Unicode character strings in the 16-bit library,
  231   or UTF-32 Unicode character strings in the 32-bit library, you must add
  232   --enable-utf to the "configure" command. Without it, the code for handling
  233   UTF-8, UTF-16 and UTF-8 is not included in the relevant library. Even
  234   when --enable-utf is included, the use of a UTF encoding still has to be
  235   enabled by an option at run time. When PCRE is compiled with this option, its
  236   input can only either be ASCII or UTF-8/16/32, even when running on EBCDIC
  237   platforms. It is not possible to use both --enable-utf and --enable-ebcdic at
  238   the same time.
  239 
  240 . There are no separate options for enabling UTF-8, UTF-16 and UTF-32
  241   independently because that would allow ridiculous settings such as requesting
  242   UTF-16 support while building only the 8-bit library. However, the option
  243   --enable-utf8 is retained for backwards compatibility with earlier releases
  244   that did not support 16-bit or 32-bit character strings. It is synonymous with
  245   --enable-utf. It is not possible to configure one library with UTF support
  246   and the other without in the same configuration.
  247 
  248 . If, in addition to support for UTF-8/16/32 character strings, you want to
  249   include support for the \P, \p, and \X sequences that recognize Unicode
  250   character properties, you must add --enable-unicode-properties to the
  251   "configure" command. This adds about 30K to the size of the library (in the
  252   form of a property table); only the basic two-letter properties such as Lu
  253   are supported.
  254 
  255 . You can build PCRE to recognize either CR or LF or the sequence CRLF or any
  256   of the preceding, or any of the Unicode newline sequences as indicating the
  257   end of a line. Whatever you specify at build time is the default; the caller
  258   of PCRE can change the selection at run time. The default newline indicator
  259   is a single LF character (the Unix standard). You can specify the default
  260   newline indicator by adding --enable-newline-is-cr or --enable-newline-is-lf
  261   or --enable-newline-is-crlf or --enable-newline-is-anycrlf or
  262   --enable-newline-is-any to the "configure" command, respectively.
  263 
  264   If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
  265   the standard tests will fail, because the lines in the test files end with
  266   LF. Even if the files are edited to change the line endings, there are likely
  267   to be some failures. With --enable-newline-is-anycrlf or
  268   --enable-newline-is-any, many tests should succeed, but there may be some
  269   failures.
  270 
  271 . By default, the sequence \R in a pattern matches any Unicode line ending
  272   sequence. This is independent of the option specifying what PCRE considers to
  273   be the end of a line (see above). However, the caller of PCRE can restrict \R
  274   to match only CR, LF, or CRLF. You can make this the default by adding
  275   --enable-bsr-anycrlf to the "configure" command (bsr = "backslash R").
  276 
  277 . When called via the POSIX interface, PCRE uses malloc() to get additional
  278   storage for processing capturing parentheses if there are more than 10 of
  279   them in a pattern. You can increase this threshold by setting, for example,
  280 
  281   --with-posix-malloc-threshold=20
  282 
  283   on the "configure" command.
  284 
  285 . PCRE has a counter that limits the depth of nesting of parentheses in a
  286   pattern. This limits the amount of system stack that a pattern uses when it
  287   is compiled. The default is 250, but you can change it by setting, for
  288   example,
  289 
  290   --with-parens-nest-limit=500
  291 
  292 . PCRE has a counter that can be set to limit the amount of resources it uses
  293   when matching a pattern. If the limit is exceeded during a match, the match
  294   fails. The default is ten million. You can change the default by setting, for
  295   example,
  296 
  297   --with-match-limit=500000
  298 
  299   on the "configure" command. This is just the default; individual calls to
  300   pcre_exec() can supply their own value. There is more discussion on the
  301   pcreapi man page.
  302 
  303 . There is a separate counter that limits the depth of recursive function calls
  304   during a matching process. This also has a default of ten million, which is
  305   essentially "unlimited". You can change the default by setting, for example,
  306 
  307   --with-match-limit-recursion=500000
  308 
  309   Recursive function calls use up the runtime stack; running out of stack can
  310   cause programs to crash in strange ways. There is a discussion about stack
  311   sizes in the pcrestack man page.
  312 
  313 . The default maximum compiled pattern size is around 64K. You can increase
  314   this by adding --with-link-size=3 to the "configure" command. In the 8-bit
  315   library, PCRE then uses three bytes instead of two for offsets to different
  316   parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
  317   the same as --with-link-size=4, which (in both libraries) uses four-byte
  318   offsets. Increasing the internal link size reduces performance. In the 32-bit
  319   library, the only supported link size is 4.
  320 
  321 . You can build PCRE so that its internal match() function that is called from
  322   pcre_exec() does not call itself recursively. Instead, it uses memory blocks
  323   obtained from the heap via the special functions pcre_stack_malloc() and
  324   pcre_stack_free() to save data that would otherwise be saved on the stack. To
  325   build PCRE like this, use
  326 
  327   --disable-stack-for-recursion
  328 
  329   on the "configure" command. PCRE runs more slowly in this mode, but it may be
  330   necessary in environments with limited stack sizes. This applies only to the
  331   normal execution of the pcre_exec() function; if JIT support is being
  332   successfully used, it is not relevant. Equally, it does not apply to
  333   pcre_dfa_exec(), which does not use deeply nested recursion. There is a
  334   discussion about stack sizes in the pcrestack man page.
  335 
  336 . For speed, PCRE uses four tables for manipulating and identifying characters
  337   whose code point values are less than 256. By default, it uses a set of
  338   tables for ASCII encoding that is part of the distribution. If you specify
  339 
  340   --enable-rebuild-chartables
  341 
  342   a program called dftables is compiled and run in the default C locale when
  343   you obey "make". It builds a source file called pcre_chartables.c. If you do
  344   not specify this option, pcre_chartables.c is created as a copy of
  345   pcre_chartables.c.dist. See "Character tables" below for further information.
  346 
  347 . It is possible to compile PCRE for use on systems that use EBCDIC as their
  348   character code (as opposed to ASCII/Unicode) by specifying
  349 
  350   --enable-ebcdic
  351 
  352   This automatically implies --enable-rebuild-chartables (see above). However,
  353   when PCRE is built this way, it always operates in EBCDIC. It cannot support
  354   both EBCDIC and UTF-8/16/32. There is a second option, --enable-ebcdic-nl25,
  355   which specifies that the code value for the EBCDIC NL character is 0x25
  356   instead of the default 0x15.
  357 
  358 . In environments where valgrind is installed, if you specify
  359 
  360   --enable-valgrind
  361 
  362   PCRE will use valgrind annotations to mark certain memory regions as
  363   unaddressable. This allows it to detect invalid memory accesses, and is
  364   mostly useful for debugging PCRE itself.
  365 
  366 . In environments where the gcc compiler is used and lcov version 1.6 or above
  367   is installed, if you specify
  368 
  369   --enable-coverage
  370 
  371   the build process implements a code coverage report for the test suite. The
  372   report is generated by running "make coverage". If ccache is installed on
  373   your system, it must be disabled when building PCRE for coverage reporting.
  374   You can do this by setting the environment variable CCACHE_DISABLE=1 before
  375   running "make" to build PCRE. There is more information about coverage
  376   reporting in the "pcrebuild" documentation.
  377 
  378 . The pcregrep program currently supports only 8-bit data files, and so
  379   requires the 8-bit PCRE library. It is possible to compile pcregrep to use
  380   libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by
  381   specifying one or both of
  382 
  383   --enable-pcregrep-libz
  384   --enable-pcregrep-libbz2
  385 
  386   Of course, the relevant libraries must be installed on your system.
  387 
  388 . The default size (in bytes) of the internal buffer used by pcregrep can be
  389   set by, for example:
  390 
  391   --with-pcregrep-bufsize=51200
  392 
  393   The value must be a plain integer. The default is 20480.
  394 
  395 . It is possible to compile pcretest so that it links with the libreadline
  396   or libedit libraries, by specifying, respectively,
  397 
  398   --enable-pcretest-libreadline or --enable-pcretest-libedit
  399 
  400   If this is done, when pcretest's input is from a terminal, it reads it using
  401   the readline() function. This provides line-editing and history facilities.
  402   Note that libreadline is GPL-licenced, so if you distribute a binary of
  403   pcretest linked in this way, there may be licensing issues. These can be
  404   avoided by linking with libedit (which has a BSD licence) instead.
  405 
  406   Enabling libreadline causes the -lreadline option to be added to the pcretest
  407   build. In many operating environments with a sytem-installed readline
  408   library this is sufficient. However, in some environments (e.g. if an
  409   unmodified distribution version of readline is in use), it may be necessary
  410   to specify something like LIBS="-lncurses" as well. This is because, to quote
  411   the readline INSTALL, "Readline uses the termcap functions, but does not link
  412   with the termcap or curses library itself, allowing applications which link
  413   with readline the to choose an appropriate library." If you get error
  414   messages about missing functions tgetstr, tgetent, tputs, tgetflag, or tgoto,
  415   this is the problem, and linking with the ncurses library should fix it.
  416 
  417 The "configure" script builds the following files for the basic C library:
  418 
  419 . Makefile             the makefile that builds the library
  420 . config.h             build-time configuration options for the library
  421 . pcre.h               the public PCRE header file
  422 . pcre-config          script that shows the building settings such as CFLAGS
  423                          that were set for "configure"
  424 . libpcre.pc         ) data for the pkg-config command
  425 . libpcre16.pc       )
  426 . libpcre32.pc       )
  427 . libpcreposix.pc    )
  428 . libtool              script that builds shared and/or static libraries
  429 
  430 Versions of config.h and pcre.h are distributed in the PCRE tarballs under the
  431 names config.h.generic and pcre.h.generic. These are provided for those who
  432 have to built PCRE without using "configure" or CMake. If you use "configure"
  433 or CMake, the .generic versions are not used.
  434 
  435 When building the 8-bit library, if a C++ compiler is found, the following
  436 files are also built:
  437 
  438 . libpcrecpp.pc        data for the pkg-config command
  439 . pcrecpparg.h         header file for calling PCRE via the C++ wrapper
  440 . pcre_stringpiece.h   header for the C++ "stringpiece" functions
  441 
  442 The "configure" script also creates config.status, which is an executable
  443 script that can be run to recreate the configuration, and config.log, which
  444 contains compiler output from tests that "configure" runs.
  445 
  446 Once "configure" has run, you can run "make". This builds the the libraries
  447 libpcre, libpcre16 and/or libpcre32, and a test program called pcretest. If you
  448 enabled JIT support with --enable-jit, a test program called pcre_jit_test is
  449 built as well.
  450 
  451 If the 8-bit library is built, libpcreposix and the pcregrep command are also
  452 built, and if a C++ compiler was found on your system, and you did not disable
  453 it with --disable-cpp, "make" builds the C++ wrapper library, which is called
  454 libpcrecpp, as well as some test programs called pcrecpp_unittest,
  455 pcre_scanner_unittest, and pcre_stringpiece_unittest.
  456 
  457 The command "make check" runs all the appropriate tests. Details of the PCRE
  458 tests are given below in a separate section of this document.
  459 
  460 You can use "make install" to install PCRE into live directories on your
  461 system. The following are installed (file names are all relative to the
  462 <prefix> that is set when "configure" is run):
  463 
  464   Commands (bin):
  465     pcretest
  466     pcregrep (if 8-bit support is enabled)
  467     pcre-config
  468 
  469   Libraries (lib):
  470     libpcre16     (if 16-bit support is enabled)
  471     libpcre32     (if 32-bit support is enabled)
  472     libpcre       (if 8-bit support is enabled)
  473     libpcreposix  (if 8-bit support is enabled)
  474     libpcrecpp    (if 8-bit and C++ support is enabled)
  475 
  476   Configuration information (lib/pkgconfig):
  477     libpcre16.pc
  478     libpcre32.pc
  479     libpcre.pc
  480     libpcreposix.pc
  481     libpcrecpp.pc (if C++ support is enabled)
  482 
  483   Header files (include):
  484     pcre.h
  485     pcreposix.h
  486     pcre_scanner.h      )
  487     pcre_stringpiece.h  ) if C++ support is enabled
  488     pcrecpp.h           )
  489     pcrecpparg.h        )
  490 
  491   Man pages (share/man/man{1,3}):
  492     pcregrep.1
  493     pcretest.1
  494     pcre-config.1
  495     pcre.3
  496     pcre*.3 (lots more pages, all starting "pcre")
  497 
  498   HTML documentation (share/doc/pcre/html):
  499     index.html
  500     *.html (lots more pages, hyperlinked from index.html)
  501 
  502   Text file documentation (share/doc/pcre):
  503     AUTHORS
  504     COPYING
  505     ChangeLog
  506     LICENCE
  507     NEWS
  508     README
  509     pcre.txt         (a concatenation of the man(3) pages)
  510     pcretest.txt     the pcretest man page
  511     pcregrep.txt     the pcregrep man page
  512     pcre-config.txt  the pcre-config man page
  513 
  514 If you want to remove PCRE from your system, you can run "make uninstall".
  515 This removes all the files that "make install" installed. However, it does not
  516 remove any directories, because these are often shared with other programs.
  517 
  518 
  519 Retrieving configuration information
  520 ------------------------------------
  521 
  522 Running "make install" installs the command pcre-config, which can be used to
  523 recall information about the PCRE configuration and installation. For example:
  524 
  525   pcre-config --version
  526 
  527 prints the version number, and
  528 
  529   pcre-config --libs
  530 
  531 outputs information about where the library is installed. This command can be
  532 included in makefiles for programs that use PCRE, saving the programmer from
  533 having to remember too many details.
  534 
  535 The pkg-config command is another system for saving and retrieving information
  536 about installed libraries. Instead of separate commands for each library, a
  537 single command is used. For example:
  538 
  539   pkg-config --cflags pcre
  540 
  541 The data is held in *.pc files that are installed in a directory called
  542 <prefix>/lib/pkgconfig.
  543 
  544 
  545 Shared libraries
  546 ----------------
  547 
  548 The default distribution builds PCRE as shared libraries and static libraries,
  549 as long as the operating system supports shared libraries. Shared library
  550 support relies on the "libtool" script which is built as part of the
  551 "configure" process.
  552 
  553 The libtool script is used to compile and link both shared and static
  554 libraries. They are placed in a subdirectory called .libs when they are newly
  555 built. The programs pcretest and pcregrep are built to use these uninstalled
  556 libraries (by means of wrapper scripts in the case of shared libraries). When
  557 you use "make install" to install shared libraries, pcregrep and pcretest are
  558 automatically re-built to use the newly installed shared libraries before being
  559 installed themselves. However, the versions left in the build directory still
  560 use the uninstalled libraries.
  561 
  562 To build PCRE using static libraries only you must use --disable-shared when
  563 configuring it. For example:
  564 
  565 ./configure --prefix=/usr/gnu --disable-shared
  566 
  567 Then run "make" in the usual way. Similarly, you can use --disable-static to
  568 build only shared libraries.
  569 
  570 
  571 Cross-compiling using autotools
  572 -------------------------------
  573 
  574 You can specify CC and CFLAGS in the normal way to the "configure" command, in
  575 order to cross-compile PCRE for some other host. However, you should NOT
  576 specify --enable-rebuild-chartables, because if you do, the dftables.c source
  577 file is compiled and run on the local host, in order to generate the inbuilt
  578 character tables (the pcre_chartables.c file). This will probably not work,
  579 because dftables.c needs to be compiled with the local compiler, not the cross
  580 compiler.
  581 
  582 When --enable-rebuild-chartables is not specified, pcre_chartables.c is created
  583 by making a copy of pcre_chartables.c.dist, which is a default set of tables
  584 that assumes ASCII code. Cross-compiling with the default tables should not be
  585 a problem.
  586 
  587 If you need to modify the character tables when cross-compiling, you should
  588 move pcre_chartables.c.dist out of the way, then compile dftables.c by hand and
  589 run it on the local host to make a new version of pcre_chartables.c.dist.
  590 Then when you cross-compile PCRE this new version of the tables will be used.
  591 
  592 
  593 Using HP's ANSI C++ compiler (aCC)
  594 ----------------------------------
  595 
  596 Unless C++ support is disabled by specifying the "--disable-cpp" option of the
  597 "configure" script, you must include the "-AA" option in the CXXFLAGS
  598 environment variable in order for the C++ components to compile correctly.
  599 
  600 Also, note that the aCC compiler on PA-RISC platforms may have a defect whereby
  601 needed libraries fail to get included when specifying the "-AA" compiler
  602 option. If you experience unresolved symbols when linking the C++ programs,
  603 use the workaround of specifying the following environment variable prior to
  604 running the "configure" script:
  605 
  606   CXXLDFLAGS="-lstd_v2 -lCsup_v2"
  607 
  608 
  609 Compiling in Tru64 using native compilers
  610 -----------------------------------------
  611 
  612 The following error may occur when compiling with native compilers in the Tru64
  613 operating system:
  614 
  615   CXX    libpcrecpp_la-pcrecpp.lo
  616 cxx: Error: /usr/lib/cmplrs/cxx/V7.1-006/include/cxx/iosfwd, line 58: #error
  617           directive: "cannot include iosfwd -- define __USE_STD_IOSTREAM to
  618           override default - see section 7.1.2 of the C++ Using Guide"
  619 #error "cannot include iosfwd -- define __USE_STD_IOSTREAM to override default
  620 - see section 7.1.2 of the C++ Using Guide"
  621 
  622 This may be followed by other errors, complaining that 'namespace "std" has no
  623 member'. The solution to this is to add the line
  624 
  625 #define __USE_STD_IOSTREAM 1
  626 
  627 to the config.h file.
  628 
  629 
  630 Using Sun's compilers for Solaris
  631 ---------------------------------
  632 
  633 A user reports that the following configurations work on Solaris 9 sparcv9 and
  634 Solaris 9 x86 (32-bit):
  635 
  636   Solaris 9 sparcv9: ./configure --disable-cpp CC=/bin/cc CFLAGS="-m64 -g"
  637   Solaris 9 x86:     ./configure --disable-cpp CC=/bin/cc CFLAGS="-g"
  638 
  639 
  640 Using PCRE from MySQL
  641 ---------------------
  642 
  643 On systems where both PCRE and MySQL are installed, it is possible to make use
  644 of PCRE from within MySQL, as an alternative to the built-in pattern matching.
  645 There is a web page that tells you how to do this:
  646 
  647   http://www.mysqludf.org/lib_mysqludf_preg/index.php
  648 
  649 
  650 Making new tarballs
  651 -------------------
  652 
  653 The command "make dist" creates three PCRE tarballs, in tar.gz, tar.bz2, and
  654 zip formats. The command "make distcheck" does the same, but then does a trial
  655 build of the new distribution to ensure that it works.
  656 
  657 If you have modified any of the man page sources in the doc directory, you
  658 should first run the PrepareRelease script before making a distribution. This
  659 script creates the .txt and HTML forms of the documentation from the man pages.
  660 
  661 
  662 Testing PCRE
  663 ------------
  664 
  665 To test the basic PCRE library on a Unix-like system, run the RunTest script.
  666 There is another script called RunGrepTest that tests the options of the
  667 pcregrep command. If the C++ wrapper library is built, three test programs
  668 called pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest
  669 are also built. When JIT support is enabled, another test program called
  670 pcre_jit_test is built.
  671 
  672 Both the scripts and all the program tests are run if you obey "make check" or
  673 "make test". For other environments, see the instructions in
  674 NON-AUTOTOOLS-BUILD.
  675 
  676 The RunTest script runs the pcretest test program (which is documented in its
  677 own man page) on each of the relevant testinput files in the testdata
  678 directory, and compares the output with the contents of the corresponding
  679 testoutput files. RunTest uses a file called testtry to hold the main output
  680 from pcretest. Other files whose names begin with "test" are used as working
  681 files in some tests.
  682 
  683 Some tests are relevant only when certain build-time options were selected. For
  684 example, the tests for UTF-8/16/32 support are run only if --enable-utf was
  685 used. RunTest outputs a comment when it skips a test.
  686 
  687 Many of the tests that are not skipped are run up to three times. The second
  688 run forces pcre_study() to be called for all patterns except for a few in some
  689 tests that are marked "never study" (see the pcretest program for how this is
  690 done). If JIT support is available, the non-DFA tests are run a third time,
  691 this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
  692 This testing can be suppressed by putting "nojit" on the RunTest command line.
  693 
  694 The entire set of tests is run once for each of the 8-bit, 16-bit and 32-bit
  695 libraries that are enabled. If you want to run just one set of tests, call
  696 RunTest with either the -8, -16 or -32 option.
  697 
  698 If valgrind is installed, you can run the tests under it by putting "valgrind"
  699 on the RunTest command line. To run pcretest on just one or more specific test
  700 files, give their numbers as arguments to RunTest, for example:
  701 
  702   RunTest 2 7 11
  703 
  704 You can also specify ranges of tests such as 3-6 or 3- (meaning 3 to the
  705 end), or a number preceded by ~ to exclude a test. For example:
  706 
  707   Runtest 3-15 ~10
  708 
  709 This runs tests 3 to 15, excluding test 10, and just ~13 runs all the tests
  710 except test 13. Whatever order the arguments are in, the tests are always run
  711 in numerical order.
  712 
  713 You can also call RunTest with the single argument "list" to cause it to output
  714 a list of tests.
  715 
  716 The first test file can be fed directly into the perltest.pl script to check
  717 that Perl gives the same results. The only difference you should see is in the
  718 first few lines, where the Perl version is given instead of the PCRE version.
  719 
  720 The second set of tests check pcre_fullinfo(), pcre_study(),
  721 pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error
  722 detection, and run-time flags that are specific to PCRE, as well as the POSIX
  723 wrapper API. It also uses the debugging flags to check some of the internals of
  724 pcre_compile().
  725 
  726 If you build PCRE with a locale setting that is not the standard C locale, the
  727 character tables may be different (see next paragraph). In some cases, this may
  728 cause failures in the second set of tests. For example, in a locale where the
  729 isprint() function yields TRUE for characters in the range 128-255, the use of
  730 [:isascii:] inside a character class defines a different set of characters, and
  731 this shows up in this test as a difference in the compiled code, which is being
  732 listed for checking. Where the comparison test output contains [\x00-\x7f] the
  733 test will contain [\x00-\xff], and similarly in some other cases. This is not a
  734 bug in PCRE.
  735 
  736 The third set of tests checks pcre_maketables(), the facility for building a
  737 set of character tables for a specific locale and using them instead of the
  738 default tables. The tests make use of the "fr_FR" (French) locale. Before
  739 running the test, the script checks for the presence of this locale by running
  740 the "locale" command. If that command fails, or if it doesn't include "fr_FR"
  741 in the list of available locales, the third test cannot be run, and a comment
  742 is output to say why. If running this test produces instances of the error
  743 
  744   ** Failed to set locale "fr_FR"
  745 
  746 in the comparison output, it means that locale is not available on your system,
  747 despite being listed by "locale". This does not mean that PCRE is broken.
  748 
  749 [If you are trying to run this test on Windows, you may be able to get it to
  750 work by changing "fr_FR" to "french" everywhere it occurs. Alternatively, use
  751 RunTest.bat. The version of RunTest.bat included with PCRE 7.4 and above uses
  752 Windows versions of test 2. More info on using RunTest.bat is included in the
  753 document entitled NON-UNIX-USE.]
  754 
  755 The fourth and fifth tests check the UTF-8/16/32 support and error handling and
  756 internal UTF features of PCRE that are not relevant to Perl, respectively. The
  757 sixth and seventh tests do the same for Unicode character properties support.
  758 
  759 The eighth, ninth, and tenth tests check the pcre_dfa_exec() alternative
  760 matching function, in non-UTF-8/16/32 mode, UTF-8/16/32 mode, and UTF-8/16/32
  761 mode with Unicode property support, respectively.
  762 
  763 The eleventh test checks some internal offsets and code size features; it is
  764 run only when the default "link size" of 2 is set (in other cases the sizes
  765 change) and when Unicode property support is enabled.
  766 
  767 The twelfth test is run only when JIT support is available, and the thirteenth
  768 test is run only when JIT support is not available. They test some JIT-specific
  769 features such as information output from pcretest about JIT compilation.
  770 
  771 The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
  772 the seventeenth, eighteenth, and nineteenth tests are run only in 16/32-bit
  773 mode. These are tests that generate different output in the two modes. They are
  774 for general cases, UTF-8/16/32 support, and Unicode property support,
  775 respectively.
  776 
  777 The twentieth test is run only in 16/32-bit mode. It tests some specific
  778 16/32-bit features of the DFA matching engine.
  779 
  780 The twenty-first and twenty-second tests are run only in 16/32-bit mode, when
  781 the link size is set to 2 for the 16-bit library. They test reloading
  782 pre-compiled patterns.
  783 
  784 The twenty-third and twenty-fourth tests are run only in 16-bit mode. They are
  785 for general cases, and UTF-16 support, respectively.
  786 
  787 The twenty-fifth and twenty-sixth tests are run only in 32-bit mode. They are
  788 for general cases, and UTF-32 support, respectively.
  789 
  790 
  791 Character tables
  792 ----------------
  793 
  794 For speed, PCRE uses four tables for manipulating and identifying characters
  795 whose code point values are less than 256. The final argument of the
  796 pcre_compile() function is a pointer to a block of memory containing the
  797 concatenated tables. A call to pcre_maketables() can be used to generate a set
  798 of tables in the current locale. If the final argument for pcre_compile() is
  799 passed as NULL, a set of default tables that is built into the binary is used.
  800 
  801 The source file called pcre_chartables.c contains the default set of tables. By
  802 default, this is created as a copy of pcre_chartables.c.dist, which contains
  803 tables for ASCII coding. However, if --enable-rebuild-chartables is specified
  804 for ./configure, a different version of pcre_chartables.c is built by the
  805 program dftables (compiled from dftables.c), which uses the ANSI C character
  806 handling functions such as isalnum(), isalpha(), isupper(), islower(), etc. to
  807 build the table sources. This means that the default C locale which is set for
  808 your system will control the contents of these default tables. You can change
  809 the default tables by editing pcre_chartables.c and then re-building PCRE. If
  810 you do this, you should take care to ensure that the file does not get
  811 automatically re-generated. The best way to do this is to move
  812 pcre_chartables.c.dist out of the way and replace it with your customized
  813 tables.
  814 
  815 When the dftables program is run as a result of --enable-rebuild-chartables,
  816 it uses the default C locale that is set on your system. It does not pay
  817 attention to the LC_xxx environment variables. In other words, it uses the
  818 system's default locale rather than whatever the compiling user happens to have
  819 set. If you really do want to build a source set of character tables in a
  820 locale that is specified by the LC_xxx variables, you can run the dftables
  821 program by hand with the -L option. For example:
  822 
  823   ./dftables -L pcre_chartables.c.special
  824 
  825 The first two 256-byte tables provide lower casing and case flipping functions,
  826 respectively. The next table consists of three 32-byte bit maps which identify
  827 digits, "word" characters, and white space, respectively. These are used when
  828 building 32-byte bit maps that represent character classes for code points less
  829 than 256.
  830 
  831 The final 256-byte table has bits indicating various character types, as
  832 follows:
  833 
  834     1   white space character
  835     2   letter
  836     4   decimal digit
  837     8   hexadecimal digit
  838    16   alphanumeric or '_'
  839   128   regular expression metacharacter or binary zero
  840 
  841 You should not alter the set of characters that contain the 128 bit, as that
  842 will cause PCRE to malfunction.
  843 
  844 
  845 File manifest
  846 -------------
  847 
  848 The distribution should contain the files listed below. Where a file name is
  849 given as pcre[16|32]_xxx it means that there are three files, one with the name
  850 pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
  851 
  852 (A) Source files of the PCRE library functions and their headers:
  853 
  854   dftables.c              auxiliary program for building pcre_chartables.c
  855                           when --enable-rebuild-chartables is specified
  856 
  857   pcre_chartables.c.dist  a default set of character tables that assume ASCII
  858                           coding; used, unless --enable-rebuild-chartables is
  859                           specified, by copying to pcre[16]_chartables.c
  860 
  861   pcreposix.c                )
  862   pcre[16|32]_byte_order.c   )
  863   pcre[16|32]_compile.c      )
  864   pcre[16|32]_config.c       )
  865   pcre[16|32]_dfa_exec.c     )
  866   pcre[16|32]_exec.c         )
  867   pcre[16|32]_fullinfo.c     )
  868   pcre[16|32]_get.c          ) sources for the functions in the library,
  869   pcre[16|32]_globals.c      )   and some internal functions that they use
  870   pcre[16|32]_jit_compile.c  )
  871   pcre[16|32]_maketables.c   )
  872   pcre[16|32]_newline.c      )
  873   pcre[16|32]_refcount.c     )
  874   pcre[16|32]_string_utils.c )
  875   pcre[16|32]_study.c        )
  876   pcre[16|32]_tables.c       )
  877   pcre[16|32]_ucd.c          )
  878   pcre[16|32]_version.c      )
  879   pcre[16|32]_xclass.c       )
  880   pcre_ord2utf8.c            )
  881   pcre_valid_utf8.c          )
  882   pcre16_ord2utf16.c         )
  883   pcre16_utf16_utils.c       )
  884   pcre16_valid_utf16.c       )
  885   pcre32_utf32_utils.c       )
  886   pcre32_valid_utf32.c       )
  887 
  888   pcre[16|32]_printint.c     ) debugging function that is used by pcretest,
  889                              )   and can also be #included in pcre_compile()
  890 
  891   pcre.h.in               template for pcre.h when built by "configure"
  892   pcreposix.h             header for the external POSIX wrapper API
  893   pcre_internal.h         header for internal use
  894   sljit/*                 16 files that make up the JIT compiler
  895   ucp.h                   header for Unicode property handling
  896 
  897   config.h.in             template for config.h, which is built by "configure"
  898 
  899   pcrecpp.h               public header file for the C++ wrapper
  900   pcrecpparg.h.in         template for another C++ header file
  901   pcre_scanner.h          public header file for C++ scanner functions
  902   pcrecpp.cc              )
  903   pcre_scanner.cc         ) source for the C++ wrapper library
  904 
  905   pcre_stringpiece.h.in   template for pcre_stringpiece.h, the header for the
  906                             C++ stringpiece functions
  907   pcre_stringpiece.cc     source for the C++ stringpiece functions
  908 
  909 (B) Source files for programs that use PCRE:
  910 
  911   pcredemo.c              simple demonstration of coding calls to PCRE
  912   pcregrep.c              source of a grep utility that uses PCRE
  913   pcretest.c              comprehensive test program
  914 
  915 (C) Auxiliary files:
  916 
  917   132html                 script to turn "man" pages into HTML
  918   AUTHORS                 information about the author of PCRE
  919   ChangeLog               log of changes to the code
  920   CleanTxt                script to clean nroff output for txt man pages
  921   Detrail                 script to remove trailing spaces
  922   HACKING                 some notes about the internals of PCRE
  923   INSTALL                 generic installation instructions
  924   LICENCE                 conditions for the use of PCRE
  925   COPYING                 the same, using GNU's standard name
  926   Makefile.in             ) template for Unix Makefile, which is built by
  927                           )   "configure"
  928   Makefile.am             ) the automake input that was used to create
  929                           )   Makefile.in
  930   NEWS                    important changes in this release
  931   NON-UNIX-USE            the previous name for NON-AUTOTOOLS-BUILD
  932   NON-AUTOTOOLS-BUILD     notes on building PCRE without using autotools
  933   PrepareRelease          script to make preparations for "make dist"
  934   README                  this file
  935   RunTest                 a Unix shell script for running tests
  936   RunGrepTest             a Unix shell script for pcregrep tests
  937   aclocal.m4              m4 macros (generated by "aclocal")
  938   config.guess            ) files used by libtool,
  939   config.sub              )   used only when building a shared library
  940   configure               a configuring shell script (built by autoconf)
  941   configure.ac            ) the autoconf input that was used to build
  942                           )   "configure" and config.h
  943   depcomp                 ) script to find program dependencies, generated by
  944                           )   automake
  945   doc/*.3                 man page sources for PCRE
  946   doc/*.1                 man page sources for pcregrep and pcretest
  947   doc/index.html.src      the base HTML page
  948   doc/html/*              HTML documentation
  949   doc/pcre.txt            plain text version of the man pages
  950   doc/pcretest.txt        plain text documentation of test program
  951   doc/perltest.txt        plain text documentation of Perl test program
  952   install-sh              a shell script for installing files
  953   libpcre16.pc.in         template for libpcre16.pc for pkg-config
  954   libpcre32.pc.in         template for libpcre32.pc for pkg-config
  955   libpcre.pc.in           template for libpcre.pc for pkg-config
  956   libpcreposix.pc.in      template for libpcreposix.pc for pkg-config
  957   libpcrecpp.pc.in        template for libpcrecpp.pc for pkg-config
  958   ltmain.sh               file used to build a libtool script
  959   missing                 ) common stub for a few missing GNU programs while
  960                           )   installing, generated by automake
  961   mkinstalldirs           script for making install directories
  962   perltest.pl             Perl test program
  963   pcre-config.in          source of script which retains PCRE information
  964   pcre_jit_test.c         test program for the JIT compiler
  965   pcrecpp_unittest.cc          )
  966   pcre_scanner_unittest.cc     ) test programs for the C++ wrapper
  967   pcre_stringpiece_unittest.cc )
  968   testdata/testinput*     test data for main library tests
  969   testdata/testoutput*    expected test results
  970   testdata/grep*          input and output for pcregrep tests
  971   testdata/*              other supporting test files
  972 
  973 (D) Auxiliary files for cmake support
  974 
  975   cmake/COPYING-CMAKE-SCRIPTS
  976   cmake/FindPackageHandleStandardArgs.cmake
  977   cmake/FindEditline.cmake
  978   cmake/FindReadline.cmake
  979   CMakeLists.txt
  980   config-cmake.h.in
  981 
  982 (E) Auxiliary files for VPASCAL
  983 
  984   makevp.bat
  985   makevp_c.txt
  986   makevp_l.txt
  987   pcregexp.pas
  988 
  989 (F) Auxiliary files for building PCRE "by hand"
  990 
  991   pcre.h.generic          ) a version of the public PCRE header file
  992                           )   for use in non-"configure" environments
  993   config.h.generic        ) a version of config.h for use in non-"configure"
  994                           )   environments
  995 
  996 (F) Miscellaneous
  997 
  998   RunTest.bat            a script for running tests under Windows
  999 
 1000 Philip Hazel
 1001 Email local part: ph10
 1002 Email domain: cam.ac.uk
 1003 Last updated: 12 February 2020