"Fossies" - the Fresh Open Source Software Archive

Member "xapian-core-1.4.14/docs/queryparser.html" (23 Nov 2019, 16661 Bytes) of package /linux/www/xapian-core-1.4.14.tar.xz:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) HTML source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 <?xml version="1.0" encoding="utf-8" ?>
    2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    4 <head>
    5 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    6 <meta name="generator" content="Docutils 0.15.2: http://docutils.sourceforge.net/" />
    7 <title>Xapian::QueryParser Syntax</title>
    8 <style type="text/css">
    9 
   10 /*
   11 :Author: David Goodger (goodger@python.org)
   12 :Id: $Id: html4css1.css 7952 2016-07-26 18:15:59Z milde $
   13 :Copyright: This stylesheet has been placed in the public domain.
   14 
   15 Default cascading style sheet for the HTML output of Docutils.
   16 
   17 See http://docutils.sf.net/docs/howto/html-stylesheets.html for how to
   18 customize this style sheet.
   19 */
   20 
   21 /* used to remove borders from tables and images */
   22 .borderless, table.borderless td, table.borderless th {
   23   border: 0 }
   24 
   25 table.borderless td, table.borderless th {
   26   /* Override padding for "table.docutils td" with "! important".
   27      The right padding separates the table cells. */
   28   padding: 0 0.5em 0 0 ! important }
   29 
   30 .first {
   31   /* Override more specific margin styles with "! important". */
   32   margin-top: 0 ! important }
   33 
   34 .last, .with-subtitle {
   35   margin-bottom: 0 ! important }
   36 
   37 .hidden {
   38   display: none }
   39 
   40 .subscript {
   41   vertical-align: sub;
   42   font-size: smaller }
   43 
   44 .superscript {
   45   vertical-align: super;
   46   font-size: smaller }
   47 
   48 a.toc-backref {
   49   text-decoration: none ;
   50   color: black }
   51 
   52 blockquote.epigraph {
   53   margin: 2em 5em ; }
   54 
   55 dl.docutils dd {
   56   margin-bottom: 0.5em }
   57 
   58 object[type="image/svg+xml"], object[type="application/x-shockwave-flash"] {
   59   overflow: hidden;
   60 }
   61 
   62 /* Uncomment (and remove this text!) to get bold-faced definition list terms
   63 dl.docutils dt {
   64   font-weight: bold }
   65 */
   66 
   67 div.abstract {
   68   margin: 2em 5em }
   69 
   70 div.abstract p.topic-title {
   71   font-weight: bold ;
   72   text-align: center }
   73 
   74 div.admonition, div.attention, div.caution, div.danger, div.error,
   75 div.hint, div.important, div.note, div.tip, div.warning {
   76   margin: 2em ;
   77   border: medium outset ;
   78   padding: 1em }
   79 
   80 div.admonition p.admonition-title, div.hint p.admonition-title,
   81 div.important p.admonition-title, div.note p.admonition-title,
   82 div.tip p.admonition-title {
   83   font-weight: bold ;
   84   font-family: sans-serif }
   85 
   86 div.attention p.admonition-title, div.caution p.admonition-title,
   87 div.danger p.admonition-title, div.error p.admonition-title,
   88 div.warning p.admonition-title, .code .error {
   89   color: red ;
   90   font-weight: bold ;
   91   font-family: sans-serif }
   92 
   93 /* Uncomment (and remove this text!) to get reduced vertical space in
   94    compound paragraphs.
   95 div.compound .compound-first, div.compound .compound-middle {
   96   margin-bottom: 0.5em }
   97 
   98 div.compound .compound-last, div.compound .compound-middle {
   99   margin-top: 0.5em }
  100 */
  101 
  102 div.dedication {
  103   margin: 2em 5em ;
  104   text-align: center ;
  105   font-style: italic }
  106 
  107 div.dedication p.topic-title {
  108   font-weight: bold ;
  109   font-style: normal }
  110 
  111 div.figure {
  112   margin-left: 2em ;
  113   margin-right: 2em }
  114 
  115 div.footer, div.header {
  116   clear: both;
  117   font-size: smaller }
  118 
  119 div.line-block {
  120   display: block ;
  121   margin-top: 1em ;
  122   margin-bottom: 1em }
  123 
  124 div.line-block div.line-block {
  125   margin-top: 0 ;
  126   margin-bottom: 0 ;
  127   margin-left: 1.5em }
  128 
  129 div.sidebar {
  130   margin: 0 0 0.5em 1em ;
  131   border: medium outset ;
  132   padding: 1em ;
  133   background-color: #ffffee ;
  134   width: 40% ;
  135   float: right ;
  136   clear: right }
  137 
  138 div.sidebar p.rubric {
  139   font-family: sans-serif ;
  140   font-size: medium }
  141 
  142 div.system-messages {
  143   margin: 5em }
  144 
  145 div.system-messages h1 {
  146   color: red }
  147 
  148 div.system-message {
  149   border: medium outset ;
  150   padding: 1em }
  151 
  152 div.system-message p.system-message-title {
  153   color: red ;
  154   font-weight: bold }
  155 
  156 div.topic {
  157   margin: 2em }
  158 
  159 h1.section-subtitle, h2.section-subtitle, h3.section-subtitle,
  160 h4.section-subtitle, h5.section-subtitle, h6.section-subtitle {
  161   margin-top: 0.4em }
  162 
  163 h1.title {
  164   text-align: center }
  165 
  166 h2.subtitle {
  167   text-align: center }
  168 
  169 hr.docutils {
  170   width: 75% }
  171 
  172 img.align-left, .figure.align-left, object.align-left, table.align-left {
  173   clear: left ;
  174   float: left ;
  175   margin-right: 1em }
  176 
  177 img.align-right, .figure.align-right, object.align-right, table.align-right {
  178   clear: right ;
  179   float: right ;
  180   margin-left: 1em }
  181 
  182 img.align-center, .figure.align-center, object.align-center {
  183   display: block;
  184   margin-left: auto;
  185   margin-right: auto;
  186 }
  187 
  188 table.align-center {
  189   margin-left: auto;
  190   margin-right: auto;
  191 }
  192 
  193 .align-left {
  194   text-align: left }
  195 
  196 .align-center {
  197   clear: both ;
  198   text-align: center }
  199 
  200 .align-right {
  201   text-align: right }
  202 
  203 /* reset inner alignment in figures */
  204 div.align-right {
  205   text-align: inherit }
  206 
  207 /* div.align-center * { */
  208 /*   text-align: left } */
  209 
  210 .align-top    {
  211   vertical-align: top }
  212 
  213 .align-middle {
  214   vertical-align: middle }
  215 
  216 .align-bottom {
  217   vertical-align: bottom }
  218 
  219 ol.simple, ul.simple {
  220   margin-bottom: 1em }
  221 
  222 ol.arabic {
  223   list-style: decimal }
  224 
  225 ol.loweralpha {
  226   list-style: lower-alpha }
  227 
  228 ol.upperalpha {
  229   list-style: upper-alpha }
  230 
  231 ol.lowerroman {
  232   list-style: lower-roman }
  233 
  234 ol.upperroman {
  235   list-style: upper-roman }
  236 
  237 p.attribution {
  238   text-align: right ;
  239   margin-left: 50% }
  240 
  241 p.caption {
  242   font-style: italic }
  243 
  244 p.credits {
  245   font-style: italic ;
  246   font-size: smaller }
  247 
  248 p.label {
  249   white-space: nowrap }
  250 
  251 p.rubric {
  252   font-weight: bold ;
  253   font-size: larger ;
  254   color: maroon ;
  255   text-align: center }
  256 
  257 p.sidebar-title {
  258   font-family: sans-serif ;
  259   font-weight: bold ;
  260   font-size: larger }
  261 
  262 p.sidebar-subtitle {
  263   font-family: sans-serif ;
  264   font-weight: bold }
  265 
  266 p.topic-title {
  267   font-weight: bold }
  268 
  269 pre.address {
  270   margin-bottom: 0 ;
  271   margin-top: 0 ;
  272   font: inherit }
  273 
  274 pre.literal-block, pre.doctest-block, pre.math, pre.code {
  275   margin-left: 2em ;
  276   margin-right: 2em }
  277 
  278 pre.code .ln { color: grey; } /* line numbers */
  279 pre.code, code { background-color: #eeeeee }
  280 pre.code .comment, code .comment { color: #5C6576 }
  281 pre.code .keyword, code .keyword { color: #3B0D06; font-weight: bold }
  282 pre.code .literal.string, code .literal.string { color: #0C5404 }
  283 pre.code .name.builtin, code .name.builtin { color: #352B84 }
  284 pre.code .deleted, code .deleted { background-color: #DEB0A1}
  285 pre.code .inserted, code .inserted { background-color: #A3D289}
  286 
  287 span.classifier {
  288   font-family: sans-serif ;
  289   font-style: oblique }
  290 
  291 span.classifier-delimiter {
  292   font-family: sans-serif ;
  293   font-weight: bold }
  294 
  295 span.interpreted {
  296   font-family: sans-serif }
  297 
  298 span.option {
  299   white-space: nowrap }
  300 
  301 span.pre {
  302   white-space: pre }
  303 
  304 span.problematic {
  305   color: red }
  306 
  307 span.section-subtitle {
  308   /* font-size relative to parent (h1..h6 element) */
  309   font-size: 80% }
  310 
  311 table.citation {
  312   border-left: solid 1px gray;
  313   margin-left: 1px }
  314 
  315 table.docinfo {
  316   margin: 2em 4em }
  317 
  318 table.docutils {
  319   margin-top: 0.5em ;
  320   margin-bottom: 0.5em }
  321 
  322 table.footnote {
  323   border-left: solid 1px black;
  324   margin-left: 1px }
  325 
  326 table.docutils td, table.docutils th,
  327 table.docinfo td, table.docinfo th {
  328   padding-left: 0.5em ;
  329   padding-right: 0.5em ;
  330   vertical-align: top }
  331 
  332 table.docutils th.field-name, table.docinfo th.docinfo-name {
  333   font-weight: bold ;
  334   text-align: left ;
  335   white-space: nowrap ;
  336   padding-left: 0 }
  337 
  338 /* "booktabs" style (no vertical lines) */
  339 table.docutils.booktabs {
  340   border: 0px;
  341   border-top: 2px solid;
  342   border-bottom: 2px solid;
  343   border-collapse: collapse;
  344 }
  345 table.docutils.booktabs * {
  346   border: 0px;
  347 }
  348 table.docutils.booktabs th {
  349   border-bottom: thin solid;
  350   text-align: left;
  351 }
  352 
  353 h1 tt.docutils, h2 tt.docutils, h3 tt.docutils,
  354 h4 tt.docutils, h5 tt.docutils, h6 tt.docutils {
  355   font-size: 100% }
  356 
  357 ul.auto-toc {
  358   list-style-type: none }
  359 
  360 </style>
  361 </head>
  362 <body>
  363 <div class="document" id="xapian-queryparser-syntax">
  364 <h1 class="title">Xapian::QueryParser Syntax</h1>
  365 
  366 <p>This document describes the query syntax supported by the
  367 Xapian::QueryParser class. The syntax is designed to be similar to other
  368 web based search engines, so that users familiar with them don't have to
  369 learn a whole new syntax.</p>
  370 <div class="section" id="operators">
  371 <h1>Operators</h1>
  372 <div class="section" id="and">
  373 <h2>AND</h2>
  374 <p><em>expression</em> AND <em>expression</em> matches documents which are matched by
  375 both of the subexpressions.</p>
  376 </div>
  377 <div class="section" id="or">
  378 <h2>OR</h2>
  379 <p><em>expression</em> OR <em>expression</em> matches documents which are matched by
  380 either of the subexpressions.</p>
  381 </div>
  382 <div class="section" id="not">
  383 <h2>NOT</h2>
  384 <p><em>expression</em> NOT <em>expression</em> matches documents which are matched by
  385 only the first subexpression. This can also be written as <em>expression</em>
  386 AND NOT <em>expression</em>. If <tt class="docutils literal">FLAG_PURE_NOT</tt> is enabled, then</p>
  387 <p>NOT <em>expression</em> will match documents which don't match the
  388 subexpression.</p>
  389 </div>
  390 <div class="section" id="xor">
  391 <h2>XOR</h2>
  392 <p><em>expression</em> XOR <em>expression</em> matches documents which are matched by one
  393 or other of the subexpressions, but not both. XOR is probably a bit
  394 esoteric.</p>
  395 </div>
  396 <div class="section" id="bracketed-expressions">
  397 <h2>Bracketed expressions</h2>
  398 <p>You can control the precedence of the boolean operators using brackets.
  399 In the query <tt class="docutils literal">one OR two AND three</tt> the AND takes precedence, so this
  400 is the same as <tt class="docutils literal">one OR (two AND three)</tt>. You can override the
  401 precedence using <tt class="docutils literal">(one OR two) AND three</tt>.</p>
  402 <p>The default precedence from highest to lowest is:</p>
  403 <ul class="simple">
  404 <li>+, - (equal)</li>
  405 <li>AND, NOT (equal)</li>
  406 <li>XOR</li>
  407 <li>OR</li>
  408 </ul>
  409 </div>
  410 <div class="section" id="id1">
  411 <h2>'+' and '-'</h2>
  412 <p>A group of terms with some marked with + and - will match documents
  413 containing all of the + terms, but none of the - terms. Terms not marked
  414 with + or - contribute towards the document rankings. You can also use +
  415 and - on phrases and on bracketed expressions.</p>
  416 </div>
  417 <div class="section" id="near">
  418 <h2>NEAR</h2>
  419 <p><tt class="docutils literal">one NEAR two NEAR three</tt> matches documents containing those words
  420 within 10 words of each other. You can set the threshold to <em>n</em> by using
  421 <tt class="docutils literal">NEAR/n</tt> like so: <tt class="docutils literal">one NEAR/6 two</tt>.</p>
  422 </div>
  423 <div class="section" id="adj">
  424 <h2>ADJ</h2>
  425 <p><tt class="docutils literal">ADJ</tt> is like <tt class="docutils literal">NEAR</tt> but only matches if the words appear in the
  426 same order as in the query. So <tt class="docutils literal">one ADJ two ADJ three</tt> matches
  427 documents containing those three words in that order and within 10 words
  428 of each other. You can set the threshold to <em>n</em> by using <tt class="docutils literal">ADJ/n</tt> like
  429 so: <tt class="docutils literal">one ADJ/6 two</tt>.</p>
  430 </div>
  431 <div class="section" id="phrase-searches">
  432 <h2>Phrase searches</h2>
  433 <p>A phrase surrounded with double quotes (&quot;&quot;) matches documents containing
  434 that exact phrase. Hyphenated words are also treated as phrases, as are
  435 cases such as filenames and email addresses (e.g. <tt class="docutils literal">/etc/passwd</tt> or
  436 <tt class="docutils literal">president&#64;whitehouse.gov</tt>).</p>
  437 </div>
  438 <div class="section" id="searching-within-a-free-text-field">
  439 <h2>Searching within a free-text field</h2>
  440 <p>If the database has been indexed with prefixes on terms generated from
  441 certain free-text fields, you can set up a prefix map so that the user can
  442 search within those fields. For example <tt class="docutils literal">author:dickens title:shop</tt>
  443 might find documents by dickens with shop in the title. You can also
  444 specify a prefix on a quoted phrase (e.g. <tt class="docutils literal"><span class="pre">author:&quot;charles</span> dickens&quot;</tt>)
  445 or on a bracketed subexpression (e.g. <tt class="docutils literal"><span class="pre">title:(mice</span> men)</tt>).</p>
  446 </div>
  447 <div class="section" id="searching-for-proper-names">
  448 <h2>Searching for proper names</h2>
  449 <p>If a query term is entered with a capitalised first letter, then it will
  450 be searched for unstemmed.</p>
  451 </div>
  452 <div class="section" id="range-searches">
  453 <h2>Range searches</h2>
  454 <p>The QueryParser <a class="reference external" href="valueranges.html">can be configured to support
  455 range-searching</a> using document values.</p>
  456 <p>The syntax for a range search is <tt class="docutils literal"><span class="pre">start..end</span></tt> - for example,
  457 <tt class="docutils literal"><span class="pre">01/03/2007..04/04/2007</span></tt>, <tt class="docutils literal"><span class="pre">$10..100</span></tt>, <tt class="docutils literal"><span class="pre">5..10kg</span></tt>.</p>
  458 <p>Open-ended ranges are also supported - an empty start or end is
  459 interpreted as no limit, for example: <tt class="docutils literal"><span class="pre">..2010-06-17</span></tt>, <tt class="docutils literal">$10..</tt>,
  460 <tt class="docutils literal"><span class="pre">$..100</span></tt>, <tt class="docutils literal">..5kg</tt>.</p>
  461 </div>
  462 <div class="section" id="synonyms">
  463 <h2>Synonyms</h2>
  464 <p>The QueryParser can be configured to support synonyms, which can either
  465 be used when explicitly specified (using the syntax <tt class="docutils literal">~term</tt>) or
  466 implicitly (synonyms will be used for all terms or groups of terms for
  467 which they have been specified).</p>
  468 </div>
  469 <div class="section" id="wildcards">
  470 <h2>Wildcards</h2>
  471 <p>The QueryParser supports using a trailing '*' wildcard, which matches
  472 any number of trailing characters, so <tt class="docutils literal">wildc*</tt> would match wildcard,
  473 wildcarded, wildcards, wildcat, wildcats, etc. This feature is disabled
  474 by default - pass <tt class="docutils literal"><span class="pre">Xapian::QueryParser::FLAG_WILDCARD</span></tt> in the flags
  475 argument of <tt class="docutils literal"><span class="pre">Xapian::QueryParser::parse_query(query_string,</span> flags)</tt> to
  476 enable it, and tell the QueryParser which database to expand wildcards
  477 from using the <tt class="docutils literal"><span class="pre">QueryParser::set_database(database)</span></tt> method.</p>
  478 <p>You can limit the number of terms a wildcard will expand to by
  479 calling <tt class="docutils literal"><span class="pre">Xapian::QueryParser::set_max_expansion()</span></tt>.  This supports
  480 several different modes, and can also be used to limit expansion
  481 performed via <tt class="docutils literal">FLAG_PARTIAL</tt> - see the API documentation for
  482 details.  By default, there's no limit on wildcard expansion and
  483 <tt class="docutils literal">FLAG_PARTIAL</tt> expands to the most frequent 100 terms.</p>
  484 </div>
  485 <div class="section" id="partially-entered-query-matching">
  486 <h2>Partially entered query matching</h2>
  487 <p>The QueryParser also supports performing a search with a query which has
  488 only been partially entered. This is intended for use with &quot;incremental
  489 search&quot; systems, which don't wait for the user to finish typing their
  490 search before displaying an initial set of results. For example, in such
  491 a system a user would enter a search, and the system would display a new
  492 set of results after each letter, or whenever the user pauses for a
  493 short period of time (or some other similar strategy).</p>
  494 <p>The problem with this kind of search is that the last word in a
  495 partially entered query often has no semantic relation to the completed
  496 word. For example, a search for &quot;dynamic cat&quot; would return a quite
  497 different set of results to a search for &quot;dynamic categorisation&quot;. This
  498 results in the set of results displayed flicking rapidly as each new
  499 character is entered. A much smoother result can be obtained if the
  500 final word is treated as having an implicit terminating wildcard, so
  501 that it matches all words starting with the entered characters - thus,
  502 as each letter is entered, the set of results displayed narrows down to
  503 the desired subject.</p>
  504 <p>A similar effect could be obtained simply by enabling the wildcard
  505 matching option, and appending a &quot;*&quot; character to each query string.
  506 However, this would be confused by searches which ended with punctuation
  507 or other characters.</p>
  508 <p>This feature is disabled by default - pass
  509 <tt class="docutils literal"><span class="pre">Xapian::QueryParser::FLAG_PARTIAL</span></tt> flag in the flags argument of
  510 <tt class="docutils literal"><span class="pre">Xapian::QueryParser::parse_query(query_string,</span> flags)</tt> to enable it,
  511 and tell the QueryParser which database to expand wildcards from using
  512 the <tt class="docutils literal"><span class="pre">QueryParser::set_database(database)</span></tt> method.</p>
  513 </div>
  514 </div>
  515 </div>
  516 </body>
  517 </html>