You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by ot...@apache.org on 2003/01/23 17:11:00 UTC

cvs commit: jakarta-lucene/docs queryparsersyntax.html

otis        2003/01/23 08:11:00

  Modified:    xdocs    queryparsersyntax.xml
               docs     queryparsersyntax.html
  Log:
  - Added useful info to Overview, added a section about Range Searches and a
    section about Field Grouping.  Fixed a few small gramatical errors.
  
  Revision  Changes    Path
  1.4       +55 -4     jakarta-lucene/xdocs/queryparsersyntax.xml
  
  Index: queryparsersyntax.xml
  ===================================================================
  RCS file: /home/cvs/jakarta-lucene/xdocs/queryparsersyntax.xml,v
  retrieving revision 1.3
  retrieving revision 1.4
  diff -u -r1.3 -r1.4
  --- queryparsersyntax.xml	16 May 2002 14:04:14 -0000	1.3
  +++ queryparsersyntax.xml	23 Jan 2003 16:10:59 -0000	1.4
  @@ -8,9 +8,37 @@
       </properties>
       <body>
           <section name="Overview">
  -            <p>Although Lucene provides the ability to create your own query's though its API, it also provides a rich query language through the QueryParser.</p>
  -            <p>This page provides syntax of Lucene's Query Parser, a lexer which interprets a string into a Lucene Query using JavaCC.</p>
  +            <p>Although Lucene provides the ability to create your own
  +            queries through its API, it also provides a rich query
  +            language through the Query Parser.</p> <p>This page
  +            provides syntax of Lucene's Query Parser, a lexer which
  +            interprets a string into a Lucene Query using JavaCC.</p>
  +            <p>
  +            Before choosing to use the provided Query Parser, please consider the following:
  +            <ol>
  +            <li>If you are programmatically generating a query string and then 
  +            parsing it with the query parser then you should seriously consider building 
  +            your queries directly with the query API.  In other words, the query
  +            parser is designed for human-entered text, not for program-generated 
  +            text.</li>
  +
  +            <li>Untokenized fields are best added directly to queries, and not 
  +            through the query parser.  If a field's values are generated programmatically 
  +            by the application, then so should query clauses for this field. 
  +            Analyzers, like the query parser, are designed to convert human-entered 
  +            text to terms.  Program-generated values, like dates, keywords, etc., 
  +            should be consistently program-generated.</li>
  +
  +            <li>In a query form, fields which are general text should use the query 
  +            parser.  All others, such as date ranges, keywords, etc. are better added 
  +            directly through the query API.  A field with a limit set of values, 
  +            that can be specified with a pull-down menu should not be added to a 
  +            query string which is subsequently parsed, but rather added as a 
  +            TermQuery clause.</li>
  +            </ol>
  +            </p>
           </section>
  +
           <section name="Terms">
           <p>A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.</p>
           <p>A Single Term is a single word such as "test" or "hello".</p>
  @@ -37,6 +65,7 @@
           </section>
           
           <section name="Term Modifiers">
  +
           <p>Lucene supports modifying query terms to provide a wide range of searching options.</p>
           
           <subsection name="Wildcard Searches">
  @@ -62,14 +91,27 @@
           <p>This search will find terms like foam and roams</p>
           <p>Note:Terms found by the fuzzy search will automatically get a boost factor of 0.2</p>
           </subsection>
  +
            
           <subsection name="Proximity Searches">
           <p>Lucene supports finding words are a within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for a "apache" and "jakarta" within 10 words of each other in a document use the search: </p>
   
           <source>"jakarta apache"~10</source>
  -
           </subsection>
  +
           
  +        <subsection name="Range Searches">
  +            <p>Range Queries allow one to match documents whose field(s) values
  +            are between the lower and upper bound specified by the Range Query.
  +            Range Queries are inclusive (i.e. the query includes the specified lower and upper bound).
  +            Sorting is done lexicographically.</p>
  +            <source>mod_date:[20020101 TO 20030101]</source>
  +            <p>This will find documents whose mod_date fields have values between 20020101 and 20030101.
  +            Note that Range Queries are not reserved for date fields.  You could also use range queries with non-date fields:</p>
  +            <source>title:[Aida TO Carmen]</source>
  +            <p>This will find all documents whose titles are between Aida and Carmen.</p>
  +        </subsection>
  +
            
           <subsection name="Boosting a Term">
           <p>Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.</p>
  @@ -82,11 +124,14 @@
           <p>This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: </p>
   
           <source>"jakarta apache"^4 "jakarta lucene"</source>
  -        <p>By default, the boost factor is 1. Although, the boost factor must be positive, it can be less than 1 (i.e. .2)</p>
  +        <p>By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)</p>
           </subsection>
  +
           </section>
   
  +
           <section name="Boolean operators">
  +
           <p>Boolean operators allow terms to be combined through logic operators.
           Lucene supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean operators must be ALL CAPS).</p>
   
  @@ -145,6 +190,12 @@
           <p>This eliminates any confusion and makes sure you that website must exist and either term jakarta or apache may exist.</p>
           </section>
           
  +        <section name="Field Grouping">
  +        <p>Lucene supports using parentheses to group multiple clauses to a single field.</p>
  +        <p>To search for a title that contains both the word "return" and the phrase "pink panther" use the query:</p>
  +        <source>title:(+return +"pink panther")</source>
  +        </section>
  +
           <section name="Escaping Special Characters">
           <p>Lucene supports escaping special characters that are part of the query syntax. The current list special characters are</p>
           <p>+ - &amp;&amp; || ! ( ) { } [ ] ^ " ~ * ? : \</p>
  
  
  
  1.15      +122 -3    jakarta-lucene/docs/queryparsersyntax.html
  
  Index: queryparsersyntax.html
  ===================================================================
  RCS file: /home/cvs/jakarta-lucene/docs/queryparsersyntax.html,v
  retrieving revision 1.14
  retrieving revision 1.15
  diff -u -r1.14 -r1.15
  --- queryparsersyntax.html	4 Jan 2003 17:19:16 -0000	1.14
  +++ queryparsersyntax.html	23 Jan 2003 16:10:59 -0000	1.15
  @@ -119,8 +119,36 @@
         </td></tr>
         <tr><td>
           <blockquote>
  -                                    <p>Although Lucene provides the ability to create your own query's though its API, it also provides a rich query language through the QueryParser.</p>
  -                                                <p>This page provides syntax of Lucene's Query Parser, a lexer which interprets a string into a Lucene Query using JavaCC.</p>
  +                                    <p>Although Lucene provides the ability to create your own
  +            queries through its API, it also provides a rich query
  +            language through the Query Parser.</p>
  +                                                <p>This page
  +            provides syntax of Lucene's Query Parser, a lexer which
  +            interprets a string into a Lucene Query using JavaCC.</p>
  +                                                <p>
  +            Before choosing to use the provided Query Parser, please consider the following:
  +            <ol>
  +            <li>If you are programmatically generating a query string and then 
  +            parsing it with the query parser then you should seriously consider building 
  +            your queries directly with the query API.  In other words, the query
  +            parser is designed for human-entered text, not for program-generated 
  +            text.</li>
  +
  +            <li>Untokenized fields are best added directly to queries, and not 
  +            through the query parser.  If a field's values are generated programmatically 
  +            by the application, then so should query clauses for this field. 
  +            Analyzers, like the query parser, are designed to convert human-entered 
  +            text to terms.  Program-generated values, like dates, keywords, etc., 
  +            should be consistently program-generated.</li>
  +
  +            <li>In a query form, fields which are general text should use the query 
  +            parser.  All others, such as date ranges, keywords, etc. are better added 
  +            directly through the query API.  A field with a limit set of values, 
  +            that can be specified with a pull-down menu should not be added to a 
  +            query string which is subsequently parsed, but rather added as a 
  +            TermQuery clause.</li>
  +            </ol>
  +            </p>
                               </blockquote>
           </p>
         </td></tr>
  @@ -377,6 +405,63 @@
                                                       <table border="0" cellspacing="0" cellpadding="2" width="100%">
         <tr><td bgcolor="#828DA6">
           <font color="#ffffff" face="arial,helvetica,sanserif">
  +          <a name="Range Searches"><strong>Range Searches</strong></a>
  +        </font>
  +      </td></tr>
  +      <tr><td>
  +        <blockquote>
  +                                    <p>Range Queries allow one to match documents whose field(s) values
  +            are between the lower and upper bound specified by the Range Query.
  +            Range Queries are inclusive (i.e. the query includes the specified lower and upper bound).
  +            Sorting is done lexicographically.</p>
  +                                                    <div align="left">
  +    <table cellspacing="4" cellpadding="0" border="0">
  +    <tr>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    <tr>
  +      <td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#ffffff"><pre>mod_date:[20020101 TO 20030101]</pre></td>
  +      <td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    <tr>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    </table>
  +    </div>
  +                                                <p>This will find documents whose mod_date fields have values between 20020101 and 20030101.
  +            Note that Range Queries are not reserved for date fields.  You could also use range queries with non-date fields:</p>
  +                                                    <div align="left">
  +    <table cellspacing="4" cellpadding="0" border="0">
  +    <tr>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    <tr>
  +      <td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#ffffff"><pre>title:[Aida TO Carmen]</pre></td>
  +      <td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    <tr>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    </table>
  +    </div>
  +                                                <p>This will find all documents whose titles are between Aida and Carmen.</p>
  +                            </blockquote>
  +      </td></tr>
  +      <tr><td><br/></td></tr>
  +    </table>
  +                                                    <table border="0" cellspacing="0" cellpadding="2" width="100%">
  +      <tr><td bgcolor="#828DA6">
  +        <font color="#ffffff" face="arial,helvetica,sanserif">
             <a name="Boosting a Term"><strong>Boosting a Term</strong></a>
           </font>
         </td></tr>
  @@ -444,7 +529,7 @@
       </tr>
       </table>
       </div>
  -                                                <p>By default, the boost factor is 1. Although, the boost factor must be positive, it can be less than 1 (i.e. .2)</p>
  +                                                <p>By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)</p>
                               </blockquote>
         </td></tr>
         <tr><td><br/></td></tr>
  @@ -708,6 +793,40 @@
       </table>
       </div>
                                                   <p>This eliminates any confusion and makes sure you that website must exist and either term jakarta or apache may exist.</p>
  +                            </blockquote>
  +        </p>
  +      </td></tr>
  +      <tr><td><br/></td></tr>
  +    </table>
  +                                                <table border="0" cellspacing="0" cellpadding="2" width="100%">
  +      <tr><td bgcolor="#525D76">
  +        <font color="#ffffff" face="arial,helvetica,sanserif">
  +          <a name="Field Grouping"><strong>Field Grouping</strong></a>
  +        </font>
  +      </td></tr>
  +      <tr><td>
  +        <blockquote>
  +                                    <p>Lucene supports using parentheses to group multiple clauses to a single field.</p>
  +                                                <p>To search for a title that contains both the word "return" and the phrase "pink panther" use the query:</p>
  +                                                    <div align="left">
  +    <table cellspacing="4" cellpadding="0" border="0">
  +    <tr>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    <tr>
  +      <td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#ffffff"><pre>title:(+return +&quot;pink panther&quot;)</pre></td>
  +      <td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    <tr>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +      <td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
  +    </tr>
  +    </table>
  +    </div>
                               </blockquote>
           </p>
         </td></tr>
  
  
  

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>