You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jackrabbit.apache.org by ch...@apache.org on 2015/09/09 11:35:49 UTC

svn commit: r1701956 - /jackrabbit/site/live/oak/docs/query/lucene.html

Author: chetanm
Date: Wed Sep  9 09:35:49 2015
New Revision: 1701956

URL: http://svn.apache.org/r1701956
Log:
OAK-3367 - Boosting fields not working as expected

Publish the updated doc

Modified:
    jackrabbit/site/live/oak/docs/query/lucene.html

Modified: jackrabbit/site/live/oak/docs/query/lucene.html
URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/query/lucene.html?rev=1701956&r1=1701955&r2=1701956&view=diff
==============================================================================
--- jackrabbit/site/live/oak/docs/query/lucene.html (original)
+++ jackrabbit/site/live/oak/docs/query/lucene.html Wed Sep  9 09:35:49 2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-09-08
+ | Generated by Apache Maven Doxia at 2015-09-09
  | Rendered using Apache Maven Fluido Skin 1.3.0
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150908" />
+    <meta name="Date-Revision-yyyymmdd" content="20150909" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Jackrabbit Oak - Lucene Index</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.3.0.min.css" />
@@ -210,7 +210,7 @@
         <ul class="breadcrumb">
                 
                     
-                  <li id="publishDate">Last Published: 2015-09-08</li>
+                  <li id="publishDate">Last Published: 2015-09-09</li>
                   <li class="divider">|</li> <li id="projectVersion">Version: 1.4-SNAPSHOT</li>
                       
                 
@@ -721,7 +721,7 @@
 <li><tt>jcr:content/metadata/.*</tt> - This property definition is  applicable for all properties of child node <i>jcr:content/metadata</i></li>
   </ul></dd>
 <dt>boost</dt>
-<dd>If the property is included in <tt>nodeScopeIndex</tt> then it defines the boost  done for the index value against the given property name.  <b>Boost currently does not work as expected due to <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3367">OAK-3367</a></b></dd>
+<dd>If the property is included in <tt>nodeScopeIndex</tt> then it defines the boost  done for the index value against the given property name. See  <a href="#boost">Boost and Search Relevancy</a> for more details</dd>
 <dt>index</dt>
 <dd>Determines if this property should be indexed. Mostly useful for fulltext  index where some properties need to be <i>excluded</i> from getting indexed.</dd>
 <dt>useInExcerpt</dt>
@@ -999,6 +999,42 @@
   - codec = &quot;Lucene46&quot;
 </pre></div>
 <p>Refer to <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-2853">OAK-2853</a> for details. Enabling the <tt>Lucene46</tt> codec would lead to smaller and compact indexes.</p>
+<p><a name="boost"></a></p></div>
+<div class="section">
+<h4>Boost and Search Relevancy<a name="Boost_and_Search_Relevancy"></a></h4>
+<p><tt>@since Oak 1.2.5</tt></p>
+<p>When fulltext indexing is enabled then internally Oak would create a fulltext field which consists of text extracted from various other fields i.e. fields for which <tt>nodeScopeIndex</tt> is <tt>true</tt>. This allows search like <tt>//*[jcr:contains(., 'foo')]</tt> to perform search across any indexable field containing foo (See <a class="externalLink" href="http://www.day.com/specs/jcr/1.0/6.6.5.2_jcr_contains_Function.html">contains function</a> for details)</p>
+<p>In certain cases its desirable that those nodes where the searched term is present in a specific property are ranked higher (come earlier in search result) compared to those node where the searched term is found in some other property.</p>
+<p>In such cases it should be possible to boost specific text contributed by individual property. Meaning that if a title field is boosted more than description, then search result would those node coming earlier where searched term is found in title field</p>
+<p>For that to work ensure that for each such property (which need to be preferred) both <tt>nodeScopeIndex</tt> and <tt>analyzed</tt> are set to true. In addition you can specify <tt>boost</tt> property so give higher weightage to values found in specific property</p>
+<p>Note that even without setting explicit <tt>boost</tt> and just setting <tt>nodeScopeIndex</tt> and <tt>analyzed</tt> to true would improve the search result due to the way <a class="externalLink" href="https://wiki.apache.org/lucene-java/LuceneFAQ#How_do_I_make_sure_that_a_match_in_a_document_title_has_greater_weight_than_a_match_in_a_document_body.3F">Lucene does scoring</a>. Internally Oak would create separate Lucene fields for those jcr properties and would perform a search across all such fields. For more details refer to <a class="externalLink" href="https://issues.apache.org/jira/browse/OAK-3367">OAK-3367</a></p>
+
+<div class="source">
+<pre>  + indexRules
+    - jcr:primaryType = &quot;nt:unstructured&quot;
+    + app:Asset
+      + properties
+        - jcr:primaryType = &quot;nt:unstructured&quot;
+        + description
+          - nodeScopeIndex = true
+          - analyzed = true
+          - name = &quot;jcr:content/metadata/jcr:description&quot;
+        + title
+          - analyzed = true
+          - nodeScopeIndex = true
+          - name = &quot;jcr:content/metadata/jcr:title&quot;
+          - boost = 2.0
+</pre></div>
+<p>With above index config a search like</p>
+
+<div class="source">
+<pre>SELECT
+  *
+FROM [app:Asset] 
+WHERE 
+  CONTAINS(., 'Batman')
+</pre></div>
+<p>Would have those node (of type app:Asset) come first where <i>Batman</i> is found in <i>jcr:title</i>. While those nodes where search text is found in other field like aggregated content would come later</p>
 <p><a name="osgi-config"></a></p></div></div>
 <div class="section">
 <h3>LuceneIndexProvider Configuration<a name="LuceneIndexProvider_Configuration"></a></h3>