You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@chemistry.apache.org by bu...@apache.org on 2011/03/18 14:18:55 UTC

svn commit: r787182 - /websites/staging/chemistry/trunk/content/java/how-to/how-to-process-query.html

Author: buildbot
Date: Fri Mar 18 13:18:55 2011
New Revision: 787182

Log:
Staging update by buildbot

Modified:
    websites/staging/chemistry/trunk/content/java/how-to/how-to-process-query.html

Modified: websites/staging/chemistry/trunk/content/java/how-to/how-to-process-query.html
==============================================================================
--- websites/staging/chemistry/trunk/content/java/how-to/how-to-process-query.html (original)
+++ websites/staging/chemistry/trunk/content/java/how-to/how-to-process-query.html Fri Mar 18 13:18:55 2011
@@ -183,7 +183,178 @@ Apache Chemistry - Query Integration
            </td>
            <td height="100%">
              <!-- Content -->
-             <div class="wiki-content"></div>
+             <div class="wiki-content"><div class="toc">
+<ul>
+<li><a href="#opencmis_query_integration">OpenCMIS Query Integration</a><ul>
+<li><a href="#implement_query_in_the_discovery_service">Implement query in the discovery service</a></li>
+<li><a href="#use_built-in_antlr_and_antlr_cmisql_grammar">Use built-in ANTLR and ANTLR CMISQL grammar</a></li>
+<li><a href="#use_opencmis_cmsiql_grammar_and_integrate_into_antlr_query_walker">Use OpenCMIS CMSIQL grammar and integrate into ANTLR query walker</a></li>
+<li><a href="#use_predefined_query_walker">Use predefined query walker</a></li>
+<li><a href="#using_queryobject">Using QueryObject</a></li>
+<li><a href="#processing_a_node_and_referencing_types_and_properties">Processing a node and referencing types and properties</a></li>
+<li><a href="#building_the_result_list">Building the result list</a></li>
+</ul>
+</li>
+</ul>
+</div>
+<h1 id="opencmis_query_integration">OpenCMIS Query Integration</h1>
+<p>The CMIS standard contains a powerful query language that supports full
+text and relational metadata query capabilities and is modeled along a
+subset of SQL. Many repositories will have the demand to integrate into
+this query interface. OpenCMIS provides support to make a query integration
+easier. This article explains the various hooks that are provided to
+integrate into the query interface. These hooks provide different levels of
+comfort and flexibility. OpenCMIS integrates a query parser that uses ANTLR
+as parsing engine. However there is no strong dependency on ANTLR. If you
+prefer a different language parsing tool it is possible to do this.</p>
+<p>There are four different levels how you can integrate query:</p>
+<ol>
+<li>Implement query in the discovery service</li>
+<li>Use the built-in ANTLR and ANTLR CMISQL grammar</li>
+<li>Use OpenCMIS CMISQL grammar and integrate into ANTLR query walker</li>
+<li>Use predefined query walker and integrate into interface <code>IQueryConditionProcessor</code>.</li>
+</ol>
+<h2 id="implement_query_in_the_discovery_service">Implement query in the discovery service</h2>
+<p>The first way is to implement the <code>query()</code> method like any other service
+method on your own. This gives you the maximum flexibility including using
+a parser tool of your choice and extensions of the query grammar as you
+like. This is also the method with the highest implementation effort.</p>
+<h2 id="use_built-in_antlr_and_antlr_cmisql_grammar">Use built-in ANTLR and ANTLR CMISQL grammar</h2>
+<p>OpenCMIS comes with a build-in integration of ANTLR and provides a grammar
+file for CMISQL. You can reuse this grammar file, modify or extend it and
+integrate query by using the ANTLR mechanisms for parsing and walking the
+abstract syntax tree. Please refer to the ANTLR documentation for further
+information. This is the right level to use if you need custom parser tree
+transformations or would like to extend the grammar with your own
+constructs. For demonstration purposes OpenCMIS provides a sample extended
+grammar as an example.</p>
+<h2 id="use_opencmis_cmsiql_grammar_and_integrate_into_antlr_query_walker">Use OpenCMIS CMSIQL grammar and integrate into ANTLR query walker</h2>
+<p>If the standard CMISQL grammar is sufficient for you there is another level
+of integration. For many repositories there are common tasks for processing
+queries: The columns of the select part need to be evaluated and mapped to
+type and property definitions. The from area needs to be mapped to type
+definitions and some parts of the where part again refer to properties in
+types. In addition all aliases defined in the statement need to be resolved
+and many validations are performed. OpenCMIS provides a class that performs
+these common tasks. You can make use of the resolved types, properties and
+aliases and walk the resulting abstract syntax tree (AST) to evaluate the
+query. You are free to walk the AST as many times as you need and in the
+order you prefer. The basic idea is that the SELECT and FROM parts are
+processed by OpenCMIS and you are responsible for the WHERE part.&nbsp; The
+CMIS InMemory server provides an example for this level of integration: For
+each object contained in the repository the tree is traversed and checked
+if it matches the current query. You can take the InMemory code as an
+example if you decide to use this integration point.</p>
+<h2 id="use_predefined_query_walker">Use predefined query walker</h2>
+<p>For some repositories a simple and one-pass query traversal is sufficient.
+This can be the case if for example your query needs to be translated to a
+SQL query statement. Because ANTLR has some complexity OpenCMIS provides a
+predefined walker that does a simple one pass depth-first traversal. If
+this is sufficient this interface hides most of the complexity of ANTLR.
+All you have to do is to implement a Java interface
+(<code>IQueryConditionProcessor</code>). You can refer to the unit tests for example
+code. The class <code>TestQueryProcessor</code> nested in the unit test <code>ProcessQueryTest</code>
+provides an example of such a walker. Some utility methods like for example
+parsing literals like <code>"abc"</code>, <code>-123</code> to Java objects like <code>String</code> and <code>Integer</code>
+are common tasks. Therefore this is implemented in an abstract class
+<code>AbstractQueryConditionProcessor</code>. This declares all interface methods as
+abstract and provides default implementations for common tasks. In most
+cases you will derive your implementation from
+<code>AbstractQueryConditionProcessor</code> and not directly implement the interface.</p>
+<p>Note: There is currently no predefined walker for the JOIN statements. If
+you need to support JOINS you have to build your own walker for this part
+as outlined in the previous section.</p>
+<h2 id="using_queryobject">Using QueryObject</h2>
+<p>The class <code>QueryObject</code> provides all the basic functionality for resolving
+types and properties and performs common validation tasks. The <code>QueryObject</code>
+processes the <code>SELECT</code> and <code>FROM</code> parts as well as all property references from
+the <code>WHERE</code> part. It maintains a list of Java objects and interface that you
+can use to access the property and type definitions given your current
+position in the statement. For an example refer to the class
+<code>StoreManagerImpl</code> of the InMemory Server and method <code>query()</code>.
+To be able to use this object <code>QueryObj</code> needs to get access to the types contained in your
+repository. For this purpose you need to pass an interface to a <code>TypeManager</code>
+as input parameter. The second parameter is your query walker implementing
+<code>IQueryConditionProcessor</code>. Your code will typically look like this:</p>
+<div class="codehilite"><pre><span class="n">TypeManager</span> <span class="n">tm</span> <span class="o">=</span> <span class="k">new</span> <span class="n">MyTypeManager</span><span class="o">();</span> <span class="c1">// implements interface TypeManager</span>
+
+<span class="n">IQueryConditionProcessor</span> <span class="n">myWalker</span> <span class="o">=</span> <span class="k">new</span> <span class="n">MyWalker</span><span class="o">();</span>
+                         <span class="c1">// implements interface IQueryConditionProcessor</span>
+                         <span class="c1">// or extends AbstractQueryConditionProcessor</span>
+
+<span class="n">queryObj</span> <span class="o">=</span> <span class="k">new</span> <span class="n">QueryObject</span><span class="o">(</span><span class="n">tm</span><span class="o">,</span> <span class="n">myWalker</span><span class="o">);</span>
+</pre></div>
+
+
+<p><code>queryObj</code> then will process the statement and call the interface methods of
+your walker:</p>
+<div class="codehilite"><pre><span class="k">try</span> <span class="o">{</span>
+
+    <span class="n">CmisQueryWalker</span> <span class="n">walker</span> <span class="o">=</span> <span class="n">QueryObject</span><span class="o">.</span><span class="na">getWalker</span><span class="o">(</span><span class="n">statement</span><span class="o">);</span>
+    <span class="n">walker</span><span class="o">.</span><span class="na">query</span><span class="o">(</span><span class="n">queryObj</span><span class="o">);</span>
+
+<span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">RecognitionException</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
+    <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="s">&quot;Walking of statement failed with RecognitionException error:\n &quot;</span> <span class="o">+</span> <span class="n">e</span><span class="o">);</span>
+<span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">Exception</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
+    <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="s">&quot;Walking of statement failed with exception:\n &quot;</span> <span class="o">+</span> <span class="n">e</span><span class="o">);</span>
+<span class="o">}</span>
+</pre></div>
+
+
+<p>After this method returns you may for example ask your walker object
+<code>myWalker</code> for the generated SQL string.</p>
+<h2 id="processing_a_node_and_referencing_types_and_properties">Processing a node and referencing types and properties</h2>
+<p>While traversing the tree you often will need to access the property and
+type definitions that are referenced in the where clause. The <code>QueryObject</code>
+provides the necessary information for resolving the references. For
+example the statement</p>
+<div class="codehilite"><pre><span class="sb">`... WHERE x &lt; 123`</span>
+</pre></div>
+
+
+<p>will result in calling the method <code>onLessThan()</code> in your walker callback
+implementation:</p>
+<div class="codehilite"><pre><span class="kd">public</span> <span class="kt">void</span> <span class="nf">onLessThan</span><span class="o">(</span><span class="n">Tree</span> <span class="n">ltNode</span><span class="o">,</span> <span class="n">Tree</span> <span class="n">leftNode</span><span class="o">,</span> <span class="n">Tree</span> <span class="n">rightNode</span><span class="o">)</span> <span class="o">{</span>
+
+    <span class="n">Object</span> <span class="n">rVal</span> <span class="o">=</span> <span class="n">onLiteral</span><span class="o">(</span><span class="n">rightChild</span><span class="o">);</span>
+    <span class="n">ColumnReference</span> <span class="n">colRef</span><span class="o">;</span>
+
+    <span class="n">CmisSelector</span> <span class="n">sel</span> <span class="o">=</span> <span class="n">queryObj</span><span class="o">.</span><span class="na">getColumnReference</span><span class="o">(</span><span class="n">columnNode</span>
+             <span class="o">.</span><span class="na">getTokenStartIndex</span><span class="o">());</span>
+
+    <span class="k">if</span> <span class="o">(</span><span class="kc">null</span> <span class="o">==</span> <span class="n">sel</span><span class="o">)</span>
+       <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="s">&quot;Unknown property query name &quot;</span> <span class="o">+</span>
+              <span class="n">columnNode</span><span class="o">.</span><span class="na">getChild</span><span class="o">(</span><span class="mi">0</span><span class="o">));</span>
+    <span class="k">else</span> <span class="nf">if</span> <span class="o">(</span><span class="n">sel</span> <span class="k">instanceof</span> <span class="n">ColumnReference</span><span class="o">)</span>
+       <span class="n">colRef</span> <span class="o">=</span> <span class="o">(</span><span class="n">ColumnReference</span><span class="o">)</span> <span class="n">sel</span><span class="o">;</span>
+
+   <span class="n">TypeDefinition</span> <span class="n">td</span> <span class="o">=</span> <span class="n">colRef</span><span class="o">.</span><span class="na">getTypeDefinition</span><span class="o">();</span>
+   <span class="n">PropertyDefinition</span> <span class="n">pd</span> <span class="o">=</span>
+       <span class="n">td</span><span class="o">.</span><span class="na">getPropertyDefinitions</span><span class="o">().</span><span class="na">get</span><span class="o">(</span><span class="n">colRef</span><span class="o">.</span><span class="na">getPropertyId</span><span class="o">());</span>
+
+<span class="o">}</span>
+</pre></div>
+
+
+<p>The right child node is a literal and you will get an Integer object with
+value 123. The left node is a reference to property and
+<code>getColumnReference()</code> will either give you a function (currently the only
+supported function is <code>SCORE()</code>) or a reference to a property in a type of
+your type system. The query object maintains several maps to resolve
+references. The key to the map is always the token index in the incoming
+token stream (an integer value). You can get the token index for each node
+by calling <code>getTokenStartIndex()</code> on the node.</p>
+<h2 id="building_the_result_list">Building the result list</h2>
+<p>After processing the query an <code>ObjectList</code> has to be returned containing the
+requested properties and function results. You can ask the query object for
+the requested information:</p>
+<div class="codehilite"><pre><span class="n">Map</span> <span class="n">props</span> <span class="o">=</span> <span class="n">queryObj</span><span class="o">.</span><span class="na">getRequestedProperties</span><span class="o">();</span>
+<span class="n">Map</span> <span class="n">funcs</span> <span class="o">=</span> <span class="n">queryObj</span><span class="o">.</span><span class="na">getRequestedFuncs</span><span class="o">();</span>
+</pre></div>
+
+
+<p>Key of the map is the query name and value is the alias if an alias was
+used in the statement or the query name otherwise.</p></div>
              <!-- Content -->
            </td>
           </tr>