You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Code Ferret (JIRA)" <ji...@apache.org> on 2019/07/10 19:40:00 UTC

[jira] [Commented] (JENA-1723) jena:text create OR's of Lucene fields

    [ https://issues.apache.org/jira/browse/JENA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882386#comment-16882386 ] 

Code Ferret commented on JENA-1723:
-----------------------------------

I've made [PRĀ 582|https://github.com/apache/jena/pull/582] for this issue.

The PR implements allowing for a list of properties in a {{text:query}} that are then or'd in the Lucene query as described above:

{code}
(?s ?sc ?lit ?graph ?prop) text:query ( skos:prefLabel skos:altLabel rdfs:label "some query" "highlight:" )
{code}

and the feature of being able to name a list of properties for later use as described above:

{code}
(?s ?sc ?lit ?graph ?prop) text:query ( ex:labels "some query" "highlight:" )
{code}

The PR includes detailed unit tests.

> jena:text create OR's of Lucene fields
> --------------------------------------
>
>                 Key: JENA-1723
>                 URL: https://issues.apache.org/jira/browse/JENA-1723
>             Project: Apache Jena
>          Issue Type: New Feature
>          Components: Jena
>    Affects Versions: Jena 3.13.0
>            Reporter: Code Ferret
>            Assignee: Code Ferret
>            Priority: Minor
>              Labels: pull-request-available
>
> h3. Motivation:
> With the current {{jena:text}} we often find that we have query patterns such as:
> {code}
> select ?foo where {
>   {
>      (?s ?sc ?lit) text:query ( rdfs:label "some query" "highlight:" ).
>   }
>   union
>   {
>     (?s ?sc ?lit) text:query ( skos:altLabel "some query" "highlight:" ).
>   }
>   union
>   { 
>     (?s ?sc ?lit) text:query ( skos:prefLabel "some query" "highlight:").
>   }
> }
> {code}
> For various sets of RDF properties, each corresponding to some Lucene field.
> It can be more performant to _push_ the {{unions}} into the Lucene query by rewriting as:
> {code}
> (altLabel:"some query" OR prefLabel:"some query" OR label:"some query")
> {code}
> Then it's a single query with Lucene performing the {{unions}}.
> h3. Approach:
> We've implemented this by 
> 1. adding a new assembler feature in {{text:TextIndexLucene}}:
> {code}
> [] text:props (
>     text:propList [ text:propListProp  ex:labels ;
>          text:props ( skos:prefLabel skos:altLabel rdfs:label ) ]
> } ;
> {code}
> Which allows to give a single _Property_ id, e.g., {{ex:labels}}, to a list of properties.
> and
> 2. adding some syntax to the {{TextQueryPF}}:
> {code}
> (?s ?sc ?lit ?graph ?prop) text:query ( text:props ex:labels "some query" "highlight:" )
> {code}
> The addition of the fifth output arg, {{?prop}}, allows to return the specific property that matched and if the input args includes {{text:props}} as the first argument then there must be a list, of at least one, properties prior to the query string. These properties are either the usual Lucene indexed properties that occur in {{text:query}} or a property list property such as {{ex:labels}} above.
> When a list property is encountered it is expanded to the underlying list of indexed properties from the configuration.
> There may be any mix of indexed and property list properties following {{text:props}} in the input arg list:
> {code}
> (?s ?sc ?lit ?graph ?prop) text:query ( text:props ex:labels rdfs:comment "some query" "highlight:" )
> {code}
> which searches over the three properties listed in {{ex:labels}} and the property {{rdfs:comment}}.
> This functionality is implemented, including copious tests, and a PR can be issued after a bit of code cleanup.
> h3. Discussion:
> The use of {{text:props}} in the query form isn't strictly necessary, and was introduced as a way of indicating the intent to have a list of properties to be searched over. 
> If the {{text:props}} _flag_ is removed from the implementation then the feature will simply check the property(s) for whether they are list properties or just indexed properties.
> With this modification the above queries would be written simply as:
> {code}
> (?s ?sc ?lit ?graph ?prop) text:query ( ex:labels "some query" "highlight:" )
> {code}
> or
> {code}
> (?s ?sc ?lit ?graph ?prop) text:query ( ex:labels rdfs:comment "some query" "highlight:" )
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)