You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Vasyl Danyliuk (JIRA)" <ji...@apache.org> on 2018/12/04 09:55:00 UTC

[jira] [Commented] (JENA-1645) Poor performance with full text search (Lucene)

    [ https://issues.apache.org/jira/browse/JENA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708490#comment-16708490 ] 

Vasyl Danyliuk commented on JENA-1645:
--------------------------------------

I have written some code that uses subject URI as an additional constraint and it works much faster in my case, but not sure if there can be any problems in more general cases.

> Poor performance with full text search (Lucene)
> -----------------------------------------------
>
>                 Key: JENA-1645
>                 URL: https://issues.apache.org/jira/browse/JENA-1645
>             Project: Apache Jena
>          Issue Type: Question
>          Components: Jena
>    Affects Versions: Jena 3.9.0
>            Reporter: Vasyl Danyliuk
>            Priority: Major
>
> Situation: half of a million of an indexed by Lucene documents(emails actually), searching for emails by sender/receiver and some text.
> If to put text filter in the start of SPARQL query it executes once but in a case of very common words here are a lot of results(100 000+) that leads to poor performance, limiting results count may and up with missed results.
> If to put text search as the last condition it executes once per each already found subject. That's completely OK but text search completely ignores subject URI.
> I found two methods in TextQueryPF class: variableSubject(...) for the first case, and concreteSubject(...) for the second one.
> The question is: why can't subject URI be used as a constraint in the text search?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)