You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mindaugas Žakšauskas <mi...@gmail.com> on 2014/01/06 12:33:43 UTC

Reusability of QueryParser

Hi,

I was wondering if a QueryParser can be reused (Lucene ver: 4.6.0)?
>From my experiment it looks like it retains some state from the
previous query.

Isolated example:

public class Test {

    public static void main(String[] args) throws ParseException, IOException {
        MyAnalyzer analyzer = new MyAnalyzer();
        QueryParser qp = new QueryParser(Version.LUCENE_46, "x", analyzer);
        Query q1 = qp.parse("foo:Moo");
        Query q2 = qp.parse("bar:Meh");
        System.out.println(q1);
        System.out.println(q2);
        Query q3 = new QueryParser(Version.LUCENE_46, "x", new MyAnalyzer())
                                       .parse("bar:Baz");
        System.out.println(q3);
    }

    private static final class MyAnalyzer extends Analyzer {
        @Override
        protected TokenStreamComponents createComponents(String field,
Reader reader) {
            KeywordTokenizer source = new KeywordTokenizer(reader);
            if ("foo".equals(field)) {
                return new TokenStreamComponents(source, new
StandardFilter(Version.LUCENE_46, source));
            } else {
                return new TokenStreamComponents(source, new
LowerCaseFilter(Version.LUCENE_46, source));
            }
        }
    }
}

Above prints:
foo:Moo
bar:Meh
bar:baz

Comment to the above code: MyAnalyzer is a simple analyzer which
behaves slightly differently based on the field: for "foo" field it
uses StandardFilter and for all other fields ("bar" in this case) it
uses LowerCaseFilter. So in the main method when parsing q2, I expect
to get "bar:meh" but I get "bar:Meh". At the same time, if i don't
reuse the QueryParser, I get "bar:baz" for the third query which is
the correct behaviour.

I was wondering if this is a bug of QueryParser, or do I miss something?

Regards,
Mindaugas

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Reusability of QueryParser

Posted by Mindaugas Žakšauskas <mi...@gmail.com>.
Hi,

This had helped, thank you Uwe!

Regards,
Mindaugas

On Mon, Jan 6, 2014 at 12:48 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi,
>
> The problem is your Analyzer: As it returns different components for each field name, the constructor must pass the per field reuse strategy. By default it uses GLOBAL_REUSE_STRATEGY (the no-arg constructor does this). So create the Analyzer with PER_FIELD_REUSE_STRATEGY explicitely passed to super's ctor.
>
> With separate instances of Analyzer and QueryParser, it will not reuse (as a new instance is created each time). This also affects the indexer. Once you index more than one document, the same will happen (it will reuse the components for the first field name encountered).
>
> Uwe

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Reusability of QueryParser

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

The problem is your Analyzer: As it returns different components for each field name, the constructor must pass the per field reuse strategy. By default it uses GLOBAL_REUSE_STRATEGY (the no-arg constructor does this). So create the Analyzer with PER_FIELD_REUSE_STRATEGY explicitely passed to super's ctor.

With separate instances of Analyzer and QueryParser, it will not reuse (as a new instance is created each time). This also affects the indexer. Once you index more than one document, the same will happen (it will reuse the components for the first field name encountered).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Mindaugas Žakšauskas [mailto:mindas@gmail.com]
> Sent: Monday, January 06, 2014 12:34 PM
> To: java-user@lucene.apache.org
> Subject: Reusability of QueryParser
> 
> Hi,
> 
> I was wondering if a QueryParser can be reused (Lucene ver: 4.6.0)?
> From my experiment it looks like it retains some state from the previous
> query.
> 
> Isolated example:
> 
> public class Test {
> 
>     public static void main(String[] args) throws ParseException, IOException {
>         MyAnalyzer analyzer = new MyAnalyzer();
>         QueryParser qp = new QueryParser(Version.LUCENE_46, "x", analyzer);
>         Query q1 = qp.parse("foo:Moo");
>         Query q2 = qp.parse("bar:Meh");
>         System.out.println(q1);
>         System.out.println(q2);
>         Query q3 = new QueryParser(Version.LUCENE_46, "x", new
> MyAnalyzer())
>                                        .parse("bar:Baz");
>         System.out.println(q3);
>     }
> 
>     private static final class MyAnalyzer extends Analyzer {
>         @Override
>         protected TokenStreamComponents createComponents(String field,
> Reader reader) {
>             KeywordTokenizer source = new KeywordTokenizer(reader);
>             if ("foo".equals(field)) {
>                 return new TokenStreamComponents(source, new
> StandardFilter(Version.LUCENE_46, source));
>             } else {
>                 return new TokenStreamComponents(source, new
> LowerCaseFilter(Version.LUCENE_46, source));
>             }
>         }
>     }
> }
> 
> Above prints:
> foo:Moo
> bar:Meh
> bar:baz
> 
> Comment to the above code: MyAnalyzer is a simple analyzer which behaves
> slightly differently based on the field: for "foo" field it uses StandardFilter
> and for all other fields ("bar" in this case) it uses LowerCaseFilter. So in the
> main method when parsing q2, I expect to get "bar:meh" but I get
> "bar:Meh". At the same time, if i don't reuse the QueryParser, I get "bar:baz"
> for the third query which is the correct behaviour.
> 
> I was wondering if this is a bug of QueryParser, or do I miss something?
> 
> Regards,
> Mindaugas
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org