You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pylucene-dev@lucene.apache.org by Daniel Rech <da...@nwebs.de> on 2010/08/17 19:39:10 UTC

setAllowLeadingWildcard and PythonMultiFieldQueryParser

 Hello,

I'd like to use the setAllowLeadingWildcard method with
PythonMultiFieldQueryParser but I always get a

lucene.JavaError: org.apache.lucene.queryParser.ParseException: Cannot
parse '*a': '*' or '?' not allowed as first character in WildcardQuery

It works fine with PythonQueryParser but not with
PythonMultiFieldQueryParser.
Some code to trace what I mean:


from lucene import StandardAnalyzer, IndexSearcher, PythonQueryParser, \
    PythonMultiFieldQueryParser, BooleanClause, Version, initVM, \
    SimpleFSDirectory, File
initVM()
indexDir = "index-dir"
analyzer = StandardAnalyzer(Version.LUCENE_CURRENT)
searcher = IndexSearcher(SimpleFSDirectory(File(indexDir)), True)

querystring="*a"

parser = PythonQueryParser(Version.LUCENE_CURRENT, 'name', analyzer)
parser.setAllowLeadingWildcard(True)
query = parser.parse(querystring)
hits = searcher.search(query, 10)
print "PythonQueryParser works just fine:",hits

parser = PythonMultiFieldQueryParser(Version.LUCENE_CURRENT, ['name'],
analyzer)
parser.setAllowLeadingWildcard(True)
query=parser.parse(Version.LUCENE_CURRENT, querystring, ['name'],
    [BooleanClause.Occur.SHOULD], analyzer)
# lucene.JavaError will occur
hits = searcher.search(query, 10)


Is it my code or is setAllowLeadingWildcard not working properly with
PythonMultiFieldQueryParser?

Thanks,
Daniel

Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

Posted by Christoph Burgmer <cb...@ira.uka.de>.
Hi, thanks for pointing out the issue here.

The misunderstanding with the static method actually started with the code 
used in test PythonMultiFieldQueryParserTestCase 
(test/test_PythonQueryParser.py) which calls the static method on an instance.

The provided workaround should solve our issue here, thanks again.

-Christoph (now finally subscribed to the correct pylucene list)

Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

Posted by Andi Vajda <va...@apache.org>.
On Tue, 17 Aug 2010, Daniel Rech wrote:

> I'd like to use the setAllowLeadingWildcard method with
> PythonMultiFieldQueryParser but I always get a
>
> lucene.JavaError: org.apache.lucene.queryParser.ParseException: Cannot
> parse '*a': '*' or '?' not allowed as first character in WildcardQuery
>
> It works fine with PythonQueryParser but not with
> PythonMultiFieldQueryParser.
> Some code to trace what I mean:

You may have better luck asking your question on java-user@lucene.apache.org 
where more Lucene experts can answer this type of Lucene usage question.
(unless you found a bug with PyLucene, of course)

Andi..

>
> from lucene import StandardAnalyzer, IndexSearcher, PythonQueryParser, \
>    PythonMultiFieldQueryParser, BooleanClause, Version, initVM, \
>    SimpleFSDirectory, File
> initVM()
> indexDir = "index-dir"
> analyzer = StandardAnalyzer(Version.LUCENE_CURRENT)
> searcher = IndexSearcher(SimpleFSDirectory(File(indexDir)), True)
>
> querystring="*a"
>
> parser = PythonQueryParser(Version.LUCENE_CURRENT, 'name', analyzer)
> parser.setAllowLeadingWildcard(True)
> query = parser.parse(querystring)
> hits = searcher.search(query, 10)
> print "PythonQueryParser works just fine:",hits
>
> parser = PythonMultiFieldQueryParser(Version.LUCENE_CURRENT, ['name'],
> analyzer)
> parser.setAllowLeadingWildcard(True)
> query=parser.parse(Version.LUCENE_CURRENT, querystring, ['name'],
>    [BooleanClause.Occur.SHOULD], analyzer)
> # lucene.JavaError will occur
> hits = searcher.search(query, 10)
>
>
> Is it my code or is setAllowLeadingWildcard not working properly with
> PythonMultiFieldQueryParser?
>
> Thanks,
> Daniel
>

Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

Posted by Daniel Rech <da...@nwebs.de>.
 Problem solved as described in
http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201008.mbox/%3CD7DA5782-0280-44E3-A783-8FF631287E87@gmail.com%3E

Thanks for all your help!
Daniel

Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

Posted by Andi Vajda <va...@apache.org>.
On Thu, 19 Aug 2010, Aric Coady wrote:

> On Aug 18, 2010, at 10:13 PM, Andi Vajda wrote:
>> On Wed, 18 Aug 2010, Aric Coady wrote:
>>> #query = queryParser.parse(queryString)
>>> query = queryParser.parse(Version.LUCENE_CURRENT, queryString, fields,
>>>                         [BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD],
>>>                         analyzer)
>>>
>> Whenever there is a name conflict between a static and non-static method detected by JCC, the static method wrapper is renamed to be suffixed with a '_' and a warning is emitted by JCC.
>>
>> Does changing the code to use a parse_() method instead solve the problem ?
>> (it's late here and I haven't tried it myself)
>
> Ah, so there are couple different things going on here. 
> MultiFieldQueryParser has only static parse methods, except that it also 
> inherits QueryParse.parse.  Perhaps that's why JCC isn't supplying a 
> parse_ method.

Yes, that's a probable limitation of the conflict/renaming logic in JCC.

>>>> lucene.MultiFieldQueryParser.parse
> <built-in method parse of type object at 0x10171d800>
>>>> lucene.MultiFieldQueryParser.parse_
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> AttributeError: type object 'MultiFieldQueryParser' has no attribute 'parse_'
>>>> lucene.QueryParser.parse
> <method 'parse' of 'QueryParser' objects>
>
> This gotcha has come up before: 
> http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201007.mbox/%3cAANLkTinkHxsiQP7JljZ1Q0CY6cv03Y5uMyZvG8a5dtyM@mail.gmail.com%3e. 
> But as known limitations go, it's an easy workaround.  Just call 
> QueryParser.parse with the parser object as the first argument.

I had forgotten about this but yes, this should work.

> As for the wildcard issue, I was trying to point out that I don't think 
> it's a pylucene problem at all.  The example given was calling the static 
> MultiFieldQueryParser.parse with a parser object, incorrectly expecting 
> settings on the parser object to have an affect.  The fact that calling 
> queryParser.parse(queryString) raises a TypeError is technically 
> unrelated, although probably adding to the confusion.

Hence my first inclination to send the user to check out the problem on 
java-user@lucene.apache.org.

Thanks for debugging this, I've been travelling and I'm not very responsive.
I'm back now.

Andi..

Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

Posted by Aric Coady <ar...@gmail.com>.
On Aug 18, 2010, at 10:13 PM, Andi Vajda wrote:
> On Wed, 18 Aug 2010, Aric Coady wrote:
>> #query = queryParser.parse(queryString)
>> query = queryParser.parse(Version.LUCENE_CURRENT, queryString, fields,
>>                         [BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD],
>>                         analyzer)
>> 
> Whenever there is a name conflict between a static and non-static method detected by JCC, the static method wrapper is renamed to be suffixed with a '_' and a warning is emitted by JCC.
> 
> Does changing the code to use a parse_() method instead solve the problem ?
> (it's late here and I haven't tried it myself)

Ah, so there are couple different things going on here.  MultiFieldQueryParser has only static parse methods, except that it also inherits QueryParse.parse.  Perhaps that's why JCC isn't supplying a parse_ method.

>>> lucene.MultiFieldQueryParser.parse
<built-in method parse of type object at 0x10171d800>
>>> lucene.MultiFieldQueryParser.parse_
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'MultiFieldQueryParser' has no attribute 'parse_'
>>> lucene.QueryParser.parse
<method 'parse' of 'QueryParser' objects>

This gotcha has come up before:  http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201007.mbox/%3cAANLkTinkHxsiQP7JljZ1Q0CY6cv03Y5uMyZvG8a5dtyM@mail.gmail.com%3e.  But as known limitations go, it's an easy workaround.  Just call QueryParser.parse with the parser object as the first argument.

As for the wildcard issue, I was trying to point out that I don't think it's a pylucene problem at all.  The example given was calling the static MultiFieldQueryParser.parse with a parser object, incorrectly expecting settings on the parser object to have an affect.  The fact that calling queryParser.parse(queryString) raises a TypeError is technically unrelated, although probably adding to the confusion.


Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

Posted by Andi Vajda <va...@apache.org>.
On Wed, 18 Aug 2010, Aric Coady wrote:

> Your python example differs from the java example.  The java one uses the call that's commented out:
>
> #query = queryParser.parse(queryString)
> query = queryParser.parse(Version.LUCENE_CURRENT, queryString, fields,
>                          [BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD],
>                          analyzer)
>
> The problem is that the second form is a static method, so settings on the parser object won't affect anything.  Try it this way to reproduce the error:
>
> query = MultiFieldQueryParser.parse(Version.LUCENE_CURRENT, queryString, fields,
>                          [BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD],
>                          analyzer)
>
> You should see the same behavior in java lucene.

Whenever there is a name conflict between a static and non-static method 
detected by JCC, the static method wrapper is renamed to be suffixed with a 
'_' and a warning is emitted by JCC.

Does changing the code to use a parse_() method instead solve the problem ?
(it's late here and I haven't tried it myself)

Andi..

>
> On Aug 18, 2010, at 5:21 AM, Christoph Burgmer wrote:
>
>> Hi Andy,
>>
>> here's a short example in both Java and Python that do the same. The Python
>> example raises the error as mentioned by Daniel. I've also added a patch to
>> extend the current test cases to include tests for setAllowLeadingWildcard().
>>
>> -Christoph
>> <LeadingWildcardExample.java><LeadingWildcardExample.py><leadingwildcard_test.patch>
>

Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

Posted by Aric Coady <ar...@gmail.com>.
Your python example differs from the java example.  The java one uses the call that's commented out:

#query = queryParser.parse(queryString)
query = queryParser.parse(Version.LUCENE_CURRENT, queryString, fields,
                          [BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD],
                          analyzer)

The problem is that the second form is a static method, so settings on the parser object won't affect anything.  Try it this way to reproduce the error:

query = MultiFieldQueryParser.parse(Version.LUCENE_CURRENT, queryString, fields,
                          [BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD],
                          analyzer)

You should see the same behavior in java lucene.

On Aug 18, 2010, at 5:21 AM, Christoph Burgmer wrote:

> Hi Andy,
> 
> here's a short example in both Java and Python that do the same. The Python 
> example raises the error as mentioned by Daniel. I've also added a patch to 
> extend the current test cases to include tests for setAllowLeadingWildcard().
> 
> -Christoph
> <LeadingWildcardExample.java><LeadingWildcardExample.py><leadingwildcard_test.patch>


Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

Posted by Christoph Burgmer <cb...@ira.uka.de>.
Hi Andy,

here's a short example in both Java and Python that do the same. The Python 
example raises the error as mentioned by Daniel. I've also added a patch to 
extend the current test cases to include tests for setAllowLeadingWildcard().

-Christoph