You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Günter Hipler <vo...@gmail.com> on 2012/08/30 17:14:19 UTC

use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Hi all,

My query against an index is (I leaved out some of the facet fields)
f.navBranchlib.facet.limit=1000&
facet=on&facet.mincount=1&
facet.limit=100&
bq=navBranchlib:A100^1000&
bq=navBranchlib:UFSW^1000&
start=0&q=+(+%2Bmitbestimmung++)+&
facet.field=navNetwork&
qt=sb-bbfull-01
-> qt refers to an edismax query-parser

I get a result for the navNetwork facets which looks like

<lst name="navNetwork">
<int name="ids">3810</int>
<int name="nebis">2732</int>
<int name="idsbb">1945</int>
</lst>

using a fq Parameter to drill down against the navNetwork facets
facet=on&facet.mincount=1&
facet.limit=100&
q=(+(+%2Bmitbestimmung++)+)&
facet.field=navNetwork&
qt=sb-bbfull-01&
fq={!term+f%3DnavNetwork}nebis
delivers 2806 Documents - instead of the expected 2732


A boolean query instead of the fq is providing the correct result of 2732
documents
facet=on&facet.mincount=1&
facet.limit=100&
%2Bmitbestimmung+%2BnavNetwork:nebis&
facet.field=navNetwork&
qt=sb-bbfull-01&



The behaviour is not consistent. Some of the facets provide the correct
result, some not.
What I can't say for sure: The behaviour was correct (if I'm not wrong)
once the whole index was newly created. After running
some updates I got these results.
The application reflecting this behaviour is available under:
http://sb-tp1.swissbib.unibas.ch

We are using Lucene/SOLR since the end of last year and deployed regularly
the various nightly builds.
The last version this error(?) didn't appear is from March 2012. The
application using it is available under
http://baselbern.swissbib.ch
The target "books and more" is using the Lucene 4.0 march version. The
index is being updated several times a day and uses the same
filter queries as for Lucene/SOLR 4.0 beta and alpha.

My question:
- has something changed in the last versions or is this a bug?

Günter Hipler

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by Jack Krupansky <ja...@basetechnology.com>.
And what happens if you use a straight Lucene query for the filter query? 
Change:

    fq={!term+f%3DnavNetwork}nebis

to

    fq=navNetwork:nebis

What field type is navNetwork? String or text?

-- Jack Krupansky

-----Original Message----- 
From: Günter Hipler
Sent: Thursday, August 30, 2012 11:14 AM
To: solr-user@lucene.apache.org
Subject: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Hi all,

My query against an index is (I leaved out some of the facet fields)
f.navBranchlib.facet.limit=1000&
facet=on&facet.mincount=1&
facet.limit=100&
bq=navBranchlib:A100^1000&
bq=navBranchlib:UFSW^1000&
start=0&q=+(+%2Bmitbestimmung++)+&
facet.field=navNetwork&
qt=sb-bbfull-01
-> qt refers to an edismax query-parser

I get a result for the navNetwork facets which looks like

<lst name="navNetwork">
<int name="ids">3810</int>
<int name="nebis">2732</int>
<int name="idsbb">1945</int>
</lst>

using a fq Parameter to drill down against the navNetwork facets
facet=on&facet.mincount=1&
facet.limit=100&
q=(+(+%2Bmitbestimmung++)+)&
facet.field=navNetwork&
qt=sb-bbfull-01&
fq={!term+f%3DnavNetwork}nebis
delivers 2806 Documents - instead of the expected 2732


A boolean query instead of the fq is providing the correct result of 2732
documents
facet=on&facet.mincount=1&
facet.limit=100&
%2Bmitbestimmung+%2BnavNetwork:nebis&
facet.field=navNetwork&
qt=sb-bbfull-01&



The behaviour is not consistent. Some of the facets provide the correct
result, some not.
What I can't say for sure: The behaviour was correct (if I'm not wrong)
once the whole index was newly created. After running
some updates I got these results.
The application reflecting this behaviour is available under:
http://sb-tp1.swissbib.unibas.ch

We are using Lucene/SOLR since the end of last year and deployed regularly
the various nightly builds.
The last version this error(?) didn't appear is from March 2012. The
application using it is available under
http://baselbern.swissbib.ch
The target "books and more" is using the Lucene 4.0 march version. The
index is being updated several times a day and uses the same
filter queries as for Lucene/SOLR 4.0 beta and alpha.

My question:
- has something changed in the last versions or is this a bug?

Günter Hipler 


Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by "guenter.hipler@unibas.ch" <gu...@unibas.ch>.
feedback to patch:
I used build #85 (Revision: 1382192) to test the same use case (build up 
an initial index of 18 Mio and run updates with around 200.000 documents)

result: the use of fq to drill down facets is now consistent! (available 
under http://sb-tp1.swissbib.unibas.ch)

Thanks for providing a quick patch!!

-Günter

On 09/07/2012 05:09 PM, Erick Erickson wrote:
> Thank the guys who actually fixed it!
>
> Thanks for bringing this up, and please let us know if Yonik's patch fixes
> your problem....
>
> Best
> Erick
>
> On Thu, Sep 6, 2012 at 11:39 PM, guenter.hipler@unibas.ch
> <gu...@unibas.ch> wrote:
>> Erick, thanks for response!
>> Our use case is very straight forward and basic.
>> - no cloud infrastructure
>> - XMLUpdateRequest - handler (transformed library bibliographic data which
>> is pushed by the post.jar component). For deletions I used to use the solrJ
>> component until two month ago but because of the difficulties I read about I
>> changed back to the basic procedure with XML documents
>> - around 18 million documents, no distributed shards
>> - once the basic use case is stable and maintainable we are heading forward
>> to the more fancy things ;-)
>>
>> Yonik provided a patch (https://issues.apache.org/jira/browse/SOLR-3793)
>> yesterday morning. I'm going to run tests once it is part of the nightly
>> builds. By now, if I'm not wrong
>> (https://builds.apache.org/job/Solr-Artifacts-4.x/), the last build doesn't
>> contain it.
>>
>> Best wishes from Basel, Günter
>>
>>
>> On 09/07/2012 07:09 AM, Erick Erickson wrote:
>>> Guenter:
>>>
>>> Are you using SolrCloud or straight Solr? And were you updating in
>>> batches (i.e. updating multiple docs at once from SolrJ by using the
>>> server.add(doclist) form)?
>>>
>>> There was a bug in this process that caused various docs to show up
>>> in various shards differently. This has been fixed in 4x, any nightly
>>> build should have the fix.
>>>
>>> I'm absolutely grasping at straws here, but this was a weird case that
>>> I happen to know about...
>>>
>>> Hossman:
>>> of course this all goes up in smoke if you can reproduce this with any
>>> recent compilation of the code.
>>>
>>> FWIW
>>> Erick
>>>
>>> On Wed, Sep 5, 2012 at 11:29 PM, guenter.hipler@unibas.ch
>>> <gu...@unibas.ch> wrote:
>>>> Hoss, I'm so happy you realized the problem because I was quite worried
>>>> about it!!
>>>>
>>>> Let me know if I can provide support with testing it.
>>>> The last two days I was busy with migrating a bunch of hosts which should
>>>> -hopefully- be finished today.
>>>> Then I have again the infrastructure for running tests
>>>>
>>>> Günter
>>>>
>>>>
>>>> On 09/05/2012 11:19 PM, Chris Hostetter wrote:
>>>>> : Subject: Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0
>>>>>
>>>>> Günter, This is definitely strange
>>>>>
>>>>> The good news is, i can reproduce your problem.
>>>>> The bad news is, i can reproduce your problem - and i have no idea
>>>>> what's
>>>>> causing it.
>>>>>
>>>>> I've opened SOLR-3793 to try to get to the bottom of this, and included
>>>>> some basic steps to demonstrate the bug using the Solr 4.0-BETA example
>>>>> data, but i'm really not sure what the problem might be...
>>>>>
>>>>> https://issues.apache.org/jira/browse/SOLR-3793
>>>>>
>>>>>
>>>>> -Hoss
>>>>
>>>>
>>>> --
>>>> Universität Basel
>>>> Universitätsbibliothek
>>>> Günter Hipler
>>>> Projekt SwissBib
>>>> Schoenbeinstrasse 18-20
>>>> 4056 Basel, Schweiz
>>>> Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
>>>> E-Mailguenter.hipler@unibas.ch
>>>> URL:www.swissbib.org   /http://www.ub.unibas.ch/
>>>>
>>
>> --
>> Universität Basel
>> Universitätsbibliothek
>> Günter Hipler
>> Projekt SwissBib
>> Schoenbeinstrasse 18-20
>> 4056 Basel, Schweiz
>> Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
>> E-Mailguenter.hipler@unibas.ch
>> URL:www.swissbib.org   /http://www.ub.unibas.ch/
>>
>


-- 
Universität Basel
Universitätsbibliothek
Günter Hipler
Projekt SwissBib
Schoenbeinstrasse 18-20
4056 Basel, Schweiz
Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
E-Mailguenter.hipler@unibas.ch
URL:www.swissbib.org   /http://www.ub.unibas.ch/


Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by Erick Erickson <er...@gmail.com>.
Thank the guys who actually fixed it!

Thanks for bringing this up, and please let us know if Yonik's patch fixes
your problem....

Best
Erick

On Thu, Sep 6, 2012 at 11:39 PM, guenter.hipler@unibas.ch
<gu...@unibas.ch> wrote:
> Erick, thanks for response!
> Our use case is very straight forward and basic.
> - no cloud infrastructure
> - XMLUpdateRequest - handler (transformed library bibliographic data which
> is pushed by the post.jar component). For deletions I used to use the solrJ
> component until two month ago but because of the difficulties I read about I
> changed back to the basic procedure with XML documents
> - around 18 million documents, no distributed shards
> - once the basic use case is stable and maintainable we are heading forward
> to the more fancy things ;-)
>
> Yonik provided a patch (https://issues.apache.org/jira/browse/SOLR-3793)
> yesterday morning. I'm going to run tests once it is part of the nightly
> builds. By now, if I'm not wrong
> (https://builds.apache.org/job/Solr-Artifacts-4.x/), the last build doesn't
> contain it.
>
> Best wishes from Basel, Günter
>
>
> On 09/07/2012 07:09 AM, Erick Erickson wrote:
>>
>> Guenter:
>>
>> Are you using SolrCloud or straight Solr? And were you updating in
>> batches (i.e. updating multiple docs at once from SolrJ by using the
>> server.add(doclist) form)?
>>
>> There was a bug in this process that caused various docs to show up
>> in various shards differently. This has been fixed in 4x, any nightly
>> build should have the fix.
>>
>> I'm absolutely grasping at straws here, but this was a weird case that
>> I happen to know about...
>>
>> Hossman:
>> of course this all goes up in smoke if you can reproduce this with any
>> recent compilation of the code.
>>
>> FWIW
>> Erick
>>
>> On Wed, Sep 5, 2012 at 11:29 PM, guenter.hipler@unibas.ch
>> <gu...@unibas.ch> wrote:
>>>
>>> Hoss, I'm so happy you realized the problem because I was quite worried
>>> about it!!
>>>
>>> Let me know if I can provide support with testing it.
>>> The last two days I was busy with migrating a bunch of hosts which should
>>> -hopefully- be finished today.
>>> Then I have again the infrastructure for running tests
>>>
>>> Günter
>>>
>>>
>>> On 09/05/2012 11:19 PM, Chris Hostetter wrote:
>>>>
>>>> : Subject: Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0
>>>>
>>>> Günter, This is definitely strange
>>>>
>>>> The good news is, i can reproduce your problem.
>>>> The bad news is, i can reproduce your problem - and i have no idea
>>>> what's
>>>> causing it.
>>>>
>>>> I've opened SOLR-3793 to try to get to the bottom of this, and included
>>>> some basic steps to demonstrate the bug using the Solr 4.0-BETA example
>>>> data, but i'm really not sure what the problem might be...
>>>>
>>>> https://issues.apache.org/jira/browse/SOLR-3793
>>>>
>>>>
>>>> -Hoss
>>>
>>>
>>>
>>> --
>>> Universität Basel
>>> Universitätsbibliothek
>>> Günter Hipler
>>> Projekt SwissBib
>>> Schoenbeinstrasse 18-20
>>> 4056 Basel, Schweiz
>>> Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
>>> E-Mailguenter.hipler@unibas.ch
>>> URL:www.swissbib.org   /http://www.ub.unibas.ch/
>>>
>
>
> --
> Universität Basel
> Universitätsbibliothek
> Günter Hipler
> Projekt SwissBib
> Schoenbeinstrasse 18-20
> 4056 Basel, Schweiz
> Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
> E-Mailguenter.hipler@unibas.ch
> URL:www.swissbib.org   /http://www.ub.unibas.ch/
>

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by "guenter.hipler@unibas.ch" <gu...@unibas.ch>.
Erick, thanks for response!
Our use case is very straight forward and basic.
- no cloud infrastructure
- XMLUpdateRequest - handler (transformed library bibliographic data 
which is pushed by the post.jar component). For deletions I used to use 
the solrJ component until two month ago but because of the difficulties 
I read about I changed back to the basic procedure with XML documents
- around 18 million documents, no distributed shards
- once the basic use case is stable and maintainable we are heading 
forward to the more fancy things ;-)

Yonik provided a patch (https://issues.apache.org/jira/browse/SOLR-3793) 
yesterday morning. I'm going to run tests once it is part of the nightly 
builds. By now, if I'm not wrong 
(https://builds.apache.org/job/Solr-Artifacts-4.x/), the last build 
doesn't contain it.

Best wishes from Basel, Günter

On 09/07/2012 07:09 AM, Erick Erickson wrote:
> Guenter:
>
> Are you using SolrCloud or straight Solr? And were you updating in
> batches (i.e. updating multiple docs at once from SolrJ by using the
> server.add(doclist) form)?
>
> There was a bug in this process that caused various docs to show up
> in various shards differently. This has been fixed in 4x, any nightly
> build should have the fix.
>
> I'm absolutely grasping at straws here, but this was a weird case that
> I happen to know about...
>
> Hossman:
> of course this all goes up in smoke if you can reproduce this with any
> recent compilation of the code.
>
> FWIW
> Erick
>
> On Wed, Sep 5, 2012 at 11:29 PM, guenter.hipler@unibas.ch
> <gu...@unibas.ch> wrote:
>> Hoss, I'm so happy you realized the problem because I was quite worried
>> about it!!
>>
>> Let me know if I can provide support with testing it.
>> The last two days I was busy with migrating a bunch of hosts which should
>> -hopefully- be finished today.
>> Then I have again the infrastructure for running tests
>>
>> Günter
>>
>>
>> On 09/05/2012 11:19 PM, Chris Hostetter wrote:
>>> : Subject: Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0
>>>
>>> Günter, This is definitely strange
>>>
>>> The good news is, i can reproduce your problem.
>>> The bad news is, i can reproduce your problem - and i have no idea what's
>>> causing it.
>>>
>>> I've opened SOLR-3793 to try to get to the bottom of this, and included
>>> some basic steps to demonstrate the bug using the Solr 4.0-BETA example
>>> data, but i'm really not sure what the problem might be...
>>>
>>> https://issues.apache.org/jira/browse/SOLR-3793
>>>
>>>
>>> -Hoss
>>
>>
>> --
>> Universität Basel
>> Universitätsbibliothek
>> Günter Hipler
>> Projekt SwissBib
>> Schoenbeinstrasse 18-20
>> 4056 Basel, Schweiz
>> Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
>> E-Mailguenter.hipler@unibas.ch
>> URL:www.swissbib.org   /http://www.ub.unibas.ch/
>>


-- 
Universität Basel
Universitätsbibliothek
Günter Hipler
Projekt SwissBib
Schoenbeinstrasse 18-20
4056 Basel, Schweiz
Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
E-Mailguenter.hipler@unibas.ch
URL:www.swissbib.org   /http://www.ub.unibas.ch/


Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by Erick Erickson <er...@gmail.com>.
Guenter:

Are you using SolrCloud or straight Solr? And were you updating in
batches (i.e. updating multiple docs at once from SolrJ by using the
server.add(doclist) form)?

There was a bug in this process that caused various docs to show up
in various shards differently. This has been fixed in 4x, any nightly
build should have the fix.

I'm absolutely grasping at straws here, but this was a weird case that
I happen to know about...

Hossman:
of course this all goes up in smoke if you can reproduce this with any
recent compilation of the code.

FWIW
Erick

On Wed, Sep 5, 2012 at 11:29 PM, guenter.hipler@unibas.ch
<gu...@unibas.ch> wrote:
> Hoss, I'm so happy you realized the problem because I was quite worried
> about it!!
>
> Let me know if I can provide support with testing it.
> The last two days I was busy with migrating a bunch of hosts which should
> -hopefully- be finished today.
> Then I have again the infrastructure for running tests
>
> Günter
>
>
> On 09/05/2012 11:19 PM, Chris Hostetter wrote:
>>
>> : Subject: Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0
>>
>> Günter, This is definitely strange
>>
>> The good news is, i can reproduce your problem.
>> The bad news is, i can reproduce your problem - and i have no idea what's
>> causing it.
>>
>> I've opened SOLR-3793 to try to get to the bottom of this, and included
>> some basic steps to demonstrate the bug using the Solr 4.0-BETA example
>> data, but i'm really not sure what the problem might be...
>>
>> https://issues.apache.org/jira/browse/SOLR-3793
>>
>>
>> -Hoss
>
>
>
> --
> Universität Basel
> Universitätsbibliothek
> Günter Hipler
> Projekt SwissBib
> Schoenbeinstrasse 18-20
> 4056 Basel, Schweiz
> Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
> E-Mailguenter.hipler@unibas.ch
> URL:www.swissbib.org   /http://www.ub.unibas.ch/
>

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by "guenter.hipler@unibas.ch" <gu...@unibas.ch>.
Hoss, I'm so happy you realized the problem because I was quite worried 
about it!!

Let me know if I can provide support with testing it.
The last two days I was busy with migrating a bunch of hosts which 
should -hopefully- be finished today.
Then I have again the infrastructure for running tests

Günter

On 09/05/2012 11:19 PM, Chris Hostetter wrote:
> : Subject: Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0
>
> Günter, This is definitely strange
>
> The good news is, i can reproduce your problem.
> The bad news is, i can reproduce your problem - and i have no idea what's
> causing it.
>
> I've opened SOLR-3793 to try to get to the bottom of this, and included
> some basic steps to demonstrate the bug using the Solr 4.0-BETA example
> data, but i'm really not sure what the problem might be...
>
> https://issues.apache.org/jira/browse/SOLR-3793
>
>
> -Hoss


-- 
Universität Basel
Universitätsbibliothek
Günter Hipler
Projekt SwissBib
Schoenbeinstrasse 18-20
4056 Basel, Schweiz
Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
E-Mailguenter.hipler@unibas.ch
URL:www.swissbib.org   /http://www.ub.unibas.ch/


Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Günter, This is definitely strange

The good news is, i can reproduce your problem. 
The bad news is, i can reproduce your problem - and i have no idea what's 
causing it.

I've opened SOLR-3793 to try to get to the bottom of this, and included 
some basic steps to demonstrate the bug using the Solr 4.0-BETA example 
data, but i'm really not sure what the problem might be...

https://issues.apache.org/jira/browse/SOLR-3793


-Hoss

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by Günter Hipler <vo...@gmail.com>.
I made more tests with the Lucene/SOLR 4.0 version deployed in March and
the latest Lucene 4.0 beta version over the weekend.


My findings:

- the version deployed in march doesn't contain the error I now come across
in Beta4.0 (The number of documents part of the facetcounts differs
 from the real number of documents in a subsequent drill-down request using
a filter query)
This is true even in case a lot of updates were done against the index
At the moment this can be seen under
http://sb-tp1.swissbib.unibas.ch/(e.g. with the term 'mitbestimmung'
and the facet value  'nebis I used for
all my tests)
As a note: because we have to migrate the OS of our servers the host might
be down in the course of the current week for one or two days.

- using the latest Lucene/Solr Beta version, the error occurs when updates
are committed against the index as I described it in my former messages.
When the index is new and freshly built the error doesn't occur (I made
these tests on a host which is not accessible for the public)

>From my point of view this is a severe bug in Lucene/Solr Beta 4.0 because
filter queries are used very, very often!

I would be very happy if someone of the SOLR core team could comment it.

Thanks a lot for support!

Günter Hipler

2012/8/31 Günter Hipler <vo...@gmail.com>

>
> Hi,
>
> thanks for your responses!
>
> I made a more simple query with only one facet and without any boosting
> stuff so it should be easier to focus the problem
>
>
> facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true
> ->
> facet=on&
> facet.mincount=1&
> facet.limit=100&
> rows=0&
> start=0&
> q=+(+%2Bmitbestimmung++)+&
> facet.field=navNetwork&
> qt=only_queryfields_edismax&
> debugQuery=true
>
> facet counts say 2734 documents for nebis
> parsedQuery
> (+(+DisjunctionMaxQuery((title_series:mitbestimmung |
> title_uniform:mitbestimmung | authorfull:mitbestimmung |
> callnum:mitbestimmung | sfulltext:mitbestimmung | title_short:mitbestimmung
> | sbranchlib:mitbestimmung | bibid:mitbestimmung |
> sfullTextRemoteData:mitbestimmung | title_long:mitbestimmung |
> autnum:mitbestimmung | subfull:mitbestimmung |
> publplace:mitbestimmung))))/no_coord
> parsedQuery_toString
> +(+(title_series:mitbestimmung | title_uniform:mitbestimmung |
> authorfull:mitbestimmung | callnum:mitbestimmung | sfulltext:mitbestimmung
> | title_short:mitbestimmung | sbranchlib:mitbestimmung |
> bibid:mitbestimmung | sfullTextRemoteData:mitbestimmung |
> title_long:mitbestimmung | autnum:mitbestimmung | subfull:mitbestimmung |
> publplace:mitbestimmung))
>
>
>
> facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!term+f%3DnavNetwork}nebis
> ->
> facet=on&facet.mincount=1&
> facet.limit=100&
> rows=0&
> start=0&
> q=+(+%2Bmitbestimmung++)+&
> facet.field=navNetwork&
> qt=only_queryfields_edismax&
> debugQuery=true&
> fq={!term+f%3DnavNetwork}nebis
>
> delivers 2871 (not the same as the number indicated in the base query)
> What is interesting:
> the facetcount of the second query itself shows the 'correct' number
> indicated in the base query (2734)
>
> parsedQuery and parsedQuery_ToString same as in base query
> @Jack: and is exactly the same for a filter query with fq=navNetwork:nebis
> we are using the term query parser to overcome problems with escaping
> special characters (as it is also described in the
> Solr Enterprise Search server book on page 189)
>
>
> Using the alternatives suggested by Hoss
>
> http://sb-s7.swissbib.unibas.ch:8080/solr/collection1/select?facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+%28+%2Bmitbestimmung++%29+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!raw%20f=navNetwork}nebis<http://sb-s7.swissbib.unibas.ch:8080/solr/collection1/select?facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+%28+%2Bmitbestimmung++%29+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq=%7B!raw%20f=navNetwork%7Dnebis>
> and
>
> facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!lucene}navNetwork:nebis
> don't change the result. The number of returned documents is higher than
> it should be related to the number of facets in the facet counts displayed
> in the base query
>
>
> the type we are using for navNetwork:
>  <field name="navNetwork" type="stdID" multiValued="true" stored="false" />
> <!-- text field type for IDs of all sorts and colors, generic usage
> (20.03.2012/osc) -->
>    <fieldType name="stdID" class="solr.TextField" sortMissingLast="true"
> omitNorms="true">
>       <analyzer>
>          <tokenizer class="solr.KeywordTokenizerFactory" />
>          <filter class="solr.LowerCaseFilterFactory" />
>          <filter class="solr.PatternReplaceFilterFactory"
>             pattern="^(\([a-z]+\))vtls0"
>             replacement="$10"
>             replace="all"
>          />
>          <filter class="solr.PatternReplaceFilterFactory"
>             pattern="[^\w]+"
>             replacement=""
>             replace="all"
>          />
>          <filter class="solr.TrimFilterFactory" />
>          <filter class="solr.LengthFilterFactory" min="2" max="100" />
>       </analyzer>
>    </fieldType>
>
>
> which in my opinion should be a common treatment for facet types
>
> the new requestHandler I'm using is quite simple (without any boosting and
> other stuff as it is done in the original one):
>    <requestHandler default="true" name="only_queryfields_edismax"
> class="solr.SearchHandler">
>       <lst name="defaults">
>         <!-- use the extended dismax query parser -->
>         <str name="defType">edismax</str>
>         <str name="echoParams">explicit</str>
>         <str name="qf">
>           title_long title_short title_uniform title_series authorfull
>           publplace subfull sfulltext sfullTextRemoteData syear bibid
>           sbranchlib callnum autnum
>         </str>
>       </lst>
>     </requestHandler>
>
>
> What I try to do as next as soon as possible:
> - I'm going to setup a new index with the Lucene 4.0 version from March
> (to be more exactly: it's version 4.0-2012-03-09_11-29-20)
> to see what are the results even in case of frequent updates
>
> - setup a 'new' index with Lucene beta4 (without any updates) and to test
> more thoroughly if I get the same not consistent results (as it is
> currently after updating the index)
>
>
> Thanks a lot for your support!
>
> Günter
>
>
>
>
>
> 2012/8/30 Chris Hostetter <ho...@fucit.org>
>
>>
>> The "q" and "bq" params have changed slightly between your first query and
>> the query where you add the "fq" param ... because of how "bq" is
>> additively added to the main query, it's possible this difference may
>> account for the behavior your are seeing -- double check the debugQuery
>> output for your main query between teh two requests to see if they match
>> up.  Heck: you can try the second query w/o the "fq" and sanity check that
>> it still matches the same number of docs as the first query.
>>
>> If that's working fine, can you please give us more info about your
>> "navNetwork" field, how is it configured?
>>
>> if you could show us the debugQuery output and numFound for these simple
>> queries (no special requestHandler settings please) that would also be
>> helpful..
>>
>>         /select?q={!raw f=navNetwork}nebis
>>         /select?q={!term f=navNetwork}nebis
>>         /select?q={!lucene}navNetwork:nebis
>>
>>
>> : My query against an index is (I leaved out some of the facet fields)
>> : f.navBranchlib.facet.limit=1000&
>> : facet=on&facet.mincount=1&
>> : facet.limit=100&
>> : bq=navBranchlib:A100^1000&
>> : bq=navBranchlib:UFSW^1000&
>> : start=0&q=+(+%2Bmitbestimmung++)+&
>> : facet.field=navNetwork&
>> : qt=sb-bbfull-01
>> : -> qt refers to an edismax query-parser
>> :
>> : I get a result for the navNetwork facets which looks like
>> :
>> : <lst name="navNetwork">
>> : <int name="ids">3810</int>
>> : <int name="nebis">2732</int>
>> : <int name="idsbb">1945</int>
>> : </lst>
>> :
>> : using a fq Parameter to drill down against the navNetwork facets
>> : facet=on&facet.mincount=1&
>> : facet.limit=100&
>> : q=(+(+%2Bmitbestimmung++)+)&
>> : facet.field=navNetwork&
>> : qt=sb-bbfull-01&
>> : fq={!term+f%3DnavNetwork}nebis
>> : delivers 2806 Documents - instead of the expected 2732
>> :
>> :
>> : A boolean query instead of the fq is providing the correct result of
>> 2732
>> : documents
>> : facet=on&facet.mincount=1&
>> : facet.limit=100&
>> : %2Bmitbestimmung+%2BnavNetwork:nebis&
>> : facet.field=navNetwork&
>> : qt=sb-bbfull-01&
>> :
>> :
>> :
>> : The behaviour is not consistent. Some of the facets provide the correct
>> : result, some not.
>> : What I can't say for sure: The behaviour was correct (if I'm not wrong)
>> : once the whole index was newly created. After running
>> : some updates I got these results.
>> : The application reflecting this behaviour is available under:
>> : http://sb-tp1.swissbib.unibas.ch
>> :
>> : We are using Lucene/SOLR since the end of last year and deployed
>> regularly
>> : the various nightly builds.
>> : The last version this error(?) didn't appear is from March 2012. The
>> : application using it is available under
>> : http://baselbern.swissbib.ch
>> : The target "books and more" is using the Lucene 4.0 march version. The
>> : index is being updated several times a day and uses the same
>> : filter queries as for Lucene/SOLR 4.0 beta and alpha.
>> :
>> : My question:
>> : - has something changed in the last versions or is this a bug?
>> :
>> : Günter Hipler
>> :
>>
>> -Hoss
>
>
>

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by Günter Hipler <vo...@gmail.com>.
Hi,

thanks for your responses!

I made a more simple query with only one facet and without any boosting
stuff so it should be easier to focus the problem

facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true
->
facet=on&
facet.mincount=1&
facet.limit=100&
rows=0&
start=0&
q=+(+%2Bmitbestimmung++)+&
facet.field=navNetwork&
qt=only_queryfields_edismax&
debugQuery=true

facet counts say 2734 documents for nebis
parsedQuery
(+(+DisjunctionMaxQuery((title_series:mitbestimmung |
title_uniform:mitbestimmung | authorfull:mitbestimmung |
callnum:mitbestimmung | sfulltext:mitbestimmung | title_short:mitbestimmung
| sbranchlib:mitbestimmung | bibid:mitbestimmung |
sfullTextRemoteData:mitbestimmung | title_long:mitbestimmung |
autnum:mitbestimmung | subfull:mitbestimmung |
publplace:mitbestimmung))))/no_coord
parsedQuery_toString
+(+(title_series:mitbestimmung | title_uniform:mitbestimmung |
authorfull:mitbestimmung | callnum:mitbestimmung | sfulltext:mitbestimmung
| title_short:mitbestimmung | sbranchlib:mitbestimmung |
bibid:mitbestimmung | sfullTextRemoteData:mitbestimmung |
title_long:mitbestimmung | autnum:mitbestimmung | subfull:mitbestimmung |
publplace:mitbestimmung))


facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!term+f%3DnavNetwork}nebis
->
facet=on&facet.mincount=1&
facet.limit=100&
rows=0&
start=0&
q=+(+%2Bmitbestimmung++)+&
facet.field=navNetwork&
qt=only_queryfields_edismax&
debugQuery=true&
fq={!term+f%3DnavNetwork}nebis

delivers 2871 (not the same as the number indicated in the base query)
What is interesting:
the facetcount of the second query itself shows the 'correct' number
indicated in the base query (2734)

parsedQuery and parsedQuery_ToString same as in base query
@Jack: and is exactly the same for a filter query with fq=navNetwork:nebis
we are using the term query parser to overcome problems with escaping
special characters (as it is also described in the
Solr Enterprise Search server book on page 189)


Using the alternatives suggested by Hoss
http://sb-s7.swissbib.unibas.ch:8080/solr/collection1/select?facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+%28+%2Bmitbestimmung++%29+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!raw%20f=navNetwork}nebis
and
facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!lucene}navNetwork:nebis
don't change the result. The number of returned documents is higher than it
should be related to the number of facets in the facet counts displayed in
the base query


the type we are using for navNetwork:
 <field name="navNetwork" type="stdID" multiValued="true" stored="false" />
<!-- text field type for IDs of all sorts and colors, generic usage
(20.03.2012/osc) -->
   <fieldType name="stdID" class="solr.TextField" sortMissingLast="true"
omitNorms="true">
      <analyzer>
         <tokenizer class="solr.KeywordTokenizerFactory" />
         <filter class="solr.LowerCaseFilterFactory" />
         <filter class="solr.PatternReplaceFilterFactory"
            pattern="^(\([a-z]+\))vtls0"
            replacement="$10"
            replace="all"
         />
         <filter class="solr.PatternReplaceFilterFactory"
            pattern="[^\w]+"
            replacement=""
            replace="all"
         />
         <filter class="solr.TrimFilterFactory" />
         <filter class="solr.LengthFilterFactory" min="2" max="100" />
      </analyzer>
   </fieldType>


which in my opinion should be a common treatment for facet types

the new requestHandler I'm using is quite simple (without any boosting and
other stuff as it is done in the original one):
   <requestHandler default="true" name="only_queryfields_edismax"
class="solr.SearchHandler">
      <lst name="defaults">
        <!-- use the extended dismax query parser -->
        <str name="defType">edismax</str>
        <str name="echoParams">explicit</str>
        <str name="qf">
          title_long title_short title_uniform title_series authorfull
          publplace subfull sfulltext sfullTextRemoteData syear bibid
          sbranchlib callnum autnum
        </str>
      </lst>
    </requestHandler>


What I try to do as next as soon as possible:
- I'm going to setup a new index with the Lucene 4.0 version from March
(to be more exactly: it's version 4.0-2012-03-09_11-29-20)
to see what are the results even in case of frequent updates

- setup a 'new' index with Lucene beta4 (without any updates) and to test
more thoroughly if I get the same not consistent results (as it is
currently after updating the index)


Thanks a lot for your support!

Günter




2012/8/30 Chris Hostetter <ho...@fucit.org>

>
> The "q" and "bq" params have changed slightly between your first query and
> the query where you add the "fq" param ... because of how "bq" is
> additively added to the main query, it's possible this difference may
> account for the behavior your are seeing -- double check the debugQuery
> output for your main query between teh two requests to see if they match
> up.  Heck: you can try the second query w/o the "fq" and sanity check that
> it still matches the same number of docs as the first query.
>
> If that's working fine, can you please give us more info about your
> "navNetwork" field, how is it configured?
>
> if you could show us the debugQuery output and numFound for these simple
> queries (no special requestHandler settings please) that would also be
> helpful..
>
>         /select?q={!raw f=navNetwork}nebis
>         /select?q={!term f=navNetwork}nebis
>         /select?q={!lucene}navNetwork:nebis
>
>
> : My query against an index is (I leaved out some of the facet fields)
> : f.navBranchlib.facet.limit=1000&
> : facet=on&facet.mincount=1&
> : facet.limit=100&
> : bq=navBranchlib:A100^1000&
> : bq=navBranchlib:UFSW^1000&
> : start=0&q=+(+%2Bmitbestimmung++)+&
> : facet.field=navNetwork&
> : qt=sb-bbfull-01
> : -> qt refers to an edismax query-parser
> :
> : I get a result for the navNetwork facets which looks like
> :
> : <lst name="navNetwork">
> : <int name="ids">3810</int>
> : <int name="nebis">2732</int>
> : <int name="idsbb">1945</int>
> : </lst>
> :
> : using a fq Parameter to drill down against the navNetwork facets
> : facet=on&facet.mincount=1&
> : facet.limit=100&
> : q=(+(+%2Bmitbestimmung++)+)&
> : facet.field=navNetwork&
> : qt=sb-bbfull-01&
> : fq={!term+f%3DnavNetwork}nebis
> : delivers 2806 Documents - instead of the expected 2732
> :
> :
> : A boolean query instead of the fq is providing the correct result of 2732
> : documents
> : facet=on&facet.mincount=1&
> : facet.limit=100&
> : %2Bmitbestimmung+%2BnavNetwork:nebis&
> : facet.field=navNetwork&
> : qt=sb-bbfull-01&
> :
> :
> :
> : The behaviour is not consistent. Some of the facets provide the correct
> : result, some not.
> : What I can't say for sure: The behaviour was correct (if I'm not wrong)
> : once the whole index was newly created. After running
> : some updates I got these results.
> : The application reflecting this behaviour is available under:
> : http://sb-tp1.swissbib.unibas.ch
> :
> : We are using Lucene/SOLR since the end of last year and deployed
> regularly
> : the various nightly builds.
> : The last version this error(?) didn't appear is from March 2012. The
> : application using it is available under
> : http://baselbern.swissbib.ch
> : The target "books and more" is using the Lucene 4.0 march version. The
> : index is being updated several times a day and uses the same
> : filter queries as for Lucene/SOLR 4.0 beta and alpha.
> :
> : My question:
> : - has something changed in the last versions or is this a bug?
> :
> : Günter Hipler
> :
>
> -Hoss

Re: use of filter queries in Lucene/Solr Alpha40 and Beta4.0

Posted by Chris Hostetter <ho...@fucit.org>.
The "q" and "bq" params have changed slightly between your first query and 
the query where you add the "fq" param ... because of how "bq" is 
additively added to the main query, it's possible this difference may 
account for the behavior your are seeing -- double check the debugQuery 
output for your main query between teh two requests to see if they match 
up.  Heck: you can try the second query w/o the "fq" and sanity check that 
it still matches the same number of docs as the first query.

If that's working fine, can you please give us more info about your 
"navNetwork" field, how is it configured?

if you could show us the debugQuery output and numFound for these simple 
queries (no special requestHandler settings please) that would also be 
helpful..

	/select?q={!raw f=navNetwork}nebis
	/select?q={!term f=navNetwork}nebis
	/select?q={!lucene}navNetwork:nebis


: My query against an index is (I leaved out some of the facet fields)
: f.navBranchlib.facet.limit=1000&
: facet=on&facet.mincount=1&
: facet.limit=100&
: bq=navBranchlib:A100^1000&
: bq=navBranchlib:UFSW^1000&
: start=0&q=+(+%2Bmitbestimmung++)+&
: facet.field=navNetwork&
: qt=sb-bbfull-01
: -> qt refers to an edismax query-parser
: 
: I get a result for the navNetwork facets which looks like
: 
: <lst name="navNetwork">
: <int name="ids">3810</int>
: <int name="nebis">2732</int>
: <int name="idsbb">1945</int>
: </lst>
: 
: using a fq Parameter to drill down against the navNetwork facets
: facet=on&facet.mincount=1&
: facet.limit=100&
: q=(+(+%2Bmitbestimmung++)+)&
: facet.field=navNetwork&
: qt=sb-bbfull-01&
: fq={!term+f%3DnavNetwork}nebis
: delivers 2806 Documents - instead of the expected 2732
: 
: 
: A boolean query instead of the fq is providing the correct result of 2732
: documents
: facet=on&facet.mincount=1&
: facet.limit=100&
: %2Bmitbestimmung+%2BnavNetwork:nebis&
: facet.field=navNetwork&
: qt=sb-bbfull-01&
: 
: 
: 
: The behaviour is not consistent. Some of the facets provide the correct
: result, some not.
: What I can't say for sure: The behaviour was correct (if I'm not wrong)
: once the whole index was newly created. After running
: some updates I got these results.
: The application reflecting this behaviour is available under:
: http://sb-tp1.swissbib.unibas.ch
: 
: We are using Lucene/SOLR since the end of last year and deployed regularly
: the various nightly builds.
: The last version this error(?) didn't appear is from March 2012. The
: application using it is available under
: http://baselbern.swissbib.ch
: The target "books and more" is using the Lucene 4.0 march version. The
: index is being updated several times a day and uses the same
: filter queries as for Lucene/SOLR 4.0 beta and alpha.
: 
: My question:
: - has something changed in the last versions or is this a bug?
: 
: Günter Hipler
: 

-Hoss