You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Manepalli, Kalyan" <Ka...@orbitz.com> on 2008/11/20 21:23:20 UTC
Filtering on blank fields
Hi,
I want to fetch only the documents which have a certain
field.
For this I am using a fq query like this
fq=rev.comments:[* TO *]
rev.comments fields is of type string.
The functionality works correctly but I am seeing a performance
degradation
Without the above fq, the QTime is around 300ms
With fq, the QTime jumps to 850ms
Is there any known issue with range query on String fields
Is there any other efficient way to do this.
Any suggestions in this regard will be very helpful
Thanks,
Kalyan Manepalli
RE: Filtering on blank fields
Posted by Lance Norskog <go...@gmail.com>.
The problem with a zero-length string "" is that it is also returned by:
field:[* TO *]. So you don't know if you're doing this right or not. For
those of us who cannot reindex at the drop of a hat, this is a big deal. We
went with -1.
Lance
-----Original Message-----
From: Manepalli, Kalyan [mailto:Kalyan.Manepalli@orbitz.com]
Sent: Thursday, November 20, 2008 12:58 PM
To: solr-user@lucene.apache.org
Subject: RE: Filtering on blank fields
Hi Mike,
Thanks for the suggestion, I will test it out and post the results
Thanks,
Kalyan Manepalli
-----Original Message-----
From: Mike Klaas [mailto:mike.klaas@gmail.com]
Sent: Thursday, November 20, 2008 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Filtering on blank fields
On 20-Nov-08, at 12:23 PM, Manepalli, Kalyan wrote:
> Hi,
>
> I want to fetch only the documents which have a certain
> field.
>
> For this I am using a fq query like this
>
> fq=rev.comments:[* TO *]
>
>
>
> rev.comments fields is of type string.
>
> The functionality works correctly but I am seeing a performance
> degradation
>
> Without the above fq, the QTime is around 300ms
>
> With fq, the QTime jumps to 850ms
>
>
>
> Is there any known issue with range query on String fields
>
> Is there any other efficient way to do this.
This is an inverted index at its worst, unfortunately (to look for an
empty field, you are enumerating every possible value of that field
and excluding the docs containing it).
The solution is to store a token indicating that the field is empty,
such as "<nocomment>" (I think that "" works too). Then change your
fq to
fq=-comments:"<nocomment>"
It should be much faster.
-Mike
If you are not the intended recipient of this e-mail message, please notify
the sender
and delete all copies immediately. The sender believes this message and any
attachments
were sent free of any virus, worm, Trojan horse, and other forms of
malicious code.
This message and its attachments could have been infected during
transmission. The
recipient opens any attachments at the recipient's own risk, and in so
doing, the
recipient accepts full responsibility for such actions and agrees to take
protective
and remedial action relating to any malicious code. Travelport is not liable
for any
loss or damage arising from this message or its attachments.
RE: Filtering on blank fields
Posted by "Manepalli, Kalyan" <Ka...@orbitz.com>.
Hi Mike,
Thanks for the suggestion, I will test it out and post the
results
Thanks,
Kalyan Manepalli
-----Original Message-----
From: Mike Klaas [mailto:mike.klaas@gmail.com]
Sent: Thursday, November 20, 2008 2:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Filtering on blank fields
On 20-Nov-08, at 12:23 PM, Manepalli, Kalyan wrote:
> Hi,
>
> I want to fetch only the documents which have a certain
> field.
>
> For this I am using a fq query like this
>
> fq=rev.comments:[* TO *]
>
>
>
> rev.comments fields is of type string.
>
> The functionality works correctly but I am seeing a performance
> degradation
>
> Without the above fq, the QTime is around 300ms
>
> With fq, the QTime jumps to 850ms
>
>
>
> Is there any known issue with range query on String fields
>
> Is there any other efficient way to do this.
This is an inverted index at its worst, unfortunately (to look for an
empty field, you are enumerating every possible value of that field
and excluding the docs containing it).
The solution is to store a token indicating that the field is empty,
such as "<nocomment>" (I think that "" works too). Then change your
fq to
fq=-comments:"<nocomment>"
It should be much faster.
-Mike
If you are not the intended recipient of this e-mail message, please notify the sender
and delete all copies immediately. The sender believes this message and any attachments
were sent free of any virus, worm, Trojan horse, and other forms of malicious code.
This message and its attachments could have been infected during transmission. The
recipient opens any attachments at the recipient's own risk, and in so doing, the
recipient accepts full responsibility for such actions and agrees to take protective
and remedial action relating to any malicious code. Travelport is not liable for any
loss or damage arising from this message or its attachments.
Re: Filtering on blank fields
Posted by Mike Klaas <mi...@gmail.com>.
On 20-Nov-08, at 12:23 PM, Manepalli, Kalyan wrote:
> Hi,
>
> I want to fetch only the documents which have a certain
> field.
>
> For this I am using a fq query like this
>
> fq=rev.comments:[* TO *]
>
>
>
> rev.comments fields is of type string.
>
> The functionality works correctly but I am seeing a performance
> degradation
>
> Without the above fq, the QTime is around 300ms
>
> With fq, the QTime jumps to 850ms
>
>
>
> Is there any known issue with range query on String fields
>
> Is there any other efficient way to do this.
This is an inverted index at its worst, unfortunately (to look for an
empty field, you are enumerating every possible value of that field
and excluding the docs containing it).
The solution is to store a token indicating that the field is empty,
such as "<nocomment>" (I think that "" works too). Then change your
fq to
fq=-comments:"<nocomment>"
It should be much faster.
-Mike