You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Viswa S <sv...@hotmail.com> on 2010/11/20 21:12:00 UTC

Empty value/string matching

Folks,Am trying to query documents which have no values present, I have used the following constructs and it doesn't seem to work on the solr dev tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) - returns no documents, parsedquery was "+MatchAllDocsQuery(*:*) -FieldName:[* TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was "-FieldName:[* TO *]"3. FieldName:"" - returns no documents, parsedquery was empty (<str name="parsedquery"/>)The field is type string, using the LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the documents with no terms are ignored and didn't seem to be the case, the result set was everything.Any help would be appreciated.-Viswa 		 	   		  

Re: Empty value/string matching

Posted by Lance Norskog <go...@gmail.com>.
If a string field has a value with " ", that has to be searched for.
fieldName:" " should work.
If there is a 0-length value in a string field, that might be found
with fieldName:"" but I have no experience with 0-length values. I
don't know if this adds a value to the field or not:
"<field name="fieldName"></field>"

One way to find out is to make that field required in the schema. If
no value goes in, you'll get an error.

The facet output should list " " and "".


On Sat, Nov 20, 2010 at 2:38 PM, Viswa S <sv...@hotmail.com> wrote:
>
> Erick,
> Thanks for the quick response. The output i showed is on a test instance i created to simulate this issue. I intentionally tried to create documents with no values by creating xml nodes with "<field name="fieldName"></field>", but having values in the other fields in a document.
> Are you saying that there is no way have a field with no value?, with text fields they seem to make sense than for string?.
> You are right on fieldName:[* TO *] results, which basically returned all the documents which included the couple of documents in question.
> -Viswa
>> Date: Sat, 20 Nov 2010 17:20:53 -0500
>> Subject: Re: Empty value/string matching
>> From: erickerickson@gmail.com
>> To: solr-user@lucene.apache.org
>>
>> I don't think that's correct. The documents wouldn't be showing
>> up in the facets if they had no value for the field. So I think you're
>> being mislead by the printout from the faceting. Perhaps you
>> have unprintable characters in there or some such. Certainly the
>> name:" " is actually a value, admittedly just a space. As for the
>> other, I suspect something similar.
>>
>> What results do you get back when you just search for
>> FieldName:[* TO *]? I'm betting you get all the docs back,
>> but I've been very wrong before.
>>
>> Best
>> Erick
>>
>> On Sat, Nov 20, 2010 at 5:02 PM, Viswa S <sv...@hotmail.com> wrote:
>>
>> >
>> > Yes I do have a couple of documents with no values and one with an empty
>> > string. Find below the output of a facet on the fieldName.
>> > ThanksViswa
>> >
>> >
>> > <int name="">2</int><int name="CASTIGO.430">2</int><int
>> > name="GDOGPRODY.424">2</int><int name="QMAGIC.412">2</int><int name="
>> > ">1</int>
>> > > Date: Sat, 20 Nov 2010 15:29:06 -0500
>> > > Subject: Re: Empty value/string matching
>> > > From: erickerickson@gmail.com
>> > > To: solr-user@lucene.apache.org
>> > >
>> > > Are you absolutely sure your documents really don't have any values for
>> > > "FieldName"? Because your results are perfectly correct if every doc has
>> > a
>> > > value for "FieldName".
>> > >
>> > > Or are you saying there no such field as "FieldName"?
>> > >
>> > > Best
>> > > Erick
>> > >
>> > > On Sat, Nov 20, 2010 at 3:12 PM, Viswa S <sv...@hotmail.com> wrote:
>> > >
>> > > >
>> > > > Folks,Am trying to query documents which have no values present, I have
>> > > > used the following constructs and it doesn't seem to work on the solr
>> > dev
>> > > > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
>> > > > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*)
>> > -FieldName:[*
>> > > > TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
>> > > > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents,
>> > parsedquery was
>> > > > empty (<str name="parsedquery"/>)The field is type string, using the
>> > > > LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
>> > > > documents with no terms are ignored and didn't seem to be the case, the
>> > > > result set was everything.Any help would be appreciated.-Viswa
>> > > >
>> >
>> >
>



-- 
Lance Norskog
goksron@gmail.com

RE: Empty value/string matching

Posted by Viswa S <sv...@hotmail.com>.
Anyone know why this would not be working in solr?. Just to recap, we are trying to exclude document which have fields missing values in the search results. I have tried and none of it seems to be working:
1. *:* -field:[* TO *]2. -field:[* TO *]3. field:""
The fields are either typed string or custom and the query parser used is the,LuceneQParser. The below suggested solutions of using some default values do not work for our use case.
ThanksViswa

> From: bob.sandiford@sirsidynix.com
> To: solr-user@lucene.apache.org
> Date: Mon, 22 Nov 2010 08:35:22 -0700
> Subject: RE: Empty value/string matching
> 
> One possibility to consider - if you really need documents with specifically empty or non-defined values (if that's not an oxymoron :)), and you have control over the values you send into the indexing, you could set a special value that means 'no value'. We've done that in a similar vein, using something like '@@EMPTY@@' for a given field, meaning that the original document didn't actually have a value for that field.  I.E. it is something very unlikely to be a 'real' value - and then we can easily select on documents by querying for the field:@@EMPTY@@ instead of the negated form of the select...  However, we haven't considered things like what it does to index size.  It's relatively rare for us (that there not be a value), so our 'gut feel' is that it's not impacting the indexes very much size-wise or performance-wise.
> 
> Bob Sandiford | Lead Software Engineer | SirsiDynix
> P: 800.288.8020 X6943 | Bob.Sandiford@sirsidynix.com
> www.sirsidynix.com 
> 
> > -----Original Message-----
> > From: Viswa S [mailto:sviswap@hotmail.com]
> > Sent: Saturday, November 20, 2010 5:38 PM
> > To: solr-user@lucene.apache.org
> > Subject: RE: Empty value/string matching
> > 
> > 
> > Erick,
> > Thanks for the quick response. The output i showed is on a test
> > instance i created to simulate this issue. I intentionally tried to
> > create documents with no values by creating xml nodes with "<field
> > name="fieldName"></field>", but having values in the other fields in a
> > document.
> > Are you saying that there is no way have a field with no value?, with
> > text fields they seem to make sense than for string?.
> > You are right on fieldName:[* TO *] results, which basically returned
> > all the documents which included the couple of documents in question.
> > -Viswa
> > > Date: Sat, 20 Nov 2010 17:20:53 -0500
> > > Subject: Re: Empty value/string matching
> > > From: erickerickson@gmail.com
> > > To: solr-user@lucene.apache.org
> > >
> > > I don't think that's correct. The documents wouldn't be showing
> > > up in the facets if they had no value for the field. So I think
> > you're
> > > being mislead by the printout from the faceting. Perhaps you
> > > have unprintable characters in there or some such. Certainly the
> > > name:" " is actually a value, admittedly just a space. As for the
> > > other, I suspect something similar.
> > >
> > > What results do you get back when you just search for
> > > FieldName:[* TO *]? I'm betting you get all the docs back,
> > > but I've been very wrong before.
> > >
> > > Best
> > > Erick
> > >
> > > On Sat, Nov 20, 2010 at 5:02 PM, Viswa S <sv...@hotmail.com> wrote:
> > >
> > > >
> > > > Yes I do have a couple of documents with no values and one with an
> > empty
> > > > string. Find below the output of a facet on the fieldName.
> > > > ThanksViswa
> > > >
> > > >
> > > > <int name="">2</int><int name="CASTIGO.430">2</int><int
> > > > name="GDOGPRODY.424">2</int><int name="QMAGIC.412">2</int><int
> > name="
> > > > ">1</int>
> > > > > Date: Sat, 20 Nov 2010 15:29:06 -0500
> > > > > Subject: Re: Empty value/string matching
> > > > > From: erickerickson@gmail.com
> > > > > To: solr-user@lucene.apache.org
> > > > >
> > > > > Are you absolutely sure your documents really don't have any
> > values for
> > > > > "FieldName"? Because your results are perfectly correct if every
> > doc has
> > > > a
> > > > > value for "FieldName".
> > > > >
> > > > > Or are you saying there no such field as "FieldName"?
> > > > >
> > > > > Best
> > > > > Erick
> > > > >
> > > > > On Sat, Nov 20, 2010 at 3:12 PM, Viswa S <sv...@hotmail.com>
> > wrote:
> > > > >
> > > > > >
> > > > > > Folks,Am trying to query documents which have no values
> > present, I have
> > > > > > used the following constructs and it doesn't seem to work on
> > the solr
> > > > dev
> > > > > > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO
> > *]) -
> > > > > > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*)
> > > > -FieldName:[*
> > > > > > TO *]"2. -FieldName:[* TO *] -  returns no documents,
> > parsedquery was
> > > > > > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents,
> > > > parsedquery was
> > > > > > empty (<str name="parsedquery"/>)The field is type string,
> > using the
> > > > > > LuceneQParser, I have also tried to see if "FieldName:[* TO *]"
> > if the
> > > > > > documents with no terms are ignored and didn't seem to be the
> > case, the
> > > > > > result set was everything.Any help would be appreciated.-Viswa
> > > > > >
> > > >
> > > >
> > 
> 
 		 	   		  

RE: Empty value/string matching

Posted by Bob Sandiford <bo...@sirsidynix.com>.
One possibility to consider - if you really need documents with specifically empty or non-defined values (if that's not an oxymoron :)), and you have control over the values you send into the indexing, you could set a special value that means 'no value'. We've done that in a similar vein, using something like '@@EMPTY@@' for a given field, meaning that the original document didn't actually have a value for that field.  I.E. it is something very unlikely to be a 'real' value - and then we can easily select on documents by querying for the field:@@EMPTY@@ instead of the negated form of the select...  However, we haven't considered things like what it does to index size.  It's relatively rare for us (that there not be a value), so our 'gut feel' is that it's not impacting the indexes very much size-wise or performance-wise.

Bob Sandiford | Lead Software Engineer | SirsiDynix
P: 800.288.8020 X6943 | Bob.Sandiford@sirsidynix.com
www.sirsidynix.com 

> -----Original Message-----
> From: Viswa S [mailto:sviswap@hotmail.com]
> Sent: Saturday, November 20, 2010 5:38 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Empty value/string matching
> 
> 
> Erick,
> Thanks for the quick response. The output i showed is on a test
> instance i created to simulate this issue. I intentionally tried to
> create documents with no values by creating xml nodes with "<field
> name="fieldName"></field>", but having values in the other fields in a
> document.
> Are you saying that there is no way have a field with no value?, with
> text fields they seem to make sense than for string?.
> You are right on fieldName:[* TO *] results, which basically returned
> all the documents which included the couple of documents in question.
> -Viswa
> > Date: Sat, 20 Nov 2010 17:20:53 -0500
> > Subject: Re: Empty value/string matching
> > From: erickerickson@gmail.com
> > To: solr-user@lucene.apache.org
> >
> > I don't think that's correct. The documents wouldn't be showing
> > up in the facets if they had no value for the field. So I think
> you're
> > being mislead by the printout from the faceting. Perhaps you
> > have unprintable characters in there or some such. Certainly the
> > name:" " is actually a value, admittedly just a space. As for the
> > other, I suspect something similar.
> >
> > What results do you get back when you just search for
> > FieldName:[* TO *]? I'm betting you get all the docs back,
> > but I've been very wrong before.
> >
> > Best
> > Erick
> >
> > On Sat, Nov 20, 2010 at 5:02 PM, Viswa S <sv...@hotmail.com> wrote:
> >
> > >
> > > Yes I do have a couple of documents with no values and one with an
> empty
> > > string. Find below the output of a facet on the fieldName.
> > > ThanksViswa
> > >
> > >
> > > <int name="">2</int><int name="CASTIGO.430">2</int><int
> > > name="GDOGPRODY.424">2</int><int name="QMAGIC.412">2</int><int
> name="
> > > ">1</int>
> > > > Date: Sat, 20 Nov 2010 15:29:06 -0500
> > > > Subject: Re: Empty value/string matching
> > > > From: erickerickson@gmail.com
> > > > To: solr-user@lucene.apache.org
> > > >
> > > > Are you absolutely sure your documents really don't have any
> values for
> > > > "FieldName"? Because your results are perfectly correct if every
> doc has
> > > a
> > > > value for "FieldName".
> > > >
> > > > Or are you saying there no such field as "FieldName"?
> > > >
> > > > Best
> > > > Erick
> > > >
> > > > On Sat, Nov 20, 2010 at 3:12 PM, Viswa S <sv...@hotmail.com>
> wrote:
> > > >
> > > > >
> > > > > Folks,Am trying to query documents which have no values
> present, I have
> > > > > used the following constructs and it doesn't seem to work on
> the solr
> > > dev
> > > > > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO
> *]) -
> > > > > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*)
> > > -FieldName:[*
> > > > > TO *]"2. -FieldName:[* TO *] -  returns no documents,
> parsedquery was
> > > > > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents,
> > > parsedquery was
> > > > > empty (<str name="parsedquery"/>)The field is type string,
> using the
> > > > > LuceneQParser, I have also tried to see if "FieldName:[* TO *]"
> if the
> > > > > documents with no terms are ignored and didn't seem to be the
> case, the
> > > > > result set was everything.Any help would be appreciated.-Viswa
> > > > >
> > >
> > >
> 


RE: Empty value/string matching

Posted by Viswa S <sv...@hotmail.com>.
Erick,
Thanks for the quick response. The output i showed is on a test instance i created to simulate this issue. I intentionally tried to create documents with no values by creating xml nodes with "<field name="fieldName"></field>", but having values in the other fields in a document. 
Are you saying that there is no way have a field with no value?, with text fields they seem to make sense than for string?.
You are right on fieldName:[* TO *] results, which basically returned all the documents which included the couple of documents in question. 
-Viswa
> Date: Sat, 20 Nov 2010 17:20:53 -0500
> Subject: Re: Empty value/string matching
> From: erickerickson@gmail.com
> To: solr-user@lucene.apache.org
> 
> I don't think that's correct. The documents wouldn't be showing
> up in the facets if they had no value for the field. So I think you're
> being mislead by the printout from the faceting. Perhaps you
> have unprintable characters in there or some such. Certainly the
> name:" " is actually a value, admittedly just a space. As for the
> other, I suspect something similar.
> 
> What results do you get back when you just search for
> FieldName:[* TO *]? I'm betting you get all the docs back,
> but I've been very wrong before.
> 
> Best
> Erick
> 
> On Sat, Nov 20, 2010 at 5:02 PM, Viswa S <sv...@hotmail.com> wrote:
> 
> >
> > Yes I do have a couple of documents with no values and one with an empty
> > string. Find below the output of a facet on the fieldName.
> > ThanksViswa
> >
> >
> > <int name="">2</int><int name="CASTIGO.430">2</int><int
> > name="GDOGPRODY.424">2</int><int name="QMAGIC.412">2</int><int name="
> > ">1</int>
> > > Date: Sat, 20 Nov 2010 15:29:06 -0500
> > > Subject: Re: Empty value/string matching
> > > From: erickerickson@gmail.com
> > > To: solr-user@lucene.apache.org
> > >
> > > Are you absolutely sure your documents really don't have any values for
> > > "FieldName"? Because your results are perfectly correct if every doc has
> > a
> > > value for "FieldName".
> > >
> > > Or are you saying there no such field as "FieldName"?
> > >
> > > Best
> > > Erick
> > >
> > > On Sat, Nov 20, 2010 at 3:12 PM, Viswa S <sv...@hotmail.com> wrote:
> > >
> > > >
> > > > Folks,Am trying to query documents which have no values present, I have
> > > > used the following constructs and it doesn't seem to work on the solr
> > dev
> > > > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
> > > > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*)
> > -FieldName:[*
> > > > TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
> > > > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents,
> > parsedquery was
> > > > empty (<str name="parsedquery"/>)The field is type string, using the
> > > > LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
> > > > documents with no terms are ignored and didn't seem to be the case, the
> > > > result set was everything.Any help would be appreciated.-Viswa
> > > >
> >
> >
 		 	   		  

Re: Empty value/string matching

Posted by Erick Erickson <er...@gmail.com>.
I don't think that's correct. The documents wouldn't be showing
up in the facets if they had no value for the field. So I think you're
being mislead by the printout from the faceting. Perhaps you
have unprintable characters in there or some such. Certainly the
name:" " is actually a value, admittedly just a space. As for the
other, I suspect something similar.

What results do you get back when you just search for
FieldName:[* TO *]? I'm betting you get all the docs back,
but I've been very wrong before.

Best
Erick

On Sat, Nov 20, 2010 at 5:02 PM, Viswa S <sv...@hotmail.com> wrote:

>
> Yes I do have a couple of documents with no values and one with an empty
> string. Find below the output of a facet on the fieldName.
> ThanksViswa
>
>
> <int name="">2</int><int name="CASTIGO.430">2</int><int
> name="GDOGPRODY.424">2</int><int name="QMAGIC.412">2</int><int name="
> ">1</int>
> > Date: Sat, 20 Nov 2010 15:29:06 -0500
> > Subject: Re: Empty value/string matching
> > From: erickerickson@gmail.com
> > To: solr-user@lucene.apache.org
> >
> > Are you absolutely sure your documents really don't have any values for
> > "FieldName"? Because your results are perfectly correct if every doc has
> a
> > value for "FieldName".
> >
> > Or are you saying there no such field as "FieldName"?
> >
> > Best
> > Erick
> >
> > On Sat, Nov 20, 2010 at 3:12 PM, Viswa S <sv...@hotmail.com> wrote:
> >
> > >
> > > Folks,Am trying to query documents which have no values present, I have
> > > used the following constructs and it doesn't seem to work on the solr
> dev
> > > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
> > > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*)
> -FieldName:[*
> > > TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
> > > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents,
> parsedquery was
> > > empty (<str name="parsedquery"/>)The field is type string, using the
> > > LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
> > > documents with no terms are ignored and didn't seem to be the case, the
> > > result set was everything.Any help would be appreciated.-Viswa
> > >
>
>

RE: Empty value/string matching

Posted by Viswa S <sv...@hotmail.com>.
Yes I do have a couple of documents with no values and one with an empty string. Find below the output of a facet on the fieldName.
ThanksViswa


<int name="">2</int><int name="CASTIGO.430">2</int><int name="GDOGPRODY.424">2</int><int name="QMAGIC.412">2</int><int name=" ">1</int>
> Date: Sat, 20 Nov 2010 15:29:06 -0500
> Subject: Re: Empty value/string matching
> From: erickerickson@gmail.com
> To: solr-user@lucene.apache.org
> 
> Are you absolutely sure your documents really don't have any values for
> "FieldName"? Because your results are perfectly correct if every doc has a
> value for "FieldName".
> 
> Or are you saying there no such field as "FieldName"?
> 
> Best
> Erick
> 
> On Sat, Nov 20, 2010 at 3:12 PM, Viswa S <sv...@hotmail.com> wrote:
> 
> >
> > Folks,Am trying to query documents which have no values present, I have
> > used the following constructs and it doesn't seem to work on the solr dev
> > tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
> > returns no documents, parsedquery was "+MatchAllDocsQuery(*:*) -FieldName:[*
> > TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
> > "-FieldName:[* TO *]"3. FieldName:"" - returns no documents, parsedquery was
> > empty (<str name="parsedquery"/>)The field is type string, using the
> > LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
> > documents with no terms are ignored and didn't seem to be the case, the
> > result set was everything.Any help would be appreciated.-Viswa
> >
 		 	   		  

Re: Empty value/string matching

Posted by Erick Erickson <er...@gmail.com>.
Are you absolutely sure your documents really don't have any values for
"FieldName"? Because your results are perfectly correct if every doc has a
value for "FieldName".

Or are you saying there no such field as "FieldName"?

Best
Erick

On Sat, Nov 20, 2010 at 3:12 PM, Viswa S <sv...@hotmail.com> wrote:

>
> Folks,Am trying to query documents which have no values present, I have
> used the following constructs and it doesn't seem to work on the solr dev
> tip (as of 09/22) or the 1.4 builds.1. (*:* AND -FieldName[* TO *]) -
> returns no documents, parsedquery was "+MatchAllDocsQuery(*:*) -FieldName:[*
> TO *]"2. -FieldName:[* TO *] -  returns no documents, parsedquery was
> "-FieldName:[* TO *]"3. FieldName:"" - returns no documents, parsedquery was
> empty (<str name="parsedquery"/>)The field is type string, using the
> LuceneQParser, I have also tried to see if "FieldName:[* TO *]" if the
> documents with no terms are ignored and didn't seem to be the case, the
> result set was everything.Any help would be appreciated.-Viswa
>