You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by testn <te...@doramail.com> on 2007/07/23 17:32:22 UTC

Search for null

Is it possible to search for the document that specified field doesn't exist
or such field value is null?
-- 
View this message in context: http://www.nabble.com/Search-for-null-tf4130600.html#a11746864
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by Erick Erickson <er...@gmail.com>.
Nobody can answer that question, you have to test in your particular
situation. Filters are very efficient to use once created, can be created
once and used often, etc.

Adding a special value to stand for an empty field is conceptually
simple, and queries are straight forward.

Unless you can demonstrate that there are speed issues, which method
you choose is largely a matter of taste. Spending valuable programming
time improving query response time by 0.000001% is...er...less than
a good expenditure of time.

Best
Erick

On 7/24/07, testn <te...@doramail.com> wrote:
>
>
> Would it be more efficient to create an additional inverted field where I
> assign a value to that field only when the field I would like to search is
> NULL?
>
>
> daniel rosher wrote:
> >
> > Perhaps you can use a filter in the following way.
> >
> > -Create a filter (via QueryFilter) that would contain all document that
> > do not have null values for the field
> > -flip the bits of the filter so that it now contains documents that have
> > null values for a field
> > -Use the filter in conjunction with subsequent queries.
> >
> > This would also help with performance as filters are simply bitsets and
> > can cheaply be stored, generated once and used often.
> >
> > Dan
> >
> > On Mon, 2007-07-23 at 13:57 -0700, Jay Yu wrote:
> >> If you want performance, a better way might be to assign some special
> >> string/value (if it's easy to create) to the missing field of docs and
> >> index the field without tokenizing it. Then you may search for that
> >> special value to find the docs.
> >>
> >> Jay
> >>
> >> Les Fletcher wrote:
> >> > Does this particular range query have any significant performance
> >> issues?
> >> >
> >> > Les
> >> >
> >> > Erik Hatcher wrote:
> >> >>
> >> >> On Jul 23, 2007, at 11:32 AM, testn wrote:
> >> >>> Is it possible to search for the document that specified field
> >> >>> doesn't exist
> >> >>> or such field value is null?
> >> >>
> >> >> This is from Solr, so I'm not sure off the top of my head if this
> mojo
> >> >> applies by itself, but a search for -fieldname:[* TO *] will result
> in
> >> >> all documents that do not have the specified field.
> >> >>
> >> >>     Erik
> >> >>
> >> >>
> >> >>
> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >>
> >> <<This email has been scanned for virus and spam content>>
> > Daniel Rosher
> > Developer
> >
> >
> > d: 0207 3489 912
> > t: 0870 2020 121
> > f: 0870 2020 131
> > m:
> > http://www.hotonline.com/
> >
> >
> >
> >
> >
> >
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> -
> > - - - - - - - - - - - - - - - - - -
> > This message is sent in confidence for the addressee only. It may
> contain
> > privileged
> > information. The contents are not to be disclosed to anyone other than
> the
> > addressee.
> > Unauthorised recipients are requested to preserve this confidentiality
> and
> > to advise
> > us of any errors in transmission. Thank you.
> >
> > hotonline ltd is registered in England & Wales. Registered office: One
> > Canada Square,
> > Canary Wharf, London E14 5AP. Registered No: 1904765.
> >
> >
> > This message has been scanned for viruses by BlackSpider MailControl -
> > www.blackspider.com
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Search-for-null-tf4130600.html#a11762894
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Search for null

Posted by testn <te...@doramail.com>.
Would it be more efficient to create an additional inverted field where I
assign a value to that field only when the field I would like to search is
NULL?


daniel rosher wrote:
> 
> Perhaps you can use a filter in the following way.
> 
> -Create a filter (via QueryFilter) that would contain all document that
> do not have null values for the field
> -flip the bits of the filter so that it now contains documents that have
> null values for a field
> -Use the filter in conjunction with subsequent queries.
> 
> This would also help with performance as filters are simply bitsets and
> can cheaply be stored, generated once and used often.
> 
> Dan
> 
> On Mon, 2007-07-23 at 13:57 -0700, Jay Yu wrote:
>> If you want performance, a better way might be to assign some special 
>> string/value (if it's easy to create) to the missing field of docs and 
>> index the field without tokenizing it. Then you may search for that 
>> special value to find the docs.
>> 
>> Jay
>> 
>> Les Fletcher wrote:
>> > Does this particular range query have any significant performance
>> issues?
>> > 
>> > Les
>> > 
>> > Erik Hatcher wrote:
>> >>
>> >> On Jul 23, 2007, at 11:32 AM, testn wrote:
>> >>> Is it possible to search for the document that specified field 
>> >>> doesn't exist
>> >>> or such field value is null?
>> >>
>> >> This is from Solr, so I'm not sure off the top of my head if this mojo 
>> >> applies by itself, but a search for -fieldname:[* TO *] will result in 
>> >> all documents that do not have the specified field.
>> >>
>> >>     Erik
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> > 
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> 
>> <<This email has been scanned for virus and spam content>>
> Daniel Rosher
> Developer
> 
> 
> d: 0207 3489 912
> t: 0870 2020 121
> f: 0870 2020 131
> m: 
> http://www.hotonline.com/
> 
> 
> 
> 
> 
> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - -
> This message is sent in confidence for the addressee only. It may contain
> privileged 
> information. The contents are not to be disclosed to anyone other than the
> addressee. 
> Unauthorised recipients are requested to preserve this confidentiality and
> to advise 
> us of any errors in transmission. Thank you.
> 
> hotonline ltd is registered in England & Wales. Registered office: One
> Canada Square, 
> Canary Wharf, London E14 5AP. Registered No: 1904765.
> 
> 
> This message has been scanned for viruses by BlackSpider MailControl -
> www.blackspider.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Search-for-null-tf4130600.html#a11762894
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by Yonik Seeley <yo...@apache.org>.
On 7/24/07, daniel rosher <da...@hotonline.com> wrote:
> Perhaps you can use a filter in the following way.
>
> -Create a filter (via QueryFilter) that would contain all document that
> do not have null values for the field
> -flip the bits of the filter so that it now contains documents that have
> null values for a field
> -Use the filter in conjunction with subsequent queries.

That's pretty much what Solr does with it's filters.  A negative
filter like -inStock:true
is generated as it's positive counterpart, and cached that way also
(generally smaller, and can satisfy both negative and positive
variants of the filter).

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by Daniel Noll <da...@nuix.com>.
On Thursday 26 July 2007 03:12:20 daniel rosher wrote:
> In this case you should look at the source for RangeFilter.java.
>
> Using this you could create your own filter using TermEnum and TermDocs
> to find all documents that had some value for the field.

That's certainly the way to do it for speed.

For the least code you can probably do...

  BooleanFilter f = new BooleanFilter();
  f.add(new FilterClause(RangeFilter.More("field", ""),
                         BooleanClause.Occur.MUST_NOT));
  f = new CachingWrapperFilter(f);

Daniel

-- 
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia    Ph: +61 2 9280 0699
Web: http://nuix.com/                               Fax: +61 2 9212 6902

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by daniel rosher <da...@hotonline.com>.
In this case you should look at the source for RangeFilter.java. 

Using this you could create your own filter using TermEnum and TermDocs
to find all documents that had some value for the field. 

You would then flip this filter (perhaps write a FlipFilter.java, that
takes an existing filter in it's constructor, for reuse) to get all
documents then didn't have a value for this field (i.e. null values). 

Depending on the time it takes to generate these filters, you could then
cache this filter with CachingWrappingFilter for subsequent searches.

Dan

On Wed, 2007-07-25 at 08:57 -0700, Jay Yu wrote:
> what if I do not know all possible values of that field which is a 
> typical case in a free text search?
> 
> daniel rosher wrote:
> > You will be unable to search for fields that do not exist which is what
> > you originally wanted to do, instead you can do something like:
> > 
> > -Establish the query that will select all non-null values
> > 
> > TermQuery tq1 = new TermQuery(new Term("field","value1"));
> > TermQuery tq2 = new TermQuery(new Term("field","value2"));
> > ...
> > TermQuery tqn = new TermQuery(new Term("field","valuen"));
> > BooleanQuery query = new BooleanQuery();
> > booleanQuery.add(tq1,BooleanClause.Occur.SHOULD);
> > booleanQuery.add(tq2,BooleanClause.Occur.SHOULD);
> > ...
> > booleanQuery.add(tqn,BooleanClause.Occur.SHOULD);
> > 
> > OR perhaps a range query if your values are contiguous
> > 
> > Term start = new Term("field","198805");
> > Term end = new Term("field","198810");
> > Query query = new RangeQuery(start, end, true);
> > ;
> > 
> > OR just use the QueryParser
> > 
> > Query query = QueryParser.parse(parseCriteria,
> > "field", new StandardAnalyzer());
> > 
> > -Create the QueryFilter
> > 
> > QueryFilter queryFilter = new QueryFilter(query);
> > 
> > -flip the bits
> > 
> > final BitSet filterBitSet = queryFilter.bits(reader);
> > filterBitSet.flip(0,filterBitSet.size());
> > 
> > Now you have a filter that contains document matching the opposite of
> > that specified by the query, and can use in subsequent queries
> > 
> > Dan
> > 
> > 
> > 
> > On Tue, 2007-07-24 at 09:40 -0700, Jay Yu wrote:
> >> daniel rosher wrote:
> >>> Perhaps you can use a filter in the following way.
> >>>
> >>> -Create a filter (via QueryFilter) that would contain all document that
> >>> do not have null values for the field
> >> Interesting: what does the QueryFilter look like? Isn't it just as hard 
> >> as finding out what docs have the null values for the field?
> >> I really like to know your trick here.
> >>> -flip the bits of the filter so that it now contains documents that have
> >>> null values for a field
> >>> -Use the filter in conjunction with subsequent queries.
> >>>
> >>> This would also help with performance as filters are simply bitsets and
> >>> can cheaply be stored, generated once and used often.
> >>>
> >>> Dan
> >>>
> >>> On Mon, 2007-07-23 at 13:57 -0700, Jay Yu wrote:
> >>>> If you want performance, a better way might be to assign some special 
> >>>> string/value (if it's easy to create) to the missing field of docs and 
> >>>> index the field without tokenizing it. Then you may search for that 
> >>>> special value to find the docs.
> >>>>
> >>>> Jay
> >>>>
> >>>> Les Fletcher wrote:
> >>>>> Does this particular range query have any significant performance issues?
> >>>>>
> >>>>> Les
> >>>>>
> >>>>> Erik Hatcher wrote:
> >>>>>> On Jul 23, 2007, at 11:32 AM, testn wrote:
> >>>>>>> Is it possible to search for the document that specified field 
> >>>>>>> doesn't exist
> >>>>>>> or such field value is null?
> >>>>>> This is from Solr, so I'm not sure off the top of my head if this mojo 
> >>>>>> applies by itself, but a search for -fieldname:[* TO *] will result in 
> >>>>>> all documents that do not have the specified field.
> >>>>>>
> >>>>>>     Erik
> >>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>
> >>>>
> >>>>
> >>>> <<This email has been scanned for virus and spam content>>
> >>> Daniel Rosher
> >>> Developer
> >>>
> >>>
> >>> d: 0207 3489 912
> >>> t: 0870 2020 121
> >>> f: 0870 2020 131
> >>> m: 
> >>> http://www.hotonline.com/
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>> This message is sent in confidence for the addressee only. It may contain privileged 
> >>> information. The contents are not to be disclosed to anyone other than the addressee. 
> >>> Unauthorised recipients are requested to preserve this confidentiality and to advise 
> >>> us of any errors in transmission. Thank you.
> >>>
> >>> hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
> >>> Canary Wharf, London E14 5AP. Registered No: 1904765.
> >>>
> >>>
> >>> This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> > Daniel Rosher
> > Developer
> > 
> > 
> > d: 0207 3489 912
> > t: 0870 2020 121
> > f: 0870 2020 131
> > m: 
> > http://www.hotonline.com/
> > 
> > 
> > 
> > 
> > 
> > 
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> > This message is sent in confidence for the addressee only. It may contain privileged 
> > information. The contents are not to be disclosed to anyone other than the addressee. 
> > Unauthorised recipients are requested to preserve this confidentiality and to advise 
> > us of any errors in transmission. Thank you.
> > 
> > hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
> > Canary Wharf, London E14 5AP. Registered No: 1904765.
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
Daniel Rosher
Developer


d: 0207 3489 912
t: 0870 2020 121
f: 0870 2020 131
m: 
http://www.hotonline.com/






- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
This message is sent in confidence for the addressee only. It may contain privileged 
information. The contents are not to be disclosed to anyone other than the addressee. 
Unauthorised recipients are requested to preserve this confidentiality and to advise 
us of any errors in transmission. Thank you.

hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
Canary Wharf, London E14 5AP. Registered No: 1904765.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by Jay Yu <yu...@AI.SRI.COM>.
what if I do not know all possible values of that field which is a 
typical case in a free text search?

daniel rosher wrote:
> You will be unable to search for fields that do not exist which is what
> you originally wanted to do, instead you can do something like:
> 
> -Establish the query that will select all non-null values
> 
> TermQuery tq1 = new TermQuery(new Term("field","value1"));
> TermQuery tq2 = new TermQuery(new Term("field","value2"));
> ...
> TermQuery tqn = new TermQuery(new Term("field","valuen"));
> BooleanQuery query = new BooleanQuery();
> booleanQuery.add(tq1,BooleanClause.Occur.SHOULD);
> booleanQuery.add(tq2,BooleanClause.Occur.SHOULD);
> ...
> booleanQuery.add(tqn,BooleanClause.Occur.SHOULD);
> 
> OR perhaps a range query if your values are contiguous
> 
> Term start = new Term("field","198805");
> Term end = new Term("field","198810");
> Query query = new RangeQuery(start, end, true);
> ;
> 
> OR just use the QueryParser
> 
> Query query = QueryParser.parse(parseCriteria,
> "field", new StandardAnalyzer());
> 
> -Create the QueryFilter
> 
> QueryFilter queryFilter = new QueryFilter(query);
> 
> -flip the bits
> 
> final BitSet filterBitSet = queryFilter.bits(reader);
> filterBitSet.flip(0,filterBitSet.size());
> 
> Now you have a filter that contains document matching the opposite of
> that specified by the query, and can use in subsequent queries
> 
> Dan
> 
> 
> 
> On Tue, 2007-07-24 at 09:40 -0700, Jay Yu wrote:
>> daniel rosher wrote:
>>> Perhaps you can use a filter in the following way.
>>>
>>> -Create a filter (via QueryFilter) that would contain all document that
>>> do not have null values for the field
>> Interesting: what does the QueryFilter look like? Isn't it just as hard 
>> as finding out what docs have the null values for the field?
>> I really like to know your trick here.
>>> -flip the bits of the filter so that it now contains documents that have
>>> null values for a field
>>> -Use the filter in conjunction with subsequent queries.
>>>
>>> This would also help with performance as filters are simply bitsets and
>>> can cheaply be stored, generated once and used often.
>>>
>>> Dan
>>>
>>> On Mon, 2007-07-23 at 13:57 -0700, Jay Yu wrote:
>>>> If you want performance, a better way might be to assign some special 
>>>> string/value (if it's easy to create) to the missing field of docs and 
>>>> index the field without tokenizing it. Then you may search for that 
>>>> special value to find the docs.
>>>>
>>>> Jay
>>>>
>>>> Les Fletcher wrote:
>>>>> Does this particular range query have any significant performance issues?
>>>>>
>>>>> Les
>>>>>
>>>>> Erik Hatcher wrote:
>>>>>> On Jul 23, 2007, at 11:32 AM, testn wrote:
>>>>>>> Is it possible to search for the document that specified field 
>>>>>>> doesn't exist
>>>>>>> or such field value is null?
>>>>>> This is from Solr, so I'm not sure off the top of my head if this mojo 
>>>>>> applies by itself, but a search for -fieldname:[* TO *] will result in 
>>>>>> all documents that do not have the specified field.
>>>>>>
>>>>>>     Erik
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>> <<This email has been scanned for virus and spam content>>
>>> Daniel Rosher
>>> Developer
>>>
>>>
>>> d: 0207 3489 912
>>> t: 0870 2020 121
>>> f: 0870 2020 131
>>> m: 
>>> http://www.hotonline.com/
>>>
>>>
>>>
>>>
>>>
>>>
>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>> This message is sent in confidence for the addressee only. It may contain privileged 
>>> information. The contents are not to be disclosed to anyone other than the addressee. 
>>> Unauthorised recipients are requested to preserve this confidentiality and to advise 
>>> us of any errors in transmission. Thank you.
>>>
>>> hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
>>> Canary Wharf, London E14 5AP. Registered No: 1904765.
>>>
>>>
>>> This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> Daniel Rosher
> Developer
> 
> 
> d: 0207 3489 912
> t: 0870 2020 121
> f: 0870 2020 131
> m: 
> http://www.hotonline.com/
> 
> 
> 
> 
> 
> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> This message is sent in confidence for the addressee only. It may contain privileged 
> information. The contents are not to be disclosed to anyone other than the addressee. 
> Unauthorised recipients are requested to preserve this confidentiality and to advise 
> us of any errors in transmission. Thank you.
> 
> hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
> Canary Wharf, London E14 5AP. Registered No: 1904765.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by daniel rosher <da...@hotonline.com>.
You will be unable to search for fields that do not exist which is what
you originally wanted to do, instead you can do something like:

-Establish the query that will select all non-null values

TermQuery tq1 = new TermQuery(new Term("field","value1"));
TermQuery tq2 = new TermQuery(new Term("field","value2"));
...
TermQuery tqn = new TermQuery(new Term("field","valuen"));
BooleanQuery query = new BooleanQuery();
booleanQuery.add(tq1,BooleanClause.Occur.SHOULD);
booleanQuery.add(tq2,BooleanClause.Occur.SHOULD);
...
booleanQuery.add(tqn,BooleanClause.Occur.SHOULD);

OR perhaps a range query if your values are contiguous

Term start = new Term("field","198805");
Term end = new Term("field","198810");
Query query = new RangeQuery(start, end, true);
;

OR just use the QueryParser

Query query = QueryParser.parse(parseCriteria,
"field", new StandardAnalyzer());

-Create the QueryFilter

QueryFilter queryFilter = new QueryFilter(query);

-flip the bits

final BitSet filterBitSet = queryFilter.bits(reader);
filterBitSet.flip(0,filterBitSet.size());

Now you have a filter that contains document matching the opposite of
that specified by the query, and can use in subsequent queries

Dan



On Tue, 2007-07-24 at 09:40 -0700, Jay Yu wrote:
> 
> daniel rosher wrote:
> > Perhaps you can use a filter in the following way.
> > 
> > -Create a filter (via QueryFilter) that would contain all document that
> > do not have null values for the field
> Interesting: what does the QueryFilter look like? Isn't it just as hard 
> as finding out what docs have the null values for the field?
> I really like to know your trick here.
> > -flip the bits of the filter so that it now contains documents that have
> > null values for a field
> > -Use the filter in conjunction with subsequent queries.
> > 
> > This would also help with performance as filters are simply bitsets and
> > can cheaply be stored, generated once and used often.
> > 
> > Dan
> > 
> > On Mon, 2007-07-23 at 13:57 -0700, Jay Yu wrote:
> >> If you want performance, a better way might be to assign some special 
> >> string/value (if it's easy to create) to the missing field of docs and 
> >> index the field without tokenizing it. Then you may search for that 
> >> special value to find the docs.
> >>
> >> Jay
> >>
> >> Les Fletcher wrote:
> >>> Does this particular range query have any significant performance issues?
> >>>
> >>> Les
> >>>
> >>> Erik Hatcher wrote:
> >>>> On Jul 23, 2007, at 11:32 AM, testn wrote:
> >>>>> Is it possible to search for the document that specified field 
> >>>>> doesn't exist
> >>>>> or such field value is null?
> >>>> This is from Solr, so I'm not sure off the top of my head if this mojo 
> >>>> applies by itself, but a search for -fieldname:[* TO *] will result in 
> >>>> all documents that do not have the specified field.
> >>>>
> >>>>     Erik
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >>
> >> <<This email has been scanned for virus and spam content>>
> > Daniel Rosher
> > Developer
> > 
> > 
> > d: 0207 3489 912
> > t: 0870 2020 121
> > f: 0870 2020 131
> > m: 
> > http://www.hotonline.com/
> > 
> > 
> > 
> > 
> > 
> > 
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> > This message is sent in confidence for the addressee only. It may contain privileged 
> > information. The contents are not to be disclosed to anyone other than the addressee. 
> > Unauthorised recipients are requested to preserve this confidentiality and to advise 
> > us of any errors in transmission. Thank you.
> > 
> > hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
> > Canary Wharf, London E14 5AP. Registered No: 1904765.
> > 
> > 
> > This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
Daniel Rosher
Developer


d: 0207 3489 912
t: 0870 2020 121
f: 0870 2020 131
m: 
http://www.hotonline.com/






- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
This message is sent in confidence for the addressee only. It may contain privileged 
information. The contents are not to be disclosed to anyone other than the addressee. 
Unauthorised recipients are requested to preserve this confidentiality and to advise 
us of any errors in transmission. Thank you.

hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
Canary Wharf, London E14 5AP. Registered No: 1904765.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by Jay Yu <yu...@AI.SRI.COM>.

daniel rosher wrote:
> Perhaps you can use a filter in the following way.
> 
> -Create a filter (via QueryFilter) that would contain all document that
> do not have null values for the field
Interesting: what does the QueryFilter look like? Isn't it just as hard 
as finding out what docs have the null values for the field?
I really like to know your trick here.
> -flip the bits of the filter so that it now contains documents that have
> null values for a field
> -Use the filter in conjunction with subsequent queries.
> 
> This would also help with performance as filters are simply bitsets and
> can cheaply be stored, generated once and used often.
> 
> Dan
> 
> On Mon, 2007-07-23 at 13:57 -0700, Jay Yu wrote:
>> If you want performance, a better way might be to assign some special 
>> string/value (if it's easy to create) to the missing field of docs and 
>> index the field without tokenizing it. Then you may search for that 
>> special value to find the docs.
>>
>> Jay
>>
>> Les Fletcher wrote:
>>> Does this particular range query have any significant performance issues?
>>>
>>> Les
>>>
>>> Erik Hatcher wrote:
>>>> On Jul 23, 2007, at 11:32 AM, testn wrote:
>>>>> Is it possible to search for the document that specified field 
>>>>> doesn't exist
>>>>> or such field value is null?
>>>> This is from Solr, so I'm not sure off the top of my head if this mojo 
>>>> applies by itself, but a search for -fieldname:[* TO *] will result in 
>>>> all documents that do not have the specified field.
>>>>
>>>>     Erik
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>> <<This email has been scanned for virus and spam content>>
> Daniel Rosher
> Developer
> 
> 
> d: 0207 3489 912
> t: 0870 2020 121
> f: 0870 2020 131
> m: 
> http://www.hotonline.com/
> 
> 
> 
> 
> 
> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> This message is sent in confidence for the addressee only. It may contain privileged 
> information. The contents are not to be disclosed to anyone other than the addressee. 
> Unauthorised recipients are requested to preserve this confidentiality and to advise 
> us of any errors in transmission. Thank you.
> 
> hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
> Canary Wharf, London E14 5AP. Registered No: 1904765.
> 
> 
> This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by daniel rosher <da...@hotonline.com>.
Perhaps you can use a filter in the following way.

-Create a filter (via QueryFilter) that would contain all document that
do not have null values for the field
-flip the bits of the filter so that it now contains documents that have
null values for a field
-Use the filter in conjunction with subsequent queries.

This would also help with performance as filters are simply bitsets and
can cheaply be stored, generated once and used often.

Dan

On Mon, 2007-07-23 at 13:57 -0700, Jay Yu wrote:
> If you want performance, a better way might be to assign some special 
> string/value (if it's easy to create) to the missing field of docs and 
> index the field without tokenizing it. Then you may search for that 
> special value to find the docs.
> 
> Jay
> 
> Les Fletcher wrote:
> > Does this particular range query have any significant performance issues?
> > 
> > Les
> > 
> > Erik Hatcher wrote:
> >>
> >> On Jul 23, 2007, at 11:32 AM, testn wrote:
> >>> Is it possible to search for the document that specified field 
> >>> doesn't exist
> >>> or such field value is null?
> >>
> >> This is from Solr, so I'm not sure off the top of my head if this mojo 
> >> applies by itself, but a search for -fieldname:[* TO *] will result in 
> >> all documents that do not have the specified field.
> >>
> >>     Erik
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> <<This email has been scanned for virus and spam content>>
Daniel Rosher
Developer


d: 0207 3489 912
t: 0870 2020 121
f: 0870 2020 131
m: 
http://www.hotonline.com/






- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
This message is sent in confidence for the addressee only. It may contain privileged 
information. The contents are not to be disclosed to anyone other than the addressee. 
Unauthorised recipients are requested to preserve this confidentiality and to advise 
us of any errors in transmission. Thank you.

hotonline ltd is registered in England & Wales. Registered office: One Canada Square, 
Canary Wharf, London E14 5AP. Registered No: 1904765.


This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by Jay Yu <yu...@AI.SRI.COM>.
If you want performance, a better way might be to assign some special 
string/value (if it's easy to create) to the missing field of docs and 
index the field without tokenizing it. Then you may search for that 
special value to find the docs.

Jay

Les Fletcher wrote:
> Does this particular range query have any significant performance issues?
> 
> Les
> 
> Erik Hatcher wrote:
>>
>> On Jul 23, 2007, at 11:32 AM, testn wrote:
>>> Is it possible to search for the document that specified field 
>>> doesn't exist
>>> or such field value is null?
>>
>> This is from Solr, so I'm not sure off the top of my head if this mojo 
>> applies by itself, but a search for -fieldname:[* TO *] will result in 
>> all documents that do not have the specified field.
>>
>>     Erik
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by Les Fletcher <le...@affinitycircles.com>.
Does this particular range query have any significant performance issues?

Les

Erik Hatcher wrote:
>
> On Jul 23, 2007, at 11:32 AM, testn wrote:
>> Is it possible to search for the document that specified field 
>> doesn't exist
>> or such field value is null?
>
> This is from Solr, so I'm not sure off the top of my head if this mojo 
> applies by itself, but a search for -fieldname:[* TO *] will result in 
> all documents that do not have the specified field.
>
>     Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search for null

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jul 23, 2007, at 11:32 AM, testn wrote:
> Is it possible to search for the document that specified field  
> doesn't exist
> or such field value is null?

This is from Solr, so I'm not sure off the top of my head if this  
mojo applies by itself, but a search for -fieldname:[* TO *] will  
result in all documents that do not have the specified field.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org