You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Perez <ar...@ethicist.net> on 2006/04/08 04:45:08 UTC

Exact date search doesn't work with 1.9.1?

Hi all,

I have a document with a date in it and I put it into a field like so:
DateTools.dateToString(theDate, Resolution.DAY), 
Field.Index.UN_TOKENIZED.

What I find is that a range query works:
[20060131 TO 20060601] and wildcard works e.g.
2006*
but exact matches do not work e.g.
20060130

Any ideas on how I am misusing the API?

This is 1.9.1.

tia,
-arturo


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Exact date search doesn't work with 1.9.1?

Posted by Daniel Naber <lu...@danielnaber.de>.

On Samstag 08 April 2006 04:45, Perez wrote:

> Any ideas on how I am misusing the API?

Please use Luke to check what's in the index and if that doesn't help, post 
a small self-contained code snippet here.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Fetch Documents Without Retrieveing All Fields

Posted by Chris Hostetter <ho...@fucit.org>.

first off, i only skimmed the url you posted. i may have missed the point,
but it appears to be a description of how to add restricted stored field
loading.

secondly...

: Of course there is no doubt that search in Lucene index is faster but
: sometimes the retrieving the hitDocs is slower(for Ex. when we try to
: retrieve more than 10000 documents  from hits). May be my scenario is a

...you most certainly do *NOT* want to use the Hits class if you are
access more then a few hundrad of the "first" results (be it by score, or
by some sorting option).  Hits will re-execute your search many times as
you walk farther down the results (it is designed for a simplified use
case which you certainly do not meet)

you most likely want to use one of the "Expert" methods ... either TopDocs
or TopFieldDocs if you are sing sorting, or a HitCollector if you can get
away with it.


Third...

: I have integrated lucene search engine for one of our project in the
: company, there I have a book index with each document has more than 15
: fields to do speific search, but out of that after I do the search I
: just want to retrieve the value of one field named "DBID" which is the
: database table column id and for rendering in the frontend I retrieve
: the data from database. In this case I really don't require all the

1) if the only field you ever need *after* a search is the DBID field,
then make sure you aren't STOREing any fields but that one -- it will make
your index smaller.

2) your use case sounds like it could best be served by leveraging the
FieldCache -- as long as each document contains only one value for the
DBID field, and as long as you index the DBID field, you can use the
FieldCache for that field (along with a HitCollector, or
TopDocs/TopFieldDocs) to access the DBID of every doc much faster then you
can get the stored value.


I am 99.999% certain that using the FieldCache to get the one indexed
field valueyou want is going to be faster then any approach you might take
which relies on the stored field value -- without or without the
modifications described in that URL.  (and this way, you don't need to
manually patch your version of Lucene in a way that will be hard to
support in the future)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Fetch Documents Without Retrieveing All Fields

Posted by Supriya Kumar Shyamal <su...@artnology.com>.

Hi All,

I found a interesting point mentioned in 
http://www.cs.cmu.edu/~shashank/htmlfiles/hacks/lucene.html by Shashank.

Of course there is no doubt that search in Lucene index is faster but 
sometimes the retrieving the hitDocs is slower(for Ex. when we try to 
retrieve more than 10000 documents  from hits). May be my scenario is a 
special case but I find the option mentioned in the article can be 
optional functionality in future release of lucene.

Small example based on the my integration of lucene in the project ..

I have integrated lucene search engine for one of our project in the 
company, there I have a book index with each document has more than 15 
fields to do speific search, but out of that after I do the search I 
just want to retrieve the value of one field named "DBID" which is the 
database table column id and for rendering in the frontend I retrieve 
the data from database. In this case I really don't require all the 
field values form the document. Also sometimes I get a OutOfMemorry 
Error when I try to retriev more than 10000 documents at once.

The advantage I found after I try to implement the above idea, the 
performance improved a lot and also I don't get any OutOfMemory error.

Just a small idea which is not entirely mine but I feel its a good option.

With Regards,
supriya

-- 
Mit freundlichen Grüßen / Regards
 
Supriya Kumar Shyamal

Software Developer
tel +49 (30) 443 50 99 -22
fax +49 (30) 443 50 99 -99
email supriya.shyamal@artnology.com
___________________________
artnology GmbH
Milastr. 4
10437 Berlin
___________________________

http://www.artnology.com
__________________________________________________________________________

 News / Aktuelle Projekte:
 * artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
   Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
   Kunstwerke zur kulturellen Repräsentation des Bundes.

 Projektreferenzen:
 * Globaler eShop und Corporate-Site für Springer: www.springeronline.com
 * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
 * Service-Center-Plattform für Biogen: www.ms-life.de
 * eCRM-System für Grünenthal: www.gruenenthal.com

___________________________________________________________________________ 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Exact date search doesn't work with 1.9.1?

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Apr 9, 2006, at 3:02 PM, Perez wrote:
> In article
> <00...@ehatchersolutions.com>,
>  Erik Hatcher <er...@ehatchersolutions.com> wrote:
>
>> This could be a case of your QueryParser analyzer eating a number.
>> Range queries and prefix/wildcard query terms are not analyzed.
>>
>> Besides the great suggestion to use Luke, also try a TermQuery if you
>> happen to be using QueryParser to create your Query currently.
>>
>> 	Erik
>>
>
> Actually, that's another problem I have.  I can't search for  
> numbers at
> all no how.  But the analyzer is just PorterStemmer(Lowercase(Stop))).
>
> Can you tell me where the analyzer is eating numbers?

What is the tokenizer you're using deeper below the "Stop" you  
mentioned?  The StopAnalyzer itself uses a LetterTokenizer, so  
numbers get removed and only letters remain.

Check out the "analysis paralysis" article I did at java.net and the  
AnalyzerDemo (or is it AnalysisDemo?) in the Lucene in Action  
codebase - it'll shed a lot of light on analyzers.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Exact date search doesn't work with 1.9.1?

Posted by Perez <ar...@ethicist.net>.

In article 
<00...@ehatchersolutions.com>,
 Erik Hatcher <er...@ehatchersolutions.com> wrote:

> This could be a case of your QueryParser analyzer eating a number.   
> Range queries and prefix/wildcard query terms are not analyzed.
> 
> Besides the great suggestion to use Luke, also try a TermQuery if you  
> happen to be using QueryParser to create your Query currently.
> 
> 	Erik
> 

Actually, that's another problem I have.  I can't search for numbers at 
all no how.  But the analyzer is just PorterStemmer(Lowercase(Stop))).

Can you tell me where the analyzer is eating numbers? 

tia,
arturo

> 
> On Apr 7, 2006, at 10:45 PM, Perez wrote:
> 
> > Hi all,
> >
> > I have a document with a date in it and I put it into a field like so:
> > DateTools.dateToString(theDate, Resolution.DAY),
> > Field.Index.UN_TOKENIZED.
> >
> > What I find is that a range query works:
> > [20060131 TO 20060601] and wildcard works e.g.
> > 2006*
> > but exact matches do not work e.g.
> > 20060130
> >
> > Any ideas on how I am misusing the API?
> >
> > This is 1.9.1.
> >
> > tia,
> > -arturo
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Exact date search doesn't work with 1.9.1?

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

This could be a case of your QueryParser analyzer eating a number.   
Range queries and prefix/wildcard query terms are not analyzed.

Besides the great suggestion to use Luke, also try a TermQuery if you  
happen to be using QueryParser to create your Query currently.

	Erik


On Apr 7, 2006, at 10:45 PM, Perez wrote:

> Hi all,
>
> I have a document with a date in it and I put it into a field like so:
> DateTools.dateToString(theDate, Resolution.DAY),
> Field.Index.UN_TOKENIZED.
>
> What I find is that a range query works:
> [20060131 TO 20060601] and wildcard works e.g.
> 2006*
> but exact matches do not work e.g.
> 20060130
>
> Any ideas on how I am misusing the API?
>
> This is 1.9.1.
>
> tia,
> -arturo
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org