You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Sanyi <ne...@yahoo.com> on 2005/02/13 13:09:01 UTC

DateFilter on UnStored field

Hi!

Does DateFilter work on fields indexed as UnStored?
Can I filter an UnStored field with values like "2004-11-05" ?

Regards,
Sanyi


		
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - 250MB free storage. Do more. Manage less. 
http://info.mail.yahoo.com/mail_250

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: DateFilter on UnStored field

Posted by PA <pe...@gmail.com>.
On Feb 13, 2005, at 13:09, Sanyi wrote:

> Does DateFilter work on fields indexed as UnStored?

Hmmm... never used DateFilter per se... but it should "just" work, like 
everything else :)

> Can I filter an UnStored field with values like "2004-11-05" ?

Sure.

Cheers

--
PA, Onnay Equitursay
http://alt.textdrive.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: DateFilter on UnStored field

Posted by Sanyi <ne...@yahoo.com>.
> DateField has a utility method to return a String:
> 
> 	DateField.timeToString(file.lastModified())
> 
> You'd use that String to pass to Field.UnStored.
> 
> I recommend, though, that you use a different format, such as the 
> YYYY-MM-DD format you're using.

Well, I read YYYY-MM-DD format string from a database.
So, I need to know how to convert YYYY-MM-DD to DateField.timeToString()'s result format.
Or I have to convert YYYY-MM-DD to file.lastModified()'s format which I can pass to
DateField.timeToString().
What is the easiest solution?

> In Lucene's latest codebase (though not in 1.4.x) includes RangeFilter 
> which would do the trick for you.  If you want to stick with Lucene 
> 1.4.x, that's fine... just grab the code for that filter and use it as 
> a custom filter - its compatible with 1.4.x.

So, why do you recommend RangeFilter over DateFilter?
Does it require less index data or/and has it better performance?
(I'm using 1.4.2)

> It depends on whether you instantiate a new filter for each search.  
> Building a filter requires scanning through the terms in the index to 
> build BitSet for the documents that fall in that range.  Filters are 
> best used over multiple searches.

Simply saying:
I let the user to enter the search string on a HTML form, then I call my custom lucene-based java
class through command line (the calling method may change to the PHP-to-JAVA bridge if it'll be
perfect for my needs).
So, every search is a whole new round. New HTML FORM post -> new command line JVM call -> new
index searcher, etc...

The OS is caching the index file pretty well (only the memory size is the limit of course).

Will my implementation's performance drop down a lot when I implement DateFilter?

Regards,
Sanyi


		
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - Find what you need with new enhanced search.
http://info.mail.yahoo.com/mail_250

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: DateFilter on UnStored field

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 14, 2005, at 6:27 AM, Sanyi wrote:
>> However, DateFilter will not work on fields indexed as "2004-11-05".
>> DateFilter only works on fields that were indexed using the DateField.
>
> Well, can you post here a short example?
> When I currently type "xxx.UnStored(.." I can simply type 
> "xxx.DateField(.." ?
> Does it take strings like "2004-11-05"?

DateField has a utility method to return a String:

	DateField.timeToString(file.lastModified())

You'd use that String to pass to Field.UnStored.

I recommend, though, that you use a different format, such as the 
YYYY-MM-DD format you're using.

>> One option is to use a QueryFilter instead, filtering with a
>> RangeQuery.
>
> I've read somewhere that classic range filtering can easily exceed the 
> maximum number of boolean
> query clauses. I need to filter a very large range of dates with day 
> accuracy and I don't want to
> increase the max. clause count to very high values. So, I decided to 
> use DateFilter which has no
> such problems AFAIK.

Right!

In Lucene's latest codebase (though not in 1.4.x) includes RangeFilter 
which would do the trick for you.  If you want to stick with Lucene 
1.4.x, that's fine... just grab the code for that filter and use it as 
a custom filter - its compatible with 1.4.x.

> How much impact does DateFilter have on search times?

It depends on whether you instantiate a new filter for each search.  
Building a filter requires scanning through the terms in the index to 
build BitSet for the documents that fall in that range.  Filters are 
best used over multiple searches.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


tf -idf showing the scores beside each hit

Posted by *Clodagh* <cl...@yahoo.com>.
hi

is it possible to show a tf idf score beside each hit 

Eg i type in a word as a query for example the word
"free" and each file with the word free is named but i
would like the tf idf score to appear beside it?

like this

0. file1.txt tf idf score = 2.16543

is it possible??


	
		
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - You care about security. So do we. 
http://promotions.yahoo.com/new_mail

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: DateFilter on UnStored field

Posted by Sanyi <ne...@yahoo.com>.
> Following up on PA's reply.  Yes, DateFilter works on *indexed* values, 
> so whether a field is stored or not is irrelevant.

Great news, thanx!

> However, DateFilter will not work on fields indexed as "2004-11-05".  
> DateFilter only works on fields that were indexed using the DateField.

Well, can you post here a short example?
When I currently type "xxx.UnStored(.." I can simply type "xxx.DateField(.." ?
Does it take strings like "2004-11-05"?

> One option is to use a QueryFilter instead, filtering with a 
> RangeQuery.

I've read somewhere that classic range filtering can easily exceed the maximum number of boolean
query clauses. I need to filter a very large range of dates with day accuracy and I don't want to
increase the max. clause count to very high values. So, I decided to use DateFilter which has no
such problems AFAIK.

How much impact does DateFilter have on search times?

Regards,
Sanyi


		
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: DateFilter on UnStored field

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Following up on PA's reply.  Yes, DateFilter works on *indexed* values, 
so whether a field is stored or not is irrelevant.

However, DateFilter will not work on fields indexed as "2004-11-05".  
DateFilter only works on fields that were indexed using the DateField.  
One option is to use a QueryFilter instead, filtering with a 
RangeQuery.

	Erik


On Feb 13, 2005, at 7:09 AM, Sanyi wrote:

> Hi!
>
> Does DateFilter work on fields indexed as UnStored?
> Can I filter an UnStored field with values like "2004-11-05" ?
>
> Regards,
> Sanyi
>
>
> 		
> __________________________________
> Do you Yahoo!?
> Yahoo! Mail - 250MB free storage. Do more. Manage less.
> http://info.mail.yahoo.com/mail_250
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org