You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Michael Celona <mc...@criticalmention.com> on 2005/03/10 14:45:43 UTC

search performace

I have a large index that needs to yield very fast query times.  I am
sorting by date as default since I am interested in the most recent
documents.  I was wondering if I boosted the score of my documents in
proportion to the date and not sorting would this increase search
performance. Thoughts?

 

Thanks,

Michael


RE: search performace

Posted by Michael Celona <mc...@criticalmention.com>.
My epoch looks like 1110816121 but is represented by a string.

-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: Thursday, March 17, 2005 11:41 AM
To: java-user@lucene.apache.org
Subject: Re: search performace


On Mar 17, 2005, at 11:13 AM, Michael Celona wrote:
> Epoch is in seconds...

But you still haven't provided the *type* of epoch.  It's a Date?  a 
String?  What do the string values look like?

>  I am also forced to used a date filter on most of
> searches... how bad is the performance hit of that.

Only testing will tell.  The hit of a filter comes the first time (as 
long as you cache and use the same IndexReader), so its not likely to 
be a factor over many queries.

	Erik

>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Thursday, March 17, 2005 9:54 AM
> To: java-user@lucene.apache.org
> Subject: Re: search performace
>
> Is epoch a Date?  or a String?  If a String, what format is it?
>
> Sorting by a Date keyword field will be sorting as a String value,
> which is a fair bit more resource intensive than if it was numeric.
>
> Try using a purely numeric field (though as a String) that can be
> represented as an int be sure to specify the sort type as an int and
> see if that improves performance.  I'm pretty certain you'd still get
> better performance by using a boost than a sort though.
>
> 	Erik
>
> On Mar 17, 2005, at 8:59 AM, Michael Celona wrote:
>
>> I am sorting against an epoch time stored in my index. By using:
>>
>> contactDocument.add( Field.Keyword( "epoch_time", epoch );
>>
>> Then I sort by this field.  My search time is in the order of 3sec on
>> an
>> index of about 6G using simple searches against a text field.  By 
>> using
>> boosts I was hoping to increase performance.  Do you think this will
>> make a
>> big difference?
>>
>> 	Michael
>>
>> -----Original Message-----
>> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
>> Sent: Tuesday, March 15, 2005 8:43 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: search performace
>>
>> I've been effectively off-line for a few days, so I'm not sure if
>> anyone has replied on this thread yet.
>>
>> Using boosts will definitely use less resources than sorting.  If you
>> do use sorting for dates, be sure you're doing it numerically rather
>> than lexicographically.
>>
>> 	Erik
>>
>> On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:
>>
>>> I have a large index that needs to yield very fast query times.  I am
>>> sorting by date as default since I am interested in the most recent
>>> documents.  I was wondering if I boosted the score of my documents in
>>> proportion to the date and not sorting would this increase search
>>> performance. Thoughts?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Michael
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: search performace

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Mar 17, 2005, at 11:13 AM, Michael Celona wrote:
> Epoch is in seconds...

But you still haven't provided the *type* of epoch.  It's a Date?  a 
String?  What do the string values look like?

>  I am also forced to used a date filter on most of
> searches... how bad is the performance hit of that.

Only testing will tell.  The hit of a filter comes the first time (as 
long as you cache and use the same IndexReader), so its not likely to 
be a factor over many queries.

	Erik

>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Thursday, March 17, 2005 9:54 AM
> To: java-user@lucene.apache.org
> Subject: Re: search performace
>
> Is epoch a Date?  or a String?  If a String, what format is it?
>
> Sorting by a Date keyword field will be sorting as a String value,
> which is a fair bit more resource intensive than if it was numeric.
>
> Try using a purely numeric field (though as a String) that can be
> represented as an int be sure to specify the sort type as an int and
> see if that improves performance.  I'm pretty certain you'd still get
> better performance by using a boost than a sort though.
>
> 	Erik
>
> On Mar 17, 2005, at 8:59 AM, Michael Celona wrote:
>
>> I am sorting against an epoch time stored in my index. By using:
>>
>> contactDocument.add( Field.Keyword( "epoch_time", epoch );
>>
>> Then I sort by this field.  My search time is in the order of 3sec on
>> an
>> index of about 6G using simple searches against a text field.  By 
>> using
>> boosts I was hoping to increase performance.  Do you think this will
>> make a
>> big difference?
>>
>> 	Michael
>>
>> -----Original Message-----
>> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
>> Sent: Tuesday, March 15, 2005 8:43 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: search performace
>>
>> I've been effectively off-line for a few days, so I'm not sure if
>> anyone has replied on this thread yet.
>>
>> Using boosts will definitely use less resources than sorting.  If you
>> do use sorting for dates, be sure you're doing it numerically rather
>> than lexicographically.
>>
>> 	Erik
>>
>> On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:
>>
>>> I have a large index that needs to yield very fast query times.  I am
>>> sorting by date as default since I am interested in the most recent
>>> documents.  I was wondering if I boosted the score of my documents in
>>> proportion to the date and not sorting would this increase search
>>> performance. Thoughts?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Michael
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: search performace

Posted by Michael Celona <mc...@criticalmention.com>.
Epoch is in seconds... I am also forced to used a date filter on most of
searches... how bad is the performance hit of that. 

-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: Thursday, March 17, 2005 9:54 AM
To: java-user@lucene.apache.org
Subject: Re: search performace

Is epoch a Date?  or a String?  If a String, what format is it?

Sorting by a Date keyword field will be sorting as a String value, 
which is a fair bit more resource intensive than if it was numeric.

Try using a purely numeric field (though as a String) that can be 
represented as an int be sure to specify the sort type as an int and 
see if that improves performance.  I'm pretty certain you'd still get 
better performance by using a boost than a sort though.

	Erik

On Mar 17, 2005, at 8:59 AM, Michael Celona wrote:

> I am sorting against an epoch time stored in my index. By using:
>
> contactDocument.add( Field.Keyword( "epoch_time", epoch );
>
> Then I sort by this field.  My search time is in the order of 3sec on 
> an
> index of about 6G using simple searches against a text field.  By using
> boosts I was hoping to increase performance.  Do you think this will 
> make a
> big difference?
>
> 	Michael
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Tuesday, March 15, 2005 8:43 AM
> To: java-user@lucene.apache.org
> Subject: Re: search performace
>
> I've been effectively off-line for a few days, so I'm not sure if
> anyone has replied on this thread yet.
>
> Using boosts will definitely use less resources than sorting.  If you
> do use sorting for dates, be sure you're doing it numerically rather
> than lexicographically.
>
> 	Erik
>
> On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:
>
>> I have a large index that needs to yield very fast query times.  I am
>> sorting by date as default since I am interested in the most recent
>> documents.  I was wondering if I boosted the score of my documents in
>> proportion to the date and not sorting would this increase search
>> performance. Thoughts?
>>
>>
>>
>> Thanks,
>>
>> Michael
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: search performace

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Is epoch a Date?  or a String?  If a String, what format is it?

Sorting by a Date keyword field will be sorting as a String value, 
which is a fair bit more resource intensive than if it was numeric.

Try using a purely numeric field (though as a String) that can be 
represented as an int be sure to specify the sort type as an int and 
see if that improves performance.  I'm pretty certain you'd still get 
better performance by using a boost than a sort though.

	Erik

On Mar 17, 2005, at 8:59 AM, Michael Celona wrote:

> I am sorting against an epoch time stored in my index. By using:
>
> contactDocument.add( Field.Keyword( "epoch_time", epoch );
>
> Then I sort by this field.  My search time is in the order of 3sec on 
> an
> index of about 6G using simple searches against a text field.  By using
> boosts I was hoping to increase performance.  Do you think this will 
> make a
> big difference?
>
> 	Michael
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Tuesday, March 15, 2005 8:43 AM
> To: java-user@lucene.apache.org
> Subject: Re: search performace
>
> I've been effectively off-line for a few days, so I'm not sure if
> anyone has replied on this thread yet.
>
> Using boosts will definitely use less resources than sorting.  If you
> do use sorting for dates, be sure you're doing it numerically rather
> than lexicographically.
>
> 	Erik
>
> On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:
>
>> I have a large index that needs to yield very fast query times.  I am
>> sorting by date as default since I am interested in the most recent
>> documents.  I was wondering if I boosted the score of my documents in
>> proportion to the date and not sorting would this increase search
>> performance. Thoughts?
>>
>>
>>
>> Thanks,
>>
>> Michael
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: search performace

Posted by Michael Celona <mc...@criticalmention.com>.
I am sorting against an epoch time stored in my index. By using:

contactDocument.add( Field.Keyword( "epoch_time", epoch );

Then I sort by this field.  My search time is in the order of 3sec on an
index of about 6G using simple searches against a text field.  By using
boosts I was hoping to increase performance.  Do you think this will make a
big difference?

	Michael

-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: Tuesday, March 15, 2005 8:43 AM
To: java-user@lucene.apache.org
Subject: Re: search performace

I've been effectively off-line for a few days, so I'm not sure if 
anyone has replied on this thread yet.

Using boosts will definitely use less resources than sorting.  If you 
do use sorting for dates, be sure you're doing it numerically rather 
than lexicographically.

	Erik

On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:

> I have a large index that needs to yield very fast query times.  I am
> sorting by date as default since I am interested in the most recent
> documents.  I was wondering if I boosted the score of my documents in
> proportion to the date and not sorting would this increase search
> performance. Thoughts?
>
>
>
> Thanks,
>
> Michael
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: search performace

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
I've been effectively off-line for a few days, so I'm not sure if 
anyone has replied on this thread yet.

Using boosts will definitely use less resources than sorting.  If you 
do use sorting for dates, be sure you're doing it numerically rather 
than lexicographically.

	Erik

On Mar 10, 2005, at 8:45 AM, Michael Celona wrote:

> I have a large index that needs to yield very fast query times.  I am
> sorting by date as default since I am interested in the most recent
> documents.  I was wondering if I boosted the score of my documents in
> proportion to the date and not sorting would this increase search
> performance. Thoughts?
>
>
>
> Thanks,
>
> Michael
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org