You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucy.apache.org by Shahab Mohammed <sh...@gmail.com> on 2014/12/03 16:15:08 UTC
[lucy-user] Is there any benchmarking details about how fast is lucy indexing
Dear Lucy Users,
I will like to know if you can direct me to a page that has Lucy Indexing
benchmarking.
I understand that benchmarking will depend on CPU/RAM hardware config etc
as well as no of fields getting indexed.
I will like to know what is rate of indexing .. ?? MB/sec that can be
indexed. If some one has done such benchmarking please share the info with
me.
Thanks
Shahab
Re: [lucy-user] Is there any benchmarking details about how fast is
lucy indexing
Posted by Shahab Mohammed <sh...@gmail.com>.
Dear Nick
Thank you so much for your reply. This helps a lot to me.
Kind Regards
Shahab
On Thu, Dec 4, 2014 at 5:35 AM, Nick Wellnhofer <we...@aevum.de> wrote:
> On 03/12/2014 16:15, Shahab Mohammed wrote:
>
>> I will like to know what is rate of indexing .. ?? MB/sec that can be
>> indexed. If some one has done such benchmarking please share the info with
>> me.
>>
>
> This depends on a lot of factors like the schema and analysis chain you
> use, the total size of your index, and the hardware. But if you want a
> ballpark figure, I'd say about 1-2 MB/s.
>
> Here is some data for one of our production systems running on a typical
> VPS:
>
> Total fields: 3
> Full text field: 2
> Highlightable fields: 2
> Documents: 20,000
> Raw input size: 35 MB
> Index size: 80 MB
> Analysis chain:
> StandardTokenizer
> Normalizer
> SnowballStopFilter
> SnowballStemmer
> Total time to reindex: 30s
>
> This includes the time to pull all of the data out of a PostgreSQL
> database, prepare it for indexing, and some other unrelated operations
> which shouldn't have a large impact.
>
> Nick
>
>
Re: [lucy-user] Is there any benchmarking details about how fast
is lucy indexing
Posted by Nick Wellnhofer <we...@aevum.de>.
On 03/12/2014 16:15, Shahab Mohammed wrote:
> I will like to know what is rate of indexing .. ?? MB/sec that can be
> indexed. If some one has done such benchmarking please share the info with
> me.
This depends on a lot of factors like the schema and analysis chain you use,
the total size of your index, and the hardware. But if you want a ballpark
figure, I'd say about 1-2 MB/s.
Here is some data for one of our production systems running on a typical VPS:
Total fields: 3
Full text field: 2
Highlightable fields: 2
Documents: 20,000
Raw input size: 35 MB
Index size: 80 MB
Analysis chain:
StandardTokenizer
Normalizer
SnowballStopFilter
SnowballStemmer
Total time to reindex: 30s
This includes the time to pull all of the data out of a PostgreSQL database,
prepare it for indexing, and some other unrelated operations which shouldn't
have a large impact.
Nick