You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Humberto Rocha <hu...@gmail.com> on 2015/09/09 03:59:59 UTC

Improvement performance of my indexing with Lucene

Hi,

I need to improve the performance of my indexing with Lucene .

Is there any material (eg, article, book , tutorial ) that can be used for
this?

Could anyone help me please ?

Thanks a lot!

-- 
Humberto

Re: Improvement performance of my indexing with Lucene

Posted by Ian Lea <ia...@gmail.com>.
> Great! I will upgrade Lucene then.

Good start.

> I'm not using database.

Fine, but you must be getting your data from somewhere.  Maybe that is
blazingly fast, maybe it isn't.

> Are there some java samples code ?
>
> Samples with:
>
> 1. indexing documents in batches.

I think this means call IndexWriter.commit() every some-large-number
of docs rather than some-small-number.

> 2. Multi-threaded indexing

I don't have examples, but pseudocode would look something like

 IndexWriter iw = whatever
 Thread t1 = whatever(iw, data-source-1)
 Thread t2 = whatever(iw, data-source-2)
 ...
 t1.start()
 t2.start()
 ...
 wait ...
 iw.close()


--
Ian.


> On Wed, Sep 9, 2015 at 11:23 AM, Ian Lea <ia...@gmail.com> wrote:
>
>> The link that I sent,
>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed is for Lucene,
>> not Solr.  The second item on the list is to make sure you are using
>> the latest version of lucene so that would be a good starting point.
>>
>>
>> --
>> Ian.
>>
>>
>> On Wed, Sep 9, 2015 at 3:10 PM, Humberto Rocha <hu...@gmail.com> wrote:
>> > Thanks a lot !
>> >
>> > But do you know some links that helps implement these optimization
>> options
>> > without the Solr (using only lucene) ?
>> >
>> > I am using lucene 4.9.
>> >
>> > More thanks.
>> >
>> > Humberto
>> >
>> >
>> > On Wed, Sep 9, 2015 at 5:23 AM, Ian Lea <ia...@gmail.com> wrote:
>> >
>> >> See also http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>> >>
>> >> Also double check that it's Lucene that you should be concentrating
>> >> on.  In my experience it's often the reading of the data from a
>> >> database, if that's what you are doing, that is the bottleneck.
>> >>
>> >>
>> >> --
>> >> Ian.
>> >>
>> >>
>> >> On Wed, Sep 9, 2015 at 6:07 AM, Modassar Ather <mo...@gmail.com>
>> >> wrote:
>> >> > There are few things you can try to improve indexing performance.
>> >> >
>> >> > 1. Try indexing documents in batches.
>> >> > 2. You can try multi-threaded indexing. What I mean to say is feed the
>> >> data
>> >> > using multiple threads to the indexer.
>> >> > 3. Analysis of memory utilization and GC tuning.
>> >> >
>> >> > Following are few links which has few details on Solr indexing
>> >> performance.
>> >> > http://wiki.apache.org/solr/SolrPerformanceFactors
>> >> >
>> >>
>> https://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
>> >> >
>> >> > Regards,
>> >> > Modassar
>> >> >
>> >> > On Wed, Sep 9, 2015 at 7:29 AM, Humberto Rocha <hu...@gmail.com>
>> >> wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> I need to improve the performance of my indexing with Lucene .
>> >> >>
>> >> >> Is there any material (eg, article, book , tutorial ) that can be
>> used
>> >> for
>> >> >> this?
>> >> >>
>> >> >> Could anyone help me please ?
>> >> >>
>> >> >> Thanks a lot!
>> >> >>
>> >> >> --
>> >> >> Humberto
>> >> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>> >
>> > --
>> > Humberto Rocha
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> Humberto Rocha

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Improvement performance of my indexing with Lucene

Posted by Humberto Rocha <hu...@gmail.com>.
Great! I will upgrade Lucene then.

I'm not using database.

Are there some java samples code ?

Samples with:

1. indexing documents in batches.
2. Multi-threaded indexing

Thanks a lot.


On Wed, Sep 9, 2015 at 11:23 AM, Ian Lea <ia...@gmail.com> wrote:

> The link that I sent,
> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed is for Lucene,
> not Solr.  The second item on the list is to make sure you are using
> the latest version of lucene so that would be a good starting point.
>
>
> --
> Ian.
>
>
> On Wed, Sep 9, 2015 at 3:10 PM, Humberto Rocha <hu...@gmail.com> wrote:
> > Thanks a lot !
> >
> > But do you know some links that helps implement these optimization
> options
> > without the Solr (using only lucene) ?
> >
> > I am using lucene 4.9.
> >
> > More thanks.
> >
> > Humberto
> >
> >
> > On Wed, Sep 9, 2015 at 5:23 AM, Ian Lea <ia...@gmail.com> wrote:
> >
> >> See also http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> >>
> >> Also double check that it's Lucene that you should be concentrating
> >> on.  In my experience it's often the reading of the data from a
> >> database, if that's what you are doing, that is the bottleneck.
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >> On Wed, Sep 9, 2015 at 6:07 AM, Modassar Ather <mo...@gmail.com>
> >> wrote:
> >> > There are few things you can try to improve indexing performance.
> >> >
> >> > 1. Try indexing documents in batches.
> >> > 2. You can try multi-threaded indexing. What I mean to say is feed the
> >> data
> >> > using multiple threads to the indexer.
> >> > 3. Analysis of memory utilization and GC tuning.
> >> >
> >> > Following are few links which has few details on Solr indexing
> >> performance.
> >> > http://wiki.apache.org/solr/SolrPerformanceFactors
> >> >
> >>
> https://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
> >> >
> >> > Regards,
> >> > Modassar
> >> >
> >> > On Wed, Sep 9, 2015 at 7:29 AM, Humberto Rocha <hu...@gmail.com>
> >> wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> I need to improve the performance of my indexing with Lucene .
> >> >>
> >> >> Is there any material (eg, article, book , tutorial ) that can be
> used
> >> for
> >> >> this?
> >> >>
> >> >> Could anyone help me please ?
> >> >>
> >> >> Thanks a lot!
> >> >>
> >> >> --
> >> >> Humberto
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
> > --
> > Humberto Rocha
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Humberto Rocha

Re: Improvement performance of my indexing with Lucene

Posted by Ian Lea <ia...@gmail.com>.
The link that I sent,
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed is for Lucene,
not Solr.  The second item on the list is to make sure you are using
the latest version of lucene so that would be a good starting point.


--
Ian.


On Wed, Sep 9, 2015 at 3:10 PM, Humberto Rocha <hu...@gmail.com> wrote:
> Thanks a lot !
>
> But do you know some links that helps implement these optimization options
> without the Solr (using only lucene) ?
>
> I am using lucene 4.9.
>
> More thanks.
>
> Humberto
>
>
> On Wed, Sep 9, 2015 at 5:23 AM, Ian Lea <ia...@gmail.com> wrote:
>
>> See also http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>>
>> Also double check that it's Lucene that you should be concentrating
>> on.  In my experience it's often the reading of the data from a
>> database, if that's what you are doing, that is the bottleneck.
>>
>>
>> --
>> Ian.
>>
>>
>> On Wed, Sep 9, 2015 at 6:07 AM, Modassar Ather <mo...@gmail.com>
>> wrote:
>> > There are few things you can try to improve indexing performance.
>> >
>> > 1. Try indexing documents in batches.
>> > 2. You can try multi-threaded indexing. What I mean to say is feed the
>> data
>> > using multiple threads to the indexer.
>> > 3. Analysis of memory utilization and GC tuning.
>> >
>> > Following are few links which has few details on Solr indexing
>> performance.
>> > http://wiki.apache.org/solr/SolrPerformanceFactors
>> >
>> https://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
>> >
>> > Regards,
>> > Modassar
>> >
>> > On Wed, Sep 9, 2015 at 7:29 AM, Humberto Rocha <hu...@gmail.com>
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> I need to improve the performance of my indexing with Lucene .
>> >>
>> >> Is there any material (eg, article, book , tutorial ) that can be used
>> for
>> >> this?
>> >>
>> >> Could anyone help me please ?
>> >>
>> >> Thanks a lot!
>> >>
>> >> --
>> >> Humberto
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> Humberto Rocha

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Improvement performance of my indexing with Lucene

Posted by Humberto Rocha <hu...@gmail.com>.
Thanks a lot !

But do you know some links that helps implement these optimization options
without the Solr (using only lucene) ?

I am using lucene 4.9.

More thanks.

Humberto


On Wed, Sep 9, 2015 at 5:23 AM, Ian Lea <ia...@gmail.com> wrote:

> See also http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>
> Also double check that it's Lucene that you should be concentrating
> on.  In my experience it's often the reading of the data from a
> database, if that's what you are doing, that is the bottleneck.
>
>
> --
> Ian.
>
>
> On Wed, Sep 9, 2015 at 6:07 AM, Modassar Ather <mo...@gmail.com>
> wrote:
> > There are few things you can try to improve indexing performance.
> >
> > 1. Try indexing documents in batches.
> > 2. You can try multi-threaded indexing. What I mean to say is feed the
> data
> > using multiple threads to the indexer.
> > 3. Analysis of memory utilization and GC tuning.
> >
> > Following are few links which has few details on Solr indexing
> performance.
> > http://wiki.apache.org/solr/SolrPerformanceFactors
> >
> https://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
> >
> > Regards,
> > Modassar
> >
> > On Wed, Sep 9, 2015 at 7:29 AM, Humberto Rocha <hu...@gmail.com>
> wrote:
> >
> >> Hi,
> >>
> >> I need to improve the performance of my indexing with Lucene .
> >>
> >> Is there any material (eg, article, book , tutorial ) that can be used
> for
> >> this?
> >>
> >> Could anyone help me please ?
> >>
> >> Thanks a lot!
> >>
> >> --
> >> Humberto
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Humberto Rocha

Re: Improvement performance of my indexing with Lucene

Posted by Ian Lea <ia...@gmail.com>.
See also http://wiki.apache.org/lucene-java/ImproveIndexingSpeed

Also double check that it's Lucene that you should be concentrating
on.  In my experience it's often the reading of the data from a
database, if that's what you are doing, that is the bottleneck.


--
Ian.


On Wed, Sep 9, 2015 at 6:07 AM, Modassar Ather <mo...@gmail.com> wrote:
> There are few things you can try to improve indexing performance.
>
> 1. Try indexing documents in batches.
> 2. You can try multi-threaded indexing. What I mean to say is feed the data
> using multiple threads to the indexer.
> 3. Analysis of memory utilization and GC tuning.
>
> Following are few links which has few details on Solr indexing performance.
> http://wiki.apache.org/solr/SolrPerformanceFactors
> https://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/
>
> Regards,
> Modassar
>
> On Wed, Sep 9, 2015 at 7:29 AM, Humberto Rocha <hu...@gmail.com> wrote:
>
>> Hi,
>>
>> I need to improve the performance of my indexing with Lucene .
>>
>> Is there any material (eg, article, book , tutorial ) that can be used for
>> this?
>>
>> Could anyone help me please ?
>>
>> Thanks a lot!
>>
>> --
>> Humberto
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Improvement performance of my indexing with Lucene

Posted by Modassar Ather <mo...@gmail.com>.
There are few things you can try to improve indexing performance.

1. Try indexing documents in batches.
2. You can try multi-threaded indexing. What I mean to say is feed the data
using multiple threads to the indexer.
3. Analysis of memory utilization and GC tuning.

Following are few links which has few details on Solr indexing performance.
http://wiki.apache.org/solr/SolrPerformanceFactors
https://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/

Regards,
Modassar

On Wed, Sep 9, 2015 at 7:29 AM, Humberto Rocha <hu...@gmail.com> wrote:

> Hi,
>
> I need to improve the performance of my indexing with Lucene .
>
> Is there any material (eg, article, book , tutorial ) that can be used for
> this?
>
> Could anyone help me please ?
>
> Thanks a lot!
>
> --
> Humberto
>