You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mukul Ranjan <mr...@egain.com> on 2016/06/14 10:31:02 UTC

How to improve indexing performance for lucene 6

Hi,

I have 150k documents in lucene index folder. It  is taking 30-35 minute to rebuild the index. We are fetching this data from sql server.
I have applied below  parameters while getting instance of indexWriter-

IndexWriterConfig indexWriterConfig = new IndexWriterConfig(getAnalyzer(callerContext, languageId));
TieredMergePolicy tieredMergePolicy = new TieredMergePolicy();
tieredMergePolicy.setMaxMergeAtOnce(10);
tieredMergePolicy.setSegmentsPerTier(84);
tieredMergePolicy.setMaxMergedSegmentMB(500);
indexWriterConfig.setRAMBufferSizeMB(200.0);
indexWriterConfig.setUseCompoundFile(false);

Can we further decrease the indexing time?  Can you provide the pinpoint areas where we can focus to decrease the indexing time.

Thanks,
Mukul Ranjan
Visit eGain on YouTube<https://www.youtube.com/user/egainchannel> and LinkedIn<https://www.linkedin.com/company/egain-corporation>

Re: How to improve indexing performance for lucene 6

Posted by Hans Lund <ha...@gmail.com>.
Hi Mukul

There is not much information in your question. So to make a guess could
you provide

1) the time it takes to fetch the docs from sql server (without doing any
indexing)
2) the size of the documents.
3) what kind of analysing is done
4) why are you creating this mergepolicy - is this what you expect to be
the bottleneck?

Regards
Hans Lund



On Tue, Jun 14, 2016 at 12:31 PM, Mukul Ranjan <mr...@egain.com> wrote:

> Hi,
>
> I have 150k documents in lucene index folder. It  is taking 30-35 minute
> to rebuild the index. We are fetching this data from sql server.
> I have applied below  parameters while getting instance of indexWriter-
>
> IndexWriterConfig indexWriterConfig = new
> IndexWriterConfig(getAnalyzer(callerContext, languageId));
> TieredMergePolicy tieredMergePolicy = new TieredMergePolicy();
> tieredMergePolicy.setMaxMergeAtOnce(10);
> tieredMergePolicy.setSegmentsPerTier(84);
> tieredMergePolicy.setMaxMergedSegmentMB(500);
> indexWriterConfig.setRAMBufferSizeMB(200.0);
> indexWriterConfig.setUseCompoundFile(false);
>
> Can we further decrease the indexing time?  Can you provide the pinpoint
> areas where we can focus to decrease the indexing time.
>
> Thanks,
> Mukul Ranjan
> Visit eGain on YouTube<https://www.youtube.com/user/egainchannel> and
> LinkedIn<https://www.linkedin.com/company/egain-corporation>
>