You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Laxmilal Menariya <lm...@chambal.com> on 2009/08/10 09:23:17 UTC
Taking too much time in optimization
Hello everyone,
I have created a sample application & indexing files properties, have index
appx 107K files.
I am getting OutofMemoryError after 100K while indexing, got the cause from
MaxBuffereddocs=100K, but after that I am calling optimize() method, this is
taking too much time appx 12-HRS, and index size is more than 500GB, its too
large.
I am using Lucene 2.4.0. Could some one please let me know what wrong with
my configuration.
My Configuration is :
lucWriter = new IndexWriter("C:\\Laxmilal", new KeywordAnalyzer(),
true);
lucWriter.setMergeFactor((int) 1000);
lucWriter.setMaxMergeDocs((int) 2147483647);
lucWriter.setMaxBufferedDocs((int) 100000);
--
Thanks,
Laxmilal Menariya
http://www.bucketexplorer.com/
http://www.sdbexplorer.com/
http://www.chambal.com/
Re: Taking too much time in optimization
Posted by Laxmilal Menariya <lm...@chambal.com>.
Thanks, I will try.
On Tue, Aug 11, 2009 at 6:08 AM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:
> Hi,
>
> That mergeFactor is too high. I suggest going back to default (10).
> maxBufferedDocs is an old and not very accurate setting (imagine what
> happens with the JVM heap if your indexer hits a SUPER LARGE document). Use
> setRamBufferSizeMB instead.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
> > From: Laxmilal Menariya <lm...@chambal.com>
> > To: java-user@lucene.apache.org
> > Sent: Monday, August 10, 2009 3:23:17 AM
> > Subject: Taking too much time in optimization
> >
> > Hello everyone,
> >
> > I have created a sample application & indexing files properties, have
> index
> > appx 107K files.
> >
> > I am getting OutofMemoryError after 100K while indexing, got the cause
> from
> > MaxBuffereddocs=100K, but after that I am calling optimize() method, this
> is
> > taking too much time appx 12-HRS, and index size is more than 500GB, its
> too
> > large.
> >
> > I am using Lucene 2.4.0. Could some one please let me know what wrong
> with
> > my configuration.
> >
> > My Configuration is :
> >
> > lucWriter = new IndexWriter("C:\\Laxmilal", new KeywordAnalyzer(),
> > true);
> > lucWriter.setMergeFactor((int) 1000);
> > lucWriter.setMaxMergeDocs((int) 2147483647);
> > lucWriter.setMaxBufferedDocs((int) 100000);
> >
> >
> > --
> > Thanks,
> > Laxmilal Menariya
> >
> > http://www.bucketexplorer.com/
> > http://www.sdbexplorer.com/
> > http://www.chambal.com/
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
--
Thanks,
Laxmilal Menariya
http://www.bucketexplorer.com/
http://www.sdbexplorer.com/
http://www.chambal.com/
Re: Taking too much time in optimization
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi,
That mergeFactor is too high. I suggest going back to default (10).
maxBufferedDocs is an old and not very accurate setting (imagine what happens with the JVM heap if your indexer hits a SUPER LARGE document). Use setRamBufferSizeMB instead.
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
----- Original Message ----
> From: Laxmilal Menariya <lm...@chambal.com>
> To: java-user@lucene.apache.org
> Sent: Monday, August 10, 2009 3:23:17 AM
> Subject: Taking too much time in optimization
>
> Hello everyone,
>
> I have created a sample application & indexing files properties, have index
> appx 107K files.
>
> I am getting OutofMemoryError after 100K while indexing, got the cause from
> MaxBuffereddocs=100K, but after that I am calling optimize() method, this is
> taking too much time appx 12-HRS, and index size is more than 500GB, its too
> large.
>
> I am using Lucene 2.4.0. Could some one please let me know what wrong with
> my configuration.
>
> My Configuration is :
>
> lucWriter = new IndexWriter("C:\\Laxmilal", new KeywordAnalyzer(),
> true);
> lucWriter.setMergeFactor((int) 1000);
> lucWriter.setMaxMergeDocs((int) 2147483647);
> lucWriter.setMaxBufferedDocs((int) 100000);
>
>
> --
> Thanks,
> Laxmilal Menariya
>
> http://www.bucketexplorer.com/
> http://www.sdbexplorer.com/
> http://www.chambal.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org