You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by "Sérgio Araújo (JIRA)" <ji...@apache.org> on 2009/10/12 18:51:31 UTC
[jira] Created: (LUCENENET-195) IndexWriter.Optimize(); return an
exception
IndexWriter.Optimize(); return an exception
-------------------------------------------
Key: LUCENENET-195
URL: https://issues.apache.org/jira/browse/LUCENENET-195
Project: Lucene.Net
Issue Type: Bug
Environment: Framework 1.1 .NET
Reporter: Sérgio Araújo
We are using the Lucene search engine a couple of months, on the first approach seems a very good and high-performance engine.
We are using the your "Lucene.net.dll"API version 2.0.0.4.
We have an index with 20GB approximately, all hours are added news docs to index and a time per day the optimization is done at 9 pm.
During a couple of days everything ran fine even a day that optimization process "writer.Optimize();" return the following exception:
"Source array was not long enough. Check srcIndex and length, and the array´s lower bounds."
Here you can find some parts of my code:
Document doc; doc = null;
IndexWriter writer; writer = null;
writer = new IndexWriter(strArticleIndexFolder, new StandardAnalyzer(), isNew);
writer.SetMergeFactor(1000);
writer.SetMaxMergeDocs(10000);
foreach (ArticleIndexFull objArticleIndex in lstArticleIndexFull)
{
doc = new Document();
doc.Add(newField(O4kFreeSearchTag.ArticleLuceneId,objArticleIndex.ArticleIndexFullId.ToString(), Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
doc.Add(newField(O4kFreeSearchTag.ArticleId,objArticleIndex.ArticleId.ToString(),Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
doc.Add(newField(O4kFreeSearchTag.ProductionDate,FactoryBLL.ArticleIndex.ClearCharStream(AlphaNumeric.ConvertToString(objArticleIndex.ProductionDate.ToString("yyyyMMdd",System.Globalization.CultureInfo.GetCultureInfo("en-US")),String.Empty)),Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES));
....
writer.AddDocument(doc);
}
if (System.DateTime.Now.Hour == 21)
{
writer.Optimize();
}
writer.Close();
If we migrate to last version available in this case the 2.4.3 my problem will be fixed?
Has my code any kind of problem?
We will appreciate your help.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
RE: [jira] Created: (LUCENENET-195) IndexWriter.Optimize(); return an exception
Posted by Michael Garski <mg...@myspace-inc.com>.
Sérgio,
What is the stack trace on that exception? That will help point to where in the optimize process the issue is occurring.
I noticed similar behavior during an optimization on a large index on 1.9 & 2.0 only when term vectors were enabled. As I didn't really need term vectors I disabled them and then everything was fine. With version 2.3 and beyond I have not encountered any issues during an optimize when term vectors were enabled (we use them for faceting and a few other things). I'd suggest going with a newer version of Lucene.Net in a test environment to see if it is reproducible there.
Michael
-----Original Message-----
From: Sérgio Araújo (JIRA) [mailto:jira@apache.org]
Sent: Monday, October 12, 2009 9:52 AM
To: lucene-net-dev@incubator.apache.org
Subject: [jira] Created: (LUCENENET-195) IndexWriter.Optimize(); return an exception
IndexWriter.Optimize(); return an exception
-------------------------------------------
Key: LUCENENET-195
URL: https://issues.apache.org/jira/browse/LUCENENET-195
Project: Lucene.Net
Issue Type: Bug
Environment: Framework 1.1 .NET
Reporter: Sérgio Araújo
We are using the Lucene search engine a couple of months, on the first approach seems a very good and high-performance engine.
We are using the your "Lucene.net.dll"API version 2.0.0.4.
We have an index with 20GB approximately, all hours are added news docs to index and a time per day the optimization is done at 9 pm.
During a couple of days everything ran fine even a day that optimization process "writer.Optimize();" return the following exception:
"Source array was not long enough. Check srcIndex and length, and the array´s lower bounds."
Here you can find some parts of my code:
Document doc; doc = null;
IndexWriter writer; writer = null;
writer = new IndexWriter(strArticleIndexFolder, new StandardAnalyzer(), isNew);
writer.SetMergeFactor(1000);
writer.SetMaxMergeDocs(10000);
foreach (ArticleIndexFull objArticleIndex in lstArticleIndexFull)
{
doc = new Document();
doc.Add(newField(O4kFreeSearchTag.ArticleLuceneId,objArticleIndex.ArticleIndexFullId.ToString(), Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
doc.Add(newField(O4kFreeSearchTag.ArticleId,objArticleIndex.ArticleId.ToString(),Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
doc.Add(newField(O4kFreeSearchTag.ProductionDate,FactoryBLL.ArticleIndex.ClearCharStream(AlphaNumeric.ConvertToString(objArticleIndex.ProductionDate.ToString("yyyyMMdd",System.Globalization.CultureInfo.GetCultureInfo("en-US")),String.Empty)),Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES));
....
writer.AddDocument(doc);
}
if (System.DateTime.Now.Hour == 21)
{
writer.Optimize();
}
writer.Close();
If we migrate to last version available in this case the 2.4.3 my problem will be fixed?
Has my code any kind of problem?
We will appreciate your help.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (LUCENENET-195) IndexWriter.Optimize(); return an
exception
Posted by "Digy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENENET-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764761#action_12764761 ]
Digy commented on LUCENENET-195:
--------------------------------
>> if (System.DateTime.Now.Hour == 21)
Are you sure that your code does not call optimize (for ex, at 21:00:01 , 21:00:02 etc,) while another optimization is in progress?
DIGY
> IndexWriter.Optimize(); return an exception
> -------------------------------------------
>
> Key: LUCENENET-195
> URL: https://issues.apache.org/jira/browse/LUCENENET-195
> Project: Lucene.Net
> Issue Type: Bug
> Environment: Framework 1.1 .NET
> Reporter: Sérgio Araújo
>
> We are using the Lucene search engine a couple of months, on the first approach seems a very good and high-performance engine.
> We are using the your "Lucene.net.dll"API version 2.0.0.4.
> We have an index with 20GB approximately, all hours are added news docs to index and a time per day the optimization is done at 9 pm.
> During a couple of days everything ran fine even a day that optimization process "writer.Optimize();" return the following exception:
> "Source array was not long enough. Check srcIndex and length, and the array´s lower bounds."
> Here you can find some parts of my code:
> Document doc; doc = null;
> IndexWriter writer; writer = null;
> writer = new IndexWriter(strArticleIndexFolder, new StandardAnalyzer(), isNew);
> writer.SetMergeFactor(1000);
> writer.SetMaxMergeDocs(10000);
> foreach (ArticleIndexFull objArticleIndex in lstArticleIndexFull)
> {
> doc = new Document();
> doc.Add(newField(O4kFreeSearchTag.ArticleLuceneId,objArticleIndex.ArticleIndexFullId.ToString(), Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
> doc.Add(newField(O4kFreeSearchTag.ArticleId,objArticleIndex.ArticleId.ToString(),Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
> doc.Add(newField(O4kFreeSearchTag.ProductionDate,FactoryBLL.ArticleIndex.ClearCharStream(AlphaNumeric.ConvertToString(objArticleIndex.ProductionDate.ToString("yyyyMMdd",System.Globalization.CultureInfo.GetCultureInfo("en-US")),String.Empty)),Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES));
> ....
> writer.AddDocument(doc);
> }
> if (System.DateTime.Now.Hour == 21)
> {
> writer.Optimize();
> }
> writer.Close();
> If we migrate to last version available in this case the 2.4.3 my problem will be fixed?
> Has my code any kind of problem?
> We will appreciate your help.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Closed: (LUCENENET-195) IndexWriter.Optimize(); return an
exception
Posted by "George Aroush (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENENET-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
George Aroush closed LUCENENET-195.
-----------------------------------
Resolution: Invalid
Let's not abuse JIRA to discuss usage. If an when you find issues, then start a JIRA discussion around it. Use lucene-net-user@ mailing list to continue discussing this topic. Thanks.
> IndexWriter.Optimize(); return an exception
> -------------------------------------------
>
> Key: LUCENENET-195
> URL: https://issues.apache.org/jira/browse/LUCENENET-195
> Project: Lucene.Net
> Issue Type: Bug
> Environment: Framework 1.1 .NET
> Reporter: Sérgio Araújo
>
> We are using the Lucene search engine a couple of months, on the first approach seems a very good and high-performance engine.
> We are using the your "Lucene.net.dll"API version 2.0.0.4.
> We have an index with 20GB approximately, all hours are added news docs to index and a time per day the optimization is done at 9 pm.
> During a couple of days everything ran fine even a day that optimization process "writer.Optimize();" return the following exception:
> "Source array was not long enough. Check srcIndex and length, and the array´s lower bounds."
> Here you can find some parts of my code:
> Document doc; doc = null;
> IndexWriter writer; writer = null;
> writer = new IndexWriter(strArticleIndexFolder, new StandardAnalyzer(), isNew);
> writer.SetMergeFactor(1000);
> writer.SetMaxMergeDocs(10000);
> foreach (ArticleIndexFull objArticleIndex in lstArticleIndexFull)
> {
> doc = new Document();
> doc.Add(newField(O4kFreeSearchTag.ArticleLuceneId,objArticleIndex.ArticleIndexFullId.ToString(), Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
> doc.Add(newField(O4kFreeSearchTag.ArticleId,objArticleIndex.ArticleId.ToString(),Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
> doc.Add(newField(O4kFreeSearchTag.ProductionDate,FactoryBLL.ArticleIndex.ClearCharStream(AlphaNumeric.ConvertToString(objArticleIndex.ProductionDate.ToString("yyyyMMdd",System.Globalization.CultureInfo.GetCultureInfo("en-US")),String.Empty)),Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES));
> ....
> writer.AddDocument(doc);
> }
> if (System.DateTime.Now.Hour == 21)
> {
> writer.Optimize();
> }
> writer.Close();
> If we migrate to last version available in this case the 2.4.3 my problem will be fixed?
> Has my code any kind of problem?
> We will appreciate your help.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (LUCENENET-195) IndexWriter.Optimize(); return an
exception
Posted by "Digy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENENET-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765184#action_12765184 ]
Digy commented on LUCENENET-195:
--------------------------------
With so little info about your case, I can only suggest to upgrade to 2.3.2 or to 2.4.0(trunk) and try again, since Lucene.Net >=2.3 is supposed to be thread safe(you can run multiple indexing or searching threads to on the same index).
DIGY
> IndexWriter.Optimize(); return an exception
> -------------------------------------------
>
> Key: LUCENENET-195
> URL: https://issues.apache.org/jira/browse/LUCENENET-195
> Project: Lucene.Net
> Issue Type: Bug
> Environment: Framework 1.1 .NET
> Reporter: Sérgio Araújo
>
> We are using the Lucene search engine a couple of months, on the first approach seems a very good and high-performance engine.
> We are using the your "Lucene.net.dll"API version 2.0.0.4.
> We have an index with 20GB approximately, all hours are added news docs to index and a time per day the optimization is done at 9 pm.
> During a couple of days everything ran fine even a day that optimization process "writer.Optimize();" return the following exception:
> "Source array was not long enough. Check srcIndex and length, and the array´s lower bounds."
> Here you can find some parts of my code:
> Document doc; doc = null;
> IndexWriter writer; writer = null;
> writer = new IndexWriter(strArticleIndexFolder, new StandardAnalyzer(), isNew);
> writer.SetMergeFactor(1000);
> writer.SetMaxMergeDocs(10000);
> foreach (ArticleIndexFull objArticleIndex in lstArticleIndexFull)
> {
> doc = new Document();
> doc.Add(newField(O4kFreeSearchTag.ArticleLuceneId,objArticleIndex.ArticleIndexFullId.ToString(), Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
> doc.Add(newField(O4kFreeSearchTag.ArticleId,objArticleIndex.ArticleId.ToString(),Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
> doc.Add(newField(O4kFreeSearchTag.ProductionDate,FactoryBLL.ArticleIndex.ClearCharStream(AlphaNumeric.ConvertToString(objArticleIndex.ProductionDate.ToString("yyyyMMdd",System.Globalization.CultureInfo.GetCultureInfo("en-US")),String.Empty)),Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES));
> ....
> writer.AddDocument(doc);
> }
> if (System.DateTime.Now.Hour == 21)
> {
> writer.Optimize();
> }
> writer.Close();
> If we migrate to last version available in this case the 2.4.3 my problem will be fixed?
> Has my code any kind of problem?
> We will appreciate your help.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (LUCENENET-195) IndexWriter.Optimize(); return an
exception
Posted by "Sérgio Araújo (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENENET-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765005#action_12765005 ]
Sérgio Araújo commented on LUCENENET-195:
-----------------------------------------
Thanks for your fast reply.
I'm sure, only an optimization process is called, this process is managed for a windows service.
Sergio
> IndexWriter.Optimize(); return an exception
> -------------------------------------------
>
> Key: LUCENENET-195
> URL: https://issues.apache.org/jira/browse/LUCENENET-195
> Project: Lucene.Net
> Issue Type: Bug
> Environment: Framework 1.1 .NET
> Reporter: Sérgio Araújo
>
> We are using the Lucene search engine a couple of months, on the first approach seems a very good and high-performance engine.
> We are using the your "Lucene.net.dll"API version 2.0.0.4.
> We have an index with 20GB approximately, all hours are added news docs to index and a time per day the optimization is done at 9 pm.
> During a couple of days everything ran fine even a day that optimization process "writer.Optimize();" return the following exception:
> "Source array was not long enough. Check srcIndex and length, and the array´s lower bounds."
> Here you can find some parts of my code:
> Document doc; doc = null;
> IndexWriter writer; writer = null;
> writer = new IndexWriter(strArticleIndexFolder, new StandardAnalyzer(), isNew);
> writer.SetMergeFactor(1000);
> writer.SetMaxMergeDocs(10000);
> foreach (ArticleIndexFull objArticleIndex in lstArticleIndexFull)
> {
> doc = new Document();
> doc.Add(newField(O4kFreeSearchTag.ArticleLuceneId,objArticleIndex.ArticleIndexFullId.ToString(), Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
> doc.Add(newField(O4kFreeSearchTag.ArticleId,objArticleIndex.ArticleId.ToString(),Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
> doc.Add(newField(O4kFreeSearchTag.ProductionDate,FactoryBLL.ArticleIndex.ClearCharStream(AlphaNumeric.ConvertToString(objArticleIndex.ProductionDate.ToString("yyyyMMdd",System.Globalization.CultureInfo.GetCultureInfo("en-US")),String.Empty)),Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES));
> ....
> writer.AddDocument(doc);
> }
> if (System.DateTime.Now.Hour == 21)
> {
> writer.Optimize();
> }
> writer.Close();
> If we migrate to last version available in this case the 2.4.3 my problem will be fixed?
> Has my code any kind of problem?
> We will appreciate your help.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.