You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Ram <ra...@datamatics.com> on 2010/03/19 07:18:38 UTC

Index was outside the bounds of the array. for Large File

Hi,
 
I am trying to index large data having many files using following code 
 
if (fileName.EndsWith(".txt"))

{

try

{

StreamReader fstr_in = new StreamReader(folderName + "\\" + fileName);

String line = null;

int counter = 0;

while ((line = fstr_in.ReadLine()) != null)

{

try

{

counter++;

String[] details = extractDetailsFromLine(line);

String paragraph = details[0];

String coords = details[1];

Lucene.Net.Documents.Document doc = new Document();

String name = fileName.Replace(".txt","");

String[] firstname = name.Split(new char[] { '_' });

String tifName = firstname[0] + ".tif";

//doc.Add(Field.UnStored("filename", folderName + "\\" + tifName,
Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));

//doc.Add(Field.keyword("paragraph", paragraph,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED));

//doc.Add(Field.Text("coords", coords, Lucene.Net.Documents.Field.Store.YES,
Field.Index.NO));

doc.Add(new Lucene.Net.Documents.Field("filename", folderName + "\\" +
tifName, Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.NO));

doc.Add(new Lucene.Net.Documents.Field("paragraph", paragraph,
Lucene.Net.Documents.Field.Store.NO,
Lucene.Net.Documents.Field.Index.TOKENIZED));

doc.Add(new Lucene.Net.Documents.Field("coords", coords,
Lucene.Net.Documents.Field.Store.YES, Field.Index.NO));

writer.AddDocument(doc);  // Throws exception

}

catch (System.IO.IOException ex) { }

}

}

catch (System.IO.IOException ex)

{

}

}

 

 

However Lucene throws an exception while indexing :

The exception is Index was outside the bounds of the array.

Current DocCount is 178723

STACK TRACE

at Lucene.Net.Index.DocumentsWriter.Abort(AbortException ae)\r\n   at
Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc, Analyzer
analyzer, Term delTerm)\r\n   at
Lucene.Net.Index.DocumentsWriter.AddDocument(Document doc, Analyzer
analyzer)\r\n   at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
Analyzer analyzer)\r\n   at
Lucene.Net.Index.IndexWriter.AddDocument(Document doc)\r\n   at
Default2.indexDoc(IndexWriter writer, String folderName, String fileName) in
d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 120\r\n
at Default2.indexFolder(String folder, String indexDir) in
d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 69\r\n
at Default2.btnIndex_Click(Object sender, EventArgs e) in
d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 33\r\n
at System.Web.UI.WebControls.Button.OnClick(EventArgs e)\r\n   at
System.Web.UI.WebControls.Button.RaisePostBackEvent(String
eventArgument)\r\n   at
System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePo
stBackEvent(String eventArgument)\r\n   at
System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl,
String eventArgument)\r\n   at
System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData)\r\n   at
System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint,
Boolean includeStagesAfterAsyncPoint)

 

Any help in this regard is highly appreciated.

 

Thanks and Regards,
Ram K. Singh
Datamatics Global Services Limited
PE-Cell| Knowledge Centre , Andheri (East) | Mumbai 400 093.
Email  ram.singh@datamatics.com | | Tel : +91 22 6102 0116 | Mobile: +91
9869 74 9107 

 

 

 


Disclaimer: The information contained in this e-mail and attachments if any are privileged and confidential and are intended for the individual(s) or entity(ies) named in this e-mail. If the reader or recipient is not the intended recipient, or employee or agent responsible for delivering to the intended recipient, you are hereby notified that dissemination, distribution or copying of this communication or attachments thereof is strictly prohibited. IF YOU RECEIVE this communication in error, please immediately notify the sender and return the original message.

Re: Index was outside the bounds of the array. for Large File

Posted by Shashi Kant <sk...@sloan.mit.edu>.
This was an issue with an older build of Lucene.net. You might want to
upgrade to the latest from Subversion.


On Fri, Mar 19, 2010 at 2:18 AM, Ram <ra...@datamatics.com> wrote:
> Hi,
>
> I am trying to index large data having many files using following code
>
> if (fileName.EndsWith(".txt"))
>
> {
>
> try
>
> {
>
> StreamReader fstr_in = new StreamReader(folderName + "\\" + fileName);
>
> String line = null;
>
> int counter = 0;
>
> while ((line = fstr_in.ReadLine()) != null)
>
> {
>
> try
>
> {
>
> counter++;
>
> String[] details = extractDetailsFromLine(line);
>
> String paragraph = details[0];
>
> String coords = details[1];
>
> Lucene.Net.Documents.Document doc = new Document();
>
> String name = fileName.Replace(".txt","");
>
> String[] firstname = name.Split(new char[] { '_' });
>
> String tifName = firstname[0] + ".tif";
>
> //doc.Add(Field.UnStored("filename", folderName + "\\" + tifName,
> Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.NO));
>
> //doc.Add(Field.keyword("paragraph", paragraph,
> Lucene.Net.Documents.Field.Store.YES,
> Lucene.Net.Documents.Field.Index.TOKENIZED));
>
> //doc.Add(Field.Text("coords", coords, Lucene.Net.Documents.Field.Store.YES,
> Field.Index.NO));
>
> doc.Add(new Lucene.Net.Documents.Field("filename", folderName + "\\" +
> tifName, Lucene.Net.Documents.Field.Store.YES,
> Lucene.Net.Documents.Field.Index.NO));
>
> doc.Add(new Lucene.Net.Documents.Field("paragraph", paragraph,
> Lucene.Net.Documents.Field.Store.NO,
> Lucene.Net.Documents.Field.Index.TOKENIZED));
>
> doc.Add(new Lucene.Net.Documents.Field("coords", coords,
> Lucene.Net.Documents.Field.Store.YES, Field.Index.NO));
>
> writer.AddDocument(doc);  // Throws exception
>
> }
>
> catch (System.IO.IOException ex) { }
>
> }
>
> }
>
> catch (System.IO.IOException ex)
>
> {
>
> }
>
> }
>
>
>
>
>
> However Lucene throws an exception while indexing :
>
> The exception is Index was outside the bounds of the array.
>
> Current DocCount is 178723
>
> STACK TRACE
>
> at Lucene.Net.Index.DocumentsWriter.Abort(AbortException ae)\r\n   at
> Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc, Analyzer
> analyzer, Term delTerm)\r\n   at
> Lucene.Net.Index.DocumentsWriter.AddDocument(Document doc, Analyzer
> analyzer)\r\n   at Lucene.Net.Index.IndexWriter.AddDocument(Document doc,
> Analyzer analyzer)\r\n   at
> Lucene.Net.Index.IndexWriter.AddDocument(Document doc)\r\n   at
> Default2.indexDoc(IndexWriter writer, String folderName, String fileName) in
> d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 120\r\n
> at Default2.indexFolder(String folder, String indexDir) in
> d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 69\r\n
> at Default2.btnIndex_Click(Object sender, EventArgs e) in
> d:\\RamThakur\\lucene_related\\luceneExample\\Default2.aspx.cs:line 33\r\n
> at System.Web.UI.WebControls.Button.OnClick(EventArgs e)\r\n   at
> System.Web.UI.WebControls.Button.RaisePostBackEvent(String
> eventArgument)\r\n   at
> System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePo
> stBackEvent(String eventArgument)\r\n   at
> System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl,
> String eventArgument)\r\n   at
> System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData)\r\n   at
> System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint,
> Boolean includeStagesAfterAsyncPoint)
>
>
>
> Any help in this regard is highly appreciated.
>
>
>
> Thanks and Regards,
> Ram K. Singh
> Datamatics Global Services Limited
> PE-Cell| Knowledge Centre , Andheri (East) | Mumbai 400 093.
> Email  ram.singh@datamatics.com | | Tel : +91 22 6102 0116 | Mobile: +91
> 9869 74 9107
>
>
>
>
>
>
>
>
> Disclaimer: The information contained in this e-mail and attachments if any are privileged and confidential and are intended for the individual(s) or entity(ies) named in this e-mail. If the reader or recipient is not the intended recipient, or employee or agent responsible for delivering to the intended recipient, you are hereby notified that dissemination, distribution or copying of this communication or attachments thereof is strictly prohibited. IF YOU RECEIVE this communication in error, please immediately notify the sender and return the original message.
>