You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Digy <di...@gmail.com> on 2008/07/07 23:21:12 UTC
Bug In IndexWriter.addDocument?
Hi all,
I am a Lucene.Net user. Since I need a fast indexing in my current project I
try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since Lucene.Net
is currently in v2.1) and I use the same instances of document and fields to
gain some speed improvements.
I use TokenStreams to set the value of fields.
My problem is that I get NullPointerException in "addDocument".
Exception in thread "main" java.lang.NullPointerException
at
org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99)
at
org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127)
at
org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(D
ocumentsWriter.java:1418)
at
org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(Document
sWriter.java:1121)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
2442)
at
org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:242
4)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442)
at MainClass.Test(MainClass.java:39)
at MainClass.main(MainClass.java:10)
To show the same bug in Java I prepared a sample application (oh, that was
hard since this is my second app. in java(first one was a "Hello World"
app.))
Is something wrong with my application or is it a bug in Lucene?
Thanks,
DIGY
SampleCode:
public class MainClass
{
DummyTokenStream DummyTokenStream1 = new DummyTokenStream();
DummyTokenStream DummyTokenStream2 = new DummyTokenStream();
//use the same document&field instances for Indexing
org.apache.lucene.document.Document Doc = new
org.apache.lucene.document.Document();
org.apache.lucene.document.Field Field1 = new
org.apache.lucene.document.Field("Field1", "",
org.apache.lucene.document.Field.Store.YES,
org.apache.lucene.document.Field.Index.TOKENIZED);
org.apache.lucene.document.Field Field2 = new
org.apache.lucene.document.Field("Field2", "",
org.apache.lucene.document.Field.Store.YES,
org.apache.lucene.document.Field.Index.TOKENIZED);
public MainClass()
{
Doc.add(Field1);
Doc.add(Field2);
}
public void Index() throws
org.apache.lucene.index.CorruptIndexException,
org.apache.lucene.store.LockObtainFailedException,
java.io.IOException
{
System.out.println("Index Started");
org.apache.lucene.index.IndexWriter wr = new
org.apache.lucene.index.IndexWriter("testindex", new
org.apache.lucene.analysis.WhitespaceAnalyzer(),true);
for (int i = 0; i < 100; i++)
{
PrepDoc();
wr.addDocument(Doc);
}
wr.close();
System.out.println("Index Completed");
}
void PrepDoc()
{
DummyTokenStream1.SetText("test1"); //Set a new Text to Token
Stream
Field1.setValue(DummyTokenStream1); //Set TokenStream to Field
Value
DummyTokenStream2.SetText("test2"); //Set a new Text to Token
Stream
Field2.setValue(DummyTokenStream2); //Set TokenStream to Field
Value
}
public static void main(String[] args) throws
org.apache.lucene.index.CorruptIndexException,
org.apache.lucene.store.LockObtainFailedException,
java.io.IOException
{
MainClass m = new MainClass();
m.Index();
}
public class DummyTokenStream extends
org.apache.lucene.analysis.TokenStream
{
String Text = "";
boolean EndOfStream = false;
org.apache.lucene.analysis.Token Token = new
org.apache.lucene.analysis.Token();
//return "Text" as the first token and null as the second
public org.apache.lucene.analysis.Token next()
{
if (EndOfStream == false)
{
EndOfStream = true;
Token.setTermText(Text);
Token.setStartOffset(0);
Token.setEndOffset(Text.length() - 1);
Token.setTermLength(Text.length());
return Token;
}
return null;
}
public void SetText(String Text)
{
EndOfStream = false;
this.Text = Text;
}
}
}
Re: Bug In IndexWriter.addDocument?
Posted by Ajay Lakhani <la...@googlemail.com>.
Dear Digy,
You cannot store the Filed value when using a TokenStream but can store the
term vector
For this you should create an instance of the Field in this manner:
Field Field1 = new Field("Field1", DummyTokenStream1, TermVector.YES);
Below is the code that should work.
public class Main2Class{
Document Doc = new Document();
DummyTokenStream DummyTokenStream1 = new DummyTokenStream();
Field Field1 = new Field("Field1", DummyTokenStream1, TermVector.YES);
DummyTokenStream DummyTokenStream2 = new DummyTokenStream();
Field Field2 = new Field("Field1", DummyTokenStream2, TermVector.YES);
public void Index() throws Exception {
Doc.add(Field1);
Doc.add(Field2);
IndexWriter wr = new IndexWriter("testindex", new WhitespaceAnalyzer(),
true);
for (int i = 0; i < 100; i++){
PrepDoc();
wr.addDocument(Doc);
}
wr.close();
}
void PrepDoc(){
DummyTokenStream1.SetText("test1");
Field1.setValue(DummyTokenStream1);
DummyTokenStream2.SetText("test2");
Field2.setValue(DummyTokenStream2);
}
public static void main(String[] args) throws Exception {
Main2Class m = new Main2Class();
m.Index();
}
}
Cheers
Ajay
2008/7/8 Ajay Lakhani <la...@googlemail.com>:
> Dear Digy,
>
> To add on, I might think that this is not a glitch.
>
> A TokenStream is usually not stored.
> If you change your field attribute to *
> org.apache.lucene.document.Field.Store.NO *then there will be no issue.
>
> Developers, any thoughts on this!
>
> Cheers
> Ajay
>
> 2008/7/8 Ajay Lakhani <la...@googlemail.com>:
>
> Dear Digy,
>> As of Lucene 2.3, there are new setValue(...) methods that allow you to
>> change the value of a Field. However, there seems to be an issue with the
>> org.apache.lucene.index.FieldWriter.writeField(...) API that stores the
>> string value for the field, which happens to be null in the case of a TokenStream.
>>
>>
>> The org.apache.lucene.index.FieldWriter.writeField(...) API needs to be
>> changed to verify whether the Field Data is an instance of String, Reader or
>> a TokenStream and then retrieve the respective values. I shall patch this
>> soon.
>>
>> Is there a particular reason you are using a TokenStream ? I suggest you
>> set the text value directly to the Field: Field1.setValue("xxx");
>>
>> Moreover, it's best to create a single Document instance, then add
>> multiple Field instances to it, but hold onto these Field instances and
>> re-use them by changing their values for each added document. After the
>> document is added, you then directly change the Field values
>> (idField.setValue(...), etc), and then re-add your Document instance. You
>> cannot re-use a single Field instance within a Document, and, you should not
>> change a Field's value until the Document containing that Field has been
>> added to the index.
>>
>> 2008/7/8 Digy <di...@gmail.com>:
>>
>> Hi all,
>>>
>>>
>>>
>>> I am a Lucene.Net user. Since I need a fast indexing in my current
>>> project I try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since
>>> Lucene.Net is currently in v2.1) and I use the same instances of document
>>> and fields to gain some speed improvements.
>>>
>>>
>>>
>>> I use TokenStreams to set the value of fields.
>>>
>>>
>>>
>>> My problem is that I get NullPointerException in "addDocument".
>>>
>>>
>>>
>>> Exception in thread "main" java.lang.NullPointerException
>>>
>>> at
>>> org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99)
>>>
>>> at
>>> org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127)
>>>
>>> at
>>> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1418)
>>>
>>> at
>>> org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:1121)
>>>
>>> at
>>> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2442)
>>>
>>> at
>>> org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2424)
>>>
>>> at
>>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464)
>>>
>>> at
>>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442)
>>>
>>> at MainClass.Test(MainClass.java:39)
>>>
>>> at MainClass.main(MainClass.java:10)
>>>
>>>
>>>
>>> To show the same bug in Java I prepared a sample application (oh, that
>>> was hard since this is my second app. in java(first one was a "Hello World"
>>> app.))
>>>
>>>
>>>
>>> Is something wrong with my application or is it a bug in Lucene?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> DIGY
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *SampleCode:*
>>>
>>> * public class **MainClass***
>>>
>>> * {*
>>>
>>> * *
>>>
>>> * DummyTokenStream **DummyTokenStream1** = new
>>> DummyTokenStream();*
>>>
>>> * DummyTokenStream **DummyTokenStream2** = new
>>> DummyTokenStream();*
>>>
>>> * *
>>>
>>> * //use the same document&field instances for Indexing*
>>>
>>> * org.apache.lucene.document.Document **Doc** = new
>>> org.apache.lucene.document.Document();*
>>>
>>> * *
>>>
>>> * org.apache.lucene.document.Field **Field1** = new
>>> org.apache.lucene.document.Field("Field1", "",
>>> org.apache.lucene.document.Field.Store.YES,
>>> org.apache.lucene.document.Field.Index.TOKENIZED);*
>>>
>>> * org.apache.lucene.document.Field **Field2** = new
>>> org.apache.lucene.document.Field("Field2", "",
>>> org.apache.lucene.document.Field.Store.YES,
>>> org.apache.lucene.document.Field.Index.TOKENIZED);*
>>>
>>> * *
>>>
>>> * public **MainClass**()*
>>>
>>> * {*
>>>
>>> * Doc.add(Field1);*
>>>
>>> * Doc.add(Field2);*
>>>
>>> * }*
>>>
>>> * *
>>>
>>> * *
>>>
>>> * public void Index() throws *
>>>
>>> *
>>> org.apache.lucene.index.CorruptIndexException,*
>>>
>>> *
>>> org.apache.lucene.store.LockObtainFailedException,*
>>>
>>> * java.io.IOException*
>>>
>>> * {*
>>>
>>> * System.out.println("Index Started"); *
>>>
>>> * org.apache.lucene.index.IndexWriter wr = new
>>> org.apache.lucene.index.IndexWriter("testindex", new
>>> org.apache.lucene.analysis.WhitespaceAnalyzer(),true);*
>>>
>>> * *
>>>
>>> * for (int i = 0; i < 100; i++)*
>>>
>>> * {*
>>>
>>> * PrepDoc();*
>>>
>>> * wr.addDocument(Doc);*
>>>
>>> * }*
>>>
>>> * wr.close();*
>>>
>>> * System.out.println("Index Completed"); *
>>>
>>> * }*
>>>
>>> * *
>>>
>>> * **void PrepDoc()*
>>>
>>> * {*
>>>
>>> * DummyTokenStream1.SetText("test1"); //Set a new Text to
>>> Token Stream*
>>>
>>> * Field1.setValue(DummyTokenStream1); //Set TokenStream to
>>> Field Value*
>>>
>>> * *
>>>
>>> * *
>>>
>>> * DummyTokenStream2.SetText("test2"); //Set a new Text to
>>> Token Stream*
>>>
>>> * Field2.setValue(DummyTokenStream2); //Set TokenStream to
>>> Field Value*
>>>
>>> * }*
>>>
>>> * *
>>>
>>> * public static void main(String[] args) throws*
>>>
>>> * org.apache.lucene.index.CorruptIndexException,*
>>>
>>> * org.apache.lucene.store.LockObtainFailedException,*
>>>
>>> * java.io.IOException*
>>>
>>> * {*
>>>
>>> * MainClass m = new MainClass();*
>>>
>>> * m.Index();*
>>>
>>> * }*
>>>
>>> * *
>>>
>>> * *
>>>
>>> * *
>>>
>>> * *
>>>
>>> * public class **DummyTokenStream **extends
>>> org.apache.lucene.analysis.TokenStream*
>>>
>>> * {*
>>>
>>> * String Text = "";*
>>>
>>> * boolean EndOfStream = false;*
>>>
>>> * org.apache.lucene.analysis.Token Token = new
>>> org.apache.lucene.analysis.Token();*
>>>
>>> * *
>>>
>>> * //return "Text" as the first token and null as the second*
>>>
>>> * public org.apache.lucene.analysis.Token next()*
>>>
>>> * {*
>>>
>>> * if (EndOfStream == false)*
>>>
>>> * {*
>>>
>>> * EndOfStream = true;*
>>>
>>> * *
>>>
>>> * Token.setTermText(Text);*
>>>
>>> * Token.setStartOffset(0);*
>>>
>>> * Token.setEndOffset(Text.length() - 1);*
>>>
>>> * Token.setTermLength(Text.length());*
>>>
>>> * return Token;*
>>>
>>> * }*
>>>
>>> * return null;*
>>>
>>> * }*
>>>
>>> * *
>>>
>>> * public void SetText(String Text)*
>>>
>>> * {*
>>>
>>> * EndOfStream = false;*
>>>
>>> * this.Text = Text;*
>>>
>>> * }*
>>>
>>> * }*
>>>
>>> * *
>>>
>>> * }*
>>>
>>>
>>>
>>>
>>>
>>
>>
>
Re: Bug In IndexWriter.addDocument?
Posted by Ajay Lakhani <la...@googlemail.com>.
Dear Digy,
To add on, I might think that this is not a glitch.
A TokenStream is usually not stored.
If you change your field attribute to *
org.apache.lucene.document.Field.Store.NO *then there will be no issue.
Developers, any thoughts on this!
Cheers
Ajay
2008/7/8 Ajay Lakhani <la...@googlemail.com>:
> Dear Digy,
> As of Lucene 2.3, there are new setValue(...) methods that allow you to
> change the value of a Field. However, there seems to be an issue with the
> org.apache.lucene.index.FieldWriter.writeField(...) API that stores the
> string value for the field, which happens to be null in the case of a TokenStream.
>
>
> The org.apache.lucene.index.FieldWriter.writeField(...) API needs to be
> changed to verify whether the Field Data is an instance of String, Reader or
> a TokenStream and then retrieve the respective values. I shall patch this
> soon.
>
> Is there a particular reason you are using a TokenStream ? I suggest you
> set the text value directly to the Field: Field1.setValue("xxx");
>
> Moreover, it's best to create a single Document instance, then add multiple
> Field instances to it, but hold onto these Field instances and re-use them
> by changing their values for each added document. After the document is
> added, you then directly change the Field values (idField.setValue(...),
> etc), and then re-add your Document instance. You cannot re-use a single
> Field instance within a Document, and, you should not change a Field's value
> until the Document containing that Field has been added to the index.
>
> 2008/7/8 Digy <di...@gmail.com>:
>
> Hi all,
>>
>>
>>
>> I am a Lucene.Net user. Since I need a fast indexing in my current project
>> I try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since Lucene.Net
>> is currently in v2.1) and I use the same instances of document and fields to
>> gain some speed improvements.
>>
>>
>>
>> I use TokenStreams to set the value of fields.
>>
>>
>>
>> My problem is that I get NullPointerException in "addDocument".
>>
>>
>>
>> Exception in thread "main" java.lang.NullPointerException
>>
>> at
>> org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99)
>>
>> at
>> org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127)
>>
>> at
>> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1418)
>>
>> at
>> org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:1121)
>>
>> at
>> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2442)
>>
>> at
>> org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2424)
>>
>> at
>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464)
>>
>> at
>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442)
>>
>> at MainClass.Test(MainClass.java:39)
>>
>> at MainClass.main(MainClass.java:10)
>>
>>
>>
>> To show the same bug in Java I prepared a sample application (oh, that was
>> hard since this is my second app. in java(first one was a "Hello World"
>> app.))
>>
>>
>>
>> Is something wrong with my application or is it a bug in Lucene?
>>
>>
>>
>> Thanks,
>>
>> DIGY
>>
>>
>>
>>
>>
>>
>>
>> *SampleCode:*
>>
>> * public class **MainClass***
>>
>> * {*
>>
>> * *
>>
>> * DummyTokenStream **DummyTokenStream1** = new DummyTokenStream();
>> *
>>
>> * DummyTokenStream **DummyTokenStream2** = new DummyTokenStream();
>> *
>>
>> * *
>>
>> * //use the same document&field instances for Indexing*
>>
>> * org.apache.lucene.document.Document **Doc** = new
>> org.apache.lucene.document.Document();*
>>
>> * *
>>
>> * org.apache.lucene.document.Field **Field1** = new
>> org.apache.lucene.document.Field("Field1", "",
>> org.apache.lucene.document.Field.Store.YES,
>> org.apache.lucene.document.Field.Index.TOKENIZED);*
>>
>> * org.apache.lucene.document.Field **Field2** = new
>> org.apache.lucene.document.Field("Field2", "",
>> org.apache.lucene.document.Field.Store.YES,
>> org.apache.lucene.document.Field.Index.TOKENIZED);*
>>
>> * *
>>
>> * public **MainClass**()*
>>
>> * {*
>>
>> * Doc.add(Field1);*
>>
>> * Doc.add(Field2);*
>>
>> * }*
>>
>> * *
>>
>> * *
>>
>> * public void Index() throws *
>>
>> *
>> org.apache.lucene.index.CorruptIndexException,*
>>
>> *
>> org.apache.lucene.store.LockObtainFailedException,*
>>
>> * java.io.IOException*
>>
>> * {*
>>
>> * System.out.println("Index Started"); *
>>
>> * org.apache.lucene.index.IndexWriter wr = new
>> org.apache.lucene.index.IndexWriter("testindex", new
>> org.apache.lucene.analysis.WhitespaceAnalyzer(),true);*
>>
>> * *
>>
>> * for (int i = 0; i < 100; i++)*
>>
>> * {*
>>
>> * PrepDoc();*
>>
>> * wr.addDocument(Doc);*
>>
>> * }*
>>
>> * wr.close();*
>>
>> * System.out.println("Index Completed"); *
>>
>> * }*
>>
>> * *
>>
>> * **void PrepDoc()*
>>
>> * {*
>>
>> * DummyTokenStream1.SetText("test1"); //Set a new Text to
>> Token Stream*
>>
>> * Field1.setValue(DummyTokenStream1); //Set TokenStream to
>> Field Value*
>>
>> * *
>>
>> * *
>>
>> * DummyTokenStream2.SetText("test2"); //Set a new Text to
>> Token Stream*
>>
>> * Field2.setValue(DummyTokenStream2); //Set TokenStream to
>> Field Value*
>>
>> * }*
>>
>> * *
>>
>> * public static void main(String[] args) throws*
>>
>> * org.apache.lucene.index.CorruptIndexException,*
>>
>> * org.apache.lucene.store.LockObtainFailedException,*
>>
>> * java.io.IOException*
>>
>> * {*
>>
>> * MainClass m = new MainClass();*
>>
>> * m.Index();*
>>
>> * }*
>>
>> * *
>>
>> * *
>>
>> * *
>>
>> * *
>>
>> * public class **DummyTokenStream **extends
>> org.apache.lucene.analysis.TokenStream*
>>
>> * {*
>>
>> * String Text = "";*
>>
>> * boolean EndOfStream = false;*
>>
>> * org.apache.lucene.analysis.Token Token = new
>> org.apache.lucene.analysis.Token();*
>>
>> * *
>>
>> * //return "Text" as the first token and null as the second*
>>
>> * public org.apache.lucene.analysis.Token next()*
>>
>> * {*
>>
>> * if (EndOfStream == false)*
>>
>> * {*
>>
>> * EndOfStream = true;*
>>
>> * *
>>
>> * Token.setTermText(Text);*
>>
>> * Token.setStartOffset(0);*
>>
>> * Token.setEndOffset(Text.length() - 1);*
>>
>> * Token.setTermLength(Text.length());*
>>
>> * return Token;*
>>
>> * }*
>>
>> * return null;*
>>
>> * }*
>>
>> * *
>>
>> * public void SetText(String Text)*
>>
>> * {*
>>
>> * EndOfStream = false;*
>>
>> * this.Text = Text;*
>>
>> * }*
>>
>> * }*
>>
>> * *
>>
>> * }*
>>
>>
>>
>>
>>
>
>
Re: Bug In IndexWriter.addDocument?
Posted by Ajay Lakhani <la...@googlemail.com>.
Dear Digy,
As of Lucene 2.3, there are new setValue(...) methods that allow you to
change the value of a Field. However, there seems to be an issue with the
org.apache.lucene.index.FieldWriter.writeField(...) API that stores the
string value for the field, which happens to be null in the case of a
TokenStream.
The org.apache.lucene.index.FieldWriter.writeField(...) API needs to be
changed to verify whether the Field Data is an instance of String, Reader or
a TokenStream and then retrieve the respective values. I shall patch this
soon.
Is there a particular reason you are using a TokenStream ? I suggest you set
the text value directly to the Field: Field1.setValue("xxx");
Moreover, it's best to create a single Document instance, then add multiple
Field instances to it, but hold onto these Field instances and re-use them
by changing their values for each added document. After the document is
added, you then directly change the Field values (idField.setValue(...),
etc), and then re-add your Document instance. You cannot re-use a single
Field instance within a Document, and, you should not change a Field's value
until the Document containing that Field has been added to the index.
2008/7/8 Digy <di...@gmail.com>:
> Hi all,
>
>
>
> I am a Lucene.Net user. Since I need a fast indexing in my current project
> I try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since Lucene.Net
> is currently in v2.1) and I use the same instances of document and fields to
> gain some speed improvements.
>
>
>
> I use TokenStreams to set the value of fields.
>
>
>
> My problem is that I get NullPointerException in "addDocument".
>
>
>
> Exception in thread "main" java.lang.NullPointerException
>
> at
> org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99)
>
> at
> org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127)
>
> at
> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1418)
>
> at
> org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:1121)
>
> at
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2442)
>
> at
> org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2424)
>
> at
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464)
>
> at
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442)
>
> at MainClass.Test(MainClass.java:39)
>
> at MainClass.main(MainClass.java:10)
>
>
>
> To show the same bug in Java I prepared a sample application (oh, that was
> hard since this is my second app. in java(first one was a "Hello World"
> app.))
>
>
>
> Is something wrong with my application or is it a bug in Lucene?
>
>
>
> Thanks,
>
> DIGY
>
>
>
>
>
>
>
> *SampleCode:*
>
> * public class **MainClass***
>
> * {*
>
> * *
>
> * DummyTokenStream **DummyTokenStream1** = new DummyTokenStream();*
>
> * DummyTokenStream **DummyTokenStream2** = new DummyTokenStream();*
>
> * *
>
> * //use the same document&field instances for Indexing*
>
> * org.apache.lucene.document.Document **Doc** = new
> org.apache.lucene.document.Document();*
>
> * *
>
> * org.apache.lucene.document.Field **Field1** = new
> org.apache.lucene.document.Field("Field1", "",
> org.apache.lucene.document.Field.Store.YES,
> org.apache.lucene.document.Field.Index.TOKENIZED);*
>
> * org.apache.lucene.document.Field **Field2** = new
> org.apache.lucene.document.Field("Field2", "",
> org.apache.lucene.document.Field.Store.YES,
> org.apache.lucene.document.Field.Index.TOKENIZED);*
>
> * *
>
> * public **MainClass**()*
>
> * {*
>
> * Doc.add(Field1);*
>
> * Doc.add(Field2);*
>
> * }*
>
> * *
>
> * *
>
> * public void Index() throws *
>
> * org.apache.lucene.index.CorruptIndexException,
> *
>
> *
> org.apache.lucene.store.LockObtainFailedException,*
>
> * java.io.IOException*
>
> * {*
>
> * System.out.println("Index Started"); *
>
> * org.apache.lucene.index.IndexWriter wr = new
> org.apache.lucene.index.IndexWriter("testindex", new
> org.apache.lucene.analysis.WhitespaceAnalyzer(),true);*
>
> * *
>
> * for (int i = 0; i < 100; i++)*
>
> * {*
>
> * PrepDoc();*
>
> * wr.addDocument(Doc);*
>
> * }*
>
> * wr.close();*
>
> * System.out.println("Index Completed"); *
>
> * }*
>
> * *
>
> * **void PrepDoc()*
>
> * {*
>
> * DummyTokenStream1.SetText("test1"); //Set a new Text to Token
> Stream*
>
> * Field1.setValue(DummyTokenStream1); //Set TokenStream to
> Field Value*
>
> * *
>
> * *
>
> * DummyTokenStream2.SetText("test2"); //Set a new Text to Token
> Stream*
>
> * Field2.setValue(DummyTokenStream2); //Set TokenStream to
> Field Value*
>
> * }*
>
> * *
>
> * public static void main(String[] args) throws*
>
> * org.apache.lucene.index.CorruptIndexException,*
>
> * org.apache.lucene.store.LockObtainFailedException,*
>
> * java.io.IOException*
>
> * {*
>
> * MainClass m = new MainClass();*
>
> * m.Index();*
>
> * }*
>
> * *
>
> * *
>
> * *
>
> * *
>
> * public class **DummyTokenStream **extends
> org.apache.lucene.analysis.TokenStream*
>
> * {*
>
> * String Text = "";*
>
> * boolean EndOfStream = false;*
>
> * org.apache.lucene.analysis.Token Token = new
> org.apache.lucene.analysis.Token();*
>
> * *
>
> * //return "Text" as the first token and null as the second*
>
> * public org.apache.lucene.analysis.Token next()*
>
> * {*
>
> * if (EndOfStream == false)*
>
> * {*
>
> * EndOfStream = true;*
>
> * *
>
> * Token.setTermText(Text);*
>
> * Token.setStartOffset(0);*
>
> * Token.setEndOffset(Text.length() - 1);*
>
> * Token.setTermLength(Text.length());*
>
> * return Token;*
>
> * }*
>
> * return null;*
>
> * }*
>
> * *
>
> * public void SetText(String Text)*
>
> * {*
>
> * EndOfStream = false;*
>
> * this.Text = Text;*
>
> * }*
>
> * }*
>
> * *
>
> * }*
>
>
>
>
>