You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Günter Kukies <gu...@heuft.com> on 2003/10/30 08:40:46 UTC

Indexing txt-files

 Hello,

I  want to add a Text field to a LUCENE Document. I checked the index with LUKE, but I don't get any results for search in the contents Field. The test.txt is a simple ASCII-File. SimpleAnalyzer is used on both sides search and index.

Here are the relevant code snippets:


File file = new File("/documents/test.txt");

addContent(document, new FileInputStream( file ));


 private static void addContent(Document document, InputStream is) throws IOException {
        try {
            InputStreamReader input = new InputStreamReader(is);
            document.add(Field.Text("contents", input ));
         }
        catch(Exception ex) {
            ex.printStackTrace();
        }
        finally {
            if( is != null ) {
                is.close();
            }
        }
    }


Thanks for your help

Günter

Re: Indexing txt-files

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
I'm not an expert on Readers/InputStreams, but it sounds like you're 
dealing with a bug related to your usages of them and not Lucene.  Have 
a look at my Lucene Intro article where I use a FileReader.  Try a 
simple test using something like that eliminating as many variables as 
you can.

	Erik



On Thursday, October 30, 2003, at 03:41  AM, Günter Kukies wrote:

> Yes, i know that it is indexed and the contents is not stored. That is 
> what
> i want. But that means that I can search the index and i get back the
> lucene-document as a hit result with all the other fields(date,
> file-location,...)
> So my problem is that I don't get back the LUCENE-Document. Maby I 
> need a
> buffered reader or it is not allowed to close the reader.
>
> Günter
>
> ----- Original Message -----
> From: "Erik Hatcher" <er...@ehatchersolutions.com>
> To: "Lucene Users List" <lu...@jakarta.apache.org>
> Sent: Thursday, October 30, 2003 9:17 AM
> Subject: Re: Indexing txt-files
>
>
> Field.Text(String, Reader) is an unstored field.  It is indexed, but
> the contents are not stored in the index.
>
> If you want the contents stored, use Field.Text(String,String)
>
> Erik
>
> On Thursday, October 30, 2003, at 02:40  AM, Günter Kukies wrote:
>
>>  Hello,
>>
>> I  want to add a Text field to a LUCENE Document. I checked the index
>> with LUKE, but I don't get any results for search in the contents
>> Field. The test.txt is a simple ASCII-File. SimpleAnalyzer is used on
>> both sides search and index.
>>
>> Here are the relevant code snippets:
>>
>>
>> File file = new File("/documents/test.txt");
>>
>> addContent(document, new FileInputStream( file ));
>>
>>
>>  private static void addContent(Document document, InputStream is)
>> throws IOException {
>>         try {
>>             InputStreamReader input = new InputStreamReader(is);
>>             document.add(Field.Text("contents", input ));
>>          }
>>         catch(Exception ex) {
>>             ex.printStackTrace();
>>         }
>>         finally {
>>             if( is != null ) {
>>                 is.close();
>>             }
>>         }
>>     }
>>
>>
>> Thanks for your help
>>
>> Günter
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Indexing txt-files

Posted by Günter Kukies <gu...@heuft.com>.
Yes, i know that it is indexed and the contents is not stored. That is what
i want. But that means that I can search the index and i get back the
lucene-document as a hit result with all the other fields(date,
file-location,...)
So my problem is that I don't get back the LUCENE-Document. Maby I need a
buffered reader or it is not allowed to close the reader.

Günter

----- Original Message -----
From: "Erik Hatcher" <er...@ehatchersolutions.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, October 30, 2003 9:17 AM
Subject: Re: Indexing txt-files


Field.Text(String, Reader) is an unstored field.  It is indexed, but
the contents are not stored in the index.

If you want the contents stored, use Field.Text(String,String)

Erik

On Thursday, October 30, 2003, at 02:40  AM, Günter Kukies wrote:

>  Hello,
>
> I  want to add a Text field to a LUCENE Document. I checked the index
> with LUKE, but I don't get any results for search in the contents
> Field. The test.txt is a simple ASCII-File. SimpleAnalyzer is used on
> both sides search and index.
>
> Here are the relevant code snippets:
>
>
> File file = new File("/documents/test.txt");
>
> addContent(document, new FileInputStream( file ));
>
>
>  private static void addContent(Document document, InputStream is)
> throws IOException {
>         try {
>             InputStreamReader input = new InputStreamReader(is);
>             document.add(Field.Text("contents", input ));
>          }
>         catch(Exception ex) {
>             ex.printStackTrace();
>         }
>         finally {
>             if( is != null ) {
>                 is.close();
>             }
>         }
>     }
>
>
> Thanks for your help
>
> Günter

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Indexing txt-files

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Field.Text(String, Reader) is an unstored field.  It is indexed, but 
the contents are not stored in the index.

If you want the contents stored, use Field.Text(String,String)

	Erik

On Thursday, October 30, 2003, at 02:40  AM, Günter Kukies wrote:

>  Hello,
>
> I  want to add a Text field to a LUCENE Document. I checked the index 
> with LUKE, but I don't get any results for search in the contents 
> Field. The test.txt is a simple ASCII-File. SimpleAnalyzer is used on 
> both sides search and index.
>
> Here are the relevant code snippets:
>
>
> File file = new File("/documents/test.txt");
>
> addContent(document, new FileInputStream( file ));
>
>
>  private static void addContent(Document document, InputStream is) 
> throws IOException {
>         try {
>             InputStreamReader input = new InputStreamReader(is);
>             document.add(Field.Text("contents", input ));
>          }
>         catch(Exception ex) {
>             ex.printStackTrace();
>         }
>         finally {
>             if( is != null ) {
>                 is.close();
>             }
>         }
>     }
>
>
> Thanks for your help
>
> Günter

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Indexing txt-files

Posted by Günter Kukies <gu...@heuft.com>.
Hi,

is it possible to upgrade the API-doc, that a Buffered Reader is a must.

Who is responsible for closing the InputStream? Does doc.add() the close?

Günter


----- Original Message -----
From: "Erik Hatcher" <er...@ehatchersolutions.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Thursday, October 30, 2003 1:19 PM
Subject: Re: Indexing txt-files


> Strange.  FileReader works fine in my java.net article code.
>
>
> On Thursday, October 30, 2003, at 06:35  AM, Otis Gospodnetic wrote:
>
> > This is from Lucene demo, included in Lucene distribution.
> > Look at FileDocument class:
> >
> >     // Add the contents of the file a field named "contents".  Use a
> > Text
> >     // field, specifying a Reader, so that the text of the file is
> > tokenized.
> >     // ?? why doesn't FileReader work here ??
> >     FileInputStream is = new FileInputStream(f);
> >     Reader reader = new BufferedReader(new InputStreamReader(is));
> >     doc.add(Field.Text("contents", reader));
> >
> > Otis
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Indexing txt-files

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Strange.  FileReader works fine in my java.net article code.


On Thursday, October 30, 2003, at 06:35  AM, Otis Gospodnetic wrote:

> This is from Lucene demo, included in Lucene distribution.
> Look at FileDocument class:
>
>     // Add the contents of the file a field named "contents".  Use a
> Text
>     // field, specifying a Reader, so that the text of the file is
> tokenized.
>     // ?? why doesn't FileReader work here ??
>     FileInputStream is = new FileInputStream(f);
>     Reader reader = new BufferedReader(new InputStreamReader(is));
>     doc.add(Field.Text("contents", reader));
>
> Otis
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Indexing txt-files

Posted by Otis Gospodnetic <ot...@yahoo.com>.
This is from Lucene demo, included in Lucene distribution.
Look at FileDocument class:

    // Add the contents of the file a field named "contents".  Use a
Text
    // field, specifying a Reader, so that the text of the file is
tokenized.
    // ?? why doesn't FileReader work here ??
    FileInputStream is = new FileInputStream(f);
    Reader reader = new BufferedReader(new InputStreamReader(is));
    doc.add(Field.Text("contents", reader));

Otis


--- G�nter_Kukies <gu...@heuft.com> wrote:
>  Hello,
> 
> I  want to add a Text field to a LUCENE Document. I checked the index
> with LUKE, but I don't get any results for search in the contents
> Field. The test.txt is a simple ASCII-File. SimpleAnalyzer is used on
> both sides search and index.
> 
> Here are the relevant code snippets:
> 
> 
> File file = new File("/documents/test.txt");
> 
> addContent(document, new FileInputStream( file ));
> 
> 
>  private static void addContent(Document document, InputStream is)
> throws IOException {
>         try {
>             InputStreamReader input = new InputStreamReader(is);
>             document.add(Field.Text("contents", input ));
>          }
>         catch(Exception ex) {
>             ex.printStackTrace();
>         }
>         finally {
>             if( is != null ) {
>                 is.close();
>             }
>         }
>     }
> 
> 
> Thanks for your help
> 
> G�nter


__________________________________
Do you Yahoo!?
Exclusive Video Premiere - Britney Spears
http://launch.yahoo.com/promos/britneyspears/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Indexing txt-files

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Also, referring to my article may help - the code is designed to index 
text files:

	http://today.java.net/pub/a/today/2003/07/30/LuceneIntro.html


On Thursday, October 30, 2003, at 02:40  AM, Günter Kukies wrote:

>  Hello,
>
> I  want to add a Text field to a LUCENE Document. I checked the index 
> with LUKE, but I don't get any results for search in the contents 
> Field. The test.txt is a simple ASCII-File. SimpleAnalyzer is used on 
> both sides search and index.
>
> Here are the relevant code snippets:
>
>
> File file = new File("/documents/test.txt");
>
> addContent(document, new FileInputStream( file ));
>
>
>  private static void addContent(Document document, InputStream is) 
> throws IOException {
>         try {
>             InputStreamReader input = new InputStreamReader(is);
>             document.add(Field.Text("contents", input ));
>          }
>         catch(Exception ex) {
>             ex.printStackTrace();
>         }
>         finally {
>             if( is != null ) {
>                 is.close();
>             }
>         }
>     }
>
>
> Thanks for your help
>
> Günter

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org