You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Sanket Paranjape <sa...@gmail.com> on 2013/09/18 14:50:30 UTC

Document not searchable after IndexWrite.updateDocument

Hi,

I wrote a simple code to update a lucene document with new values.

Code Snippet:

    Term term = new Term("PRODUCT_CODE", productCode);
    TermQuery query = new TermQuery(term);
    TopDocs productDoc = this.searcher.search(query, 1);

    int docNum = scoreDoc.doc;
    Document doc = searcher.getIndexReader().document(docNum);

    doc.removeField("PRICE");
    doc.add(new LongField("PRICE", Long.valueOf(price), Field.Store.YES));
    this.writer.updateDocument(term, doc);


As per docs, updateDocument would delete the existing document and 
create a new document. This works as expected.

Problem is if I search by product code then I am able to find the 
document in luke. But If I apply BoolanQuery with RangeQuery, product 
code and few other queries then this document is not found.

I am using Lucene 4.4 version with Faceted Search.


Re: Document not searchable after IndexWrite.updateDocument

Posted by Sanket Paranjape <sa...@gmail.com>.
Hi Uwe,

Thanks for explaining.

Earlier our system was using 2.4 version and in that this was possible.

Anyways, I will implement it correctly as you suggested.

On 18-09-2013 07:41 PM, Uwe Schindler wrote:
> Hi,
>
> the problem is that a document retrieved by IndexReader.document() only contains stored fields and no indexed fields (they rae no longer accessible from the index). Also, the field types only contain "stored" as attribute, so when reindexing with IndexWriter you just create a document with stored fields but no indexed fields.
>
> Because of this common error, in Lucene 5.0 it is no longer possible to do this; the API will prevent it. IndexReader.document() will return a different class than IndexWriter accepts (StoredDocument vs. IndexDocument).
>
> To correctly implement this, make sure:
> - All fields must be stored (or use a special field which contains a representation of the whole document in its orginal form, like XML or JSON). This is how e.g. ElasticSearch handles this.
> - Get the stored field(s) from IndexWriter
> - Create a *new* document instance where you reconstruct the document including all stored/indexed flags (e.g. from the special XML/JSON field).
> - Index the new instance
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: Sanket Paranjape [mailto:sanket.paranjape.mailinglist@gmail.com]
>> Sent: Wednesday, September 18, 2013 2:51 PM
>> To: java-user@lucene.apache.org
>> Subject: Document not searchable after IndexWrite.updateDocument
>>
>> Hi,
>>
>> I wrote a simple code to update a lucene document with new values.
>>
>> Code Snippet:
>>
>>      Term term = new Term("PRODUCT_CODE", productCode);
>>      TermQuery query = new TermQuery(term);
>>      TopDocs productDoc = this.searcher.search(query, 1);
>>
>>      int docNum = scoreDoc.doc;
>>      Document doc = searcher.getIndexReader().document(docNum);
>>
>>      doc.removeField("PRICE");
>>      doc.add(new LongField("PRICE", Long.valueOf(price), Field.Store.YES));
>>      this.writer.updateDocument(term, doc);
>>
>>
>> As per docs, updateDocument would delete the existing document and
>> create a new document. This works as expected.
>>
>> Problem is if I search by product code then I am able to find the document in
>> luke. But If I apply BoolanQuery with RangeQuery, product code and few
>> other queries then this document is not found.
>>
>> I am using Lucene 4.4 version with Faceted Search.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Document not searchable after IndexWrite.updateDocument

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

the problem is that a document retrieved by IndexReader.document() only contains stored fields and no indexed fields (they rae no longer accessible from the index). Also, the field types only contain "stored" as attribute, so when reindexing with IndexWriter you just create a document with stored fields but no indexed fields.

Because of this common error, in Lucene 5.0 it is no longer possible to do this; the API will prevent it. IndexReader.document() will return a different class than IndexWriter accepts (StoredDocument vs. IndexDocument).

To correctly implement this, make sure:
- All fields must be stored (or use a special field which contains a representation of the whole document in its orginal form, like XML or JSON). This is how e.g. ElasticSearch handles this.
- Get the stored field(s) from IndexWriter
- Create a *new* document instance where you reconstruct the document including all stored/indexed flags (e.g. from the special XML/JSON field). 
- Index the new instance

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Sanket Paranjape [mailto:sanket.paranjape.mailinglist@gmail.com]
> Sent: Wednesday, September 18, 2013 2:51 PM
> To: java-user@lucene.apache.org
> Subject: Document not searchable after IndexWrite.updateDocument
> 
> Hi,
> 
> I wrote a simple code to update a lucene document with new values.
> 
> Code Snippet:
> 
>     Term term = new Term("PRODUCT_CODE", productCode);
>     TermQuery query = new TermQuery(term);
>     TopDocs productDoc = this.searcher.search(query, 1);
> 
>     int docNum = scoreDoc.doc;
>     Document doc = searcher.getIndexReader().document(docNum);
> 
>     doc.removeField("PRICE");
>     doc.add(new LongField("PRICE", Long.valueOf(price), Field.Store.YES));
>     this.writer.updateDocument(term, doc);
> 
> 
> As per docs, updateDocument would delete the existing document and
> create a new document. This works as expected.
> 
> Problem is if I search by product code then I am able to find the document in
> luke. But If I apply BoolanQuery with RangeQuery, product code and few
> other queries then this document is not found.
> 
> I am using Lucene 4.4 version with Faceted Search.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org