You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Kelvin Tan <ke...@relevanz.com> on 2002/10/31 08:31:44 UTC

Deleting fields from a Document

There is currently no way to delete fields from a Document. I 
wondered if this was evil, in any way, and looking at the source of 
Document.java, found no evidence that it is so. 

Document maintains a linked list of Fields. It would be not be 
difficult to delete a random Field, albeit a little inefficient.

The reason why I need to delete fields, is that my index has been 
inadvertently corrupted by fields with bad values from the 
application. Attempts to add in correct values for these fields don't 
solve the problem because the "bad" field still exists. One possible 
solution is to create a new document, enumerate through all the 
fields of the old document and add the ones you want. I don't have a 
huge problem with that, but I also wonder if field deletion is truly 
taboo.

Maybe someone can shed some light here?

Regards,
Kelvin


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Deleting fields from a Document

Posted by Doug Cutting <cu...@lucene.com>.
Kelvin Tan wrote:
> Does an in-memory Field guarantee access to its name and value? Say I 
> retrieve a Field from a Document A, and add it to a new Document B. 
> Before writing B to the index, I delete A. Would B still contain the 
> Field? If so, does it work for both String-based and Reader-based 
> values?

Readers can only be consumed once and they are never stored.  So a 
retrieved document will never have a Reader-valued field, and if a 
Document instance with a Reader-valued field is added to two indexes 
then this field will in effect only be added to the first index.

Other than that, documents instances may be added to multiple indexes. 
Deleting a document from an index does not alter a retrieved document 
instance.  Note however that retrieved document instances will not 
contain unstored fields.

Doug



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Deleting fields from a Document

Posted by Kelvin Tan <ke...@relevanz.com>.
This brings me to a related discussion:  in-memory and index Field 
representations.

Does an in-memory Field guarantee access to its name and value? Say I 
retrieve a Field from a Document A, and add it to a new Document B. 
Before writing B to the index, I delete A. Would B still contain the 
Field? If so, does it work for both String-based and Reader-based 
values?

Regards,
Kelvin


On Mon, 04 Nov 2002 10:40:40 -0800, Doug Cutting said:
>Kelvin Tan wrote:
>>Document maintains a linked list of Fields. It would be not be
>>difficult to delete a random Field, albeit a little inefficient.
>
>That would delete it from the in-memory representation, but, once it
>has been indexed, there is no easy way to remove a field value from
>a document other than to delete the document and re-add it.
>
>Doug
>
>
>--
>To unsubscribe, e-mail:   <mailto:lucene-user-
>unsubscribe@jakarta.apache.org> For additional commands, e-mail:
><mailto:lucene-user-
>help@jakarta.apache.org>




--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Deleting fields from a Document

Posted by Doug Cutting <cu...@lucene.com>.
Kelvin Tan wrote:
> Document maintains a linked list of Fields. It would be not be 
> difficult to delete a random Field, albeit a little inefficient.

That would delete it from the in-memory representation, but, once it has 
been indexed, there is no easy way to remove a field value from a 
document other than to delete the document and re-add it.

Doug


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>