You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Paul Libbrecht <pa...@activemath.org> on 2005/07/28 11:37:32 UTC

updating an index... with existing documents ?

hi,

My mission is currently to update an index by marking adding a flag 
field on some documents.
For this, I seem to have the only following possibility:
- search for the documents in question, store them, filter them
- modify the documents in accordance
- delete the modified documents
- put back the documents

However, I seem to experiment and fear that unstored-fields will be 
lost underway... I, of course, do not wish to re-run the analysis 
process here (which is complex because of the xml-nature on the back).

Will these fields indeed be lost ?
Is there no way to "copy the token-streams" (as it's anyways stored in 
the index in some way) ?

thanks

paul


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: updating an index... with existing documents ?

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jul 28, 2005, at 8:36 AM, Paul Libbrecht wrote:
> Dare I ask wether this implies that the fields are stored ?

I don't quite understand.  The "reconstruct" feature of Luke (and  
thus the code you can borrow from) does not require that fields are  
stored - it pulls the indexed terms from the index and puts it back  
together.  Again, this is risky business depending on the analysis  
process that you used to originally index a document.

Give Luke a try to see it first-hand.

     Erik


>
> thanks
>
> paul
>
>
> Le 28 juil. 05, à 14:26, Erik Hatcher a écrit :
>
>> It is possible to reconstruct a document from the index, but it is  
>> a potentially lossy proposition, since stemming and other  
>> manglings might have gone on.  Look at Luke and see how it does it  
>> (you can "reconstruct and edit" a document from its UI).
>>
>> On Jul 28, 2005, at 5:37 AM, Paul Libbrecht wrote:
>>
>>> My mission is currently to update an index by marking adding a  
>>> flag field on some documents.
>>> For this, I seem to have the only following possibility:
>>> - search for the documents in question, store them, filter them
>>> - modify the documents in accordance
>>> - delete the modified documents
>>> - put back the documents
>>>
>>> However, I seem to experiment and fear that unstored-fields will  
>>> be lost underway... I, of course, do not wish to re-run the  
>>> analysis process here (which is complex because of the xml-nature  
>>> on the back).
>>>
>>> Will these fields indeed be lost ?
>>> Is there no way to "copy the token-streams" (as it's anyways  
>>> stored in the index in some way) ?
>>>
>>> thanks
>>>
>>> paul
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: updating an index... with existing documents ?

Posted by Paul Libbrecht <pa...@activemath.org>.
Dare I ask wether this implies that the fields are stored ?

thanks

paul


Le 28 juil. 05, à 14:26, Erik Hatcher a écrit :
> It is possible to reconstruct a document from the index, but it is a 
> potentially lossy proposition, since stemming and other manglings 
> might have gone on.  Look at Luke and see how it does it (you can 
> "reconstruct and edit" a document from its UI).
>
> On Jul 28, 2005, at 5:37 AM, Paul Libbrecht wrote:
>> My mission is currently to update an index by marking adding a flag 
>> field on some documents.
>> For this, I seem to have the only following possibility:
>> - search for the documents in question, store them, filter them
>> - modify the documents in accordance
>> - delete the modified documents
>> - put back the documents
>>
>> However, I seem to experiment and fear that unstored-fields will be 
>> lost underway... I, of course, do not wish to re-run the analysis 
>> process here (which is complex because of the xml-nature on the 
>> back).
>>
>> Will these fields indeed be lost ?
>> Is there no way to "copy the token-streams" (as it's anyways stored 
>> in the index in some way) ?
>>
>> thanks
>>
>> paul
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: updating an index... with existing documents ?

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Paul,

It is possible to reconstruct a document from the index, but it is a  
potentially lossy proposition, since stemming and other manglings  
might have gone on.  Look at Luke and see how it does it (you can  
"reconstruct and edit" a document from its UI).

     Erik

On Jul 28, 2005, at 5:37 AM, Paul Libbrecht wrote:

>
> hi,
>
> My mission is currently to update an index by marking adding a flag  
> field on some documents.
> For this, I seem to have the only following possibility:
> - search for the documents in question, store them, filter them
> - modify the documents in accordance
> - delete the modified documents
> - put back the documents
>
> However, I seem to experiment and fear that unstored-fields will be  
> lost underway... I, of course, do not wish to re-run the analysis  
> process here (which is complex because of the xml-nature on the back).
>
> Will these fields indeed be lost ?
> Is there no way to "copy the token-streams" (as it's anyways stored  
> in the index in some way) ?
>
> thanks
>
> paul
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org