You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Albert Vila Puig <av...@imente.com> on 2003/10/31 09:53:01 UTC

Remove a token from a field

Hi,

    Is there a way to remove a token from a document field entry?. For 
example, I've got a UnStored field in my index and I want to remove a 
token from this field without doing the delete and add document (because 
I'm inserting the documents by date and I don't want to loose that sort).

    Is there an easy way to do that? Has anybody already started 
implementing it? Any suggestions about if I can do it in an efficient 
way? Maybe with another deletable file by fields?

   Any help will be appreciated.

Thanks

Albert


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Remove a token from a field

Posted by Albert Vila Puig <av...@imente.com>.
I know there is no way to update a document without doing a delete/add. 
But I'm asking if this feature is viable to be implemented in an 
efficient way.

Thanks

Erik Hatcher wrote:

> On Friday, October 31, 2003, at 03:53  AM, Albert Vila Puig wrote:
>
>> Hi,
>>
>>    Is there a way to remove a token from a document field entry?. 
>> For  example, I've got a UnStored field in my index and I want to 
>> remove a  token from this field without doing the delete and add 
>> document  (because I'm inserting the documents by date and I don't 
>> want to loose  that sort).
>>
>>    Is there an easy way to do that? Has anybody already started  
>> implementing it? Any suggestions about if I can do it in an 
>> efficient  way? Maybe with another deletable file by fields?
>
>
> Presently there is no way to "update" a document without doing a  
> delete/add.
>
> The sorting issue is an interesting one.  By design, of course, 
> Lucene  is meant to sort by score - period.  There has been an 
> interesting  implementation of a custom searcher posted about this 
> topic in the  recent past:
>
>     http://cvs.sourceforge.net/viewcvs.py/weblucene/weblucene/webapp/WEB- 
> INF/src/org/apache/lucene/search/ 
> IndexOrderSearcher.java?rev=1.2&view=auto
>
> It would be a performance hit to access the contents of a document  
> during searching to pull the contents of a field though, but that  
> search method shown could be adapted to do so and sort by a 
> particular  field during searching.
>
> I tend to suggest that date sorting is something that should be done 
> on  a data set culled from hits after searching is complete rather 
> than  during the searching operation itself.  There are games that 
> could be  played with boosts and perhaps a custom Similarity 
> implementation that  might be able to pull off date sorting somehow 
> too - I'll add this to  my list of interesting things to try out for fun.
>
>     Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Remove a token from a field

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Friday, October 31, 2003, at 03:53  AM, Albert Vila Puig wrote:
> Hi,
>
>    Is there a way to remove a token from a document field entry?. For  
> example, I've got a UnStored field in my index and I want to remove a  
> token from this field without doing the delete and add document  
> (because I'm inserting the documents by date and I don't want to loose  
> that sort).
>
>    Is there an easy way to do that? Has anybody already started  
> implementing it? Any suggestions about if I can do it in an efficient  
> way? Maybe with another deletable file by fields?

Presently there is no way to "update" a document without doing a  
delete/add.

The sorting issue is an interesting one.  By design, of course, Lucene  
is meant to sort by score - period.  There has been an interesting  
implementation of a custom searcher posted about this topic in the  
recent past:

	http://cvs.sourceforge.net/viewcvs.py/weblucene/weblucene/webapp/WEB- 
INF/src/org/apache/lucene/search/ 
IndexOrderSearcher.java?rev=1.2&view=auto

It would be a performance hit to access the contents of a document  
during searching to pull the contents of a field though, but that  
search method shown could be adapted to do so and sort by a particular  
field during searching.

I tend to suggest that date sorting is something that should be done on  
a data set culled from hits after searching is complete rather than  
during the searching operation itself.  There are games that could be  
played with boosts and perhaps a custom Similarity implementation that  
might be able to pull off date sorting somehow too - I'll add this to  
my list of interesting things to try out for fun.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org