You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Albert Vila Puig <av...@imente.com> on 2003/10/31 09:53:01 UTC
Remove a token from a field
Hi,
Is there a way to remove a token from a document field entry?. For
example, I've got a UnStored field in my index and I want to remove a
token from this field without doing the delete and add document (because
I'm inserting the documents by date and I don't want to loose that sort).
Is there an easy way to do that? Has anybody already started
implementing it? Any suggestions about if I can do it in an efficient
way? Maybe with another deletable file by fields?
Any help will be appreciated.
Thanks
Albert
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Remove a token from a field
Posted by Albert Vila Puig <av...@imente.com>.
I know there is no way to update a document without doing a delete/add.
But I'm asking if this feature is viable to be implemented in an
efficient way.
Thanks
Erik Hatcher wrote:
> On Friday, October 31, 2003, at 03:53 AM, Albert Vila Puig wrote:
>
>> Hi,
>>
>> Is there a way to remove a token from a document field entry?.
>> For example, I've got a UnStored field in my index and I want to
>> remove a token from this field without doing the delete and add
>> document (because I'm inserting the documents by date and I don't
>> want to loose that sort).
>>
>> Is there an easy way to do that? Has anybody already started
>> implementing it? Any suggestions about if I can do it in an
>> efficient way? Maybe with another deletable file by fields?
>
>
> Presently there is no way to "update" a document without doing a
> delete/add.
>
> The sorting issue is an interesting one. By design, of course,
> Lucene is meant to sort by score - period. There has been an
> interesting implementation of a custom searcher posted about this
> topic in the recent past:
>
> http://cvs.sourceforge.net/viewcvs.py/weblucene/weblucene/webapp/WEB-
> INF/src/org/apache/lucene/search/
> IndexOrderSearcher.java?rev=1.2&view=auto
>
> It would be a performance hit to access the contents of a document
> during searching to pull the contents of a field though, but that
> search method shown could be adapted to do so and sort by a
> particular field during searching.
>
> I tend to suggest that date sorting is something that should be done
> on a data set culled from hits after searching is complete rather
> than during the searching operation itself. There are games that
> could be played with boosts and perhaps a custom Similarity
> implementation that might be able to pull off date sorting somehow
> too - I'll add this to my list of interesting things to try out for fun.
>
> Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Remove a token from a field
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Friday, October 31, 2003, at 03:53 AM, Albert Vila Puig wrote:
> Hi,
>
> Is there a way to remove a token from a document field entry?. For
> example, I've got a UnStored field in my index and I want to remove a
> token from this field without doing the delete and add document
> (because I'm inserting the documents by date and I don't want to loose
> that sort).
>
> Is there an easy way to do that? Has anybody already started
> implementing it? Any suggestions about if I can do it in an efficient
> way? Maybe with another deletable file by fields?
Presently there is no way to "update" a document without doing a
delete/add.
The sorting issue is an interesting one. By design, of course, Lucene
is meant to sort by score - period. There has been an interesting
implementation of a custom searcher posted about this topic in the
recent past:
http://cvs.sourceforge.net/viewcvs.py/weblucene/weblucene/webapp/WEB-
INF/src/org/apache/lucene/search/
IndexOrderSearcher.java?rev=1.2&view=auto
It would be a performance hit to access the contents of a document
during searching to pull the contents of a field though, but that
search method shown could be adapted to do so and sort by a particular
field during searching.
I tend to suggest that date sorting is something that should be done on
a data set culled from hits after searching is complete rather than
during the searching operation itself. There are games that could be
played with boosts and perhaps a custom Similarity implementation that
might be able to pull off date sorting somehow too - I'll add this to
my list of interesting things to try out for fun.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org