You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Gabriele Kahlout <ga...@mysimpatico.com> on 2011/07/09 11:11:09 UTC

Can I delete the stored value?

I've stored the contents of some pages I no longer need. How can I now
delete the stored content without re-crawling the pages (i.e. using
updateDocument ). I cannot just remove the field, since I still want the
field to be indexed, I just don't want to store something with it.
My understanding is that field.setValue("") won't do since that should
affect the indexed value as well.

-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Re: Can I delete the stored value?

Posted by Simon Willnauer <si...@googlemail.com>.
On Mon, Jul 11, 2011 at 8:28 AM, Andrzej Bialecki <ab...@getopt.org> wrote:
> On 7/10/11 2:33 PM, Simon Willnauer wrote:
>>
>> Currently there is no easy way to do this. I would need to think how
>> you can force the index to drop those so the answer here is no you
>> can't!
>>
>> simon
>>
>> On Sat, Jul 9, 2011 at 11:11 AM, Gabriele Kahlout
>> <ga...@mysimpatico.com>  wrote:
>>>
>>> I've stored the contents of some pages I no longer need. How can I now
>>> delete the stored content without re-crawling the pages (i.e. using
>>> updateDocument ). I cannot just remove the field, since I still want the
>>> field to be indexed, I just don't want to store something with it.
>>> My understanding is that field.setValue("") won't do since that should
>>> affect the indexed value as well.
>
> You could pump the content of your index through a FilterIndexReader - i.e.
> implement a subclass of FilterIndexReader that removes stored fields under
> some conditions, and then use IndexWriter.addIndexes with this reader.
>
> See LUCENE-1812 for another practical application of this concept.

good call andrzej, to make this work I think you need to use lucene
directly so make sure you are on the right version.
simon
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>

Re: Can I delete the stored value?

Posted by Andrzej Bialecki <ab...@getopt.org>.
On 7/10/11 2:33 PM, Simon Willnauer wrote:
> Currently there is no easy way to do this. I would need to think how
> you can force the index to drop those so the answer here is no you
> can't!
>
> simon
>
> On Sat, Jul 9, 2011 at 11:11 AM, Gabriele Kahlout
> <ga...@mysimpatico.com>  wrote:
>> I've stored the contents of some pages I no longer need. How can I now
>> delete the stored content without re-crawling the pages (i.e. using
>> updateDocument ). I cannot just remove the field, since I still want the
>> field to be indexed, I just don't want to store something with it.
>> My understanding is that field.setValue("") won't do since that should
>> affect the indexed value as well.

You could pump the content of your index through a FilterIndexReader - 
i.e. implement a subclass of FilterIndexReader that removes stored 
fields under some conditions, and then use IndexWriter.addIndexes with 
this reader.

See LUCENE-1812 for another practical application of this concept.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Re: Can I delete the stored value?

Posted by Simon Willnauer <si...@googlemail.com>.
Currently there is no easy way to do this. I would need to think how
you can force the index to drop those so the answer here is no you
can't!

simon

On Sat, Jul 9, 2011 at 11:11 AM, Gabriele Kahlout
<ga...@mysimpatico.com> wrote:
> I've stored the contents of some pages I no longer need. How can I now
> delete the stored content without re-crawling the pages (i.e. using
> updateDocument ). I cannot just remove the field, since I still want the
> field to be indexed, I just don't want to store something with it.
> My understanding is that field.setValue("") won't do since that should
> affect the indexed value as well.
>
> --
> Regards,
> K. Gabriele
>
> --- unchanged since 20/9/10 ---
> P.S. If the subject contains "[LON]" or the addressee acknowledges the
> receipt within 48 hours then I don't resend the email.
> subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
> < Now + 48h) ⇒ ¬resend(I, this).
>
> If an email is sent by a sender that is not a trusted contact or the email
> does not contain a valid code then the email is not received. A valid code
> starts with a hyphen and ends with "X".
> ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> L(-[a-z]+[0-9]X)).
>