You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erick Erickson <er...@gmail.com> on 2014/03/06 03:40:46 UTC

abusing Doc values and updating fields....

I had an odd thought and wondered if there's any possibility of
abusing DocValues to make it work. And remember I have very little
real clue about DocValues implementations....

We've seen requests to add a field to the Solr schema or change a
field value, something akin to "update table set col1=foo where
col2=bar" or "add a new field to Solr documents and populate it". It
seems possible to write something that uses DocValues to actually do
something like this. I'm completely fuzzy on what that would look
like, whether one could do this on fields that weren't already
DocValues="true" fields, etc.

But it would be nifty if we could. And what a GSoC project if it makes
any kind of sense!

Erick@TiredEnoughThatAnySuggestionSeemsPossible

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: abusing Doc values and updating fields....

Posted by Shai Erera <se...@gmail.com>.
Hi Erick,

Lucene supports updating existing NumericDocValues fields in-place without
the need to re-index the documents since 4.6.0. But it currently only
supports changing existing fields' values, and does not allow adding new
fields to the index. If you want to make schema changes to the index you
can e.g. add the index to itself through a FilterReader that will make
schema changes, such as adding a new (even if empty) field. I think now
that we're gen'ing FieldInfos this isn't a technical limitation just more
from a statement perspective -- it's fine to add new documents with new
fields (that was always possible with Lucene), but if you want to alter the
schema of the index for existing documents, by either adding or removing
fields, you should go the FilterReader route.

As for supporting other DV types, we can definitely follow the same
approach for other types (and I even planned to, just work took me
elsewhere for a while :)). Now that the infrastructure is in place, adding
support for e.g. BinaryDV (which are very similar to NDV) is a matter of
following the trail of NDV updates, adding tests etc. But I don't think
we'll need to alter how the current updates are handled.

Handling other field types (i.e. postings) is more involved and will
require a different approach (a'la the StackedSegments approach we started
on LUCENE-4258).

Shai


On Thu, Mar 6, 2014 at 4:40 AM, Erick Erickson <er...@gmail.com>wrote:

> I had an odd thought and wondered if there's any possibility of
> abusing DocValues to make it work. And remember I have very little
> real clue about DocValues implementations....
>
> We've seen requests to add a field to the Solr schema or change a
> field value, something akin to "update table set col1=foo where
> col2=bar" or "add a new field to Solr documents and populate it". It
> seems possible to write something that uses DocValues to actually do
> something like this. I'm completely fuzzy on what that would look
> like, whether one could do this on fields that weren't already
> DocValues="true" fields, etc.
>
> But it would be nifty if we could. And what a GSoC project if it makes
> any kind of sense!
>
> Erick@TiredEnoughThatAnySuggestionSeemsPossible
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>