You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Artem Redkin <ar...@yandex-team.ru> on 2015/02/27 13:23:06 UTC

FieldValueFilter and non-DocValues fields

Hello.

After upgrade to 5.0.0 FieldValueFilter no longer works for fields that are not in DocValues. I have large indexes (around half a billion documents each) and I do not want to duplicate data too much. If I add some fields to DocValues each index will grow from 400GB to 1.3TB, with no apparent benefits, those fields are not used for faceting or sorting, only as “flags” in search (thought I have to return them to user as they are - integers).

Can you please help me with two questions:
1. Is there any alternative to FieldValueFilter (I use NumericRangeFilter.newIntRange(fieldName, Integer.MIN_VALUE, Integer.MAX_VALUE, true, true) for now) to find documents with field present?
2. Can one use DocValues effectively instead of Stored Fields to show found documents? Or I should use UninvertingReader for fields that are not in DocValues?

Thanks!

-- 
Artem Redkin
artemredkin@yandex-team.ru


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: FieldValueFilter and non-DocValues fields

Posted by Adrien Grand <jp...@gmail.com>.
If you do not need sorting or faceting, doc values are not needed indeed.

You can get back the old behaviour by using UninvertingReader (see
LUCENE-5666 for more background). But like before this will load a lot
of stuff into memory...

Note that FieldValueFilter is very slow (with or without doc values).
An alternative would be to index the field names of your documents and
then use simple term queries against this field to filter those that
have a value.

On Fri, Feb 27, 2015 at 1:23 PM, Artem Redkin
<ar...@yandex-team.ru> wrote:
> Hello.
>
> After upgrade to 5.0.0 FieldValueFilter no longer works for fields that are not in DocValues. I have large indexes (around half a billion documents each) and I do not want to duplicate data too much. If I add some fields to DocValues each index will grow from 400GB to 1.3TB, with no apparent benefits, those fields are not used for faceting or sorting, only as “flags” in search (thought I have to return them to user as they are - integers).
>
> Can you please help me with two questions:
> 1. Is there any alternative to FieldValueFilter (I use NumericRangeFilter.newIntRange(fieldName, Integer.MIN_VALUE, Integer.MAX_VALUE, true, true) for now) to find documents with field present?
> 2. Can one use DocValues effectively instead of Stored Fields to show found documents? Or I should use UninvertingReader for fields that are not in DocValues?
>
> Thanks!
>
> --
> Artem Redkin
> artemredkin@yandex-team.ru
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>



-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org