You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2014/05/17 06:52:14 UTC

[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

    [ https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000666#comment-14000666 ] 

Robert Muir commented on LUCENE-5618:
-------------------------------------

This looks good to me. My only concern (which can be a followup issue), is to try to simplify a lot of stuff in SegmentReader.initDocValuesProducers

In general I think SegmentReader needs a cleanup to ensure things are fast.

This logic is now more complex than before as there is back compat etc going on, and involves multiple full passes over fieldinfos/dv fields. In general we should really try to avoid this.

Its now quite a bit difficult to see what is happening in the common case (no updates for a segment) via initDocValuesProducers/SegmentDocValues/getNumericXXX codepaths.

On that issue we should cleanup other inefficiencies while we are there: e.g. we also want to try to reduce the overhead going on in e.g. SR.getNumericDocValues. For example today this is doing two hash lookups, when this method could just try 'dvFields' first and optimize the common case.

But lets fix the bugs first, this approach looks good to me. Long-term we should also investigate refactoring the livedocs format maybe to use this "files" approach recorded in the commit. Because currently the LiveDocs codec api is really horrible, and really its just an updatable 1-bit numeric docvalues.

> DocValues updates send wrong fieldinfos to codec producers
> ----------------------------------------------------------
>
>                 Key: LUCENE-5618
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5618
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Assignee: Shai Erera
>            Priority: Blocker
>             Fix For: 4.9
>
>         Attachments: LUCENE-5618.patch, LUCENE-5618.patch
>
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write "batches" of fields in updates but just have only one field per gen? 
> This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org