You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2011/06/30 17:51:28 UTC

[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field

    [ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13057894#comment-13057894 ] 

Michael McCandless commented on LUCENE-3216:
--------------------------------------------

Looks great!

So this means, if you use default StandardCodec, and 3 fields store
doc values, and "main" CFS is off but doc values CFS is on, you'll see
a cfs file holding the 3-6 sub-files that your docvalues created,
right?

But eg if some fields use another codec, then that codec will have its
own CFS for any fields it has with docvalues (this is the TODO)?
That's seems fine for starters.

I like CodecConfig, but I'm not sure it should hold things specific
only to 1 codec, like the Pulsing cutoff?  The other settings seem
more widely applicable... though I guess even terms cache size is not
used by various codecs, but it is by enough to have it in
CodecConfig, I think?

CodecConfig needs @experimental?

For the nested test... couldn't you createCompoundOutput directly from
an opened CompoundFileDirectory?  (Vs creating externally & copying
in).


> Store DocValues per segment instead of per field
> ------------------------------------------------
>
>                 Key: LUCENE-3216
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3216
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch
>
>
> currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org