You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/02/05 17:36:00 UTC

[jira] [Commented] (LUCENE-9147) Move the stored fields index off-heap

    [ https://issues.apache.org/jira/browse/LUCENE-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030849#comment-17030849 ] 

ASF subversion and git services commented on LUCENE-9147:
---------------------------------------------------------

Commit 136dcbdbbced7c2d32b4d244ca99ace2c59baee8 in lucene-solr's branch refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=136dcbd ]

LUCENE-9147: Move the stored fields index off-heap. (#1179)

This replaces the index of stored fields and term vectors with two
`DirectMonotonic` arrays. `DirectMonotonicWriter` requires to know the number
of values to write up-front, so incoming doc IDs and file pointers are buffered
on disk using temporary files that never get fsynced, but have index headers
and footers to make sure any corruption in these files wouldn't propagate to the
index.

`DirectMonotonicReader` gets a specialized `binarySearch` implementation that
leverages the metadata in order to avoid going to the IndexInput as often as
possible. Actually in the common case, it would only go to a single
sub `DirectReader` which, combined with the size of blocks of 1k values, helps
bound the number of page faults to 2.


> Move the stored fields index off-heap
> -------------------------------------
>
>                 Key: LUCENE-9147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9147
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Priority: Minor
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Now that the terms index is off-heap by default, it's almost embarrassing that many indices spend most of their memory usage on the stored fields index or the term vectors index, which are much less performance-sensitive than the terms index. We should move them off-heap too?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org