You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Greg Miller (Jira)" <ji...@apache.org> on 2021/09/04 12:42:00 UTC

[jira] [Commented] (LUCENE-9969) DirectoryTaxonomyReader.taxoArray占用内存较大导致系统OOM宕机

    [ https://issues.apache.org/jira/browse/LUCENE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17409960#comment-17409960 ] 

Greg Miller commented on LUCENE-9969:
-------------------------------------

Interesting ideas! For the {{parents}} array, we _might_ even consider accessing the index structure currently in place as a first pass. While it's not as efficient as doc values, parents are stored using a single position entry per ordinal in a single postings list. As long as we can pre-sort the ordinals for which we need to lookup parents (as you suggest), we could access this postings list directly and read the position entry for each to grab the parent. That lets us use the current index format to try this out quickly.

> DirectoryTaxonomyReader.taxoArray占用内存较大导致系统OOM宕机
> ------------------------------------------------
>
>                 Key: LUCENE-9969
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9969
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>    Affects Versions: 6.6.2
>            Reporter: FengFeng Cheng
>            Priority: Trivial
>         Attachments: image-2021-05-24-13-43-43-289.png
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 首先数据量很大,jvm内存为90G,但是TaxonomyIndexArrays几乎占走了一半
> !image-2021-05-24-13-43-43-289.png!
> 请问对于TaxonomyReader是否有更好的使用方式或者其他的优化?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org