You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Kathleen Hilston (Jira)" <ji...@apache.org> on 2021/05/17 14:54:00 UTC
[jira] [Commented] (LUCENE-9961) Lucene 8 causing app server threads to hang due to high rate of network usage

    [ https://issues.apache.org/jira/browse/LUCENE-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346207#comment-17346207 ] 

Kathleen Hilston commented on LUCENE-9961:
------------------------------------------

I would like to highlight this issues was sent to the email group as a question and we received the following reply to our email from 

Robert Muir <[rcmuir@gmail.com|mailto:rcmuir@gmail.com]>

{color:#0747a6}Don't use filesystems such as NFS (that is what EFS is) with lucene! This is really bad design, and it is the root cause of your issue.{color}

 

We have follow-up:

We find ability to have indexes located centrally valuable and beneficial and need to understand why it is considered bad design.

 

Please advise.

> Lucene 8 causing app server threads to hang due to high rate of network usage
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-9961
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9961
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Kathleen Hilston
>            Priority: Major
>         Attachments: image-2021-05-17-10-47-42-245.png, image-2021-05-17-10-48-35-739.png
>
>
> *Issue*: Lucene 8 causing app server threads to hang due to high rate of network usage.
>  
> *Further details*: Recently we migrated from Lucene 7.5.0 to Lucene 8.6.3 and we have encountered severe performance issues after this upgrade.  Our Lucene index has multilingual terms, is large in size, and is hosted on a network file storage (EFS at AWS).  Our Lucene queries construct a lot of Boolean term queries, and we suspect the off-heap FST introduced with Lucene 8 could be the root cause.  The specific issue we are facing after the Lucene upgrade is that, when a user searches for any given term, the tomcat server thread will hang while reading the bytes from an unexpectedly huge inbound flow of data from the Lucene Index on network storage.  We have seen inbound data flows ranging from 5% up to 45% of the total index size for a single search, primarily when searching for a term in a different language.  This issue does not occur with Lucene 7.
>  
> Here is a typical call stack highlighting the point of contention in the Tomcat threads when we encounter this performance issue:
>  
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:432)
> org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:421)
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:574)
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:445)
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:658)
> org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:330)
> org.apache.lucene.search.Weight.bulkScorer(Weight.java:181)
> org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
> org.apache.lucene.search.Weight.scorerSupplier(Weight.java:147)
> org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:115)
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.impacts(SegmentTermsEnum.java:1017)
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.impacts(Lucene84PostingsReader.java:272)
> org.apache.lucene.codecs.lucene84.Lucene84PostingsReader$BlockImpactsDocsEnum.<init>(Lucene84PostingsReader.java:1061)
> org.apache.lucene.codecs.lucene84.Lucene84SkipReader.init(Lucene84SkipReader.java:103)
> org.apache.lucene.codecs.MultiLevelSkipListReader.init(MultiLevelSkipListReader.java:208)
> org.apache.lucene.codecs.MultiLevelSkipListReader.loadSkipLevels(MultiLevelSkipListReader.java:229)
> org.apache.lucene.store.DataInput.readVLong(DataInput.java:190)
> org.apache.lucene.store.DataInput.readVLong(DataInput.java:205)
> org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:80)
> org.apache.lucene.store.ByteBufferGuard.getByte(ByteBufferGuard.java:99)
>  
> When researching found the LUCENE JIRA LUCENE-8635 (which is referenced in [https://www.elastic.co/blog/whats-new-in-lucene-8] section ‘Moving the terms dictionary off-heap’).  Would this help the issue?
>  
> Please advise.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org