You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Tamer Gur <tg...@ebi.ac.uk> on 2017/07/12 10:29:12 UTC

stucked indexing process

Hi all,

we are having an issue in our indexing pipeline time to time our 
indexing process are stucked. Following text&picture is from jvisualvm 
and it seems process is waiting at 
sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext() method all the 
time. we are using lucene 5.4.1 and java  1.8.0_65-b17.

what can be the reason of this?

Many Thanks

Tamer

text version

" 
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>()","100.0","73509067","73509067","3"
" 
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>()","100.0","73509067","73509067","3"
" 
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategory()","100.0","73509067","73509067","3"
" 
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.internalAddCategory()","100.0","73509067","73509067","3"
" 
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategoryDocument()","100.0","73509067","73509067","3"
" 
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getTaxoArrays()","100.0","73509067","73509067","3"
" 
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.initReaderManager()","100.0","73509067","73509067","3"
" 
org.apache.lucene.index.ReaderManager.<init>()","100.0","73509067","73509067","3"
" 
org.apache.lucene.index.DirectoryReader.open()","100.0","73509067","73509067","3"
" 
org.apache.lucene.index.IndexWriter.getReader()","100.0","73509067","73509067","3"
" 
org.apache.lucene.index.IndexWriter.maybeMerge()","100.0","73509067","73509067","3"
" 
org.apache.lucene.index.ConcurrentMergeScheduler.merge()","100.0","73509067","73509067","3"
" 
org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults()","100.0","73509067","73509067","3"
" org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
" org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
" 
org.apache.lucene.util.IOUtils.spinsLinux()","100.0","73509067","73509067","3"
" 
org.apache.lucene.util.IOUtils.getFileStore()","100.0","73509067","73509067","3"
" 
sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext()","100.0","73509067","73509067","3"

image version


Re: stucked indexing process

Posted by Tamer Gur <tg...@ebi.ac.uk>.
thanks a lot for the "hack" and jstack suggestion Uwe i will try them.

Unfortunately we are in the NFS mount since we don't have other choices.

also might be related, in the cluster(computing farm) we are indexing 
parallel several size of different datasets and most them are indexed 
without problem the one that are stucking recently always the one we 
allocate them 1 cpu since they are very small and easy to index we ask 
less cpu to use our cluster efficiently. I will also increase these 
datasets cpu size to 2 and to see if it helps.

thanks again
Tamer

On 12/07/2017 16:27, Uwe Schindler wrote:
> Hi Tamer,
>
> Actually you can skip the check with a “hack”:
>
> You can override the check by enforcing number of threads and number of merges at same time by setting this on your own config when using ConcurrentMergeScheduler:
>
> https://goo.gl/5QJpMh
>
> If you override maxThreadCount and maxMergeCount in the CMS instance, the check is not executed. You may pass the CMS config using IndexWriterConfig.
>
> In addition to find out the real issue, we need more information: The problem I had was that I cannot say where it exactly stopped for your case, because your “stack” trace-like output had no line numbers. It would be better to run “jstack <pid>” on command line when it hangs to see the line number in Lucene code. There are 2 places that might hang: Listing the mount points on its own or listing their block device properties inside “/sys/block” folder.
>
> Just to be sure: I hope you don’t place your indexes on NFS mounts? Aren’t you?
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Tamer Gur [mailto:tgur@ebi.ac.uk]
>> Sent: Wednesday, July 12, 2017 4:57 PM
>> To: java-user@lucene.apache.org; Uwe Schindler <uw...@thetaphi.de>
>> Subject: Re: stucked indexing process
>>
>> thanks Uwe for reply. we are indexing data in a cluster where there are
>> many mount points so it is possible that one them has issue or slowness
>> when this check first tried but now when i execute "mount" it is
>> responding all the mount points.
>>
>> I was wondering is there any configuration to skip this SSD check?
>>
>> Tamer
>>
>> On 12/07/2017 14:15, Uwe Schindler wrote:
>>> Hi,
>>>
>>> to figure out if you system is using an SSD drive for the index
>>> directory, the merge scheduler has to get the underlying mount point
>>> of the index directory. As there is no direct lookup for that, it
>>> needs to list all mount points in the system with a Java7 FS function.
>>> And that seems to hang for some reason. Could it be that you have a
>>> mount (like NFS or CIFS) that no longer responds?
>>>
>>> Just list all with “cat /proc/mounts” or the “mount” command and check
>>> if any of them is stuck or no longer responding.
>>>
>>> Uwe
>>>
>>> -----
>>>
>>> Uwe Schindler
>>>
>>> Achterdiek 19, D-28357 Bremen
>>>
>>> http://www.thetaphi.de <http://www.thetaphi.de/>
>>>
>>> eMail: uwe@thetaphi.de
>>>
>>> *From:*Tamer Gur [mailto:tgur@ebi.ac.uk]
>>> *Sent:* Wednesday, July 12, 2017 12:29 PM
>>> *To:* java-user@lucene.apache.org
>>> *Subject:* stucked indexing process
>>>
>>> Hi all,
>>>
>>> we are having an issue in our indexing pipeline time to time our
>>> indexing process are stucked. Following text&picture is from jvisualvm
>>> and it seems process is waiting at
>>> sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext() method all the
>>> time. we are using lucene 5.4.1 and java 1.8.0_65-b17.
>>>
>>> what can be the reason of this?
>>>
>>> Many Thanks
>>>
>>> Tamer
>>>
>>> text version
>>>
>>> "
>>>
>> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init
>>> ()","100.0","73509067","73509067","3"
>>> "
>>>
>> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init
>>> ()","100.0","73509067","73509067","3"
>>> "
>>>
>> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addC
>> ategory()","100.0","73509067","73509067","3"
>>> "
>>>
>> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.inter
>> nalAddCategory()","100.0","73509067","73509067","3"
>>> "
>>>
>> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addC
>> ategoryDocument()","100.0","73509067","73509067","3"
>>> "
>>>
>> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getT
>> axoArrays()","100.0","73509067","73509067","3"
>>> "
>>>
>> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.initR
>> eaderManager()","100.0","73509067","73509067","3"
>>> "
>>>
>> org.apache.lucene.index.ReaderManager.<init>()","100.0","73509067","7350
>> 9067","3"
>>> "
>>>
>> org.apache.lucene.index.DirectoryReader.open()","100.0","73509067","7350
>> 9067","3"
>>> "
>>>
>> org.apache.lucene.index.IndexWriter.getReader()","100.0","73509067","735
>> 09067","3"
>>> "
>>>
>> org.apache.lucene.index.IndexWriter.maybeMerge()","100.0","73509067","7
>> 3509067","3"
>>> "
>>>
>> org.apache.lucene.index.ConcurrentMergeScheduler.merge()","100.0","7350
>> 9067","73509067","3"
>>> "
>>>
>> org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults()",
>> "100.0","73509067","73509067","3"
>>> "
>>> org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
>>> "
>>> org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
>>> "
>>>
>> org.apache.lucene.util.IOUtils.spinsLinux()","100.0","73509067","73509067",
>> "3"
>>> "
>>>
>> org.apache.lucene.util.IOUtils.getFileStore()","100.0","73509067","73509067"
>> ,"3"
>>> "
>>>
>> sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext()","100.0","73509067","
>> 73509067","3"
>>> image version
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: stucked indexing process

Posted by "peterbasutkar@gmail.com" <pe...@gmail.com>.
Hi 

i am from the same team of Tamer who initiated this thread

We are indexing documents using apache lucene using several parallel
indexing pipelines(java process) to NFS mounted directory.
All of them follows same code and workflow most of the pipelines succeeds
without any issue, but only only few indexing pipelines remains in idle and
in RUN state forever , we observed the thread dump as well , it's not moving
at all.
if anyone is facing this issue and found solution do share with me

Note: We are using LSF cluster for our parallel indexings(java process) and
we launch these jobs using dynamic resource like cpu and memory , but
indexing process of individual lucene index serve by single host

Thread dump :
2021-05-10 09:26:22
Full thread dump OpenJDK 64-Bit Server VM (11.0.4+11 mixed mode):

Threads class SMR info:
_java_thread_list=0x00002b9174000df0, length=14, elements={
0x00002b90b8012000, 0x00002b90ba0b5000, 0x00002b90ba0b9000,
0x00002b90ba0cc000,
0x00002b90ba0ce000, 0x00002b90ba0d0000, 0x00002b90ba0d2000,
0x00002b90ba130000,
0x00002b90ba144000, 0x00002b90ba807800, 0x00002b90ba817000,
0x00002b9140001000,
0x00002b9168019800, 0x00002b916801e800
}

"main" #1 prio=5 os_prio=0 cpu=17492.51ms elapsed=24411.44s
tid=0x00002b90b8012000 nid=0x600f1 runnable  [0x00002b90b423a000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.fs.UnixNativeDispatcher.stat0(java.base@11.0.4/Native Method)
	at
sun.nio.fs.UnixNativeDispatcher.stat(java.base@11.0.4/UnixNativeDispatcher.java:291)
	at
sun.nio.fs.UnixFileAttributes.get(java.base@11.0.4/UnixFileAttributes.java:70)
	at sun.nio.fs.UnixFileStore.devFor(java.base@11.0.4/UnixFileStore.java:57)
	at sun.nio.fs.UnixFileStore.<init>(java.base@11.0.4/UnixFileStore.java:72)
	at
sun.nio.fs.LinuxFileStore.<init>(java.base@11.0.4/LinuxFileStore.java:53)
	at
sun.nio.fs.LinuxFileSystem.getFileStore(java.base@11.0.4/LinuxFileSystem.java:112)
	at
sun.nio.fs.UnixFileSystem$FileStoreIterator.readNext(java.base@11.0.4/UnixFileSystem.java:212)
	at
sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext(java.base@11.0.4/UnixFileSystem.java:223)
	- locked <0x00000007e75ab7e0> (a
sun.nio.fs.UnixFileSystem$FileStoreIterator)
	at org.apache.lucene.util.IOUtils.getFileStore(IOUtils.java:595)
	at org.apache.lucene.util.IOUtils.spinsLinux(IOUtils.java:539)
	at org.apache.lucene.util.IOUtils.spins(IOUtils.java:528)
	at org.apache.lucene.util.IOUtils.spins(IOUtils.java:503)
	at
org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults(ConcurrentMergeScheduler.java:412)
	- locked <0x00000007e7146348> (a
org.apache.lucene.index.ConcurrentMergeScheduler)
	at
org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:501)
	- locked <0x00000007e7146348> (a
org.apache.lucene.index.ConcurrentMergeScheduler)
	at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2158)
	at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:548)
	at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:116)
	at org.apache.lucene.index.ReaderManager.<init>(ReaderManager.java:72)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.initReaderManager(DirectoryTaxonomyWriter.java:279)
	- locked <0x00000007e70bbff8> (a
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getTaxoArrays(DirectoryTaxonomyWriter.java:749)
	- locked <0x00000007e70bbff8> (a
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategoryDocument(DirectoryTaxonomyWriter.java:508)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.internalAddCategory(DirectoryTaxonomyWriter.java:462)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategory(DirectoryTaxonomyWriter.java:429)
	- locked <0x00000007e70bbff8> (a
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>(DirectoryTaxonomyWriter.java:209)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>(DirectoryTaxonomyWriter.java:293)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>(DirectoryTaxonomyWriter.java:309)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.createTaxonomyWriter(IndexStepIndexing.java:304)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.prepareIndexWriter(IndexStepIndexing.java:217)
	- locked <0x00000007ebf96b70> (a
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.prepareIndexWriter(IndexStepIndexing.java:206)
	- locked <0x00000007ebf96b70> (a
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.index(IndexStepIndexing.java:133)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.process(IndexStepIndexing.java:80)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.process(IndexStepIndexing.java:61)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.execute(Pipeline.java:32)
	at uk.ac.ebi.ebinocle.indexer.steps.IndexStep.process(IndexStep.java:54)
	at uk.ac.ebi.ebinocle.indexer.steps.IndexStep.process(IndexStep.java:39)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.execute(Pipeline.java:32)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.LsfIndexerJob.run(LsfIndexerJob.java:110)
	at
org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:800)
	at
org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:784)
	at
org.springframework.boot.SpringApplication.run(SpringApplication.java:338)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.LsfIndexerJob.main(LsfIndexerJob.java:67)

   Locked ownable synchronizers:
	- None



Regards,
Prasad



--
Sent from: https://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: stucked indexing process

Posted by "peterbasutkar@gmail.com" <pe...@gmail.com>.
Hi 

i am from the same team of Tamer who initiated this thread

We are indexing documents using apache lucene using several parallel
indexing pipelines(java process) to NFS mounted directory.
All of them follows same code and workflow most of the pipelines succeeds
without any issue, but only only few indexing pipelines remains in idle and
in RUN state forever , we observed the thread dump as well , it's not moving
at all.
if anyone is facing this issue and found solution do share with me

Note: We are using LSF cluster for our parallel indexings(java process) and
we launch these jobs using dynamic resource like cpu and memory , but
indexing process of individual lucene index serve by single host

Thread dump :
2021-05-10 09:26:22
Full thread dump OpenJDK 64-Bit Server VM (11.0.4+11 mixed mode):

Threads class SMR info:
_java_thread_list=0x00002b9174000df0, length=14, elements={
0x00002b90b8012000, 0x00002b90ba0b5000, 0x00002b90ba0b9000,
0x00002b90ba0cc000,
0x00002b90ba0ce000, 0x00002b90ba0d0000, 0x00002b90ba0d2000,
0x00002b90ba130000,
0x00002b90ba144000, 0x00002b90ba807800, 0x00002b90ba817000,
0x00002b9140001000,
0x00002b9168019800, 0x00002b916801e800
}

"main" #1 prio=5 os_prio=0 cpu=17492.51ms elapsed=24411.44s
tid=0x00002b90b8012000 nid=0x600f1 runnable  [0x00002b90b423a000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.fs.UnixNativeDispatcher.stat0(java.base@11.0.4/Native Method)
	at
sun.nio.fs.UnixNativeDispatcher.stat(java.base@11.0.4/UnixNativeDispatcher.java:291)
	at
sun.nio.fs.UnixFileAttributes.get(java.base@11.0.4/UnixFileAttributes.java:70)
	at sun.nio.fs.UnixFileStore.devFor(java.base@11.0.4/UnixFileStore.java:57)
	at sun.nio.fs.UnixFileStore.<init>(java.base@11.0.4/UnixFileStore.java:72)
	at
sun.nio.fs.LinuxFileStore.<init>(java.base@11.0.4/LinuxFileStore.java:53)
	at
sun.nio.fs.LinuxFileSystem.getFileStore(java.base@11.0.4/LinuxFileSystem.java:112)
	at
sun.nio.fs.UnixFileSystem$FileStoreIterator.readNext(java.base@11.0.4/UnixFileSystem.java:212)
	at
sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext(java.base@11.0.4/UnixFileSystem.java:223)
	- locked <0x00000007e75ab7e0> (a
sun.nio.fs.UnixFileSystem$FileStoreIterator)
	at org.apache.lucene.util.IOUtils.getFileStore(IOUtils.java:595)
	at org.apache.lucene.util.IOUtils.spinsLinux(IOUtils.java:539)
	at org.apache.lucene.util.IOUtils.spins(IOUtils.java:528)
	at org.apache.lucene.util.IOUtils.spins(IOUtils.java:503)
	at
org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults(ConcurrentMergeScheduler.java:412)
	- locked <0x00000007e7146348> (a
org.apache.lucene.index.ConcurrentMergeScheduler)
	at
org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:501)
	- locked <0x00000007e7146348> (a
org.apache.lucene.index.ConcurrentMergeScheduler)
	at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2158)
	at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:548)
	at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:116)
	at org.apache.lucene.index.ReaderManager.<init>(ReaderManager.java:72)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.initReaderManager(DirectoryTaxonomyWriter.java:279)
	- locked <0x00000007e70bbff8> (a
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getTaxoArrays(DirectoryTaxonomyWriter.java:749)
	- locked <0x00000007e70bbff8> (a
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategoryDocument(DirectoryTaxonomyWriter.java:508)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.internalAddCategory(DirectoryTaxonomyWriter.java:462)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategory(DirectoryTaxonomyWriter.java:429)
	- locked <0x00000007e70bbff8> (a
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>(DirectoryTaxonomyWriter.java:209)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>(DirectoryTaxonomyWriter.java:293)
	at
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>(DirectoryTaxonomyWriter.java:309)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.createTaxonomyWriter(IndexStepIndexing.java:304)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.prepareIndexWriter(IndexStepIndexing.java:217)
	- locked <0x00000007ebf96b70> (a
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.prepareIndexWriter(IndexStepIndexing.java:206)
	- locked <0x00000007ebf96b70> (a
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.index(IndexStepIndexing.java:133)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.process(IndexStepIndexing.java:80)
	at
uk.ac.ebi.ebinocle.indexer.steps.substeps.IndexStepIndexing.process(IndexStepIndexing.java:61)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.execute(Pipeline.java:32)
	at uk.ac.ebi.ebinocle.indexer.steps.IndexStep.process(IndexStep.java:54)
	at uk.ac.ebi.ebinocle.indexer.steps.IndexStep.process(IndexStep.java:39)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.lambda$pipe$0(Pipeline.java:28)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline$$Lambda$202/0x000000080041f840.process(Unknown
Source)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.Pipeline.execute(Pipeline.java:32)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.LsfIndexerJob.run(LsfIndexerJob.java:110)
	at
org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:800)
	at
org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:784)
	at
org.springframework.boot.SpringApplication.run(SpringApplication.java:338)
	at
uk.ac.ebi.ebinocle.indexer.indexingjob.LsfIndexerJob.main(LsfIndexerJob.java:67)

   Locked ownable synchronizers:
	- None



Regards,
Prasad



--
Sent from: https://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: stucked indexing process

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

this looks like an issue in Solr or how you use Solr. Could it be that you
are reloading the cores all the time? Because the mentioned
"IOUtils.spins()" should only be called when the index is opened and the
IndexWriter is initialized by Solr. It is unlikely that you have any
concurrency there.

There might be one problem: If you have a stuck mount point in your system
(like another NFS mount) that hangs, it might happen that Lucene's code also
hangs, as it inspects the mount points for SSD / spinning disks on starting
up IndexWriter. So please make sure that "mount" does not hang and all the
mountpoints respond (e.g. there are o hanging NFS mounts blocking lucene
from inspecting mounts).

This is also a different issue than the one mentioned before, because you
don't use NFS, it's a local disk, right?

One workaround may be to explicitely tell ConcurrentMergeScheduler to enable
SSD or spinning disk  default settings in your solrconfig.xml:

    <mergeScheduler
class="org.apache.lucene.index.ConcurrentMergeScheduler">
      <bool name="defaultMaxMergesAndThreads">true</bool>
    </mergeScheduler>

Use "true" for spinning disks and "false" for SSDs. This prevents the
auto-detection from running.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Sachin909 <sa...@gmail.com>
> Sent: Wednesday, October 14, 2020 9:43 AM
> To: java-user@lucene.apache.org
> Subject: RE: stucked indexing process
> 
> Hi Uwe,
> 
> I have observed the similer issue with my application.
> 
> Application stack:
> 
> "coreLoadExecutor-4-thread-1" #86 prio=5 os_prio=0 tid=0x00007fbb1c364800
> *nid=0x1616* runnable [0x00007fbaa96ef000]
>    java.lang.Thread.State: RUNNABLE
> 	at sun.nio.fs.UnixNativeDispatcher.stat0(Native Method)
> 	at
sun.nio.fs.UnixNativeDispatcher.stat(UnixNativeDispatcher.java:286)
> 	at sun.nio.fs.UnixFileAttributes.get(UnixFileAttributes.java:70)
> 	at sun.nio.fs.UnixFileStore.devFor(UnixFileStore.java:55)
> 	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:70)
> 	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:48)
> 	at sun.nio.fs.LinuxFileSystem.getFileStore(LinuxFileSystem.java:112)
> 	at
>
sun.nio.fs.UnixFileSystem$FileStoreIterator.readNext(UnixFileSystem.java:213
)
> 	at
>
sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext(UnixFileSystem.java:224)
> 	- locked <0x00000000996864f8> (a
> sun.nio.fs.UnixFileSystem$FileStoreIterator)
> 	at org.apache.lucene.util.IOUtils.getFileStore(IOUtils.java:543)
> 	at org.apache.lucene.util.IOUtils.spinsLinux(IOUtils.java:487)
> 	at org.apache.lucene.util.IOUtils.spins(IOUtils.java:476)
> 	at org.apache.lucene.util.IOUtils.spins(IOUtils.java:451)
> 	at
> *org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults(Con
> currentMergeScheduler.java:376)*
> 	- locked <0x0000000099686598> (a
> org.apache.lucene.index.ConcurrentMergeScheduler)
> 	at
> org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeS
> cheduler.java:464)
> 	- locked <0x0000000099686598> (a
> org.apache.lucene.index.ConcurrentMergeScheduler)
> 	at
> org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:2444)
> 	at
> org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1131)
> 	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1175)
> 	at
> org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:291)
> 	at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:716)
> 	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:899)
> 	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:816)
> 	at org.apache.solr.core.CoreContainer.create(CoreContainer.java:890)
> 	at
> org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:542)
> 	at
> org.apache.solr.core.CoreContainer$$Lambda$34/209767675.call(Unknown
> Source)
> 	at
>
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(I
> nstrumentedExecutorService.java:197)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lam
> bda$execute$0(ExecutorUtil.java:229)
> 	at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$La
> mbda$35/1998024988.run(Unknown
> Source)
> 	at
>
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
> 149)
> 	at
>
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
> 624)
> 	at java.lang.Thread.run(Thread.java:748)
> 
> 
> Mount:
> /dev/mapper/appvg-lv_apps    /apps   index location has sufficient
> (100Gb+)disk free space.
> 
> 
> starce command:
> 
> In the strace command I have observed above thread were active and
> performing the operation till some point after that it doesn't do any
> activity,
> 
> Jstack *nid=0x1616* = 5654
> 
> 	Line 54534: 5654  18:53:53.224879 open("/proc/mounts",
> O_RDONLY|O_CLOEXEC
> <unfinished ...>
> 	Line 54536: 5654  18:53:53.226100 <... open resumed> ) = 2231
> <0.001205>
> 	Line 54538: 5654  18:53:53.226206 fstat(2231,  <unfinished ...>
> 	Line 54540: 5654  18:53:53.226255 <... fstat resumed>
> {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 <0.000039>
> 
> ....
> 
> 	Line 55026: 5654  18:53:53.247991
> stat("/net/icgudadmna1p/ctodata",
> <unfinished ...>
> 
> << thread doesn't show any activity after /net/...>>>
> 
> Could you please advice what could be the possible cause.
> 
> Java : Azule JDK1.8
> org.apache.lucene-lucene-core-2.4.1.jar
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Lucene-Java-Users-
> f532864.html
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: stucked indexing process

Posted by Sachin909 <sa...@gmail.com>.
Hi Uwe,

I have observed the similer issue with my application. 

Application stack: 

"coreLoadExecutor-4-thread-1" #86 prio=5 os_prio=0 tid=0x00007fbb1c364800
*nid=0x1616* runnable [0x00007fbaa96ef000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.fs.UnixNativeDispatcher.stat0(Native Method)
	at sun.nio.fs.UnixNativeDispatcher.stat(UnixNativeDispatcher.java:286)
	at sun.nio.fs.UnixFileAttributes.get(UnixFileAttributes.java:70)
	at sun.nio.fs.UnixFileStore.devFor(UnixFileStore.java:55)
	at sun.nio.fs.UnixFileStore.<init>(UnixFileStore.java:70)
	at sun.nio.fs.LinuxFileStore.<init>(LinuxFileStore.java:48)
	at sun.nio.fs.LinuxFileSystem.getFileStore(LinuxFileSystem.java:112)
	at
sun.nio.fs.UnixFileSystem$FileStoreIterator.readNext(UnixFileSystem.java:213)
	at
sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext(UnixFileSystem.java:224)
	- locked <0x00000000996864f8> (a
sun.nio.fs.UnixFileSystem$FileStoreIterator)
	at org.apache.lucene.util.IOUtils.getFileStore(IOUtils.java:543)
	at org.apache.lucene.util.IOUtils.spinsLinux(IOUtils.java:487)
	at org.apache.lucene.util.IOUtils.spins(IOUtils.java:476)
	at org.apache.lucene.util.IOUtils.spins(IOUtils.java:451)
	at
*org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults(ConcurrentMergeScheduler.java:376)*
	- locked <0x0000000099686598> (a
org.apache.lucene.index.ConcurrentMergeScheduler)
	at
org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:464)
	- locked <0x0000000099686598> (a
org.apache.lucene.index.ConcurrentMergeScheduler)
	at org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:2444)
	at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1131)
	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1175)
	at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:291)
	at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:716)
	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:899)
	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:816)
	at org.apache.solr.core.CoreContainer.create(CoreContainer.java:890)
	at org.apache.solr.core.CoreContainer.lambda$load$3(CoreContainer.java:542)
	at org.apache.solr.core.CoreContainer$$Lambda$34/209767675.call(Unknown
Source)
	at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
	at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$35/1998024988.run(Unknown
Source)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)


Mount: 
/dev/mapper/appvg-lv_apps    /apps   index location has sufficient
(100Gb+)disk free space.


starce command:

In the strace command I have observed above thread were active and
performing the operation till some point after that it doesn't do any
activity, 

Jstack *nid=0x1616* = 5654

	Line 54534: 5654  18:53:53.224879 open("/proc/mounts", O_RDONLY|O_CLOEXEC
<unfinished ...>
	Line 54536: 5654  18:53:53.226100 <... open resumed> ) = 2231 <0.001205>
	Line 54538: 5654  18:53:53.226206 fstat(2231,  <unfinished ...>
	Line 54540: 5654  18:53:53.226255 <... fstat resumed>
{st_mode=S_IFREG|0444, st_size=0, ...}) = 0 <0.000039>

....

	Line 55026: 5654  18:53:53.247991 stat("/net/icgudadmna1p/ctodata", 
<unfinished ...>

<< thread doesn't show any activity after /net/...>>>

Could you please advice what could be the possible cause.

Java : Azule JDK1.8
org.apache.lucene-lucene-core-2.4.1.jar 



--
Sent from: https://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: stucked indexing process

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Tamer,

Actually you can skip the check with a “hack”:

You can override the check by enforcing number of threads and number of merges at same time by setting this on your own config when using ConcurrentMergeScheduler:

https://goo.gl/5QJpMh

If you override maxThreadCount and maxMergeCount in the CMS instance, the check is not executed. You may pass the CMS config using IndexWriterConfig.

In addition to find out the real issue, we need more information: The problem I had was that I cannot say where it exactly stopped for your case, because your “stack” trace-like output had no line numbers. It would be better to run “jstack <pid>” on command line when it hangs to see the line number in Lucene code. There are 2 places that might hang: Listing the mount points on its own or listing their block device properties inside “/sys/block” folder.

Just to be sure: I hope you don’t place your indexes on NFS mounts? Aren’t you?

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Tamer Gur [mailto:tgur@ebi.ac.uk]
> Sent: Wednesday, July 12, 2017 4:57 PM
> To: java-user@lucene.apache.org; Uwe Schindler <uw...@thetaphi.de>
> Subject: Re: stucked indexing process
> 
> thanks Uwe for reply. we are indexing data in a cluster where there are
> many mount points so it is possible that one them has issue or slowness
> when this check first tried but now when i execute "mount" it is
> responding all the mount points.
> 
> I was wondering is there any configuration to skip this SSD check?
> 
> Tamer
> 
> On 12/07/2017 14:15, Uwe Schindler wrote:
> >
> > Hi,
> >
> > to figure out if you system is using an SSD drive for the index
> > directory, the merge scheduler has to get the underlying mount point
> > of the index directory. As there is no direct lookup for that, it
> > needs to list all mount points in the system with a Java7 FS function.
> > And that seems to hang for some reason. Could it be that you have a
> > mount (like NFS or CIFS) that no longer responds?
> >
> > Just list all with “cat /proc/mounts” or the “mount” command and check
> > if any of them is stuck or no longer responding.
> >
> > Uwe
> >
> > -----
> >
> > Uwe Schindler
> >
> > Achterdiek 19, D-28357 Bremen
> >
> > http://www.thetaphi.de <http://www.thetaphi.de/>
> >
> > eMail: uwe@thetaphi.de
> >
> > *From:*Tamer Gur [mailto:tgur@ebi.ac.uk]
> > *Sent:* Wednesday, July 12, 2017 12:29 PM
> > *To:* java-user@lucene.apache.org
> > *Subject:* stucked indexing process
> >
> > Hi all,
> >
> > we are having an issue in our indexing pipeline time to time our
> > indexing process are stucked. Following text&picture is from jvisualvm
> > and it seems process is waiting at
> > sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext() method all the
> > time. we are using lucene 5.4.1 and java 1.8.0_65-b17.
> >
> > what can be the reason of this?
> >
> > Many Thanks
> >
> > Tamer
> >
> > text version
> >
> > "
> >
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init
> >()","100.0","73509067","73509067","3"
> > "
> >
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init
> >()","100.0","73509067","73509067","3"
> > "
> >
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addC
> ategory()","100.0","73509067","73509067","3"
> > "
> >
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.inter
> nalAddCategory()","100.0","73509067","73509067","3"
> > "
> >
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addC
> ategoryDocument()","100.0","73509067","73509067","3"
> > "
> >
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getT
> axoArrays()","100.0","73509067","73509067","3"
> > "
> >
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.initR
> eaderManager()","100.0","73509067","73509067","3"
> > "
> >
> org.apache.lucene.index.ReaderManager.<init>()","100.0","73509067","7350
> 9067","3"
> > "
> >
> org.apache.lucene.index.DirectoryReader.open()","100.0","73509067","7350
> 9067","3"
> > "
> >
> org.apache.lucene.index.IndexWriter.getReader()","100.0","73509067","735
> 09067","3"
> > "
> >
> org.apache.lucene.index.IndexWriter.maybeMerge()","100.0","73509067","7
> 3509067","3"
> > "
> >
> org.apache.lucene.index.ConcurrentMergeScheduler.merge()","100.0","7350
> 9067","73509067","3"
> > "
> >
> org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults()",
> "100.0","73509067","73509067","3"
> > "
> > org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
> > "
> > org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
> > "
> >
> org.apache.lucene.util.IOUtils.spinsLinux()","100.0","73509067","73509067",
> "3"
> > "
> >
> org.apache.lucene.util.IOUtils.getFileStore()","100.0","73509067","73509067"
> ,"3"
> > "
> >
> sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext()","100.0","73509067","
> 73509067","3"
> >
> > image version
> >



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: stucked indexing process

Posted by Tamer Gur <tg...@ebi.ac.uk>.
thanks Uwe for reply. we are indexing data in a cluster where there are 
many mount points so it is possible that one them has issue or slowness 
when this check first tried but now when i execute "mount" it is 
responding all the mount points.

I was wondering is there any configuration to skip this SSD check?

Tamer

On 12/07/2017 14:15, Uwe Schindler wrote:
>
> Hi,
>
> to figure out if you system is using an SSD drive for the index 
> directory, the merge scheduler has to get the underlying mount point 
> of the index directory. As there is no direct lookup for that, it 
> needs to list all mount points in the system with a Java7 FS function. 
> And that seems to hang for some reason. Could it be that you have a 
> mount (like NFS or CIFS) that no longer responds?
>
> Just list all with “cat /proc/mounts” or the “mount” command and check 
> if any of them is stuck or no longer responding.
>
> Uwe
>
> -----
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> http://www.thetaphi.de <http://www.thetaphi.de/>
>
> eMail: uwe@thetaphi.de
>
> *From:*Tamer Gur [mailto:tgur@ebi.ac.uk]
> *Sent:* Wednesday, July 12, 2017 12:29 PM
> *To:* java-user@lucene.apache.org
> *Subject:* stucked indexing process
>
> Hi all,
>
> we are having an issue in our indexing pipeline time to time our 
> indexing process are stucked. Following text&picture is from jvisualvm 
> and it seems process is waiting at 
> sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext() method all the 
> time. we are using lucene 5.4.1 and java 1.8.0_65-b17.
>
> what can be the reason of this?
>
> Many Thanks
>
> Tamer
>
> text version
>
> " 
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategory()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.internalAddCategory()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategoryDocument()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getTaxoArrays()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.initReaderManager()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.index.ReaderManager.<init>()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.index.DirectoryReader.open()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.index.IndexWriter.getReader()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.index.IndexWriter.maybeMerge()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.index.ConcurrentMergeScheduler.merge()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.util.IOUtils.spinsLinux()","100.0","73509067","73509067","3"
> " 
> org.apache.lucene.util.IOUtils.getFileStore()","100.0","73509067","73509067","3"
> " 
> sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext()","100.0","73509067","73509067","3"
>
> image version
>


RE: stucked indexing process

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

 

to figure out if you system is using an SSD drive for the index directory, the merge scheduler has to get the underlying mount point of the index directory. As there is no direct lookup for that, it needs to list all mount points in the system with a Java7 FS function. And that seems to hang for some reason. Could it be that you have a mount (like NFS or CIFS) that no longer responds?

 

Just list all with “cat /proc/mounts” or the “mount” command and check if any of them is stuck or no longer responding.

 

Uwe

 

-----

Uwe Schindler

Achterdiek 19, D-28357 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Tamer Gur [mailto:tgur@ebi.ac.uk] 
Sent: Wednesday, July 12, 2017 12:29 PM
To: java-user@lucene.apache.org
Subject: stucked indexing process

 

Hi all,

we are having an issue in our indexing pipeline time to time our indexing process are stucked. Following text&picture is from jvisualvm and it seems process is waiting at sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext() method all the time. we are using lucene 5.4.1 and java  1.8.0_65-b17.

what can be the reason of this? 

Many Thanks

Tamer

text version

"           org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>()","100.0","73509067","73509067","3"
"            org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.<init>()","100.0","73509067","73509067","3"
"             org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategory()","100.0","73509067","73509067","3"
"              org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.internalAddCategory()","100.0","73509067","73509067","3"
"               org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.addCategoryDocument()","100.0","73509067","73509067","3"
"                org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.getTaxoArrays()","100.0","73509067","73509067","3"
"                 org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter.initReaderManager()","100.0","73509067","73509067","3"
"                  org.apache.lucene.index.ReaderManager.<init>()","100.0","73509067","73509067","3"
"                   org.apache.lucene.index.DirectoryReader.open()","100.0","73509067","73509067","3"
"                    org.apache.lucene.index.IndexWriter.getReader()","100.0","73509067","73509067","3"
"                     org.apache.lucene.index.IndexWriter.maybeMerge()","100.0","73509067","73509067","3"
"                      org.apache.lucene.index.ConcurrentMergeScheduler.merge()","100.0","73509067","73509067","3"
"                       org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults()","100.0","73509067","73509067","3"
"                        org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
"                         org.apache.lucene.util.IOUtils.spins()","100.0","73509067","73509067","3"
"                          org.apache.lucene.util.IOUtils.spinsLinux()","100.0","73509067","73509067","3"
"                           org.apache.lucene.util.IOUtils.getFileStore()","100.0","73509067","73509067","3"
"                            sun.nio.fs.UnixFileSystem$FileStoreIterator.hasNext()","100.0","73509067","73509067","3"

image version