You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Matthew Byng-Maddick (JIRA)" <ji...@apache.org> on 2016/06/14 11:02:01 UTC

[jira] [Commented] (SOLR-7113) Multiple calls to UpdateLog#init is not thread safe with respect to the HDFS FileSystem client object usage.

    [ https://issues.apache.org/jira/browse/SOLR-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15329314#comment-15329314 ] 

Matthew Byng-Maddick commented on SOLR-7113:
--------------------------------------------

I'm very confused about this. We're seeing that tlogs get held open (and in particular hold open datanode transceivers) on HDFS Solr:

Using the github version of the commit (because I know how to link to it): https://github.com/apache/lucene-solr/commit/f2c9067e59b81b3dea7903315431babcd2506167#diff-c796f1f2f2f362c18bd89a85688fbebfR295 we see the following lines:
{code}
tlog = ntlog

if (tlog != ntlog) {
{code}

When is that if condition ever not true? What was this if condition supposed to do? This does appear one part of a reasonable explanation as to why the old rotated tlogs are being held open by the solr HDFS client.

> Multiple calls to UpdateLog#init is not thread safe with respect to the HDFS FileSystem client object usage.
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7113
>                 URL: https://issues.apache.org/jira/browse/SOLR-7113
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Vamsee Yarlagadda
>            Assignee: Mark Miller
>             Fix For: 5.1, 6.0
>
>         Attachments: SOLR-7113.patch
>
>
> I notice this issue while trying to do some heavy indexing into Solr. (700K docs  per minute)
> Solr log errors
> {code}
> 15:42:47
> ERROR
> HdfsTransactionLog
> Exception closing tlog.
> java.io.IOException: Filesystem closed
> 	at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:765)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1898)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1859)
> 	at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
> 	at org.apache.solr.update.HdfsTransactionLog.close(HdfsTransactionLog.java:303)
> 	at org.apache.solr.update.TransactionLog.decref(TransactionLog.java:504)
> 	at org.apache.solr.update.UpdateLog.addOldLog(UpdateLog.java:335)
> 	at org.apache.solr.update.UpdateLog.postCommit(UpdateLog.java:628)
> 	at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:600)
> 	at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> 15:42:47
> ERROR
> CommitTracker
> auto commit error...:org.apache.solr.common.SolrException: java.io.IOException: Filesystem closed
> auto commit error...:org.apache.solr.common.SolrException: java.io.IOException: Filesystem closed
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org