You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Young-Seok Kim (JIRA)" <ji...@apache.org> on 2015/10/20 18:29:27 UTC

[jira] [Commented] (ASTERIXDB-1144) FeedMetaStoreNodePushable.close() call hangs

    [ https://issues.apache.org/jira/browse/ASTERIXDB-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965343#comment-14965343 ] 

Young-Seok Kim commented on ASTERIXDB-1144:
-------------------------------------------

This issue can be fixed by putting the isFinished() call and wait() in the same synchronized block. 
So, if isFinished() returns false, the caller thread of the isFinished() will call wait in the same synchronized block and will get the notification.

> FeedMetaStoreNodePushable.close() call hangs
> --------------------------------------------
>
>                 Key: ASTERIXDB-1144
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1144
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Young-Seok Kim
>            Assignee: Abdullah Alamoudi
>            Priority: Critical
>
> Feed job hangs in FeedMetaStoreNodePushable.close() call as shown in the following jstack trace:
> "org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:1:0:7:0:0" daemon prio=10 tid=0x00007fac6005c000 nid=0x4310 in Object.wait() [0x00007facd74f3000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:503)
>         at org.apache.asterix.metadata.feeds.FeedMetaStoreNodePushable.close(FeedMetaStoreNodePushable.java:195)
>         - locked <0x0000000677fd9e80> (a org.apache.asterix.common.dataflow.AsterixLSMInsertDeleteOperatorNodePushable)
>         at org.apache.hyracks.algebricks.runtime.operators.std.StreamProjectRuntimeFactory$1.close(StreamProjectRuntimeFactory.java:140)
>         at org.apache.hyracks.algebricks.runtime.operators.std.AssignRuntimeFactory$1.close(AssignRuntimeFactory.java:220)
>         at org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$2.close(AlgebricksMetaOperatorDescriptor.java:145)
>         at org.apache.asterix.metadata.feeds.FeedMetaNodePushable.close(FeedMetaNodePushable.java:174)
>         at org.apache.hyracks.storage.am.common.dataflow.IndexInsertUpdateDeleteOperatorNodePushable.close(IndexInsertUpdateDeleteOperatorNodePushable.java:153)
>         at org.apache.asterix.metadata.feeds.FeedMetaStoreNodePushable.close(FeedMetaStoreNodePushable.java:200)
>         at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:349)
>         at org.apache.hyracks.control.nc.Task.run(Task.java:290)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> The reason of the hang seems to have a bug in the wait/notify code. 
> More specifically,  FrameEventCallback.frameEvent() method gives notification in the following code snippet:
> ------------------------------------------------------
>             case FINISHED_PROCESSING:
>                 inputSideHandler.setFinished(true);
>                 synchronized (coreOperator) {
>                     coreOperator.notifyAll();
>                 }
> ------------------------------------------------------
> FeedMetaStoreNodePushable.close() methods waits notification in the following code snippet:
> ------------------------------------------------------
>                 while (!inputSideHandler.isFinished()) {
>                     synchronized (coreOperator) {
>                         coreOperator.wait();
>                     }
>                 }
> ------------------------------------------------------
> If a caller thread of the close() just called isFinished(), it's return value is false, then the thread is scheduled by OS and waits for the next scheduling for running.
> Then, if the a caller thread of the frameEvent() called setFinished(true) and coreOperator.notifyAll(), then the notification of notifyAll() can be lost. In other words, the notification may not reach to the caller thread of the close(). 
> If this happens, the caller may hang as shown in the above jstack trace. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)