You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "EJTTianyu (Jira)" <ji...@apache.org> on 2020/04/30 03:26:00 UTC

[jira] [Commented] (IOTDB-398) IoTDBMergeTest Problems in CI

    [ https://issues.apache.org/jira/browse/IOTDB-398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096073#comment-17096073 ] 

EJTTianyu commented on IOTDB-398:
---------------------------------

Hi,

I am solving this issue.  The reason I found IoTDBMergeTest to be stuck is that in extreme cases concurrent query and merge will cause a deadlock. This deadlock will occur under the following conditions:
 # For an incoming query, the mergeLock's readLock of the StorageGroupProcessor will be first obtained for each queryed measurement. For the TsFile that needs to be used, the file's readLock is added and release the mergeLock's readLock (note that the TsFile's readLock num> = 0)
 # In the process of merge resources’ cleaning up, mergeLock‘s writeLock will be added to the StorageGroupProcessor, and try to obtain the file's writeLock (because the file's readLock of the previous step still exists, so the writeLock cannot be obtained)
 # When the query in the first step queries the next measurement, it will request the mergeLock’s readLock of the StorageGroupProcessor (because the mergeLock's writeLock has been added in the previous step, it forms a loop wait).

For example, there exists a storage group named root.SG1 which contains measurement s0 and s1. In the previous writing process, a seq file xxx-1-0.tsfile and an unseq file xxx-2-0.tsfile have been generated, both contains the s0,s1 data. The query process(the query: select * from root.SG1) and the merge process are executed simultaneously. The merge process will merge the seq and unseq file to a new file named xxx-1-1.tsfile, which is a seq file.
 # the s0's query will add a readLock to the xxx-1-1.tsfile before all querys end.
 # the merge process would like to clean up the merge resources, then it holds the mergeLock's writeLock and apply for the file's readLock.
 # the s1's query will add a mergeLock to the StorageGroupProcessor. However, the mergeLock is hold by merge process in the previos step. Then the condition produces a deadLock.

There are two solutions to this deadlock problem:
 # Change the lock mechanism of the query
 # Break the deadlock loop waiting condition

Here, I uesd the second solution. The pr has submitted.

> IoTDBMergeTest Problems in CI
> -----------------------------
>
>                 Key: IOTDB-398
>                 URL: https://issues.apache.org/jira/browse/IOTDB-398
>             Project: Apache IoTDB
>          Issue Type: Bug
>            Reporter: Xiangdong Huang
>            Assignee: EJTTianyu
>            Priority: Major
>         Attachments: IoTDBLoadExternalTsfileTest_fillCache.log
>
>
> Hi,
> If you check Travis's status, you will find  there are many tests may be blocked on master branch, which needs to be pay more attention.
> And I have reproduced one on my Mac (with OpenJDK11),  I attach the jstack logs. 
>  
> Notice, the attachment is just a case.  I wonder IoTDBMergeTest also has some problems..
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)