You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@drill.apache.org by "Parth Chandra (JIRA)" <ji...@apache.org> on 2015/03/21 00:11:40 UTC

[jira] [Commented] (DRILL-2509) Drill Client threading issue with m_pendingRequests

    [ https://issues.apache.org/jira/browse/DRILL-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372296#comment-14372296 ] 

Parth Chandra commented on DRILL-2509:
--------------------------------------

I think the way to address this is to protect m_pendingRequests with it's own mutex. The real issue is that as part of handling a response, the read handler sends back an ack _synchronously_. The locking is messed up because obtaining the mutex so that m_pendingRequests is correctly protected causes a deadlock.
Longer term (post release 1.0) we can spend some time looking at how to clean up the state management.

> Drill Client threading issue with m_pendingRequests
> ---------------------------------------------------
>
>                 Key: DRILL-2509
>                 URL: https://issues.apache.org/jira/browse/DRILL-2509
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Client - C++
>    Affects Versions: 0.7.0
>            Reporter: Norris Lee
>            Assignee: Norris Lee
>             Fix For: 0.9.0
>
>
> 1.       (Thread 1) 1st query receives the last record batch (no data, only contains the query state. Eg. QueryResult_QueryState_COMPLETED) and calls processQueryResult. At this moment, it grabs the lock. Assume m_pendingRequests = 1.
> 2.       (Thread 1) processQueryStatusResult is called and drops m_pendingRequests to 0. The lock is still held at this point.
> 3.       (Thread 1) It returns from processQueryStatusResult, immediately followed by returning from processQueryResult. At this point the lock is released.
> 4.       (Thread 2) SubmitQuery for the next query sees the lock has been freed so it swoops in and grabs the lock. It bumps m_pendingRequests up to 1 and sets sendResusts=true since it sees that m_pendingRequests was previously 0, causing it to call getNextResult
> 5.       (Thread 1) After returning from processQueryResult, it sees that m_pendingRequests is now set to 1 so it calls getNextResult
> 6.       So now 2 threads end up calling getNextResult. For whatever reason, the server then sends a record batch with no data, has_rpc_type = false, and rpc_type = 0, leading to throwing a ERR_QRY_INVRPCTYPE error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)