You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/07/05 23:05:11 UTC

[jira] [Commented] (DRILL-4647) C++ client is not propagating a connection failed error when a drillbit goes down

    [ https://issues.apache.org/jira/browse/DRILL-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363430#comment-15363430 ] 

ASF GitHub Bot commented on DRILL-4647:
---------------------------------------

Github user sudheeshkatkam commented on a diff in the pull request:

    https://github.com/apache/drill/pull/493#discussion_r69653159
  
    --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp ---
    @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t
     
         uint64_t coordId;
         DrillClientQueryResult* pQuery=NULL;
    +    connectionStatus_t cStatus=CONN_SUCCESS;
         {
             boost::lock_guard<boost::mutex> prLock(this->m_prMutex);
             boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex);
             coordId = this->getNextCoordinationId();
             OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query);
    -        sendSync(out_msg);
     
    +        // Create the result object and register the listener before we send the query
    +        // because sometimes the caller is not checking the status of the submitQuery call.
    +        // This way, the broadcast error call will cause the results listener to be called
    +        // with a COMM_ERROR status.
             pQuery = new DrillClientQueryResult(this, coordId, plan);
             pQuery->registerListener(l, lCtx);
    -        bool sendRequest=false;
             this->m_queryIds[coordId]=pQuery;
     
    -        DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG)  << "Sent query request. " << "[" << m_connectedHost << "]"  << "Coordination id = " << coordId << std::endl;)
    -        DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG)  << "Sent query " <<  "Coordination id = " << coordId << " query: " << plan << std::endl;)
    +        connectionStatus_t cStatus=sendSync(out_msg);
    +        if(cStatus == CONN_SUCCESS){
    +            bool sendRequest=false;
     
    -        if(m_pendingRequests++==0){
    -            sendRequest=true;
    -        }else{
    -            DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;)
    -            DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;)
    -        }
    -        if(sendRequest){
    -            DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
    -                << m_pendingRequests << std::endl;)
    -            getNextResult(); // async wait for results
    +            DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG)  << "Sent query request. " << "[" << m_connectedHost << "]"  << "Coordination id = " << coordId << std::endl;)
    +                DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG)  << "Sent query " <<  "Coordination id = " << coordId << " query: " << plan << std::endl;)
    +
    +                if(m_pendingRequests++==0){
    +                    sendRequest=true;
    +                }else{
    +                    DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;)
    +                        DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;)
    +                }
    +            if(sendRequest){
    +                DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
    +                        << m_pendingRequests << std::endl;)
    +                    getNextResult(); // async wait for results
    +            }
             }
    +
    +    }
    +    if(cStatus!=CONN_SUCCESS){
    --- End diff --
    
    This block fails only this query. How are the other queries notified? Heartbeat failure?


> C++ client is not propagating a connection failed error when a drillbit goes down
> ---------------------------------------------------------------------------------
>
>                 Key: DRILL-4647
>                 URL: https://issues.apache.org/jira/browse/DRILL-4647
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Parth Chandra
>
> When a drillbit goes down, there are two conditions under which the client is not propagating the error back to the application -
> 1) The application is in a submitQuery call: the ODBC driver is expecting that the error be reported thru the query results listener which hasn't been registered at the point the error is encountered.
> 2) A submitQuery call succeeded but never reached the drillbit because it was shutdown. In this case the application has a handle to a query and is listening for results which will never arrive. The heartbeat mechanism detects the failure, but is not propagating the error to the query results listener.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)