You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/07/05 23:05:11 UTC
[jira] [Commented] (DRILL-4647) C++ client is not propagating a
connection failed error when a drillbit goes down
[ https://issues.apache.org/jira/browse/DRILL-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363430#comment-15363430 ]
ASF GitHub Bot commented on DRILL-4647:
---------------------------------------
Github user sudheeshkatkam commented on a diff in the pull request:
https://github.com/apache/drill/pull/493#discussion_r69653159
--- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp ---
@@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t
uint64_t coordId;
DrillClientQueryResult* pQuery=NULL;
+ connectionStatus_t cStatus=CONN_SUCCESS;
{
boost::lock_guard<boost::mutex> prLock(this->m_prMutex);
boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex);
coordId = this->getNextCoordinationId();
OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query);
- sendSync(out_msg);
+ // Create the result object and register the listener before we send the query
+ // because sometimes the caller is not checking the status of the submitQuery call.
+ // This way, the broadcast error call will cause the results listener to be called
+ // with a COMM_ERROR status.
pQuery = new DrillClientQueryResult(this, coordId, plan);
pQuery->registerListener(l, lCtx);
- bool sendRequest=false;
this->m_queryIds[coordId]=pQuery;
- DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl;)
- DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl;)
+ connectionStatus_t cStatus=sendSync(out_msg);
+ if(cStatus == CONN_SUCCESS){
+ bool sendRequest=false;
- if(m_pendingRequests++==0){
- sendRequest=true;
- }else{
- DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;)
- DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;)
- }
- if(sendRequest){
- DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
- << m_pendingRequests << std::endl;)
- getNextResult(); // async wait for results
+ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl;)
+ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl;)
+
+ if(m_pendingRequests++==0){
+ sendRequest=true;
+ }else{
+ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;)
+ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;)
+ }
+ if(sendRequest){
+ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
+ << m_pendingRequests << std::endl;)
+ getNextResult(); // async wait for results
+ }
}
+
+ }
+ if(cStatus!=CONN_SUCCESS){
--- End diff --
This block fails only this query. How are the other queries notified? Heartbeat failure?
> C++ client is not propagating a connection failed error when a drillbit goes down
> ---------------------------------------------------------------------------------
>
> Key: DRILL-4647
> URL: https://issues.apache.org/jira/browse/DRILL-4647
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Parth Chandra
>
> When a drillbit goes down, there are two conditions under which the client is not propagating the error back to the application -
> 1) The application is in a submitQuery call: the ODBC driver is expecting that the error be reported thru the query results listener which hasn't been registered at the point the error is encountered.
> 2) A submitQuery call succeeded but never reached the drillbit because it was shutdown. In this case the application has a handle to a query and is listening for results which will never arrive. The heartbeat mechanism detects the failure, but is not propagating the error to the query results listener.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)