You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Abhishek Rawat (Jira)" <ji...@apache.org> on 2023/04/04 22:22:00 UTC

[jira] [Created] (IMPALA-12039) Potential Race condition between executor group deletion and admission controller

Abhishek Rawat created IMPALA-12039:
---------------------------------------

             Summary: Potential Race condition between executor group deletion and admission controller
                 Key: IMPALA-12039
                 URL: https://issues.apache.org/jira/browse/IMPALA-12039
             Project: IMPALA
          Issue Type: Improvement
            Reporter: Abhishek Rawat


IMPALA-11891 added support for deleting executor groups if it's empty. However, there is a race condition here where if a query comes in it could be admitted to just deleted executor group and the query fails.
{code:java}
I0330 06:05:25.600728  9398 admission-controller.cc:1941] 3c4f9069df52951e:0b97d92800000000] Trying to admit id=3c4f9069df52951e:0b97d92800000000 in pool_name=root.default executor_group_name=root.default-group-000 per_host_mem_estimate=192.22 MB dedicated_coord_mem_estimate=100.03 MB max_requests=-1 max_queued=200 max_mem=48828.12 GB is_trivial_query=false

I0330 06:05:25.600769  9398 admission-controller.cc:1950] 3c4f9069df52951e:0b97d92800000000] Stats: agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0,  local_host(local_mem_admitted=0, local_trivial_running=0, num_admitted_running=0, num_queued=0, backend_mem_reserved=0, topN_query_stats: queries=[7345a69a7cf74870:36a8543f00000000], total_mem_consumed=0; pool_level_stats: num_running=1, min=0, max=0, pool_total_mem=0, average_per_query=0)

I0330 06:05:25.600816  9398 admission-controller.cc:1300] 3c4f9069df52951e:0b97d92800000000] Admitting query id=3c4f9069df52951e:0b97d92800000000

I0330 06:05:25.600883  9398 impala-server.cc:2231] 3c4f9069df52951e:0b97d92800000000] Registering query locations

I0330 06:05:25.600898  9398 coordinator.cc:151] 3c4f9069df52951e:0b97d92800000000] Exec() query_id=3c4f9069df52951e:0b97d92800000000 stmt=select count(*) from test_a9a41a5.t where id + random() < sleep(10000)

I0330 06:05:25.601054  9398 coordinator.cc:476] 3c4f9069df52951e:0b97d92800000000] starting execution on 2 backends for query_id=3c4f9069df52951e:0b97d92800000000

I0330 06:05:25.601359   124 control-service.cc:148] 3c4f9069df52951e:0b97d92800000000] ExecQueryFInstances(): query_id=3c4f9069df52951e:0b97d92800000000 coord=coordinator-0.coordinator-int.impala-1680155570-trh7.svc.cluster.local:27000 #instances=1

I0330 06:05:25.601604   117 kudu-status-util.h:55] Exec() rpc failed: Network error: Client connection negotiation failed: client connection to 192.168.112.16:27010: connect: Connection refused (error 111)

E0330 06:05:25.601706   117 coordinator-backend-state.cc:190] ExecQueryFInstances rpc query_id=3c4f9069df52951e:0b97d92800000000 failed: Exec() rpc failed: Network error: Client connection negotiation failed: client connection to 192.168.112.16:27010: connect: Connection refused (error 111) {code}
In the past the empty executor group would have been unhealthy and admission controller would've queued the incoming query.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org