You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Attila Jeges (Jira)" <ji...@apache.org> on 2020/02/25 14:49:00 UTC
[jira] [Created] (IMPALA-9421) Metadata operations are slow in
impala-shell when using hs2-http with LDAP auth.
Attila Jeges created IMPALA-9421:
------------------------------------
Summary: Metadata operations are slow in impala-shell when using hs2-http with LDAP auth.
Key: IMPALA-9421
URL: https://issues.apache.org/jira/browse/IMPALA-9421
Project: IMPALA
Issue Type: Improvement
Components: Clients
Affects Versions: Impala 3.4.0
Reporter: Attila Jeges
Show database operation takes over 3-4 seconds in impala-shell when connecting to an CDW Azure environment:
{code:java}
$ impala-shell.sh --protocol='hs2-http' --ssl -i "coordinator-attilaj-test-impala-vw.env-q52cn6.dwx.workload-dev.cloudera.com:443" -u csso_attilaj -l
impala-shell> show database;
+------------------------+----------------------------------------------+
| name | comment |
+------------------------+----------------------------------------------+
| _impala_builtins | System database for Impala builtin functions |
| airline_ontime_orc | |
| airline_ontime_parquet | |
| default | Default Hive database |
+------------------------+----------------------------------------------+
Fetched 4 row(s) in 3.66s
{code}
impala-coordinator logs show that there are multiple new connections set up and authenticated:
{code:java}
I0225 14:15:48.976776 317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58588>
I0225 14:15:48.976878 320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58588>
I0225 14:15:48.976912 320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58588>
I0225 14:15:48.977216 115929 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:48.989554 115929 authentication.cc:273] LDAP bind successful
I0225 14:15:48.989639 115929 impala-hs2-server.cc:1085] PingImpalaHS2Service(): request=TPingImpalaHS2ServiceReq {
01: sessionHandle (struct) = TSessionHandle {
01: sessionId (struct) = THandleIdentifier {
01: guid (string) = "#\x8f\xdf\x01\xd7\xd6Bv\xa5\xec\xcd\x17Q\xb9q\x93",
02: secret (string) = "\xd6\xaaO\v\xedXE!\x89}x\xbds\x1f\xe1\xf0",
},
},
}
I0225 14:15:50.152348 317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58596>
I0225 14:15:50.152446 321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58596>
I0225 14:15:50.152493 321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58596>
I0225 14:15:50.152722 115930 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:50.163576 115930 authentication.cc:273] LDAP bind successful
I0225 14:15:50.163733 115930 impala-hs2-server.cc:442] ExecuteStatement(): request=TExecuteStatementReq {
01: sessionHandle (struct) = TSessionHandle {
01: sessionId (struct) = THandleIdentifier {
01: guid (string) = "#\x8f\xdf\x01\xd7\xd6Bv\xa5\xec\xcd\x17Q\xb9q\x93",
02: secret (string) = "\xd6\xaaO\v\xedXE!\x89}x\xbds\x1f\xe1\xf0",
},
},
02: statement (string) = "show databases",
03: confOverlay (map) = map<string,string>[1] {
"CLIENT_IDENTIFIER" -> "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
},
04: runAsync (bool) = true,
}
I0225 14:15:50.163775 115930 impala-hs2-server.cc:230] TExecuteStatementReq: TExecuteStatementReq {
01: sessionHandle (struct) = TSessionHandle {
01: sessionId (struct) = THandleIdentifier {
01: guid (string) = "#\x8f\xdf\x01\xd7\xd6Bv\xa5\xec\xcd\x17Q\xb9q\x93",
02: secret (string) = "\xd6\xaaO\v\xedXE!\x89}x\xbds\x1f\xe1\xf0",
},
},
02: statement (string) = "show databases",
03: confOverlay (map) = map<string,string>[1] {
"CLIENT_IDENTIFIER" -> "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
},
04: runAsync (bool) = true,
}
I0225 14:15:50.173715 115930 impala-hs2-server.cc:268] TClientRequest.queryOptions: TQueryOptions {
01: abort_on_error (bool) = false,
02: max_errors (i32) = 100,
03: disable_codegen (bool) = false,
04: batch_size (i32) = 0,
05: num_nodes (i32) = 0,
06: max_scan_range_length (i64) = 0,
07: num_scanner_threads (i32) = 0,
11: debug_action (string) = "",
12: mem_limit (i64) = 0,
15: hbase_caching (i32) = 0,
16: hbase_cache_blocks (bool) = false,
17: parquet_file_size (i64) = 0,
18: explain_level (i32) = 1,
19: sync_ddl (bool) = false,
24: disable_outermost_topn (bool) = false,
26: query_timeout_s (i32) = 0,
28: appx_count_distinct (bool) = false,
29: disable_unsafe_spills (bool) = false,
31: exec_single_node_rows_threshold (i32) = 100,
32: optimize_partition_key_scans (bool) = false,
33: replica_preference (i32) = 0,
34: schedule_random_replica (bool) = false,
36: disable_streaming_preaggregations (bool) = false,
37: runtime_filter_mode (i32) = 2,
38: runtime_bloom_filter_size (i32) = 1048576,
39: runtime_filter_wait_time_ms (i32) = 0,
40: disable_row_runtime_filtering (bool) = false,
41: max_num_runtime_filters (i32) = 10,
42: parquet_annotate_strings_utf8 (bool) = false,
43: parquet_fallback_schema_resolution (i32) = 0,
45: s3_skip_insert_staging (bool) = true,
46: runtime_filter_min_size (i32) = 1048576,
47: runtime_filter_max_size (i32) = 16777216,
48: prefetch_mode (i32) = 1,
49: strict_mode (bool) = false,
50: scratch_limit (i64) = -1,
51: enable_expr_rewrites (bool) = true,
52: decimal_v2 (bool) = true,
53: parquet_dictionary_filtering (bool) = true,
54: parquet_array_resolution (i32) = 0,
55: parquet_read_statistics (bool) = true,
56: default_join_distribution_mode (i32) = 0,
57: disable_codegen_rows_threshold (i32) = 50000,
58: default_spillable_buffer_size (i64) = 2097152,
59: min_spillable_buffer_size (i64) = 65536,
60: max_row_size (i64) = 524288,
61: idle_session_timeout (i32) = 900,
62: compute_stats_min_sample_size (i64) = 1073741824,
63: exec_time_limit_s (i32) = 0,
64: shuffle_distinct_exprs (bool) = true,
65: max_mem_estimate_for_admission (i64) = 0,
66: thread_reservation_limit (i32) = 3000,
67: thread_reservation_aggregate_limit (i32) = 0,
68: kudu_read_mode (i32) = 0,
69: allow_erasure_coded_files (bool) = false,
70: timezone (string) = "",
71: scan_bytes_limit (i64) = 0,
72: cpu_limit_s (i64) = 0,
73: topn_bytes_limit (i64) = 536870912,
74: client_identifier (string) = "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
75: resource_trace_ratio (double) = 0,
76: num_remote_executor_candidates (i32) = 3,
77: num_rows_produced_limit (i64) = 0,
78: planner_testcase_mode (bool) = false,
79: default_file_format (i32) = 4,
80: parquet_timestamp_type (i32) = 0,
81: parquet_read_page_index (bool) = true,
82: parquet_write_page_index (bool) = true,
84: disable_hdfs_num_rows_estimate (bool) = false,
86: spool_query_results (bool) = true,
87: default_transactional_type (i32) = 1,
88: statement_expression_limit (i32) = 250000,
89: max_statement_length_bytes (i32) = 16777216,
90: disable_data_cache (bool) = false,
91: max_result_spooling_mem (i64) = 104857600,
92: max_spilled_result_spooling_mem (i64) = 1073741824,
93: disable_hbase_num_rows_estimate (bool) = false,
94: fetch_rows_timeout_ms (i64) = 10000,
95: now_string (string) = "",
96: parquet_object_store_split_size (i64) = 268435456,
97: mem_limit_executors (i64) = 0,
98: broadcast_bytes_limit (i64) = 34359738368,
}
I0225 14:15:50.173835 115930 impala-server.cc:987] Found local timezone "UTC".
I0225 14:15:50.177309 115930 impala-server.cc:1042] 4f44d29479adfa14:508106ff00000000] Registered query query_id=4f44d29479adfa14:508106ff00000000 session_id=7642d6d701df8f23:9371b95117cdeca5
I0225 14:15:50.177577 115930 Frontend.java:1499] 4f44d29479adfa14:508106ff00000000] Analyzing query: show databases db: default
I0225 14:15:50.177830 115930 BaseAuthorizationChecker.java:110] 4f44d29479adfa14:508106ff00000000] Authorization check took 0 ms
I0225 14:15:50.177906 115930 Frontend.java:1541] 4f44d29479adfa14:508106ff00000000] Analysis and authorization finished.
I0225 14:15:50.182478 115930 impala-server.cc:1080] Query 4f44d29479adfa14:508106ff00000000 has idle timeout of 10m
I0225 14:15:50.182540 115930 impala-hs2-server.cc:512] ExecuteStatement(): return_val=TExecuteStatementResp {
01: status (struct) = TStatus {
01: statusCode (i32) = 0,
},
02: operationHandle (struct) = TOperationHandle {
01: operationId (struct) = THandleIdentifier {
01: guid (string) = "\x14\xfa\xady\x94\xd2DO\x00\x00\x00\x00\xff\x06\x81P",
02: secret (string) = "\xd6\xaaO\v\xedXE!\x89}x\xbds\x1f\xe1\xf0",
},
02: operationType (i32) = 0,
03: hasResultSet (bool) = true,
},
}
I0225 14:15:51.934399 317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58622>
I0225 14:15:51.934571 320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58622>
I0225 14:15:51.934634 320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58622>
I0225 14:15:51.934870 115940 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:51.945902 115940 authentication.cc:273] LDAP bind successful
I0225 14:15:51.945957 115940 impala-hs2-server.cc:812] GetResultSetMetadata(): query_id=4f44d29479adfa14:508106ff00000000
I0225 14:15:51.946015 115940 impala-hs2-server.cc:847] GetResultSetMetadata(): return_val=TGetResultSetMetadataResp {
01: status (struct) = TStatus {
01: statusCode (i32) = 0,
},
02: schema (struct) = TTableSchema {
01: columns (list) = list<struct>[2] {
[0] = TColumnDesc {
01: columnName (string) = "name",
02: typeDesc (struct) = TTypeDesc {
01: types (list) = list<struct>[1] {
[0] = TTypeEntry {
01: primitiveEntry (struct) = TPrimitiveTypeEntry {
01: type (i32) = 7,
},
},
},
},
03: position (i32) = 0,
},
[1] = TColumnDesc {
01: columnName (string) = "comment",
02: typeDesc (struct) = TTypeDesc {
01: types (list) = list<struct>[1] {
[0] = TTypeEntry {
01: primitiveEntry (struct) = TPrimitiveTypeEntry {
01: type (i32) = 7,
},
},
},
},
03: position (i32) = 1,
},
},
},
}
I0225 14:15:53.537967 317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58628>
I0225 14:15:53.538059 321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58628>
I0225 14:15:53.538092 321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58628>
I0225 14:15:53.538578 115941 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:53.550732 115941 authentication.cc:273] LDAP bind successful
I0225 14:15:54.959165 115929 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:54.970517 115929 authentication.cc:273] LDAP bind successful
I0225 14:15:56.381584 317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58658>
I0225 14:15:56.381669 320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58658>
I0225 14:15:56.381703 320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58658>
I0225 14:15:56.381961 115943 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:16:01.398638 115943 authentication.cc:273] LDAP bind successful
I0225 14:16:02.591567 317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58718>
I0225 14:16:02.591709 321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58718>
I0225 14:16:02.591747 321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58718>
I0225 14:16:02.592200 115965 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:16:02.603652 115965 authentication.cc:273] LDAP bind successful
I0225 14:16:02.603735 115965 impala-hs2-server.cc:778] CloseOperation(): query_id=4f44d29479adfa14:508106ff00000000
I0225 14:16:02.603758 115965 impala-server.cc:1121] UnregisterQuery(): query_id=4f44d29479adfa14:508106ff00000000
I0225 14:16:02.603766 115965 impala-server.cc:1223] Cancel(): query_id=4f44d29479adfa14:508106ff00000000
I0225 14:16:04.448861 317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58748>
I0225 14:16:04.449045 320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58748>
I0225 14:16:04.449076 320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58748>
I0225 14:16:04.449290 115968 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:16:04.460290 115968 authentication.cc:273] LDAP bind successful
I0225 14:16:20.212851 115930 impala-server.cc:1957] Connection d943451a077f71e4:0a2eb062b208c388 from client 127.0.0.1:58596 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:21.965914 115940 impala-server.cc:1957] Connection ff4ec2f4f3931c3d:792ae752a9916293 from client 127.0.0.1:58622 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:23.581163 115941 impala-server.cc:1957] Connection 4540176beb3e990c:37df4bffa25515b6 from client 127.0.0.1:58628 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:25.000741 115929 impala-server.cc:1957] Connection 974e59d0132ae6a3:6309b762ba77c190 from client 127.0.0.1:58588 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:31.428939 115943 impala-server.cc:1957] Connection 1542d06b26eabcf1:566956f904ba1e80 from client 127.0.0.1:58658 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:32.634624 115965 impala-server.cc:1957] Connection 6e4770d6a982ecd5:d4e40ecc6d1327b3 from client 127.0.0.1:58718 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:34.471824 115968 impala-server.cc:1957] Connection 6643d4e11d526c8d:b6772779e91bcc8b from client 127.0.0.1:58748 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
{code}
Looks like there's a new connection and LDAP authentication for each RPC call made which imposes an overhead.
Please investigate whether it's possible to speed things up by reusing connections.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)