You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/06/16 04:20:00 UTC

[jira] [Commented] (IMPALA-9421) Metadata operations are slow in impala-shell when using hs2-http with LDAP auth.

    [ https://issues.apache.org/jira/browse/IMPALA-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136279#comment-17136279 ] 

Tim Armstrong commented on IMPALA-9421:
---------------------------------------

[~attilaj] IMPALA-8584 should have avoided the need for re-authentication. Is that not working?

> Metadata operations are slow in impala-shell when using hs2-http with LDAP auth.
> --------------------------------------------------------------------------------
>
>                 Key: IMPALA-9421
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9421
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Clients
>    Affects Versions: Impala 3.4.0
>            Reporter: Attila Jeges
>            Priority: Critical
>
> Show database operation takes ~ 3 - 4 seconds, sometimes ~ 8 - 9 seconds in impala-shell when connecting to a coordinator using hs2-http with LDAP authentication:
> {code:java}
> $ impala-shell.sh --protocol='hs2-http' --ssl -i "impala-coordinator:443" -u username -l
> impala-shell> show database;
> +------------------------+----------------------------------------------+
> | name | comment |
> +------------------------+----------------------------------------------+
> | _impala_builtins | System database for Impala builtin functions |
> | airline_ontime_orc | |
> | airline_ontime_parquet | |
> | default | Default Hive database |
> +------------------------+----------------------------------------------+
> Fetched 4 row(s) in 8.87s
> {code}
> impala-coordinator logs show that there are multiple new connections set up and authenticated:
> {code:java}
> I0225 16:07:58.143942   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 50216>
> I0225 16:07:58.144042   321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 50216>
> I0225 16:07:58.144101   321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 50216>
> I0225 16:07:58.144338 128883 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
> I0225 16:07:58.155827 128883 authentication.cc:273] LDAP bind successful
> I0225 16:07:58.155901 128883 impala-hs2-server.cc:1085] PingImpalaHS2Service(): request=TPingImpalaHS2ServiceReq {
>   01: sessionHandle (struct) = TSessionHandle {
>     01: sessionId (struct) = THandleIdentifier {
>       01: guid (string) = "\xab\x9bS/\r\xd1@\xab\x862z\xee(#\x14h",
>       02: secret (string) = "\x81\x84\xf0\x7f\v\xac@\x9a\x9b\x9e\xdf#\xa1\xc3\xc4\x04",
>     },
>   },
> }
> I0225 16:07:58.876168   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 50222>
> I0225 16:07:58.876317   320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 50222>
> I0225 16:07:58.876364   320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 50222>
> I0225 16:07:58.876847 128884 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
> I0225 16:07:58.887931 128884 authentication.cc:273] LDAP bind successful
> I0225 16:07:58.888008 128884 impala-hs2-server.cc:442] ExecuteStatement(): request=TExecuteStatementReq {
>   01: sessionHandle (struct) = TSessionHandle {
>     01: sessionId (struct) = THandleIdentifier {
>       01: guid (string) = "\xab\x9bS/\r\xd1@\xab\x862z\xee(#\x14h",
>       02: secret (string) = "\x81\x84\xf0\x7f\v\xac@\x9a\x9b\x9e\xdf#\xa1\xc3\xc4\x04",
>     },
>   },
>   02: statement (string) = "show databases",
>   03: confOverlay (map) = map<string,string>[1] {
>     "CLIENT_IDENTIFIER" -> "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
>   },
>   04: runAsync (bool) = true,
> }
> I0225 16:07:58.888049 128884 impala-hs2-server.cc:230] TExecuteStatementReq: TExecuteStatementReq {
>   01: sessionHandle (struct) = TSessionHandle {
>     01: sessionId (struct) = THandleIdentifier {
>       01: guid (string) = "\xab\x9bS/\r\xd1@\xab\x862z\xee(#\x14h",
>       02: secret (string) = "\x81\x84\xf0\x7f\v\xac@\x9a\x9b\x9e\xdf#\xa1\xc3\xc4\x04",
>     },
>   },
>   02: statement (string) = "show databases",
>   03: confOverlay (map) = map<string,string>[1] {
>     "CLIENT_IDENTIFIER" -> "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
>   },
>   04: runAsync (bool) = true,
> }
> I0225 16:07:58.898981 128884 impala-hs2-server.cc:268] TClientRequest.queryOptions: TQueryOptions {
>   01: abort_on_error (bool) = false,
>   02: max_errors (i32) = 100,
>   03: disable_codegen (bool) = false,
>   04: batch_size (i32) = 0,
>   05: num_nodes (i32) = 0,
>   06: max_scan_range_length (i64) = 0,
>   07: num_scanner_threads (i32) = 0,
>   11: debug_action (string) = "",
>   12: mem_limit (i64) = 0,
>   15: hbase_caching (i32) = 0,
>   16: hbase_cache_blocks (bool) = false,
>   17: parquet_file_size (i64) = 0,
>   18: explain_level (i32) = 1,
>   19: sync_ddl (bool) = false,
>   24: disable_outermost_topn (bool) = false,
>   26: query_timeout_s (i32) = 0,
>   28: appx_count_distinct (bool) = false,
>   29: disable_unsafe_spills (bool) = false,
>   31: exec_single_node_rows_threshold (i32) = 100,
>   32: optimize_partition_key_scans (bool) = false,
>   33: replica_preference (i32) = 0,
>   34: schedule_random_replica (bool) = false,
>   36: disable_streaming_preaggregations (bool) = false,
>   37: runtime_filter_mode (i32) = 2,
>   38: runtime_bloom_filter_size (i32) = 1048576,
>   39: runtime_filter_wait_time_ms (i32) = 0,
>   40: disable_row_runtime_filtering (bool) = false,
>   41: max_num_runtime_filters (i32) = 10,
>   42: parquet_annotate_strings_utf8 (bool) = false,
>   43: parquet_fallback_schema_resolution (i32) = 0,
>   45: s3_skip_insert_staging (bool) = true,
>   46: runtime_filter_min_size (i32) = 1048576,
>   47: runtime_filter_max_size (i32) = 16777216,
>   48: prefetch_mode (i32) = 1,
>   49: strict_mode (bool) = false,
>   50: scratch_limit (i64) = -1,
>   51: enable_expr_rewrites (bool) = true,
>   52: decimal_v2 (bool) = true,
>   53: parquet_dictionary_filtering (bool) = true,
>   54: parquet_array_resolution (i32) = 0,
>   55: parquet_read_statistics (bool) = true,
>   56: default_join_distribution_mode (i32) = 0,
>   57: disable_codegen_rows_threshold (i32) = 50000,
>   58: default_spillable_buffer_size (i64) = 2097152,
>   59: min_spillable_buffer_size (i64) = 65536,
>   60: max_row_size (i64) = 524288,
>   61: idle_session_timeout (i32) = 900,
>   62: compute_stats_min_sample_size (i64) = 1073741824,
>   63: exec_time_limit_s (i32) = 0,
>   64: shuffle_distinct_exprs (bool) = true,
>   65: max_mem_estimate_for_admission (i64) = 0,
>   66: thread_reservation_limit (i32) = 3000,
>   67: thread_reservation_aggregate_limit (i32) = 0,
>   68: kudu_read_mode (i32) = 0,
>   69: allow_erasure_coded_files (bool) = false,
>   70: timezone (string) = "",
>   71: scan_bytes_limit (i64) = 0,
>   72: cpu_limit_s (i64) = 0,
>   73: topn_bytes_limit (i64) = 536870912,
>   74: client_identifier (string) = "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
>   75: resource_trace_ratio (double) = 0,
>   76: num_remote_executor_candidates (i32) = 3,
>   77: num_rows_produced_limit (i64) = 0,
>   78: planner_testcase_mode (bool) = false,
>   79: default_file_format (i32) = 4,
>   80: parquet_timestamp_type (i32) = 0,
>   81: parquet_read_page_index (bool) = true,
>   82: parquet_write_page_index (bool) = true,
>   84: disable_hdfs_num_rows_estimate (bool) = false,
>   86: spool_query_results (bool) = true,
>   87: default_transactional_type (i32) = 1,
>   88: statement_expression_limit (i32) = 250000,
>   89: max_statement_length_bytes (i32) = 16777216,
>   90: disable_data_cache (bool) = false,
>   91: max_result_spooling_mem (i64) = 104857600,
>   92: max_spilled_result_spooling_mem (i64) = 1073741824,
>   93: disable_hbase_num_rows_estimate (bool) = false,
>   94: fetch_rows_timeout_ms (i64) = 10000,
>   95: now_string (string) = "",
>   96: parquet_object_store_split_size (i64) = 268435456,
>   97: mem_limit_executors (i64) = 0,
>   98: broadcast_bytes_limit (i64) = 34359738368,
> }
> I0225 16:07:58.899091 128884 impala-server.cc:987] Found local timezone "UTC".
> I0225 16:07:58.900794 128884 impala-server.cc:1042] ac4832ea4ab1a2be:38f5a41400000000] Registered query query_id=ac4832ea4ab1a2be:38f5a41400000000 session_id=ab40d10d2f539bab:68142328ee7a3286
> I0225 16:07:58.901051 128884 Frontend.java:1499] ac4832ea4ab1a2be:38f5a41400000000] Analyzing query: show databases db: default
> I0225 16:07:58.901293 128884 BaseAuthorizationChecker.java:110] ac4832ea4ab1a2be:38f5a41400000000] Authorization check took 0 ms
> I0225 16:07:58.901369 128884 Frontend.java:1541] ac4832ea4ab1a2be:38f5a41400000000] Analysis and authorization finished.
> I0225 16:07:58.903031 128884 impala-server.cc:1080] Query ac4832ea4ab1a2be:38f5a41400000000 has idle timeout of 10m
> I0225 16:07:58.903087 128884 impala-hs2-server.cc:512] ExecuteStatement(): return_val=TExecuteStatementResp {
>   01: status (struct) = TStatus {
>     01: statusCode (i32) = 0,
>   },
>   02: operationHandle (struct) = TOperationHandle {
>     01: operationId (struct) = THandleIdentifier {
>       01: guid (string) = "\xbe\xa2\xb1J\xea2H\xac\x00\x00\x00\x00\x14\xa4\xf58",
>       02: secret (string) = "\x81\x84\xf0\x7f\v\xac@\x9a\x9b\x9e\xdf#\xa1\xc3\xc4\x04",
>     },
>     02: operationType (i32) = 0,
>     03: hasResultSet (bool) = true,
>   },
> }
> I0225 16:07:59.617283   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 50244>
> I0225 16:07:59.617388   321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 50244>
> I0225 16:07:59.617424   321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 50244>
> I0225 16:07:59.617705 128886 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
> I0225 16:07:59.629288 128886 authentication.cc:273] LDAP bind successful
> I0225 16:07:59.629354 128886 impala-hs2-server.cc:812] GetResultSetMetadata(): query_id=ac4832ea4ab1a2be:38f5a41400000000
> I0225 16:07:59.629410 128886 impala-hs2-server.cc:847] GetResultSetMetadata(): return_val=TGetResultSetMetadataResp {
>   01: status (struct) = TStatus {
>     01: statusCode (i32) = 0,
>   },
>   02: schema (struct) = TTableSchema {
>     01: columns (list) = list<struct>[2] {
>       [0] = TColumnDesc {
>         01: columnName (string) = "name",
>         02: typeDesc (struct) = TTypeDesc {
>           01: types (list) = list<struct>[1] {
>             [0] = TTypeEntry {
>               01: primitiveEntry (struct) = TPrimitiveTypeEntry {
>                 01: type (i32) = 7,
>               },
>             },
>           },
>         },
>         03: position (i32) = 0,
>       },
>       [1] = TColumnDesc {
>         01: columnName (string) = "comment",
>         02: typeDesc (struct) = TTypeDesc {
>           01: types (list) = list<struct>[1] {
>             [0] = TTypeEntry {
>               01: primitiveEntry (struct) = TPrimitiveTypeEntry {
>                 01: type (i32) = 7,
>               },
>             },
>           },
>         },
>         03: position (i32) = 1,
>       },
>     },
>   },
> }
> I0225 16:08:00.347491 128862 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
> I0225 16:08:00.535367 128862 authentication.cc:273] LDAP bind successful
> I0225 16:08:01.253826   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 50256>
> I0225 16:08:01.253938   320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 50256>
> I0225 16:08:01.253988   320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 50256>
> I0225 16:08:01.254253 128887 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
> I0225 16:08:01.264217 128887 authentication.cc:273] LDAP bind successful
> I0225 16:08:01.982829   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 50282>
> I0225 16:08:01.982926   321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 50282>
> I0225 16:08:01.982965   321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 50282>
> I0225 16:08:01.983230 128901 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
> I0225 16:08:07.029768 128901 authentication.cc:273] LDAP bind successful
> I0225 16:08:07.747694 128860 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
> I0225 16:08:07.758265 128860 authentication.cc:273] LDAP bind successful
> I0225 16:08:07.758330 128860 impala-hs2-server.cc:778] CloseOperation(): query_id=ac4832ea4ab1a2be:38f5a41400000000
> I0225 16:08:07.758345 128860 impala-server.cc:1121] UnregisterQuery(): query_id=ac4832ea4ab1a2be:38f5a41400000000
> I0225 16:08:07.758352 128860 impala-server.cc:1223] Cancel(): query_id=ac4832ea4ab1a2be:38f5a41400000000
> I0225 16:08:08.463980   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 50354>
> I0225 16:08:08.464076   320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 50354>
> I0225 16:08:08.464107   320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 50354>
> I0225 16:08:08.464340 128913 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
> I0225 16:08:08.474979 128913 authentication.cc:273] LDAP bind successful
> I0225 16:08:28.186151 128883 impala-server.cc:1957] Connection 5f417413b1c20e07:c36931be0b8aa094 from client 127.0.0.1:50216 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
> I0225 16:08:28.933374 128884 impala-server.cc:1957] Connection 0b418f1d01e5c8c7:ee2c4c7a963619a6 from client 127.0.0.1:50222 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
> I0225 16:08:29.658985 128886 impala-server.cc:1957] Connection c74c832b5368b85a:49e58a673f5d7ba9 from client 127.0.0.1:50244 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
> I0225 16:08:30.565888 128862 impala-server.cc:1957] Connection ce47f868d7284d03:d86d2e769c3726a4 from client 127.0.0.1:50142 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
> I0225 16:08:31.294651 128887 impala-server.cc:1957] Connection df4d3b98d939f104:01f15c102203cc81 from client 127.0.0.1:50256 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
> I0225 16:08:37.060230 128901 impala-server.cc:1957] Connection c6481bc01063b9c4:b9c726aeb14b91a6 from client 127.0.0.1:50282 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
> I0225 16:08:37.789135 128860 impala-server.cc:1957] Connection ad43f983d8502f74:106705062799569e from client 127.0.0.1:50136 to server hiveserver2-http-frontend closed. The connection had
> 1 associated session(s).
> I0225 16:08:38.505254 128913 impala-server.cc:1957] Connection d146ef0df1cad245:1e4d8c8dd97fcc81 from client 127.0.0.1:50354 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
> {code}
> Looks like there's a new connection and LDAP authentication for each RPC call made which imposes an overhead.
> Please investigate whether it's possible to speed things up by reusing connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org