You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Attila Jeges (Jira)" <ji...@apache.org> on 2020/02/25 14:49:00 UTC

[jira] [Created] (IMPALA-9421) Metadata operations are slow in impala-shell when using hs2-http with LDAP auth.

Attila Jeges created IMPALA-9421:
------------------------------------

             Summary: Metadata operations are slow in impala-shell when using hs2-http with LDAP auth.
                 Key: IMPALA-9421
                 URL: https://issues.apache.org/jira/browse/IMPALA-9421
             Project: IMPALA
          Issue Type: Improvement
          Components: Clients
    Affects Versions: Impala 3.4.0
            Reporter: Attila Jeges


Show database operation takes over 3-4 seconds in impala-shell when connecting to an CDW Azure environment:
{code:java}
$ impala-shell.sh --protocol='hs2-http' --ssl -i "coordinator-attilaj-test-impala-vw.env-q52cn6.dwx.workload-dev.cloudera.com:443" -u csso_attilaj -l

impala-shell> show database;
+------------------------+----------------------------------------------+
| name | comment |
+------------------------+----------------------------------------------+
| _impala_builtins | System database for Impala builtin functions |
| airline_ontime_orc | |
| airline_ontime_parquet | |
| default | Default Hive database |
+------------------------+----------------------------------------------+

Fetched 4 row(s) in 3.66s
{code}
impala-coordinator logs show that there are multiple new connections set up and authenticated:
{code:java}
I0225 14:15:48.976776   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58588>
I0225 14:15:48.976878   320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58588>
I0225 14:15:48.976912   320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58588>
I0225 14:15:48.977216 115929 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:48.989554 115929 authentication.cc:273] LDAP bind successful
I0225 14:15:48.989639 115929 impala-hs2-server.cc:1085] PingImpalaHS2Service(): request=TPingImpalaHS2ServiceReq {
  01: sessionHandle (struct) = TSessionHandle {
    01: sessionId (struct) = THandleIdentifier {
      01: guid (string) = "#\x8f\xdf\x01\xd7\xd6Bv\xa5\xec\xcd\x17Q\xb9q\x93",
      02: secret (string) = "\xd6\xaaO\v\xedXE!\x89}x\xbds\x1f\xe1\xf0",
    },
  },
}
I0225 14:15:50.152348   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58596>
I0225 14:15:50.152446   321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58596>
I0225 14:15:50.152493   321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58596>
I0225 14:15:50.152722 115930 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:50.163576 115930 authentication.cc:273] LDAP bind successful
I0225 14:15:50.163733 115930 impala-hs2-server.cc:442] ExecuteStatement(): request=TExecuteStatementReq {
  01: sessionHandle (struct) = TSessionHandle {
    01: sessionId (struct) = THandleIdentifier {
      01: guid (string) = "#\x8f\xdf\x01\xd7\xd6Bv\xa5\xec\xcd\x17Q\xb9q\x93",
      02: secret (string) = "\xd6\xaaO\v\xedXE!\x89}x\xbds\x1f\xe1\xf0",
    },
  },
  02: statement (string) = "show databases",
  03: confOverlay (map) = map<string,string>[1] {
    "CLIENT_IDENTIFIER" -> "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
  },
  04: runAsync (bool) = true,
}
I0225 14:15:50.163775 115930 impala-hs2-server.cc:230] TExecuteStatementReq: TExecuteStatementReq {
  01: sessionHandle (struct) = TSessionHandle {
    01: sessionId (struct) = THandleIdentifier {
      01: guid (string) = "#\x8f\xdf\x01\xd7\xd6Bv\xa5\xec\xcd\x17Q\xb9q\x93",
      02: secret (string) = "\xd6\xaaO\v\xedXE!\x89}x\xbds\x1f\xe1\xf0",
    },
  },
  02: statement (string) = "show databases",
  03: confOverlay (map) = map<string,string>[1] {
    "CLIENT_IDENTIFIER" -> "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
  },
  04: runAsync (bool) = true,
}
I0225 14:15:50.173715 115930 impala-hs2-server.cc:268] TClientRequest.queryOptions: TQueryOptions {
  01: abort_on_error (bool) = false,
  02: max_errors (i32) = 100,
  03: disable_codegen (bool) = false,
  04: batch_size (i32) = 0,
  05: num_nodes (i32) = 0,
  06: max_scan_range_length (i64) = 0,
  07: num_scanner_threads (i32) = 0,
  11: debug_action (string) = "",
  12: mem_limit (i64) = 0,
  15: hbase_caching (i32) = 0,
  16: hbase_cache_blocks (bool) = false,
  17: parquet_file_size (i64) = 0,
  18: explain_level (i32) = 1,
  19: sync_ddl (bool) = false,
  24: disable_outermost_topn (bool) = false,
  26: query_timeout_s (i32) = 0,
  28: appx_count_distinct (bool) = false,
  29: disable_unsafe_spills (bool) = false,
  31: exec_single_node_rows_threshold (i32) = 100,
  32: optimize_partition_key_scans (bool) = false,
  33: replica_preference (i32) = 0,
  34: schedule_random_replica (bool) = false,
  36: disable_streaming_preaggregations (bool) = false,
  37: runtime_filter_mode (i32) = 2,
  38: runtime_bloom_filter_size (i32) = 1048576,
  39: runtime_filter_wait_time_ms (i32) = 0,
  40: disable_row_runtime_filtering (bool) = false,
  41: max_num_runtime_filters (i32) = 10,
  42: parquet_annotate_strings_utf8 (bool) = false,
  43: parquet_fallback_schema_resolution (i32) = 0,
  45: s3_skip_insert_staging (bool) = true,
  46: runtime_filter_min_size (i32) = 1048576,
  47: runtime_filter_max_size (i32) = 16777216,
  48: prefetch_mode (i32) = 1,
  49: strict_mode (bool) = false,
  50: scratch_limit (i64) = -1,
  51: enable_expr_rewrites (bool) = true,
  52: decimal_v2 (bool) = true,
  53: parquet_dictionary_filtering (bool) = true,
  54: parquet_array_resolution (i32) = 0,
  55: parquet_read_statistics (bool) = true,
  56: default_join_distribution_mode (i32) = 0,
  57: disable_codegen_rows_threshold (i32) = 50000,
  58: default_spillable_buffer_size (i64) = 2097152,
  59: min_spillable_buffer_size (i64) = 65536,
  60: max_row_size (i64) = 524288,
  61: idle_session_timeout (i32) = 900,
  62: compute_stats_min_sample_size (i64) = 1073741824,
  63: exec_time_limit_s (i32) = 0,
  64: shuffle_distinct_exprs (bool) = true,
  65: max_mem_estimate_for_admission (i64) = 0,
  66: thread_reservation_limit (i32) = 3000,
  67: thread_reservation_aggregate_limit (i32) = 0,
  68: kudu_read_mode (i32) = 0,
  69: allow_erasure_coded_files (bool) = false,
  70: timezone (string) = "",
  71: scan_bytes_limit (i64) = 0,
  72: cpu_limit_s (i64) = 0,
  73: topn_bytes_limit (i64) = 536870912,
  74: client_identifier (string) = "Impala Shell v3.4.0-SNAPSHOT (cad1561) built on Fri Feb 14 14:15:26 CET 2020",
  75: resource_trace_ratio (double) = 0,
  76: num_remote_executor_candidates (i32) = 3,
  77: num_rows_produced_limit (i64) = 0,
  78: planner_testcase_mode (bool) = false,
  79: default_file_format (i32) = 4,
  80: parquet_timestamp_type (i32) = 0,
  81: parquet_read_page_index (bool) = true,
  82: parquet_write_page_index (bool) = true,
  84: disable_hdfs_num_rows_estimate (bool) = false,
  86: spool_query_results (bool) = true,
  87: default_transactional_type (i32) = 1,
  88: statement_expression_limit (i32) = 250000,
  89: max_statement_length_bytes (i32) = 16777216,
  90: disable_data_cache (bool) = false,
  91: max_result_spooling_mem (i64) = 104857600,
  92: max_spilled_result_spooling_mem (i64) = 1073741824,
  93: disable_hbase_num_rows_estimate (bool) = false,
  94: fetch_rows_timeout_ms (i64) = 10000,
  95: now_string (string) = "",
  96: parquet_object_store_split_size (i64) = 268435456,
  97: mem_limit_executors (i64) = 0,
  98: broadcast_bytes_limit (i64) = 34359738368,
}
I0225 14:15:50.173835 115930 impala-server.cc:987] Found local timezone "UTC".
I0225 14:15:50.177309 115930 impala-server.cc:1042] 4f44d29479adfa14:508106ff00000000] Registered query query_id=4f44d29479adfa14:508106ff00000000 session_id=7642d6d701df8f23:9371b95117cdeca5
I0225 14:15:50.177577 115930 Frontend.java:1499] 4f44d29479adfa14:508106ff00000000] Analyzing query: show databases db: default
I0225 14:15:50.177830 115930 BaseAuthorizationChecker.java:110] 4f44d29479adfa14:508106ff00000000] Authorization check took 0 ms
I0225 14:15:50.177906 115930 Frontend.java:1541] 4f44d29479adfa14:508106ff00000000] Analysis and authorization finished.
I0225 14:15:50.182478 115930 impala-server.cc:1080] Query 4f44d29479adfa14:508106ff00000000 has idle timeout of 10m
I0225 14:15:50.182540 115930 impala-hs2-server.cc:512] ExecuteStatement(): return_val=TExecuteStatementResp {
  01: status (struct) = TStatus {
    01: statusCode (i32) = 0,
  },
  02: operationHandle (struct) = TOperationHandle {
    01: operationId (struct) = THandleIdentifier {
      01: guid (string) = "\x14\xfa\xady\x94\xd2DO\x00\x00\x00\x00\xff\x06\x81P",
      02: secret (string) = "\xd6\xaaO\v\xedXE!\x89}x\xbds\x1f\xe1\xf0",
    },
    02: operationType (i32) = 0,
    03: hasResultSet (bool) = true,
  },
}
I0225 14:15:51.934399   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58622>
I0225 14:15:51.934571   320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58622>
I0225 14:15:51.934634   320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58622>
I0225 14:15:51.934870 115940 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:51.945902 115940 authentication.cc:273] LDAP bind successful
I0225 14:15:51.945957 115940 impala-hs2-server.cc:812] GetResultSetMetadata(): query_id=4f44d29479adfa14:508106ff00000000
I0225 14:15:51.946015 115940 impala-hs2-server.cc:847] GetResultSetMetadata(): return_val=TGetResultSetMetadataResp {
  01: status (struct) = TStatus {
    01: statusCode (i32) = 0,
  },
  02: schema (struct) = TTableSchema {
    01: columns (list) = list<struct>[2] {
      [0] = TColumnDesc {
        01: columnName (string) = "name",
        02: typeDesc (struct) = TTypeDesc {
          01: types (list) = list<struct>[1] {
            [0] = TTypeEntry {
              01: primitiveEntry (struct) = TPrimitiveTypeEntry {
                01: type (i32) = 7,
              },
            },
          },
        },
        03: position (i32) = 0,
      },
      [1] = TColumnDesc {
        01: columnName (string) = "comment",
        02: typeDesc (struct) = TTypeDesc {
          01: types (list) = list<struct>[1] {
            [0] = TTypeEntry {
              01: primitiveEntry (struct) = TPrimitiveTypeEntry {
                01: type (i32) = 7,
              },
            },
          },
        },
        03: position (i32) = 1,
      },
    },
  },
}
I0225 14:15:53.537967   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58628>
I0225 14:15:53.538059   321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58628>
I0225 14:15:53.538092   321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58628>
I0225 14:15:53.538578 115941 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:53.550732 115941 authentication.cc:273] LDAP bind successful
I0225 14:15:54.959165 115929 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:15:54.970517 115929 authentication.cc:273] LDAP bind successful
I0225 14:15:56.381584   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58658>
I0225 14:15:56.381669   320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58658>
I0225 14:15:56.381703   320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58658>
I0225 14:15:56.381961 115943 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:16:01.398638 115943 authentication.cc:273] LDAP bind successful
I0225 14:16:02.591567   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58718>
I0225 14:16:02.591709   321 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58718>
I0225 14:16:02.591747   321 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58718>
I0225 14:16:02.592200 115965 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:16:02.603652 115965 authentication.cc:273] LDAP bind successful
I0225 14:16:02.603735 115965 impala-hs2-server.cc:778] CloseOperation(): query_id=4f44d29479adfa14:508106ff00000000
I0225 14:16:02.603758 115965 impala-server.cc:1121] UnregisterQuery(): query_id=4f44d29479adfa14:508106ff00000000
I0225 14:16:02.603766 115965 impala-server.cc:1223] Cancel(): query_id=4f44d29479adfa14:508106ff00000000
I0225 14:16:04.448861   317 TAcceptQueueServer.cpp:340] New connection to server hiveserver2-http-frontend from client <Host: 127.0.0.1 Port: 58748>
I0225 14:16:04.449045   320 TAcceptQueueServer.cpp:227] TAcceptQueueServer: hiveserver2-http-frontend started connection setup for client <Host: 127.0.0.1 Port: 58748>
I0225 14:16:04.449076   320 TAcceptQueueServer.cpp:245] TAcceptQueueServer: hiveserver2-http-frontend finished connection setup for client <Host: 127.0.0.1 Port: 58748>
I0225 14:16:04.449290 115968 authentication.cc:261] Trying simple LDAP bind for: uid=csso_attilaj,cn=users,cn=accounts,dc=attilaj,dc=xcu2-8y8x,dc=dev,dc=cldr,dc=work
I0225 14:16:04.460290 115968 authentication.cc:273] LDAP bind successful
I0225 14:16:20.212851 115930 impala-server.cc:1957] Connection d943451a077f71e4:0a2eb062b208c388 from client 127.0.0.1:58596 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:21.965914 115940 impala-server.cc:1957] Connection ff4ec2f4f3931c3d:792ae752a9916293 from client 127.0.0.1:58622 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:23.581163 115941 impala-server.cc:1957] Connection 4540176beb3e990c:37df4bffa25515b6 from client 127.0.0.1:58628 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:25.000741 115929 impala-server.cc:1957] Connection 974e59d0132ae6a3:6309b762ba77c190 from client 127.0.0.1:58588 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:31.428939 115943 impala-server.cc:1957] Connection 1542d06b26eabcf1:566956f904ba1e80 from client 127.0.0.1:58658 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:32.634624 115965 impala-server.cc:1957] Connection 6e4770d6a982ecd5:d4e40ecc6d1327b3 from client 127.0.0.1:58718 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
I0225 14:16:34.471824 115968 impala-server.cc:1957] Connection 6643d4e11d526c8d:b6772779e91bcc8b from client 127.0.0.1:58748 to server hiveserver2-http-frontend closed. The connection had 1 associated session(s).
{code}
Looks like there's a new connection and LDAP authentication for each RPC call made which imposes an overhead.

Please investigate whether it's possible to speed things up by reusing connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)