You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2018/08/21 21:38:51 UTC

[1/3] impala git commit: IMPALA-7470: SentryServicePinger logs error messages on success

Repository: impala
Updated Branches:
  refs/heads/master 3fa05604a -> bddd7def9


IMPALA-7470: SentryServicePinger logs error messages on success

SentryServicePinger checks if Sentry is running by calling a Sentry API
to get a list of roles. If Sentry is not yet running, an exception will
be thrown. However, the Sentry client implementation will log some error
messages when an exception is thrown. For the purpose of SentryServicePinger,
this can be too noisy and verbose and may also confuse other developers
into thinking it was a failure when starting Sentry. The log messages
were muted in IMPALA-6878, however since Impala no longer uses the
shaded version of Sentry client in IMPALA-7423, the patch in IMPALA-6878
no longer worked. This patch fixes the muting of Sentry error messages by
turning off the log level using the non-shaded version of Sentry Thrift
client.

Testing:
- Manually tested by starting Sentry and did not see any error messages
  logged into stdout.

Change-Id: I75782d23c1cb67564a9265bf3cc94fd590c7b666
Reviewed-on: http://gerrit.cloudera.org:8080/11281
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/f849eff4
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/f849eff4
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/f849eff4

Branch: refs/heads/master
Commit: f849eff42134e7d11311acaf834d20ddceefeb60
Parents: 3fa0560
Author: Fredy Wijaya <fw...@cloudera.com>
Authored: Mon Aug 20 21:17:45 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Aug 21 19:52:40 2018 +0000

----------------------------------------------------------------------
 .../java/org/apache/impala/testutil/SentryServicePinger.java     | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/f849eff4/fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
----------------------------------------------------------------------
diff --git a/fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java b/fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
index 705f58a..4361ef9 100644
--- a/fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
+++ b/fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java
@@ -25,6 +25,7 @@ import org.apache.impala.authorization.SentryConfig;
 import org.apache.impala.authorization.User;
 import org.apache.impala.util.SentryPolicyService;
 import org.apache.log4j.Level;
+import org.apache.sentry.core.common.transport.SentryTransportFactory;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -43,7 +44,8 @@ public class SentryServicePinger {
   public static void main(String[] args) throws Exception {
     // Programmatically disable Sentry Thrift logging since Sentry error logging can be
     // pretty noisy and verbose.
-    org.apache.log4j.Logger logger4j = org.apache.log4j.Logger.getLogger("sentry");
+    org.apache.log4j.Logger logger4j = org.apache.log4j.Logger.getLogger(
+        SentryTransportFactory.class.getPackage().getName());
     logger4j.setLevel(Level.OFF);
 
     // Parse command line options to get config file path.


[2/3] impala git commit: IMPALA-7449: Fix network throughput calculation of DataStreamSender

Posted by ta...@apache.org.
IMPALA-7449: Fix network throughput calculation of DataStreamSender

Currently, the network throughput presented in the query profile
for DataStreamSender is computed by dividing the total bytes sent
by the total network time which is the sum of observed network time
of all individual RPCs. This is wrong in general and may only make
sense if the network throughput is fixed. In addition, RPCs are
asynchronous and they overlap with each other. So, dividing the
total byte sent by network throughput may result in time which exceeds
the wall clock time, making it impossible to interpret.

This change fixes the problem by measuring the network throughput
of each individual RPC and uses a summary counter to track avg/min/max
of network throughputs instead.

Change-Id: I344ac76c0a1a49b4da3d37d2c547f3d5051ebe24
Reviewed-on: http://gerrit.cloudera.org:8080/11241
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/2a60655b
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/2a60655b
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/2a60655b

Branch: refs/heads/master
Commit: 2a60655b09afaa76d3bf2120c7043eb0b22eefcf
Parents: f849eff
Author: Michael Ho <kw...@cloudera.com>
Authored: Wed Aug 15 14:25:16 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Aug 21 21:29:07 2018 +0000

----------------------------------------------------------------------
 be/src/runtime/krpc-data-stream-sender.cc | 17 ++++++++++-------
 be/src/runtime/krpc-data-stream-sender.h  | 11 ++++-------
 2 files changed, 14 insertions(+), 14 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/2a60655b/be/src/runtime/krpc-data-stream-sender.cc
----------------------------------------------------------------------
diff --git a/be/src/runtime/krpc-data-stream-sender.cc b/be/src/runtime/krpc-data-stream-sender.cc
index e6cea1f..6a2e5b3 100644
--- a/be/src/runtime/krpc-data-stream-sender.cc
+++ b/be/src/runtime/krpc-data-stream-sender.cc
@@ -394,10 +394,16 @@ void KrpcDataStreamSender::Channel::TransmitDataCompleteCb() {
   const kudu::Status controller_status = rpc_controller_.status();
   if (LIKELY(controller_status.ok())) {
     DCHECK(rpc_in_flight_batch_ != nullptr);
-    COUNTER_ADD(parent_->bytes_sent_counter_,
-        RowBatch::GetSerializedSize(*rpc_in_flight_batch_));
+    int64_t row_batch_size = RowBatch::GetSerializedSize(*rpc_in_flight_batch_);
     int64_t network_time = total_time - resp_.receiver_latency_ns();
-    COUNTER_ADD(&parent_->total_network_timer_, network_time);
+    COUNTER_ADD(parent_->bytes_sent_counter_, row_batch_size);
+    if (LIKELY(network_time > 0)) {
+      // 'row_batch_size' is bounded by FLAGS_rpc_max_message_size which shouldn't exceed
+      // max 32-bit signed value so multiplication below should not overflow.
+      DCHECK_LE(row_batch_size, numeric_limits<int32_t>::max());
+      int64_t network_throughput = row_batch_size * NANOS_PER_SEC / network_time;
+      parent_->network_throughput_counter_->UpdateCounter(network_throughput);
+    }
     Status rpc_status = Status::OK();
     int32_t status_code = resp_.status().status_code();
     if (status_code == TErrorCode::DATASTREAM_RECVR_CLOSED) {
@@ -584,7 +590,6 @@ KrpcDataStreamSender::KrpcDataStreamSender(int sender_id, const RowDescriptor* r
     sender_id_(sender_id),
     partition_type_(sink.output_partition.type),
     per_channel_buffer_size_(per_channel_buffer_size),
-    total_network_timer_(TUnit::TIME_NS, 0),
     dest_node_id_(sink.dest_node_id),
     next_unknown_partition_(0) {
   DCHECK_GT(destinations.size(), 0);
@@ -642,9 +647,7 @@ Status KrpcDataStreamSender::Prepare(
   bytes_sent_time_series_counter_ =
       ADD_TIME_SERIES_COUNTER(profile(), "BytesSent", bytes_sent_counter_);
   network_throughput_counter_ =
-      profile()->AddDerivedCounter("NetworkThroughput", TUnit::BYTES_PER_SECOND,
-          bind<int64_t>(&RuntimeProfile::UnitsPerSecond, bytes_sent_counter_,
-              &total_network_timer_));
+      ADD_SUMMARY_STATS_COUNTER(profile(), "NetworkThroughput", TUnit::BYTES_PER_SECOND);
   eos_sent_counter_ = ADD_COUNTER(profile(), "EosSent", TUnit::UNIT);
   uncompressed_bytes_counter_ =
       ADD_COUNTER(profile(), "UncompressedRowBatchSize", TUnit::BYTES);

http://git-wip-us.apache.org/repos/asf/impala/blob/2a60655b/be/src/runtime/krpc-data-stream-sender.h
----------------------------------------------------------------------
diff --git a/be/src/runtime/krpc-data-stream-sender.h b/be/src/runtime/krpc-data-stream-sender.h
index 6757c2a..65580b1 100644
--- a/be/src/runtime/krpc-data-stream-sender.h
+++ b/be/src/runtime/krpc-data-stream-sender.h
@@ -207,13 +207,10 @@ class KrpcDataStreamSender : public DataSink {
   /// Total number of rows sent.
   RuntimeProfile::Counter* total_sent_rows_counter_ = nullptr;
 
-  /// Approximate network throughput for sending row batches.
-  RuntimeProfile::Counter* network_throughput_counter_ = nullptr;
-
-  /// Aggregated time spent in network (including queuing time in KRPC transfer queue)
-  /// for transmitting the RPC requests and receiving the responses. Used for computing
-  /// 'network_throughput_'. Not too meaningful by itself so not shown in profile.
-  RuntimeProfile::Counter total_network_timer_;
+  /// Summary of network throughput for sending row batches. Network time also includes
+  /// queuing time in KRPC transfer queue for transmitting the RPC requests and receiving
+  /// the responses.
+  RuntimeProfile::SummaryStatsCounter* network_throughput_counter_ = nullptr;
 
   /// Identifier of the destination plan node.
   PlanNodeId dest_node_id_;


[3/3] impala git commit: IMPALA-7465: fix test_kudu_scan_mem_usage

Posted by ta...@apache.org.
IMPALA-7465: fix test_kudu_scan_mem_usage

The issue was that the row batch queue could grow a lot if the consumer
was slow.

Also add an additional test to exercise the OOM code path in Kudu for
completeness.

Testing:
Added sleep to kudu-scan-node.cc that reproduced the problem.
Looped modified test to flush out flakiness.

Change-Id: Ic4a95b6b6d96a447df68ef4912a86f1e11f219ca
Reviewed-on: http://gerrit.cloudera.org:8080/11285
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/bddd7def
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/bddd7def
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/bddd7def

Branch: refs/heads/master
Commit: bddd7def991f60f87772299df1aa53bc0fde48f2
Parents: 2a60655
Author: Tim Armstrong <ta...@cloudera.com>
Authored: Tue Aug 21 10:53:47 2018 -0700
Committer: Impala Public Jenkins <im...@cloudera.com>
Committed: Tue Aug 21 21:38:11 2018 +0000

----------------------------------------------------------------------
 .../queries/QueryTest/kudu-scan-mem-usage.test           | 11 +++++++++++
 tests/query_test/test_mem_usage_scaling.py               |  2 ++
 2 files changed, 13 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/bddd7def/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test b/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
index aa42edb..7fd6e3c 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-mem-usage.test
@@ -2,6 +2,8 @@
 ---- QUERY
 # IMPALA-7076: this query will fail if the Kudu scan spins up too many scanner threads.
 # Only one scanner thread per impalad should be started.
+# Set num_scanner_threads to reduce the row batch queue length and avoid IMPALA-7465.
+set num_scanner_threads=2;
 set mem_limit=4mb;
 select * from tpch_kudu.orders
 order by o_orderkey limit 3;
@@ -14,3 +16,12 @@ BIGINT,BIGINT,STRING,DECIMAL,STRING,STRING,STRING,INT,STRING
 ---- RUNTIME_PROFILE
 aggregation(SUM, NumScannerThreadsStarted): 3
 ====
+---- QUERY
+# Test with lower limit that reliably hits memory limit exceeded.
+set disable_codegen=true;
+set mem_limit=32k;
+select * from tpch_kudu.orders
+order by o_orderkey limit 3;
+---- CATCH
+Memory limit exceeded
+====

http://git-wip-us.apache.org/repos/asf/impala/blob/bddd7def/tests/query_test/test_mem_usage_scaling.py
----------------------------------------------------------------------
diff --git a/tests/query_test/test_mem_usage_scaling.py b/tests/query_test/test_mem_usage_scaling.py
index 0d3219a..21b320e 100644
--- a/tests/query_test/test_mem_usage_scaling.py
+++ b/tests/query_test/test_mem_usage_scaling.py
@@ -353,6 +353,8 @@ class TestScanMemLimit(ImpalaTestSuite):
   def test_kudu_scan_mem_usage(self, vector):
     """Test that Kudu scans can stay within a low memory limit. Before IMPALA-7096 they
     were not aware of mem_limit and would start up too many scanner threads."""
+    # .test file overrides disable_codegen.
+    del vector.get_value('exec_option')['disable_codegen']
     self.run_test_case('QueryTest/kudu-scan-mem-usage', vector)
 
   def test_hdfs_scanner_thread_mem_scaling(self, vector):