You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by to...@apache.org on 2016/03/03 00:41:01 UTC

[1/3] incubator-kudu git commit: Submit ProbeStat metrics only once per batch

Repository: incubator-kudu
Updated Branches:
  refs/heads/master 74af46bd6 -> cd3aa0ee6


Submit ProbeStat metrics only once per batch

This is a small optimization to the gathering of the probe statistics for write
operations. We previously collected these metrics on every operation, which
cost us several atomic instructions for incrementing the metrics, as well as a
TRACE entry for each.

Now, we collect the stats for each operation into an arena-allocated array, and
then batch them together before submitting to the tablet metrics. We also now
only log a single trace entry for the probe stats rather than one per row,
making the traces much more manageable.

The new code uses the transaction state's arena for allocation to avoid adding
any extra allocator contention, which is already fairly measurable in write
workloads.

To measure the impact, I temporarily added an 'exit(0)' after the insert threads
finish in full_stack-insert-scan-test and ran:

$ KUDU_ALLOW_SLOW_TESTS=1 perf record \
    ./build/latest/bin/full_stack-insert-scan-test \
    --gtest_filter=\*WithDiskStress\* --inserts_per_client=2000000 -rows_per_batch=1000

before and after the patch. The patch had the following effect:

- functions related to tracing (SubstituteToBuffer, SubstitutedSize,
  SubstituteAndTrace) reduced from 3.83% of cycles to 0.05% of cycles
- HdrHistogram::IncrementBy(...) reduced from 1.4% of cycles to 0.03% of cycles
- LongAdder::IncrementBy(...) reduced from 1.52% of cycles to 0.65% of cycles

'perf stat' shows an even better reduction in cycles compared to summing up the above,
probably due to fewer cross-CPU cache invalidations, allocator pressure from the traces,
etc:

  without patch:

       347644.057631      task-clock (msec)         #    6.549 CPUs utilized
             341,767      context-switches          #    0.983 K/sec
              57,828      cpu-migrations            #    0.166 K/sec
             506,239      page-faults               #    0.001 M/sec
     779,116,782,135      cycles                    #    2.241 GHz
     <not supported>      stalled-cycles-frontend
     <not supported>      stalled-cycles-backend
     745,063,382,648      instructions              #    0.96  insns per cycle
     134,030,182,890      branches                  #  385.539 M/sec
         689,148,045      branch-misses             #    0.51% of all branches

        53.079492439 seconds time elapsed

  with patch:

       244940.029112      task-clock (msec)         #    5.706 CPUs utilized
             290,294      context-switches          #    0.001 M/sec
              54,321      cpu-migrations            #    0.222 K/sec
             818,060      page-faults               #    0.003 M/sec
     637,474,442,303      cycles                    #    2.603 GHz
     <not supported>      stalled-cycles-frontend
     <not supported>      stalled-cycles-backend
     609,976,573,012      instructions              #    0.96  insns per cycle
     106,824,660,658      branches                  #  436.126 M/sec
         383,546,319      branch-misses             #    0.36% of all branches

        42.928344981 seconds time elapsed

Change-Id: I9609ea01375be745a82105f845a21cd7829b3a45
Reviewed-on: http://gerrit.cloudera.org:8080/2377
Tested-by: Kudu Jenkins
Reviewed-by: Jean-Daniel Cryans


Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/ac3771f4
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/ac3771f4
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/ac3771f4

Branch: refs/heads/master
Commit: ac3771f4078c1f23545494f63384c724f73cc0af
Parents: 74af46b
Author: Todd Lipcon <to...@apache.org>
Authored: Tue Mar 1 11:59:42 2016 -0800
Committer: Todd Lipcon <to...@apache.org>
Committed: Wed Mar 2 22:43:42 2016 +0000

----------------------------------------------------------------------
 src/kudu/tablet/tablet.cc           | 46 +++++++++++++---------
 src/kudu/tablet/tablet.h            |  9 +++--
 src/kudu/tablet/tablet_bootstrap.cc |  3 +-
 src/kudu/tablet/tablet_metrics.cc   | 66 ++++++++++++++++++++++++++------
 src/kudu/tablet/tablet_metrics.h    | 29 ++++----------
 5 files changed, 99 insertions(+), 54 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/ac3771f4/src/kudu/tablet/tablet.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/tablet.cc b/src/kudu/tablet/tablet.cc
index 37bd5bb..ab5b6f4 100644
--- a/src/kudu/tablet/tablet.cc
+++ b/src/kudu/tablet/tablet.cc
@@ -355,7 +355,8 @@ void Tablet::StartTransaction(WriteTransactionState* tx_state) {
 }
 
 Status Tablet::InsertUnlocked(WriteTransactionState *tx_state,
-                              RowOp* insert) {
+                              RowOp* insert,
+                              ProbeStats* stats) {
   const TabletComponents* comps = DCHECK_NOTNULL(tx_state->tablet_components());
 
   CHECK(state_ == kOpen || state_ == kBootstrapping);
@@ -365,11 +366,6 @@ Status Tablet::InsertUnlocked(WriteTransactionState *tx_state,
   DCHECK_EQ(tx_state->schema_at_decode_time(), schema()) << "Raced against schema change";
   DCHECK(tx_state->op_id().IsInitialized()) << "TransactionState OpId needed for anchoring";
 
-  ProbeStats stats;
-
-  // Submit the stats before returning from this function
-  ProbeStatsSubmitter submitter(stats, metrics_.get());
-
   // First, ensure that it is a unique key by checking all the open RowSets.
   if (FLAGS_tablet_do_dup_key_checks) {
     vector<RowSet *> to_check;
@@ -378,7 +374,7 @@ Status Tablet::InsertUnlocked(WriteTransactionState *tx_state,
 
     for (const RowSet *rowset : to_check) {
       bool present = false;
-      RETURN_NOT_OK(rowset->CheckRowPresent(*insert->key_probe, &present, &stats));
+      RETURN_NOT_OK(rowset->CheckRowPresent(*insert->key_probe, &present, stats));
       if (PREDICT_FALSE(present)) {
         Status s = Status::AlreadyPresent("key already present");
         if (metrics_) {
@@ -451,7 +447,8 @@ vector<RowSet*> Tablet::FindRowSetsToCheck(RowOp* mutate,
 }
 
 Status Tablet::MutateRowUnlocked(WriteTransactionState *tx_state,
-                                 RowOp* mutate) {
+                                 RowOp* mutate,
+                                 ProbeStats* stats) {
   DCHECK(tx_state != nullptr) << "you must have a WriteTransactionState";
   DCHECK(tx_state->op_id().IsInitialized()) << "TransactionState OpId needed for anchoring";
   DCHECK_EQ(tx_state->schema_at_decode_time(), schema());
@@ -475,16 +472,12 @@ Status Tablet::MutateRowUnlocked(WriteTransactionState *tx_state,
 
   Timestamp ts = tx_state->timestamp();
 
-  ProbeStats stats;
-  // Submit the stats before returning from this function
-  ProbeStatsSubmitter submitter(stats, metrics_.get());
-
   // First try to update in memrowset.
   s = comps->memrowset->MutateRow(ts,
                             *mutate->key_probe,
                             mutate->decoded_op.changelist,
                             tx_state->op_id(),
-                            &stats,
+                            stats,
                             result.get());
   if (s.ok()) {
     mutate->SetMutateSucceeded(std::move(result));
@@ -503,7 +496,7 @@ Status Tablet::MutateRowUnlocked(WriteTransactionState *tx_state,
                       *mutate->key_probe,
                       mutate->decoded_op.changelist,
                       tx_state->op_id(),
-                      &stats,
+                      stats,
                       result.get());
     if (s.ok()) {
       mutate->SetMutateSucceeded(std::move(result));
@@ -527,22 +520,39 @@ void Tablet::StartApplying(WriteTransactionState* tx_state) {
 }
 
 void Tablet::ApplyRowOperations(WriteTransactionState* tx_state) {
+  // Allocate the ProbeStats objects from the transaction's arena, so
+  // they're all contiguous and we don't need to do any central allocation.
+  int num_ops = tx_state->row_ops().size();
+  ProbeStats* stats_array = static_cast<ProbeStats*>(
+      tx_state->arena()->AllocateBytesAligned(sizeof(ProbeStats) * num_ops,
+                                              alignof(ProbeStats)));
+
   StartApplying(tx_state);
+  int i = 0;
   for (RowOp* row_op : tx_state->row_ops()) {
-    ApplyRowOperation(tx_state, row_op);
+    ProbeStats* stats = &stats_array[i++];
+    // Manually run the constructor to clear the stats to 0 before collecting
+    // them.
+    new (stats) ProbeStats();
+    ApplyRowOperation(tx_state, row_op, stats);
+  }
+
+  if (metrics_) {
+    metrics_->AddProbeStats(stats_array, num_ops, tx_state->arena());
   }
 }
 
 void Tablet::ApplyRowOperation(WriteTransactionState* tx_state,
-                               RowOp* row_op) {
+                               RowOp* row_op,
+                               ProbeStats* stats) {
   switch (row_op->decoded_op.type) {
     case RowOperationsPB::INSERT:
-      ignore_result(InsertUnlocked(tx_state, row_op));
+      ignore_result(InsertUnlocked(tx_state, row_op, stats));
       return;
 
     case RowOperationsPB::UPDATE:
     case RowOperationsPB::DELETE:
-      ignore_result(MutateRowUnlocked(tx_state, row_op));
+      ignore_result(MutateRowUnlocked(tx_state, row_op, stats));
       return;
 
     default:

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/ac3771f4/src/kudu/tablet/tablet.h
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/tablet.h b/src/kudu/tablet/tablet.h
index e8dee8b..0d2a8e1 100644
--- a/src/kudu/tablet/tablet.h
+++ b/src/kudu/tablet/tablet.h
@@ -182,7 +182,8 @@ class Tablet {
   // Apply a single row operation, which must already be prepared.
   // The result is set back into row_op->result
   void ApplyRowOperation(WriteTransactionState* tx_state,
-                         RowOp* row_op);
+                         RowOp* row_op,
+                         ProbeStats* stats);
 
   // Create a new row iterator which yields the rows as of the current MVCC
   // state of this tablet.
@@ -382,13 +383,15 @@ class Tablet {
   // they were already acquired. Requires that handles for the relevant locks
   // and MVCC transaction are present in the transaction state.
   Status InsertUnlocked(WriteTransactionState *tx_state,
-                        RowOp* insert);
+                        RowOp* insert,
+                        ProbeStats* stats);
 
   // A version of MutateRow that does not acquire locks and instead assumes
   // they were already acquired. Requires that handles for the relevant locks
   // and MVCC transaction are present in the transaction state.
   Status MutateRowUnlocked(WriteTransactionState *tx_state,
-                           RowOp* mutate);
+                           RowOp* mutate,
+                           ProbeStats* stats);
 
   // Return the list of RowSets that need to be consulted when processing the
   // given mutation.

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/ac3771f4/src/kudu/tablet/tablet_bootstrap.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/tablet_bootstrap.cc b/src/kudu/tablet/tablet_bootstrap.cc
index ec8ec09..782a973 100644
--- a/src/kudu/tablet/tablet_bootstrap.cc
+++ b/src/kudu/tablet/tablet_bootstrap.cc
@@ -1324,7 +1324,8 @@ Status TabletBootstrap::FilterAndApplyOperations(WriteTransactionState* tx_state
     }
 
     // Actually apply it.
-    tablet_->ApplyRowOperation(tx_state, op);
+    ProbeStats stats; // we don't use this, but tablet internals require non-NULL.
+    tablet_->ApplyRowOperation(tx_state, op, &stats);
     DCHECK(op->result != nullptr);
 
     // We expect that the above Apply() will always succeed, because we're

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/ac3771f4/src/kudu/tablet/tablet_metrics.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/tablet_metrics.cc b/src/kudu/tablet/tablet_metrics.cc
index 6a6d87d..0b8b196 100644
--- a/src/kudu/tablet/tablet_metrics.cc
+++ b/src/kudu/tablet/tablet_metrics.cc
@@ -16,7 +16,12 @@
 // under the License.
 #include "kudu/tablet/tablet_metrics.h"
 
+#include <functional>
+#include <map>
+#include <utility>
+
 #include "kudu/gutil/strings/substitute.h"
+#include "kudu/util/memory/arena.h"
 #include "kudu/util/metrics.h"
 #include "kudu/util/trace.h"
 
@@ -205,6 +210,8 @@ METRIC_DEFINE_counter(tablet, leader_memory_pressure_rejections,
   "Number of RPC requests rejected due to memory pressure while LEADER.");
 
 using strings::Substitute;
+using std::unordered_map;
+
 
 namespace kudu {
 namespace tablet {
@@ -250,20 +257,57 @@ TabletMetrics::TabletMetrics(const scoped_refptr<MetricEntity>& entity)
 #undef MINIT
 #undef GINIT
 
-void TabletMetrics::AddProbeStats(const ProbeStats& stats) {
-  bloom_lookups->IncrementBy(stats.blooms_consulted);
-  key_file_lookups->IncrementBy(stats.keys_consulted);
-  delta_file_lookups->IncrementBy(stats.deltas_consulted);
-  mrs_lookups->IncrementBy(stats.mrs_consulted);
-
-  bloom_lookups_per_op->Increment(stats.blooms_consulted);
-  key_file_lookups_per_op->Increment(stats.keys_consulted);
-  delta_file_lookups_per_op->Increment(stats.deltas_consulted);
+void TabletMetrics::AddProbeStats(const ProbeStats* stats_array, int len,
+                                  Arena* work_arena) {
+  // In most cases, different operations within a batch will have the same
+  // statistics (e.g. 1 or 2 bloom lookups, 0 key lookups, 0 delta lookups).
+  //
+  // Given that, we pre-aggregate our contributions to the tablet histograms
+  // in local maps here. We also pre-aggregate our normal counter contributions
+  // to minimize contention on the counter metrics.
+  //
+  // To avoid any actual expensive allocation, we allocate these local maps from
+  // 'work_arena'.
+  typedef ArenaAllocator<std::pair<const int32_t, int32_t>, false> AllocType;
+  typedef std::map<int32_t, int32_t, std::less<int32_t>, AllocType> MapType;
+  AllocType alloc(work_arena);
+  MapType bloom_lookups_hist(std::less<int32_t>(), alloc);
+  MapType key_file_lookups_hist(std::less<int32_t>(), alloc);
+  MapType delta_file_lookups_hist(std::less<int32_t>(), alloc);
+
+  ProbeStats sum;
+  for (int i = 0; i < len; i++) {
+    const ProbeStats& stats = stats_array[i];
+
+    sum.blooms_consulted += stats.blooms_consulted;
+    sum.keys_consulted += stats.keys_consulted;
+    sum.deltas_consulted += stats.deltas_consulted;
+    sum.mrs_consulted += stats.mrs_consulted;
+
+    bloom_lookups_hist[stats.blooms_consulted]++;
+    key_file_lookups_hist[stats.keys_consulted]++;
+    delta_file_lookups_hist[stats.deltas_consulted]++;
+  }
+
+  bloom_lookups->IncrementBy(sum.blooms_consulted);
+  key_file_lookups->IncrementBy(sum.keys_consulted);
+  delta_file_lookups->IncrementBy(sum.deltas_consulted);
+  mrs_lookups->IncrementBy(sum.mrs_consulted);
+
+  for (const auto& entry : bloom_lookups_hist) {
+    bloom_lookups_per_op->IncrementBy(entry.first, entry.second);
+  }
+  for (const auto& entry : key_file_lookups_hist) {
+    key_file_lookups_per_op->IncrementBy(entry.first, entry.second);
+  }
+  for (const auto& entry : delta_file_lookups_hist) {
+    delta_file_lookups_per_op->IncrementBy(entry.first, entry.second);
+  }
 
   TRACE("ProbeStats: bloom_lookups=$0,key_file_lookups=$1,"
         "delta_file_lookups=$2,mrs_lookups=$3",
-        stats.blooms_consulted, stats.keys_consulted,
-        stats.deltas_consulted, stats.mrs_consulted);
+        sum.blooms_consulted, sum.keys_consulted,
+        sum.deltas_consulted, sum.mrs_consulted);
 }
 
 } // namespace tablet

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/ac3771f4/src/kudu/tablet/tablet_metrics.h
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/tablet_metrics.h b/src/kudu/tablet/tablet_metrics.h
index f8c60cb..0d20a51 100644
--- a/src/kudu/tablet/tablet_metrics.h
+++ b/src/kudu/tablet/tablet_metrics.h
@@ -22,6 +22,7 @@
 
 namespace kudu {
 
+class Arena;
 class Counter;
 template<class T>
 class AtomicGauge;
@@ -36,7 +37,13 @@ struct ProbeStats;
 struct TabletMetrics {
   explicit TabletMetrics(const scoped_refptr<MetricEntity>& metric_entity);
 
-  void AddProbeStats(const ProbeStats& stats);
+  // Add a batch of probe stats to the metrics.
+  //
+  // We use C-style array passing here since the call site allocates the
+  // ProbeStats from an arena.
+  //
+  // This allocates temporary scratch space from work_arena.
+  void AddProbeStats(const ProbeStats* stats_array, int len, Arena* work_arena);
 
   // Operation rates
   scoped_refptr<Counter> rows_inserted;
@@ -82,26 +89,6 @@ struct TabletMetrics {
   scoped_refptr<Counter> leader_memory_pressure_rejections;
 };
 
-class ProbeStatsSubmitter {
- public:
-  ProbeStatsSubmitter(const ProbeStats& stats, TabletMetrics* metrics)
-    : stats_(stats),
-      metrics_(metrics) {
-  }
-
-  ~ProbeStatsSubmitter() {
-    if (metrics_) {
-      metrics_->AddProbeStats(stats_);
-    }
-  }
-
- private:
-  const ProbeStats& stats_;
-  TabletMetrics* const metrics_;
-
-  DISALLOW_COPY_AND_ASSIGN(ProbeStatsSubmitter);
-};
-
 } // namespace tablet
 } // namespace kudu
 #endif /* KUDU_TABLET_TABLET_METRICS_H */


[2/3] incubator-kudu git commit: Improve failure message in WaitUntilCommittedOpIdIndexIs() test utility

Posted by to...@apache.org.
Improve failure message in WaitUntilCommittedOpIdIndexIs() test utility

Print the last seen OpId, if any.

Change-Id: Iea208e4da7d546b9178591ec1d5053c66cbe129a
Reviewed-on: http://gerrit.cloudera.org:8080/2406
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/a3419b84
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/a3419b84
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/a3419b84

Branch: refs/heads/master
Commit: a3419b84ccffe60736ac45a5e750feeadc4548b2
Parents: ac3771f
Author: Mike Percy <mp...@apache.org>
Authored: Wed Mar 2 17:46:18 2016 +0200
Committer: Todd Lipcon <to...@apache.org>
Committed: Wed Mar 2 22:48:06 2016 +0000

----------------------------------------------------------------------
 src/kudu/integration-tests/cluster_itest_util.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/a3419b84/src/kudu/integration-tests/cluster_itest_util.cc
----------------------------------------------------------------------
diff --git a/src/kudu/integration-tests/cluster_itest_util.cc b/src/kudu/integration-tests/cluster_itest_util.cc
index f7fecf1..e54728a 100644
--- a/src/kudu/integration-tests/cluster_itest_util.cc
+++ b/src/kudu/integration-tests/cluster_itest_util.cc
@@ -373,9 +373,10 @@ Status WaitUntilCommittedOpIdIndexIs(int64_t opid_index,
     SleepFor(MonoDelta::FromMilliseconds(10));
   }
   return Status::TimedOut(Substitute("Committed consensus opid_index does not equal $0 "
-                                     "after waiting for $1. Last status: $2",
+                                     "after waiting for $1. Last opid: $2. Last status: $3",
                                      opid_index,
                                      MonoTime::Now(MonoTime::FINE).GetDeltaSince(start).ToString(),
+                                     OpIdToString(op_id),
                                      s.ToString()));
 }
 


[3/3] incubator-kudu git commit: KUDU-1347. Improve licensing documentation

Posted by to...@apache.org.
KUDU-1347. Improve licensing documentation

This addresses a few issues and suggestions raised on our
0.7.0 release candidate vote:

- Many of the licenses previously mentioned in LICENSE.txt do not actually
  require that they be reproduced in binary distributions. Since we already
  include the license text in the source headers themselves, we can just
  refer the reader to the source.

  For those licenses which require some form of notice or attribution in
  binary distributions, we continue to copy-paste the license text into
  the LICENSE.txt file.

- Copy the CMake BSD license notice into the FindProtobuf.cmake file.
  Even though we substantially rewrote it, it could be considered a
  derived work.

- Relocate the copyright notices for Slice.java and Slices.java into the
  source header for those files, so that we can remove the extra copy of the
  Apache license from LICENSE.txt.

  Similarly, relocate the authorship information for HdrHistogram into the
  source code. Since it's a public domain library, it doesn't require
  any attribution.

- Remove a stray Cloudera copyright notice from python/Makefile

- Add a copy of the Boost license to the thirdparty/boost_uuid/ directory
  and reference it from the top-level LICENSE instead of copying it. This
  license does not require attribution in binary distributions.

- Fix the reference to the WebRTC code to include random.h (not just
  random-util.cc)

Change-Id: I95e5a86128e677839c84e209ba4f7c910c33517d
Reviewed-on: http://gerrit.cloudera.org:8080/2417
Reviewed-by: Jean-Daniel Cryans
Tested-by: Todd Lipcon <to...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/cd3aa0ee
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/cd3aa0ee
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/cd3aa0ee

Branch: refs/heads/master
Commit: cd3aa0ee636ce7008750922d5b55620aab03a97e
Parents: a3419b8
Author: Todd Lipcon <to...@apache.org>
Authored: Tue Mar 1 18:36:31 2016 -0800
Committer: Todd Lipcon <to...@apache.org>
Committed: Wed Mar 2 23:34:15 2016 +0000

----------------------------------------------------------------------
 LICENSE.txt                                     | 176 +++----------------
 cmake_modules/FindProtobuf.cmake                |  61 +++++--
 .../src/main/java/org/kududb/util/Slice.java    |   3 +
 .../src/main/java/org/kududb/util/Slices.java   |   8 +-
 python/Makefile                                 |   2 -
 src/kudu/util/hdr_histogram.cc                  |   7 +
 src/kudu/util/hdr_histogram.h                   |   9 +-
 src/kudu/util/url-coding.cc                     |  11 --
 thirdparty/boost_uuid/LICENSE.txt               |  23 +++
 9 files changed, 118 insertions(+), 182 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/LICENSE.txt
----------------------------------------------------------------------
diff --git a/LICENSE.txt b/LICENSE.txt
index 9a41fdf..5116c0c 100644
--- a/LICENSE.txt
+++ b/LICENSE.txt
@@ -201,10 +201,11 @@
    limitations under the License.
 
 --------------------------------------------------------------------------------
+
 src/kudu/gutil (some portions): Apache 2.0, and 3-clause BSD
 
-This module is derived from code in the Chromium project, copyright
-(c) Google inc and (c) The Chromium Authors and licensed under the
+Some portions of this module are derived from code in the Chromium project,
+copyright (c) Google inc and (c) The Chromium Authors and licensed under the
 Apache 2.0 License or the under the 3-clause BSD license:
 
   Copyright (c) 2013 The Chromium Authors. All rights reserved.
@@ -253,48 +254,8 @@ src/kudu/gutil/utf: licensed under the following terms:
   REPRESENTATION OR WARRANTY OF ANY KIND CONCERNING THE MERCHANTABILITY
   OF THIS SOFTWARE OR ITS FITNESS FOR ANY PARTICULAR PURPOSE.
 
-
 --------------------------------------------------------------------------------
 
-src/kudu/gutil/valgrind.h: Hybrid BSD (half BSD, half zlib)
-
-   This file is part of Valgrind, a dynamic binary instrumentation
-   framework.
-
-   Copyright (C) 2000-2008 Julian Seward.  All rights reserved.
-
-   Redistribution and use in source and binary forms, with or without
-   modification, are permitted provided that the following conditions
-   are met:
-
-   1. Redistributions of source code must retain the above copyright
-      notice, this list of conditions and the following disclaimer.
-
-   2. The origin of this software must not be misrepresented; you must
-      not claim that you wrote the original software.  If you use this
-      software in a product, an acknowledgment in the product
-      documentation would be appreciated but is not required.
-
-   3. Altered source versions must be plainly marked as such, and must
-      not be misrepresented as being the original software.
-
-   4. The name of the author may not be used to endorse or promote
-      products derived from this software without specific prior written
-      permission.
-
-   THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
-   OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-   WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
-   ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
-   DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-   DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
-   GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
-   INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
-   WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
-   NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
-   SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
---------------------------------------------------------------------------------
 src/kudu/util (some portions): 3-clause BSD license
 
 Some portions of this module are derived from code from LevelDB
@@ -329,19 +290,9 @@ Some portions of this module are derived from code from LevelDB
   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
 --------------------------------------------------------------------------------
-src/kudu/util (HdrHistogram-related classes): public domain
-
-Portions of these classes were ported from Java to C++ from the sources
-available at https://github.com/HdrHistogram/HdrHistogram .
-
-  The code in this repository code was Written by Gil Tene, Michael Barker,
-  and Matt Warren, and released to the public domain, as explained at
-  http://creativecommons.org/publicdomain/zero/1.0/
-
---------------------------------------------------------------------------------
-
-src/kudu/util/random-util.cc: some portions adapted from WebRTC project
-(modules/video_coding/main/test/test_util.cc) under a 3-clause BSD license.
+src/kudu/util/{random-util.cc},{random.h}: some portions adapted from WebRTC
+project (modules/video_coding/main/test/test_util.cc) under a 3-clause BSD
+license.
 
   Copyright (c) 2011, The WebRTC project authors. All rights reserved.
 
@@ -445,53 +396,6 @@ BSD license with an additional grant of patent rights:
 
 --------------------------------------------------------------------------------
 
-src/kudu/server/url-coding.cc: some portions adapted from the Boost project
-thirdparty/boost_uuid/:
-
-  Boost Software License - Version 1.0 - August 17th, 2003
-
-  Permission is hereby granted, free of charge, to any person or organization
-  obtaining a copy of the software and accompanying documentation covered by
-  this license (the "Software") to use, reproduce, display, distribute,
-  execute, and transmit the Software, and to prepare derivative works of the
-  Software, and to permit third-parties to whom the Software is furnished to
-  do so, all subject to the following:
-
-  The copyright notices in the Software and this entire statement, including
-  the above license grant, this restriction and the following disclaimer,
-  must be included in all copies of the Software, in whole or in part, and
-  all derivative works of the Software, unless such copies or derivative
-  works are solely in the form of machine-executable object code generated by
-  a source language processor.
-
-  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-  FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
-  SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
-  FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
-  ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
-  DEALINGS IN THE SOFTWARE.
-
---------------------------------------------------------------------------------
-
-www/bootstrap: Apache 2.0 license
-
-   Copyright 2012 Twitter, Inc
-
-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-
---------------------------------------------------------------------------------
-
 www/d3.v2.js: BSD 3-clause license
 
    Copyright (c) 2012, Michael Bostock
@@ -520,6 +424,7 @@ www/d3.v2.js: BSD 3-clause license
    OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
    NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
    EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
 --------------------------------------------------------------------------------
 
 www/epoch.*: MIT license
@@ -576,28 +481,6 @@ www/jquery*.js: MIT license
 
 --------------------------------------------------------------------------------
 
-java/kudu-client/src/main/java/org/kududb/util/: Slice.java and Slices.java
-
-  Derived from the LevelDB Java project at https://github.com/dain/leveldb/
-  Licensed under the Apache 2.0 license with the following copyrights:
-
-  Copyright 2011 Dain Sundstrom <da...@iq80.com>
-  Copyright 2011 FuseSource Corp. http://fusesource.com
-
-  Licensed under the Apache License, Version 2.0 (the "License");
-  you may not use this file except in compliance with the License.
-  You may obtain a copy of the License at
-
-      http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License.
-
---------------------------------------------------------------------------------
-
 java/kudu-client/: Some classes are derived from the AsyncHBase project
 under the following 3-clause BSD license:
 
@@ -626,34 +509,33 @@ under the following 3-clause BSD license:
   ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   POSSIBILITY OF SUCH DAMAGE.
 
---------------------------------------------------------------------------------
-
-.ycm_extra_conf.py: public domain
+================================================================================
 
-  This is free and unencumbered software released into the public domain.
+The following dependencies or pieces of incorporated source code have licenses
+such that either:
+  (a) do not require their license text to be re-distributed with binary
+      distributions, or
+  (b) have no requirements about re-distributing their license text in either
+      source or binary distributions, or
+  (c) are the same Apache 2.0 license reproduced in its entirety above.
 
-  Anyone is free to copy, modify, publish, use, compile, sell, or
-  distribute this software, either in source code form or as a compiled
-  binary, for any purpose, commercial or non-commercial, and by any
-  means.
+Therefore, we do not reproduce their licenses in their entirety in this file.
+--------------------------------------------------------------------------------
 
-  In jurisdictions that recognize copyright laws, the author or authors
-  of this software dedicate any and all copyright interest in the
-  software to the public domain. We make this dedication for the benefit
-  of the public at large and to the detriment of our heirs and
-  successors. We intend this dedication to be an overt act of
-  relinquishment in perpetuity of all present and future rights to this
-  software under copyright law.
+.ycm_extra_conf.py: public domain
+cmake_modules/FindGMock.cmake: MIT license
+cmake_modules/FindProtobuf.cmake: BSD 3-clause license
+src/kudu/util (HdrHistogram-related classes): public domain
+src/kudu/gutil/valgrind.h: Hybrid BSD (half BSD, half zlib)
+  - See the file headers for full license text.
 
-  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
-  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
-  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
-  IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
-  OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
-  ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
-  OTHER DEALINGS IN THE SOFTWARE.
+src/kudu/server/url-coding.cc (some portions): Boost software license
+thirdparty/boost_uuid/: Boost software license
+  - See thirdparty/boost_uuid/LICENSE.txt
 
-  For more information, please refer to <http://unlicense.org/>
+www/bootstrap: Apache 2.0 license
+java/kudu-client/src/main/java/org/kududb/util/{Slice,Slices}.java: Apache 2.0 license
+  - See above for full text.
 
 ================================================================================
 

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/cmake_modules/FindProtobuf.cmake
----------------------------------------------------------------------
diff --git a/cmake_modules/FindProtobuf.cmake b/cmake_modules/FindProtobuf.cmake
index 128bbf2..ab393c7 100644
--- a/cmake_modules/FindProtobuf.cmake
+++ b/cmake_modules/FindProtobuf.cmake
@@ -14,6 +14,51 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+#
+#=============================================================================
+# This file is heavily modified/rewritten from FindProtobuf.cmake from the
+# CMake project:
+#
+#   Copyright 2011 Kirill A. Korinskiy <ca...@catap.ru>
+#   Copyright 2009 Kitware, Inc.
+#   Copyright 2009 Philip Lowman <ph...@yhbt.com>
+#   Copyright 2008 Esben Mose Hansen, Ange Optimization ApS
+#
+#   Distributed under the OSI-approved BSD License (the "License"):
+#
+#   CMake - Cross Platform Makefile Generator
+#   Copyright 2000-2015 Kitware, Inc.
+#   Copyright 2000-2011 Insight Software Consortium
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#   * Redistributions of source code must retain the above copyright
+#     notice, this list of conditions and the following disclaimer.
+#
+#   * Redistributions in binary form must reproduce the above copyright
+#     notice, this list of conditions and the following disclaimer in the
+#     documentation and/or other materials provided with the distribution.
+#
+#   * Neither the names of Kitware, Inc., the Insight Software Consortium,
+#     nor the names of their contributors may be used to endorse or promote
+#     products derived from this software without specific prior written
+#     permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#=============================================================================
 
 #########
 # Local rewrite of the protobuf support in cmake.
@@ -62,22 +107,6 @@
 #          in order to "serialize" the protoc invocations
 #  ====================================================================
 
-#=============================================================================
-# Copyright 2011 Kirill A. Korinskiy <ca...@catap.ru>
-# Copyright 2009 Kitware, Inc.
-# Copyright 2009 Philip Lowman <ph...@yhbt.com>
-# Copyright 2008 Esben Mose Hansen, Ange Optimization ApS
-#
-# Distributed under the OSI-approved BSD License (the "License");
-# see accompanying file Copyright.txt for details.
-#
-# This software is distributed WITHOUT ANY WARRANTY; without even the
-# implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-# See the License for more information.
-#=============================================================================
-# (To distributed this file outside of CMake, substitute the full
-#  License text for the above reference.)
-
 function(PROTOBUF_GENERATE_CPP SRCS HDRS TGTS)
   if(NOT ARGN)
     message(SEND_ERROR "Error: PROTOBUF_GENERATE_CPP() called without any proto files")

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/java/kudu-client/src/main/java/org/kududb/util/Slice.java
----------------------------------------------------------------------
diff --git a/java/kudu-client/src/main/java/org/kududb/util/Slice.java b/java/kudu-client/src/main/java/org/kududb/util/Slice.java
index a2d5ad1..c9d2719 100644
--- a/java/kudu-client/src/main/java/org/kududb/util/Slice.java
+++ b/java/kudu-client/src/main/java/org/kududb/util/Slice.java
@@ -12,6 +12,9 @@
  * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the
  * License for the specific language governing permissions and limitations
  * under the License.
+ *
+ * Copyright 2011 Dain Sundstrom <da...@iq80.com>
+ * Copyright 2011 FuseSource Corp. http://fusesource.com
  */
 package org.kududb.util;
 

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/java/kudu-client/src/main/java/org/kududb/util/Slices.java
----------------------------------------------------------------------
diff --git a/java/kudu-client/src/main/java/org/kududb/util/Slices.java b/java/kudu-client/src/main/java/org/kududb/util/Slices.java
index 7fb9f17..c2cdbde 100644
--- a/java/kudu-client/src/main/java/org/kududb/util/Slices.java
+++ b/java/kudu-client/src/main/java/org/kududb/util/Slices.java
@@ -1,9 +1,4 @@
 /**
- * Copyright (C) 2011 the original author or authors.
- *
- * See the LICENSE.txt file distributed with this work for additional
- * information regarding copyright ownership.
- *
  * Licensed to the Apache Software Foundation (ASF) under one
  * or more contributor license agreements.  See the NOTICE file
  * distributed with this work for additional information
@@ -20,6 +15,9 @@
  * KIND, either express or implied.  See the License for the
  * specific language governing permissions and limitations
  * under the License.
+ *
+ * Copyright 2011 Dain Sundstrom <da...@iq80.com>
+ * Copyright 2011 FuseSource Corp. http://fusesource.com
  */
 package org.kududb.util;
 

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/python/Makefile
----------------------------------------------------------------------
diff --git a/python/Makefile b/python/Makefile
index 3fdd0b6..f8df953 100644
--- a/python/Makefile
+++ b/python/Makefile
@@ -1,5 +1,3 @@
-# Copyright 2016 Cloudera, Inc.
-#
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/src/kudu/util/hdr_histogram.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/hdr_histogram.cc b/src/kudu/util/hdr_histogram.cc
index 43bd2a9..e510df3 100644
--- a/src/kudu/util/hdr_histogram.cc
+++ b/src/kudu/util/hdr_histogram.cc
@@ -14,6 +14,13 @@
 // KIND, either express or implied.  See the License for the
 // specific language governing permissions and limitations
 // under the License.
+//
+// Portions of these classes were ported from Java to C++ from the sources
+// available at https://github.com/HdrHistogram/HdrHistogram .
+//
+//   The code in this repository code was Written by Gil Tene, Michael Barker,
+//   and Matt Warren, and released to the public domain, as explained at
+//   http://creativecommons.org/publicdomain/zero/1.0/
 #include "kudu/util/hdr_histogram.h"
 
 #include <algorithm>

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/src/kudu/util/hdr_histogram.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/hdr_histogram.h b/src/kudu/util/hdr_histogram.h
index bbfedd1..19e31cc 100644
--- a/src/kudu/util/hdr_histogram.h
+++ b/src/kudu/util/hdr_histogram.h
@@ -18,7 +18,14 @@
 #define KUDU_UTIL_HDRHISTOGRAM_H_
 
 // C++ (TR1) port of HdrHistogram.
-// Original java implementation: http://giltene.github.io/HdrHistogram/
+//
+// Portions of these classes were ported from Java to C++ from the sources
+// available at https://github.com/HdrHistogram/HdrHistogram .
+//
+//   The code in this repository code was Written by Gil Tene, Michael Barker,
+//   and Matt Warren, and released to the public domain, as explained at
+//   http://creativecommons.org/publicdomain/zero/1.0/
+// ---------------------------------------------------------------------------
 //
 // A High Dynamic Range (HDR) Histogram
 //

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/src/kudu/util/url-coding.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/url-coding.cc b/src/kudu/util/url-coding.cc
index 4aae48e..6c3f26c 100644
--- a/src/kudu/util/url-coding.cc
+++ b/src/kudu/util/url-coding.cc
@@ -15,17 +15,6 @@
 // specific language governing permissions and limitations
 // under the License.
 //
-// Licensed under the Apache License, Version 2.0 (the "License");
-// you may not use this file except in compliance with the License.
-// You may obtain a copy of the License at
-//
-// http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing, software
-// distributed under the License is distributed on an "AS IS" BASIS,
-// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-// See the License for the specific language governing permissions and
-// limitations under the License.
 
 #include "kudu/util/url-coding.h"
 

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/cd3aa0ee/thirdparty/boost_uuid/LICENSE.txt
----------------------------------------------------------------------
diff --git a/thirdparty/boost_uuid/LICENSE.txt b/thirdparty/boost_uuid/LICENSE.txt
new file mode 100644
index 0000000..0076b2d
--- /dev/null
+++ b/thirdparty/boost_uuid/LICENSE.txt
@@ -0,0 +1,23 @@
+  Boost Software License - Version 1.0 - August 17th, 2003
+
+  Permission is hereby granted, free of charge, to any person or organization
+  obtaining a copy of the software and accompanying documentation covered by
+  this license (the "Software") to use, reproduce, display, distribute,
+  execute, and transmit the Software, and to prepare derivative works of the
+  Software, and to permit third-parties to whom the Software is furnished to
+  do so, all subject to the following:
+
+  The copyright notices in the Software and this entire statement, including
+  the above license grant, this restriction and the following disclaimer,
+  must be included in all copies of the Software, in whole or in part, and
+  all derivative works of the Software, unless such copies or derivative
+  works are solely in the form of machine-executable object code generated by
+  a source language processor.
+
+  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+  FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
+  SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
+  FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
+  ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+  DEALINGS IN THE SOFTWARE.