You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2019/07/10 07:02:40 UTC

[kudu-CR] [metrics] Merge metrics by the same attribute

Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13580 )

Change subject: [metrics] Merge metrics by the same attribute
......................................................................


Patch Set 5:

(37 comments)

I didn't review metrics-test.cc yet.

http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/server/default_path_handlers.cc
File src/kudu/server/default_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/server/default_path_handlers.cc@353
PS5, Line 353:     if (entity_type == "tablet") {
             :       // Entities in type of 'tablet' who have the same attribute of 'table_name' will be merged
             :       // to a new entity with type 'table' and id the value of 'table_name'.
             :       EmplaceIfNotPresent(&opts.merge_attributes_by_entity_type, entity_type,
             :           new MetricJsonOptions::MergeAttributes("table", "table_name"));
             :     }
This leaks table/tablet concepts into the server/ module, which, like util/ should remain generic.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram-test.cc
File src/kudu/util/hdr_histogram-test.cc:

http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram-test.cc@133
PS5, Line 133:   hist.Merge(other);
Should ASSERT_TRUE here.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram-test.cc@137
PS5, Line 137:   ASSERT_EQ(hist.MinValue(), std::min(old.MinValue(), other.MinValue()));
             :   ASSERT_EQ(hist.MaxValue(), std::max(old.MaxValue(), other.MaxValue()));
Maybe just test for 1 and 250 here instead? Would be clearer.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram-test.cc@139
PS5, Line 139:   ASSERT_LE(hist.MeanValue() - (1 + 250) / 2.0, 1e-6);
Maybe ASSERT_NEAR would help here (and maybe below too)?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram.h
File src/kudu/util/hdr_histogram.h:

http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram.h@174
PS5, Line 174:   bool Merge(const HdrHistogram& other);
Doc.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram.cc
File src/kudu/util/hdr_histogram.cc:

http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram.cc@348
PS5, Line 348:   if (PREDICT_FALSE(highest_trackable_value_ != other.highest_trackable_value())) {
             :     LOG(WARNING) << "highest_trackable_value_ not match.";
             :     return false;
             :   }
             : 
             :   if (PREDICT_FALSE(num_significant_digits_ != other.num_significant_digits())) {
             :     LOG(WARNING) << "num_significant_digits_ not match.";
             :     return false;
             :   }
Should these DCHECK/CHECK instead, like L358-363? Under what circumstances would you expect to see this case at runtime?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/hdr_histogram.cc@368
PS5, Line 368:   // Update min, if needed.
             :   {
             :     Atomic64 min_val;
             :     while (other.min_value_ < (min_val = MinValue())) {
             :       Atomic64 old_val = NoBarrier_CompareAndSwap(&min_value_, min_val, other.min_value_);
             :       if (PREDICT_TRUE(old_val == min_val)) break; // CAS success.
             :     }
             :   }
             : 
             :   // Update max, if needed.
             :   {
             :     Atomic64 max_val;
             :     while (other.max_value_ > (max_val = MaxValue())) {
             :       Atomic64 old_val = NoBarrier_CompareAndSwap(&max_value_, max_val, other.max_value_);
             :       if (PREDICT_TRUE(old_val == max_val)) break; // CAS success.
             :     }
             :   }
             : 
             :   for (int i = 0; i < counts_array_length_; i++) {
             :     Atomic64 count = NoBarrier_Load(&other.counts_[i]);
             :     if (count > 0) {
             :       NoBarrier_AtomicIncrement(&counts_[i], count);
             :     }
             :   }
Could you share this with IncrementBy rather than copying it?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h
File src/kudu/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@226
PS5, Line 226: #include <string.h>
Nit: prefer <cstring>


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@445
PS5, Line 445: set
setting


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@467
PS5, Line 467:   struct MergeAttributes : public RefCountedThreadSafe<MergeAttributes> {
Not clear why this needs to be RefCountedThreadSafe.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@474
PS5, Line 474: merge
merged


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@477
PS5, Line 477: in
Nit: "is in"


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@478
PS5, Line 478: to acknowledge
Nit: replace with "for"


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@480
PS5, Line 480: name
Nit "name is"


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@512
PS5, Line 512: class MetricPrototype {
Did this change at all, or was it just moved?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@567
PS5, Line 567: MetricPrototypeEqual
Convention is to call these EqualTo rather than Equal.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@573
PS5, Line 573: struct MetricCollectionEntity {
Doc.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@579
PS5, Line 579: 
             : struct MetricCollectionEntityHash {
             :   size_t operator()(const MetricCollectionEntity& entity) const {
             :     return std::hash<std::string>()(entity.id_);
             :   }
             : };
             : 
             : struct MetricCollectionEntityEqual {
             :   bool operator()(const MetricCollectionEntity& first, const MetricCollectionEntity& second) const {
             :     return first.type_ == second.type_ && first.id_ == second.id_;
             :   }
             : };
Why does Hash include just id_, but Equal compare both id_ and type_? That seems inconsistent.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@627
PS5, Line 627:   Status CollectTo(MetricCollection* collections,
Doc. Also not clear why, if this doesn't write JSON, it takes a MetricJsonOptions. Maybe MetricJsonOptions should be broken into different classes that are composed together?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@681
PS5, Line 681: Get attributes and snapshot of metrics
Nit: "Get a snapshot of the entity's metrics as well as the entity's attributes..."


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@683
PS5, Line 683:   // Return Status::NotFound if all attributes or metrics are filtered,
             :   // otherwise return Status::OK.
Couldn't we just test for this by looking at the size of 'metrics' and 'attrs'? Or is it important to distinguish between "everything was filtered" and "there were no attributes and metrics to begin with"?

Looking at the implementation, it seems that every instance of NotFound is just converted back to OK. So it's not clear why we care about the distinction here.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@685
PS5, Line 685: MetricJsonOptions
Seems like a poor fit given there's no JSON involved here.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@711
PS5, Line 711: , we
             :   // can use the new one for some other use, like Merge with another one.
You can drop this part of the comment, I don't think it's necessary.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@923
PS5, Line 923: std::set<std::string> initial_unique_values = {}
Where is this used?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@934
PS5, Line 934: values
Call this "unique_values()"? Or doc what it means if it's not obvious.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@937
PS5, Line 937:   std::set<std::string> unique_values_;
This only exists to support merging, right? May want to use unordered_set instead; it may be more efficient.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@951
PS5, Line 951: dynamic_cast
Use down_cast (gutil/casts) instead of dynamic_cast.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@1057
PS5, Line 1057:     // The bounded function is associated with another MetricEntity instance, here we don't know
              :     // when it release, it's not safe to keep the function as a member, so it's needed to
              :     // call DetachToCurrentValue() to make it safe.
This is pretty icky because it violates the clone() contract (of doing a deep copy).


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@1113
PS5, Line 1113:   // value() will be constant after Merge()
Is this why we need to deep copy metrics?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@1215
PS5, Line 1215:       scoped_refptr<Metric> m = new Histogram(dynamic_cast<const HistogramPrototype*>(prototype_),
Nit: indentation.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.h@1262
PS5, Line 1262:   Histogram(const HistogramPrototype* proto, const HdrHistogram& hdr_hist);
Where is this used?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.cc
File src/kudu/util/metrics.cc:

http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.cc@72
PS5, Line 72: 
Shouldn't we also (optionally) write entity attributes?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.cc@75
PS5, Line 75:     for (const auto& val : collection.second) {
Why doesn't this replicate the epoch/untouched check that WriteAsJson does?

On that note, could we reuse that code instead of duplicating it?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.cc@365
PS5, Line 365:   std::pair<typename MetricCollection::iterator, bool> ret =
             :       collections->insert(typename MetricCollection::value_type(e, CollectMetrics()));
Is there a gutil/map-util.h function that'd be appropriate (and more clear) here?


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.cc@461
PS5, Line 461: opts.include_unmerged_entities_if_merged
Is this an important use case for you? It adds complexity and there's an annoyance here where we actually end up taking two different "snapshots" of the metrics. Meaning, if you manually compared the collected/merged values with the unmerged values, you might see discrepancies if there were changes to metric values in between.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.cc@671
PS5, Line 671: std::
Don't need std:: prefixes in these sets and strings.


http://gerrit.cloudera.org:8080/#/c/13580/5/src/kudu/util/metrics.cc@710
PS5, Line 710:   std::lock_guard<simple_spinlock> l(lock_);
Hairy locking: we'll hold two StringGauge locks together at the same time for every pair being merged.



-- 
To view, visit http://gerrit.cloudera.org:8080/13580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8db3d082ae847eb1d83b9e4aee57d5e4bf13e1b5
Gerrit-Change-Number: 13580
Gerrit-PatchSet: 5
Gerrit-Owner: Yingchun Lai <40...@qq.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yingchun Lai <40...@qq.com>
Gerrit-Comment-Date: Wed, 10 Jul 2019 07:02:40 +0000
Gerrit-HasComments: Yes