You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Abhishek Rawat (Code Review)" <ge...@cloudera.org> on 2019/04/08 23:09:04 UTC

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Abhishek Rawat has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12956


Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................

IMPALA-8375: Add metrics for spill disk usage

Added a new metric tmp-file-mgr.scratch-space-bytes-used-high-water-mark
for tracking HWM for spilled bytes.

A new class AtomicHighWaterMarkMetric was added to keep track of the
current value and the HWM value. The current value for spilled bytes is
incremented every time a new range is allocated from a temporary file.
The current value for spilled bytes is decremented when a temporary file
is closed. The new metric is not updated when ranges are recycled from a
file. We can add a new metric in future for keeping track of actual
spilled bytes. The HWM value is updated whenever the current value is
greater than the HWM value.

Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
---
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
5 files changed, 98 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/12956/3
-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 6:

Thanks for your first contribution!


-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Apr 2019 03:56:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4028/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 5
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Apr 2019 22:09:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2807/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 5
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Apr 2019 22:06:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2691/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Apr 2019 23:40:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................

IMPALA-8375: Add metrics for spill disk usage

Added two new metrics tmp-file-mgr.scratch-space-bytes-used-high-water-mark
& tmp-file-mgr.scratch-space-bytes-used for tracking HWM and current
value for spilled bytes, respectively.

A new class AtomicHighWaterMarkGauge was added to keep track of the HWM
value. The new class also encapsulates a metric object which keeps track
of the current value for the spilled bytes.

The current value is incremented every time a new range is allocated from
a temporary file. The current value for spilled bytes is decremented when
a temporary file is closed. The new metrics are not updated when ranges
are recycled from a file. We can add a new metric in future for keeping
track of actual spilled bytes. The HWM value is updated whenever the
current value is greater than the HWM value.

Testing:
- Added new unit tests to the metrics-test test case.
- E2E testing for both the metrics by running concurrent spilling queries
  and ensuring that both the current value metric and the HWM metric were
  behaving as expected. Ran concurrent queries and monitored the metrics
  on the impala daemon's metric page.

Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Reviewed-on: http://gerrit.cloudera.org:8080/12956
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/CMakeLists.txt
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
6 files changed, 135 insertions(+), 3 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 5
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Apr 2019 22:09:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Abhishek Rawat (Code Review)" <ge...@cloudera.org>.
Abhishek Rawat has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................

IMPALA-8375: Add metrics for spill disk usage

Added two new metrics tmp-file-mgr.scratch-space-bytes-used-high-water-mark
& tmp-file-mgr.scratch-space-bytes-used for tracking HWM and current
value for spilled bytes, respectively.

A new class AtomicHighWaterMarkGauge was added to keep track of the HWM
value. The new class also encapsulates a metric object which keeps track
of the current value for the spilled bytes.

The current value is incremented every time a new range is allocated from
a temporary file. The current value for spilled bytes is decremented when
a temporary file is closed. The new metrics are not updated when ranges
are recycled from a file. We can add a new metric in future for keeping
track of actual spilled bytes. The HWM value is updated whenever the
current value is greater than the HWM value.

Testing:
- Added new unit tests to the metrics-test test case.
- E2E testing for both the metrics by running concurrent spilling queries
  and ensuring that both the current value metric and the HWM metric were
  behaving as expected. Ran concurrent queries and monitored the metrics
  on the impala daemon's metric page.

Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
---
M be/CMakeLists.txt
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
6 files changed, 134 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/12956/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12956/4/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/12956/4/be/src/util/metrics.h@254
PS4, Line 254: /// maintains the current value. Note that since two separate atomics are used
> I have made the suggested change. I was looking for a standard way to docum
I'm open to the idea. We use the three-slash comments so that they're picked up by doxygen but don't use any other features.

Some of the earlier authors didn't particularly like overly-structured javadoc-style comments - sometimes people get over-zealous and start over-documenting obvious parameters and return values.

dev@impala.apache.org is a good place to float ideas like this, in the past we've made changes to the style guide based on informal consensus on the dev list and implemented them incrementally as code was touched.



-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Apr 2019 22:15:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Abhishek Rawat (Code Review)" <ge...@cloudera.org>.
Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 4:

(7 comments)

Addressed review comments on patch #3.

http://gerrit.cloudera.org:8080/#/c/12956/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12956/3//COMMIT_MSG@20
PS3, Line 20: are recycled from a file. We can add a new metric in future for keeping
> Can you add a brief "Testing:" section to describe the automated tests you 
Done


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/runtime/tmp-file-mgr.cc@285
PS3, Line 285:     Status status = file->Remove();
> nit: unnecessary parentheses
Done


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics-test.cc
File be/src/util/metrics-test.cc:

http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics-test.cc@112
PS3, Line 112:   AddMetricDef("hwm_gauge", TMetricKind::GAUGE, TUnit::NONE);
> I think if you cast this to the base class IntGauge the behaviour will chan
Addressed this by making the new class AtomicHighWaterMarkGauge inherit directly from ScalarMetric<>.


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h@255
PS3, Line 255: /// for maintaining the current value and HWM, they could be out of sync for a short
> I'm wondering if it's better to have this implemented as two metrics - one 
Done


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h@263
PS3, Line 263:     : ScalarMetric<int64_t, TMetricKind::GAUGE>(metric_def),
> Let's mark it is an override so it's clear that it's overriding something f
Done


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h@269
PS3, Line 269:   ~AtomicHighWaterMarkGauge() {}
> I think we have a problem here - SetValue() is not a virtual function so we
Addressed this by making the new class AtomicHighWaterMarkGauge inherit directly from ScalarMetric<>.


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h@276
PS3, Line 276:     current_value_->SetValue(value);
> Same issue here with it being non-virtual.
Addressed this by making the new class AtomicHighWaterMarkGauge inherit directly from ScalarMetric<>.



-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sun, 14 Apr 2019 23:56:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 4:

(4 comments)

Looks good, just had a few comments about comments and naming.

http://gerrit.cloudera.org:8080/#/c/12956/4/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

http://gerrit.cloudera.org:8080/#/c/12956/4/be/src/runtime/tmp-file-mgr.h@429
PS4, Line 429: scratch_space_bytes_used_high_water_mark_metric_
This could be more concise, e.g. scratch_bytes_used_metric_


http://gerrit.cloudera.org:8080/#/c/12956/4/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/12956/4/be/src/util/metrics.h@254
PS4, Line 254: /// maintains the current value. Note that since two separate atomics are used
Everything except the first sentence is describing implementation details. This is fine, but it's helpful to separate from the actual interface. Maybe just have as a separate implementation notes section, e.g.

  /// ... description of public interface...
  ///
  /// Implementation notes:
  /// ...


http://gerrit.cloudera.org:8080/#/c/12956/4/be/src/util/metrics.h@271
PS4, Line 271: Atomicaly
Atomically.


http://gerrit.cloudera.org:8080/#/c/12956/4/be/src/util/metrics.h@271
PS4, Line 271: hwm_value_
Prefer not to mention private members in public interface. Could just rewrite as:

  /// Returns the current high water mark value.



-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Apr 2019 05:44:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 5: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 5
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Apr 2019 03:19:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 6:

Please close the JIRA and set the "Fix version" when you have a chance.


-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Apr 2019 03:57:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Abhishek Rawat (Code Review)" <ge...@cloudera.org>.
Abhishek Rawat has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................

IMPALA-8375: Add metrics for spill disk usage

Added two new metrics tmp-file-mgr.scratch-space-bytes-used-high-water-mark
& tmp-file-mgr.scratch-space-bytes-used for tracking HWM and current
value for spilled bytes, respectively.

A new class AtomicHighWaterMarkGauge was added to keep track of the HWM
value. The new class also encapsulates a metric object which keeps track
of the current value for the spilled bytes.

The current value is incremented every time a new range is allocated from
a temporary file. The current value for spilled bytes is decremented when
a temporary file is closed. The new metrics are not updated when ranges
are recycled from a file. We can add a new metric in future for keeping
track of actual spilled bytes. The HWM value is updated whenever the
current value is greater than the HWM value.

Testing:
- Added new unit tests to the metrics-test test case.
- E2E testing for both the metrics by running concurrent spilling queries
  and ensuring that both the current value metric and the HWM metric were
  behaving as expected. Ran concurrent queries and monitored the metrics
  on the impala daemon's metric page.

Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
---
M be/CMakeLists.txt
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
6 files changed, 135 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/12956/5
-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 5
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2774/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 15 Apr 2019 00:40:05 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8375: Add metrics for spill disk usage

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12956 )

Change subject: IMPALA-8375: Add metrics for spill disk usage
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/12956/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12956/3//COMMIT_MSG@20
PS3, Line 20: 
Can you add a brief "Testing:" section to describe the automated tests you added and any manual tests you ran (e.g. looked at the metrics page.

I find that helpful for reviewing.


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/runtime/tmp-file-mgr.cc@285
PS3, Line 285:           -1 * (scratch_space_bytes_used_counter_->value()));
nit: unnecessary parentheses


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics-test.cc
File be/src/util/metrics-test.cc:

http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics-test.cc@112
PS3, Line 112:   IntHWMGauge* int_hwm_gauge = metrics.AddHWMGauge("gauge", 0);
I think if you cast this to the base class IntGauge the behaviour will change because of the non-virtual methods.


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h@255
PS3, Line 255: class AtomicHighWaterMarkMetric : public IntGauge {
I'm wondering if it's better to have this implemented as two metrics - one which is the underlying metric and other which is the high water mark. I think if the HWM is useful then the other metric is also useful to expose. E.g. this could be a wrapper around an AtomicMetric similar to NegatedGauge below.


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h@263
PS3, Line 263:   int64_t GetValue() { return hwm_value_.Load(); }
Let's mark it is an override so it's clear that it's overriding something from the base class.


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h@269
PS3, Line 269:   void SetValue(const int64_t& value) {
I think we have a problem here - SetValue() is not a virtual function so we can easily call the wrong implementation if we have a pointer to IntGauge rather than AtomicHighWaterMarkMetric.


http://gerrit.cloudera.org:8080/#/c/12956/3/be/src/util/metrics.h@276
PS3, Line 276:   int64_t Increment(int64_t delta) {
Same issue here with it being non-virtual.

I think if we implemented this as a subclass of ScalarMetric<> wrapping an AtomicMetric, we would avoid this issue since it wouldn't be in the same hierarchy as AtomicMetric.



-- 
To view, visit http://gerrit.cloudera.org:8080/12956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1b3dd604c7234a8d8af34d70ca731544a46d298
Gerrit-Change-Number: 12956
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Rawat <ar...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Apr 2019 23:50:25 +0000
Gerrit-HasComments: Yes