You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/11/27 19:45:00 UTC

[jira] [Commented] (IMPALA-9382) Prototype denser runtime profile implementation

    [ https://issues.apache.org/jira/browse/IMPALA-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239822#comment-17239822 ] 

ASF subversion and git services commented on IMPALA-9382:
---------------------------------------------------------

Commit 9429bd779de986d3e61858bef7e258bd73a2cacd in impala's branch refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=9429bd7 ]

IMPALA-9382: part 2/3: aggregate profiles sent to coordinator

This reworks the status reporting so that serialized
AggregatedRuntimeProfile objects are sent from executors
to coordinators. These profiles are substantially denser
and faster to process for higher mt_dop values. The aggregation
is also done in a single step, merging the aggregated thrift
profile from the executor directly into the final aggregated
profile, instead of converting it to an unaggregated profile
first.

The changes required were:
* A new Update() method for AggregatedRuntimeProfile that
  updates the profile from a serialised AggregateRuntimeProfile
  for a subset of the instances. The code is generalized from the
  existing InitFromThrift() code path.
* Per-fragment reports included in the status report protobuf
  when --gen_experimental_profile=true.
* Logic on the coordinator that either consumes serialized
  AggregatedRuntimeProfile per fragment, when
  --gen_experimental_profile=true, or consumes a serialized
  RuntimeProfile per finstance otherwise.

This also adds support for event sequences and time series
in the aggregated profile, so the amount of information
in the aggregated profile is now on par with the basic profile.

We also finish off support for JSON profile. The JSON profile is
more stripped down because we do not need to round-trip profiles
via JSON and it is a much less dense profile representation.

Part 3 will clean up and improve the display of the profile.

Testing:
* Add sanity tests for aggregated runtime profile.
* Add unit tests to exercise aggregation of the various counter types
* Ran core tests.

Change-Id: Ic680cbfe94c939c2a8fad9d0943034ed058c6bca
Reviewed-on: http://gerrit.cloudera.org:8080/16057
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Prototype denser runtime profile implementation
> -----------------------------------------------
>
>                 Key: IMPALA-9382
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9382
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>         Attachments: profile_504b379400cba9f2_2d2cf00700000000, tpcds_q10_profile_v1.txt, tpcds_q10_profile_v2.txt, tpcds_q10_profile_v2.txt
>
>
> RuntimeProfile trees can potentially stress the memory allocator and use up a lot more memory and cache than is really necessary:
> * std::map is used throughout, and allocates a node per map entry. We do depend on the counters being displayed in-order, but we would probably be better of storing the counters in a vector and lazily sorting when needed (since the set of counters is generally static after Prepare()).
> * We store the same counter names redundantly all over the place. We'd probably be best off using a pool of constant counter names (we could just require registering them upfront).
> There may be a small gain from switching thrift to using unordered_map, e.g. for the info strings that appear with some frequency in profiles.
> However, I think we need to restructure the thrift representation and in-memory representation to get significant gains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org