You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/10/03 20:00:01 UTC
[jira] [Commented] (IMPALA-9382) Prototype denser runtime profile implementation

    [ https://issues.apache.org/jira/browse/IMPALA-9382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206847#comment-17206847 ] 

ASF subversion and git services commented on IMPALA-9382:
---------------------------------------------------------

Commit e60292fb3bd71f25b90119d0d48292f4c49e158f in impala's branch refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e60292f ]

IMPALA-9711: incrementally update aggregate profile

In order to not cause additional work in the default mode,
we still only compute the average once per instance,
when it completes or when the query finishes.

When --gen_experimental_profile=true, we update the aggregated
profile for each status report, so that the live profile
can be viewed as the query executes.

The implications of this are as follows:
* More work is done on the KRPC control service RPC thread
  (although this is largely moot after part 2 of IMPALA-9382
   where we merge into the aggregated profile directly,
   so avoid the extra update).
* For complex multi-stage queries, the profile merging
  work is done earlier as each stage completes, therefore
  the critical path of the query is shortened
* Multiple RPC threads may be merging profiles concurrently
* Multiple threads may be calling AggregatedRuntimeProfile::Update()
  on the same profile, whereas previously all merging was done by
  a single thread. I looked through the locking in that function to
  check correctness.

Testing:
Ran core tests.

Ran a subset of the Python tests under TSAN, confirmed no races
were introduced in this code.

Change-Id: Ib03e79a40a33d8e74464640ae5f95a1467a6713a
Reviewed-on: http://gerrit.cloudera.org:8080/15931
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Prototype denser runtime profile implementation
> -----------------------------------------------
>
>                 Key: IMPALA-9382
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9382
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>         Attachments: profile_504b379400cba9f2_2d2cf00700000000, tpcds_q10_profile_v1.txt, tpcds_q10_profile_v2.txt, tpcds_q10_profile_v2.txt
>
>
> RuntimeProfile trees can potentially stress the memory allocator and use up a lot more memory and cache than is really necessary:
> * std::map is used throughout, and allocates a node per map entry. We do depend on the counters being displayed in-order, but we would probably be better of storing the counters in a vector and lazily sorting when needed (since the set of counters is generally static after Prepare()).
> * We store the same counter names redundantly all over the place. We'd probably be best off using a pool of constant counter names (we could just require registering them upfront).
> There may be a small gain from switching thrift to using unordered_map, e.g. for the info strings that appear with some frequency in profiles.
> However, I think we need to restructure the thrift representation and in-memory representation to get significant gains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org