You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org> on 2020/05/01 22:43:30 UTC

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Thomas Tauber-Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15844


Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................

IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

The new admission control service will be written in protobuf, so
there are various admission control related structures currently
stored in Thrift that it would be convenient to convert to protobuf,
to minimize the amount of converting back and forth that needs to be
done.

This patch converts some portions of TExecPlanFragmentInfo to
protobuf. TExecPlanFragmentInfo is sent as a sidecar with the Exec()
rpc, so the refactored parts are now just directly included in the
ExecQueryFInstancesRequestPB.

The portions that are converted are those that are part of the
QuerySchedule, in particular the TPlanFragmentDestination,
TScanRangeParams, and TJoinBuildInput.

This patch is just a refactor and doesn't contain any functional
changes.

Testing:
- Passed a full run of existing tests.

Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exec/scan-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/rpc/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/fe-support.cc
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
M common/protobuf/control_service.proto
A common/protobuf/planner.proto
M common/thrift/ImpalaInternalService.thrift
49 files changed, 617 insertions(+), 372 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/15844/1
-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/15844/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15844/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
> This patch is the last set of stuff I'm planning on refactoring for this pr
yeah, just having trouble understanding why these specific parts of the QuerySchedule need to be converted to protobuf? If there was a design doc for the global admission controller along with the kRPC interface we plan to use for communication between the coordinator and the global admission controller, that would probably help a lot.

it would be nice to know why the conversion to protobuf is strictly necessary for use with kRPC (I'm assuming thats why we are doing this?). I see that we sometimes just serialize Thrift objects and set them as kRPC sidecars, is there a reason we couldn't do the same thing for all these Thrift objects in QuerySchedule?

but if this is the last patch, its not a huge deal.


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/coordinator-backend-state.cc@144
PS4, Line 144:     instance_ctx.fragment_idx = fragment_idx;
             :     instance_ctx_pb->set_fragment_idx(fragment_idx);
if you decide to create the PlanFragmentInstanceCtx struct, it would be nice if it had a SetFragmentIdx method that encapsulated this.


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/fragment-state.h
File be/src/runtime/fragment-state.h:

http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/fragment-state.h@63
PS4, Line 63: pb
nit: I'm not sure suffix-ing these variables with '_pb' is adding that much value. feel like it makes the variable names more verbose than necessary

although I guess TPlanFragmentInstanceCtx and PlanFragmentInstanceCtxPB would be the one exception, since they are exactly the same.


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/fragment-state.h@163
PS4, Line 163: fragment_
its confusing to me that we call this 'fragment_' seems like 'fragment_info_' would be a better name given that the docs for TPlanFragment say:

 TPlanFragment encapsulates info needed to execute a particular plan fragment

Although I guess calling it 'fragment_' is the standard throughout the code. I'll leave it up to you to decide if its worth renaming or not.


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/fragment-state.h@167
PS4, Line 167: PlanFragmentInstanceCtxPB
for TPlanFragmentInstanceCtx / PlanFragmentInstanceCtxPB, would it make sense to have a struct to represent the logical PlanFragmentInstanceCtx? - e.g.

 struct PlanFragmentInstanceCtx {
   TPlanFragmentInstanceCtx instance_ctx_thrift_;
   PlanFragmentInstanceCtxPB instance_ctx_pb_;
 }

I haven't read through all the code yet, but I would guess you have to pass around all the objects together. I think it makes the code easier to understand as well.


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/scheduling/scheduler.cc@1054
PS4, Line 1054: CompressionType THdfsCompressionToProto(const THdfsCompression::type& compression) {
is this a copy of what has in hdfs-scan-node? should we re-factor these to a common util class?


http://gerrit.cloudera.org:8080/#/c/15844/4/common/protobuf/common.proto
File common/protobuf/common.proto:

http://gerrit.cloudera.org:8080/#/c/15844/4/common/protobuf/common.proto@44
PS4, Line 44: CompressionType
should be CompressionTypePB?



-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 May 2020 19:23:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6103/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 May 2020 23:02:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 May 2020 06:39:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 6: Code-Review+2

A few nits, but otherwise LGTM


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 6
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 May 2020 19:57:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................

IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

The new admission control service will be written in protobuf, so
there are various admission control related structures currently
stored in Thrift that it would be convenient to convert to protobuf,
to minimize the amount of converting back and forth that needs to be
done.

This patch converts some portions of TExecPlanFragmentInfo to
protobuf. TExecPlanFragmentInfo is sent as a sidecar with the Exec()
rpc, so the refactored parts are now just directly included in the
ExecQueryFInstancesRequestPB.

The portions that are converted are those that are part of the
QuerySchedule, in particular the TPlanFragmentDestination,
TScanRangeParams, and TJoinBuildInput.

This patch is just a refactor and doesn't contain any functional
changes.

One notable related change is that DataSink::CreateSink() has two
parameters removed - TPlanFragmentCtx (which no longer exists) and
TPlanFragmentInstanceCtx. These variables and the new PB eqivalents
are available via the RuntimeState that was already being passed in as
another parameter and don't need to be individually passed in.

Testing:
- Passed a full run of existing tests.
- Ran the single node perf test and didn't detect any regressions.

Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Reviewed-on: http://gerrit.cloudera.org:8080/15844
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exec/scan-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/rpc/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/fe-support.cc
M be/src/util/CMakeLists.txt
A be/src/util/compression-util.cc
A be/src/util/compression-util.h
M be/src/util/container-util.h
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
M common/protobuf/control_service.proto
A common/protobuf/planner.proto
M common/protobuf/row_batch.proto
M common/thrift/ImpalaInternalService.thrift
M common/thrift/PlanNodes.thrift
56 files changed, 699 insertions(+), 385 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 9
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 8: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 8
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 May 2020 20:06:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5930/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 May 2020 23:28:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15844/5/be/src/util/compression-util.h
File be/src/util/compression-util.h:

http://gerrit.cloudera.org:8080/#/c/15844/5/be/src/util/compression-util.h@20
PS5, Line 20: #include "gen-cpp/CatalogObjects_types.h"
            : #include "gen-cpp/common.pb.h"
can u just use forward declarations for this instead of including the whole header file?


http://gerrit.cloudera.org:8080/#/c/15844/5/be/src/util/compression-util.h@25
PS5, Line 25: Comvert
nit: typo here and below



-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 May 2020 19:57:04 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6150/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 May 2020 19:10:16 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15844

to look at the new patch set (#4).

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................

IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

The new admission control service will be written in protobuf, so
there are various admission control related structures currently
stored in Thrift that it would be convenient to convert to protobuf,
to minimize the amount of converting back and forth that needs to be
done.

This patch converts some portions of TExecPlanFragmentInfo to
protobuf. TExecPlanFragmentInfo is sent as a sidecar with the Exec()
rpc, so the refactored parts are now just directly included in the
ExecQueryFInstancesRequestPB.

The portions that are converted are those that are part of the
QuerySchedule, in particular the TPlanFragmentDestination,
TScanRangeParams, and TJoinBuildInput.

This patch is just a refactor and doesn't contain any functional
changes.

One notable related change is that DataSink::CreateSink() has two
parameters removed - TPlanFragmentCtx (which no longer exists) and
TPlanFragmentInstanceCtx. These variables and the new PB eqivalents
are available via the RuntimeState that was already being passed in as
another parameter and don't need to be individually passed in.

Testing:
- Passed a full run of existing tests.
- Ran the single node perf test and didn't detect any regressions.

Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exec/scan-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/rpc/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/fe-support.cc
M be/src/util/container-util.h
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
M common/protobuf/control_service.proto
A common/protobuf/planner.proto
M common/thrift/ImpalaInternalService.thrift
M common/thrift/PlanNodes.thrift
51 files changed, 635 insertions(+), 379 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/15844/4
-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15844

to look at the new patch set (#3).

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................

IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

The new admission control service will be written in protobuf, so
there are various admission control related structures currently
stored in Thrift that it would be convenient to convert to protobuf,
to minimize the amount of converting back and forth that needs to be
done.

This patch converts some portions of TExecPlanFragmentInfo to
protobuf. TExecPlanFragmentInfo is sent as a sidecar with the Exec()
rpc, so the refactored parts are now just directly included in the
ExecQueryFInstancesRequestPB.

The portions that are converted are those that are part of the
QuerySchedule, in particular the TPlanFragmentDestination,
TScanRangeParams, and TJoinBuildInput.

This patch is just a refactor and doesn't contain any functional
changes.

One notable related change is that DataSink::CreateSink() has two
parameters removed - TPlanFragmentCtx (which no longer exists) and
TPlanFragmentInstanceCtx. These variables and the new PB eqivalents
are available via the RuntimeState that was already being passed in as
another parameter and don't need to be individually passed in.

Testing:
- Passed a full run of existing tests.

Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exec/scan-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/rpc/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/fe-support.cc
M be/src/util/container-util.h
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
M common/protobuf/control_service.proto
A common/protobuf/planner.proto
M common/thrift/ImpalaInternalService.thrift
50 files changed, 623 insertions(+), 375 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/15844/3
-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5815/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 May 2020 01:27:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/15844/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15844/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
> yeah, just having trouble understanding why these specific parts of
 > the QuerySchedule need to be converted to protobuf? If there was a
 > design doc for the global admission controller along with the kRPC
 > interface we plan to use for communication between the coordinator
 > and the global admission controller, that would probably help a
 > lot.

There's a good design doc that Lars wrote that we've been using internally. Its a little outdated, and its good to make these things public, so I went ahead and copied it to a public google doc and made some updates:
https://docs.google.com/document/d/1nLovzk4rwZLct2yl-eT7VMHFBP7LswcJp7RqiP59RAc/edit?usp=sharing

Something that might help is if you see the final patch in this chain that pulls it all together. I submitted a WIP version: https://gerrit.cloudera.org/#/c/15961/

 > 
 > it would be nice to know why the conversion to protobuf is strictly
 > necessary for use with kRPC (I'm assuming thats why we are doing
 > this?). I see that we sometimes just serialize Thrift objects and
 > set them as kRPC sidecars, is there a reason we couldn't do the
 > same thing for all these Thrift objects in QuerySchedule?
 > 
 > but if this is the last patch, its not a huge deal.

I think serializing Thrift into sidecars ourselves makes sense if you just have a single Thrift struct, the way with Exec() we send one TExecPlanFragmentInfo per rpc. The way QuerySchedule is currently written, there are a bunch of individual Thrift structs (each of TPlanFragmentDestination, TJoinBuildInput, and TScanRangeParams have many instances per schedule) and so converting them to sidecars would have been a bunch of extra boiler plate that needed to get added.

It also might not even be possible - there's currently a limit in krpc of 10,000 sidecars per rpc, which we maybe could hit with a complicated schedule.

Another option would be to convert the whole schedule into a single TQuerySchedule, which would make a sidecar reasonable, but I think we're better off moving away from using Thrift where possible not adding even more Thrift, so while we're here it makes sense to do these couple of refactor patches.


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/fragment-state.h
File be/src/runtime/fragment-state.h:

http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/fragment-state.h@63
PS4, Line 63: ) 
> nit: I'm not sure suffix-ing these variables with '_pb' is adding that much
Done


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/fragment-state.h@163
PS4, Line 163: fragment_
> its confusing to me that we call this 'fragment_' seems like 'fragment_info
Yeah, its not ideal, but as you note its standard and its also nice that its concise, so I would prefer to leave it.


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/runtime/fragment-state.h@167
PS4, Line 167: PlanFragmentInstanceCtxPB
> for TPlanFragmentInstanceCtx / PlanFragmentInstanceCtxPB, would it make sen
Yeah I mean, we already have RuntimeState, which is passed around and provides references to these structs. I'm not sure that another layer is helpful, esp. since it'll eventually go away when we finish refactoring TPlanFragmentInstanceCtx.

If you like, I can promise to finish that refactor soon, or even just get it all done in this patch. I was avoiding that, since this patch is already so large, but looking at it now there's not all that much left except for TDebugOptions and a bunch of ints.


http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/15844/4/be/src/scheduling/scheduler.cc@1054
PS4, Line 1054: 
> is this a copy of what has in hdfs-scan-node? should we re-factor these to 
Done


http://gerrit.cloudera.org:8080/#/c/15844/4/common/protobuf/common.proto
File common/protobuf/common.proto:

http://gerrit.cloudera.org:8080/#/c/15844/4/common/protobuf/common.proto@44
PS4, Line 44: CompressionType
> should be CompressionTypePB?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 6
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 May 2020 23:41:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/15844/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15844/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
> is there a list of all the Thrift structs that need to be converted to prot
This patch is the last set of stuff I'm planning on refactoring for this project. As noted below, the particular things being refactorered here are those that are part of QuerySchedule.

Ultimately it would be nice to convert everything to protobuf, so that we don't have to maintain both stacks, but the value of converting most stuff is fairly low relative to the work (and we'll probably never really be able to get rid of Thrift anyways since its used for external clients) so further refactoring is pretty low priority unless there are specific reasons to do it like this project.


http://gerrit.cloudera.org:8080/#/c/15844/1/be/src/scheduling/scheduler-test.cc
File be/src/scheduling/scheduler-test.cc:

http://gerrit.cloudera.org:8080/#/c/15844/1/be/src/scheduling/scheduler-test.cc@760
PS1, Line 760:       ++range_length_count[instance_ranges[0].scan_range().hdfs_file_split().length()
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/15844/1/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/15844/1/be/src/scheduling/scheduler.cc@561
PS1, Line 561:     TUniqueIdToUniqueIdPB(instance_exec_params->back().instance_id,
> line too long (110 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/common.proto
File common/protobuf/common.proto:

http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/common.proto@44
PS3, Line 44: enum CompressionType {
> is this a mirror of THdfsCompression in CatalogObjects.thrift? Can we inclu
Done


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/control_service.proto@292
PS3, Line 292: // Protobuf portion of the execution parameters of a single fragment instance. Every
             : // fragment instance will also have a corresponding TPlanFragmentInstanceCtx with the same
             : // fragment_idx.
> this seems a bit confusing to me. this basically splits up TPlanFragmentIns
Yeah I don't think there's any clear names that would apply only to the converted/non-converted parameters. I think its good to keep the names the same to make it clear that they correspond, and ultimately we'll get rid of TPlanFragmentInstanceCtx

I updated a bunch of comments to make this more clear.


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/control_service.proto@335
PS3, Line 335:   optional int64 per_backend_mem_limit = 6;
             : 
             :   // General execution parameters for different fragments. Corresponds to 'fragments' in
             :   // the TExecPlanFragmentInfo sidecar.
             :   repeated PlanFragmentCtxPB fragment_ctxs = 7;
             : 
             :   // Execution parameters for specific fragment instances. Corresponds to
             :   // 'fragment_instance_ctxs' in the TExecPlanFragmentInfo sidec
> I'm still trying to understand the code, but it seems some of the info in t
If there is a perf change, it would just be if protobuf is faster/slower than thrift to serialize/deserialize. I would assume if anything protobuf is faster, since its a more mature project, but probably not by very much.

We're not introducing any new copies, its still the amount of data being sent (other than possible representation differences between thrift and protobuf).

I'll run some basic perf tests just for a sanity check.


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/planner.proto
File common/protobuf/planner.proto:

http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/planner.proto@1
PS3, Line 1: // Licensed to the Apache Software Foundation (ASF) under one
> why name this planner.proto? looks like most of these structs come from Pla
In the thrift, its really not intuitive to me what the difference is between PlanNodes.thrift and Planner.thrift, there's not an obvious reason why the particular structs I converted are specifically related to plan nodes, and since Planner.thrift is pretty short I figured it was fine to lose the distinction when doing the refactoring.

We haven't in general tried to match the protobuf and thrift files up, see eg. common.proto which really doesn't have any thrift equivalent, though it might have been nice to have done so.

I don't feel strongly if you would prefer I name it plan_nodes.proto


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/planner.proto@25
PS3, Line 25: // should be kept in sync with it.
> Is it a lot of effort to delete the corresponding THdfsFileSplit in PlanNod
Yeah there are THdfsFileSplits generated in the FE along with the rest of the TPlan. Its a pretty big project to convert all of the Planner over to protobuf, definitely not something that I think has much value in the short term vs. how much work it would be.



-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 May 2020 00:34:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5932/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Sat, 02 May 2020 00:32:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15844

to look at the new patch set (#2).

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................

IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

1;95;0cThe new admission control service will be written in protobuf, so
there are various admission control related structures currently
stored in Thrift that it would be convenient to convert to protobuf,
to minimize the amount of converting back and forth that needs to be
done.

This patch converts some portions of TExecPlanFragmentInfo to
protobuf. TExecPlanFragmentInfo is sent as a sidecar with the Exec()
rpc, so the refactored parts are now just directly included in the
ExecQueryFInstancesRequestPB.

The portions that are converted are those that are part of the
QuerySchedule, in particular the TPlanFragmentDestination,
TScanRangeParams, and TJoinBuildInput.

This patch is just a refactor and doesn't contain any functional
changes.

One notable related change is that DataSink::CreateSink() has two
parameters removed - TPlanFragmentCtx (which no longer exists) and
TPlanFragmentInstanceCtx. These variables and the new PB eqivalents
are available via the RuntimeState that was already being passed in as
another parameter and don't need to be individually passed in.

Testing:
- Passed a full run of existing tests.

Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exec/scan-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/rpc/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/fe-support.cc
M be/src/util/container-util.h
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
M common/protobuf/control_service.proto
A common/protobuf/planner.proto
M common/thrift/ImpalaInternalService.thrift
50 files changed, 623 insertions(+), 375 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/15844/2
-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15844

to look at the new patch set (#5).

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................

IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

The new admission control service will be written in protobuf, so
there are various admission control related structures currently
stored in Thrift that it would be convenient to convert to protobuf,
to minimize the amount of converting back and forth that needs to be
done.

This patch converts some portions of TExecPlanFragmentInfo to
protobuf. TExecPlanFragmentInfo is sent as a sidecar with the Exec()
rpc, so the refactored parts are now just directly included in the
ExecQueryFInstancesRequestPB.

The portions that are converted are those that are part of the
QuerySchedule, in particular the TPlanFragmentDestination,
TScanRangeParams, and TJoinBuildInput.

This patch is just a refactor and doesn't contain any functional
changes.

One notable related change is that DataSink::CreateSink() has two
parameters removed - TPlanFragmentCtx (which no longer exists) and
TPlanFragmentInstanceCtx. These variables and the new PB eqivalents
are available via the RuntimeState that was already being passed in as
another parameter and don't need to be individually passed in.

Testing:
- Passed a full run of existing tests.
- Ran the single node perf test and didn't detect any regressions.

Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exec/scan-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/rpc/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/fe-support.cc
M be/src/util/CMakeLists.txt
A be/src/util/compression-util.cc
A be/src/util/compression-util.h
M be/src/util/container-util.h
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
M common/protobuf/control_service.proto
A common/protobuf/planner.proto
M common/protobuf/row_batch.proto
M common/thrift/ImpalaInternalService.thrift
M common/thrift/PlanNodes.thrift
56 files changed, 699 insertions(+), 385 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/15844/5
-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15844

to look at the new patch set (#7).

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................

IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

The new admission control service will be written in protobuf, so
there are various admission control related structures currently
stored in Thrift that it would be convenient to convert to protobuf,
to minimize the amount of converting back and forth that needs to be
done.

This patch converts some portions of TExecPlanFragmentInfo to
protobuf. TExecPlanFragmentInfo is sent as a sidecar with the Exec()
rpc, so the refactored parts are now just directly included in the
ExecQueryFInstancesRequestPB.

The portions that are converted are those that are part of the
QuerySchedule, in particular the TPlanFragmentDestination,
TScanRangeParams, and TJoinBuildInput.

This patch is just a refactor and doesn't contain any functional
changes.

One notable related change is that DataSink::CreateSink() has two
parameters removed - TPlanFragmentCtx (which no longer exists) and
TPlanFragmentInstanceCtx. These variables and the new PB eqivalents
are available via the RuntimeState that was already being passed in as
another parameter and don't need to be individually passed in.

Testing:
- Passed a full run of existing tests.
- Ran the single node perf test and didn't detect any regressions.

Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exec/scan-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/rpc/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/fragment-state.cc
M be/src/runtime/fragment-state.h
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/query-state.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/fe-support.cc
M be/src/util/CMakeLists.txt
A be/src/util/compression-util.cc
A be/src/util/compression-util.h
M be/src/util/container-util.h
M be/src/util/uid-util.h
M common/protobuf/CMakeLists.txt
M common/protobuf/common.proto
M common/protobuf/control_service.proto
A common/protobuf/planner.proto
M common/protobuf/row_batch.proto
M common/thrift/ImpalaInternalService.thrift
M common/thrift/PlanNodes.thrift
56 files changed, 699 insertions(+), 385 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/15844/7
-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 7: Code-Review+2

carrying forward


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 May 2020 20:06:00 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15844/1/be/src/scheduling/scheduler-test.cc
File be/src/scheduling/scheduler-test.cc:

http://gerrit.cloudera.org:8080/#/c/15844/1/be/src/scheduling/scheduler-test.cc@760
PS1, Line 760:       ++range_length_count[instance_ranges[0].scan_range().hdfs_file_split().length() - 1];
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/15844/1/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/15844/1/be/src/scheduling/scheduler.cc@561
PS1, Line 561:     TUniqueIdToUniqueIdPB(instance_exec_params->back().instance_id, build_input.mutable_input_finstance_id());
line too long (110 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 May 2020 22:44:14 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 3:

(6 comments)

only looked through the proto and thrift file changes so far

http://gerrit.cloudera.org:8080/#/c/15844/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15844/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
is there a list of all the Thrift structs that need to be converted to protobuf? listing them out either in the JIRA or some type doc would be great, a long with an explanation of why thrift --> protobuf conversion is necessary


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/common.proto
File common/protobuf/common.proto:

http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/common.proto@44
PS3, Line 44: enum CompressionType {
is this a mirror of THdfsCompression in CatalogObjects.thrift? Can we include comments in both this file and CatalogObjects.thrift stating that.


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/control_service.proto@292
PS3, Line 292: // Execution parameters of a single fragment instance. Corresponds to a
             : // TPlanFragmentInstanceCtx with the same fragment_idx.
             : message PlanFragmentInstanceCtxPB {
this seems a bit confusing to me. this basically splits up TPlanFragmentInstanceCtx into two structs, some of the struct is defined in Thrift and some in Protobuf. I guess it is probably fine to re-factor it that way, but we should update the comments of both this struct and the Thrift version to reflect that. for example, the docs of each struct state that it is the "Execution parameters of a single fragment instance.", however, that is not really true anymore, it is just the partial set of execution parameters.
we could consider renaming this and TPlanFragmentInstanceCtx altogether to encapsulate the exact parameters that each one captures, however, not sure if that is possible.


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/control_service.proto@335
PS3, Line 335: 
             :   // General execution parameters for different fragments. Corresponds to 'fragments' in
             :   // the TExecPlanFragmentInfo sidecar.
             :   repeated PlanFragmentCtxPB fragment_ctxs = 7;
             : 
             :   // Execution parameters for specific fragment instances. Corresponds to
             :   // 'fragment_instance_ctxs' in the TExecPlanFragmentInfo sidecar.
             :   repeated PlanFragmentInstanceCtxPB fragment_instance_ctxs = 8;
I'm still trying to understand the code, but it seems some of the info in these fields were being sent in the TExecPlanFragmentInfo sidecar, right? and now we are moving them out of the sidecar into protobuf fields? is it possible that moving things out of the sidecar can affect the perf of the Exec RPCs?
if the issue exists, might be a minor one; but just thought I'd bring it up to ensure we have considered it.


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/planner.proto
File common/protobuf/planner.proto:

http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/planner.proto@1
PS3, Line 1: // Licensed to the Apache Software Foundation (ASF) under one
why name this planner.proto? looks like most of these structs come from PlanNodes.thrift


http://gerrit.cloudera.org:8080/#/c/15844/3/common/protobuf/planner.proto@25
PS3, Line 25: message HdfsFileSplitPB {
Is it a lot of effort to delete the corresponding THdfsFileSplit in PlanNodes.thrift? if we plan to keep these duplicated across Thrift and Protobuf, would be good to include a note in PlanNodes.thrift saying any changes to THdfsFileSplit, should be made to HdfsFileSplitPB in this file.
Same applies for all the other structs in this file.



-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Mon, 11 May 2020 15:23:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15844/5/be/src/util/compression-util.h
File be/src/util/compression-util.h:

http://gerrit.cloudera.org:8080/#/c/15844/5/be/src/util/compression-util.h@20
PS5, Line 20: #include "gen-cpp/CatalogObjects_types.h"
            : #include "gen-cpp/common.pb.h"
> can u just use forward declarations for this instead of including the whole
No, I don't think we can.

For the Thrift enum, its defined as a nested class ('type' is the enum defined inside a struct THdfsCompression) so it definitely can't be forward declared.

For the protobuf enum, as far as I can figure out, in order to forward declare an enum we would need the enum to have an explicitly specified underlying type:
https://stackoverflow.com/questions/42766839/c11-enum-forward-causes-underlying-type-mismatch
protobuf doesn't automatically emit explicit types for enums, and I couldn't find any sort of setting to get it to.

I think that its possible to provide hooks to the protobuf code generator that would allow us to rewrite the generated code like this, but as we don't already do that setting up the infrastructure for it is probably more work than removing this include is worth.


http://gerrit.cloudera.org:8080/#/c/15844/5/be/src/util/compression-util.h@25
PS5, Line 25: Convert
> nit: typo here and below
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 May 2020 18:20:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6046/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 May 2020 01:25:48 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5900/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 8
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 May 2020 20:06:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 8: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 8
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Fri, 29 May 2020 01:26:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15844 )

Change subject: IMPALA-9692 (part 2): Refactor parts of TExecPlanFragmentInfo to protobuf
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5999/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8e46767b257bbf677171ac2f4efb1b623ba41b
Gerrit-Change-Number: 15844
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 May 2020 00:16:22 +0000
Gerrit-HasComments: No