You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org> on 2022/09/28 23:36:41 UTC

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Yu-Wen Lai has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19052


Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................

IMPALA-8592: Add support for insert events for 'LOAD DATA' statements
from Impala

In this patch, we use TUpdateCatalogRequest to refresh metadata after
'LOAD DATA' instead of TResetMetadataRequest so that we can reuse the
code for 'INSERT' statements. It will fire an insert event just same
as what we did for 'INSERT' statements.

Testing:
- Run existing test_load.py

Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
---
M be/src/service/client-request-state.cc
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
4 files changed, 67 insertions(+), 28 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/19052/1
-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8652/


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Oct 2022 01:21:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org>.
Yu-Wen Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 3:

(3 comments)

> Patch Set 1:
> 
> (3 comments)
> 
> This is a pretty nice fix!

http://gerrit.cloudera.org:8080/#/c/19052/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19052/1//COMMIT_MSG@16
PS1, Line 16: - Run existing test_load.py
> We also need tests to verify the INSERT events. Could you add some tests in
I realized that replication cannot be used as a verification of insert event for external tables because hive replication for external tables relies on distcp instead of insert events. Given that LOAD DATA is only applicable to external tables, we need to use another way to verify the INSERT events. Therefore, I added a test and used number of skipped events as an implicit indicator. Let me know if you have better idea.


http://gerrit.cloudera.org:8080/#/c/19052/1/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/19052/1/be/src/service/client-request-state.cc@806
PS1, Line 806: string for unpartitione
> nit: Could you add a comment mentioning that the partition_name is an empty
Done


http://gerrit.cloudera.org:8080/#/c/19052/1/be/src/service/client-request-state.cc@809
PS1, Line 809:   catalog_update.__set_sync_ddl(exec_request_->query_options.sync_ddl);
             :   catalog_update.__set_header(GetCatalogServiceRequestHeader());
             :   catalog_update.target_table = exec_request_->load_data_request.table_name.table_name;
             :   catalog_update.db_name = exec_request_->load_data_request.table_name.db_name;
             :   catalog_update.is_overwrite = exec_request_->load_data_request.overwrite;
             : 
             :   const TNetworkAddress& address =
> nit: these duplicate the code in ClientRequestState::ExecLoadDataRequestImp
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Sat, 01 Oct 2022 01:59:52 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11521/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 03 Oct 2022 20:33:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19052/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19052/1//COMMIT_MSG@16
PS1, Line 16: - Run existing test_load.py
> I realized that replication cannot be used as a verification of insert even
I see. Can we use the hive_client to fetch and verify the INSERT events directly? We use it to fetch the latest event id here:
https://github.com/apache/impala/blob/68650057a163ac23e1ca85b7d9d8dbfd975a69ff/tests/util/event_processor_utils.py#L125

Probably we can use get_next_notification() to fetch the INSERT events.


http://gerrit.cloudera.org:8080/#/c/19052/3/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/19052/3/be/src/service/client-request-state.cc@2047
PS3, Line 2047:     
nit: 2 spaces indent here



-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 03 Oct 2022 08:27:41 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11522/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 5
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Oct 2022 00:51:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8652/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 03 Oct 2022 20:13:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org>.
Yu-Wen Lai has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................

IMPALA-8592: Add support for insert events for 'LOAD DATA' statements
from Impala

In this patch, we use TUpdateCatalogRequest to refresh metadata after
'LOAD DATA' instead of TResetMetadataRequest so that we can reuse the
code for 'INSERT' statements. It will fire an insert event just same
as what we did for 'INSERT' statements.

We also fix the inconsistent indentation in event_processor_utils.py.

Testing:
- Run existing test_load.py
- Added test_load_data_from_impala() in test_event_processing.py

Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/metadata/test_event_processing.py
M tests/util/event_processor_utils.py
7 files changed, 194 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/19052/6
-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 6
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 3: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Sat, 01 Oct 2022 08:51:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19052/4/tests/metadata/test_event_processing.py
File tests/metadata/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/19052/4/tests/metadata/test_event_processing.py@396
PS4, Line 396:         parquet".format(unique_database, tbl_part))
Can we keep the wait of "EventProcessorUtils.wait_for_event_processing(self)"? I'm just concerning the test become flaky if the CREATE_TABLE events come late.


http://gerrit.cloudera.org:8080/#/c/19052/4/tests/metadata/test_event_processing.py@408
PS4, Line 408:       assert len(events) == 1
I think we need to mark this test using @pytest.mark.execute_serially. Otherwise, it runs concurrently with other metadata tests, which can be flaky as well.



-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 03 Oct 2022 23:17:17 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8648/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Sat, 01 Oct 2022 03:40:05 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org>.
Yu-Wen Lai has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................

IMPALA-8592: Add support for insert events for 'LOAD DATA' statements
from Impala

In this patch, we use TUpdateCatalogRequest to refresh metadata after
'LOAD DATA' instead of TResetMetadataRequest so that we can reuse the
code for 'INSERT' statements. It will fire an insert event just same
as what we did for 'INSERT' statements.

Testing:
- Run existing test_load.py
- Added test_load_data_from_impala() in test_event_processing.py

Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/metadata/test_event_processing.py
6 files changed, 129 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/19052/3
-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11477/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 29 Sep 2022 00:00:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 1: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 29 Sep 2022 05:19:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org>.
Yu-Wen Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19052/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19052/1//COMMIT_MSG@16
PS1, Line 16: - Run existing test_load.py
> I see. Can we use the hive_client to fetch and verify the INSERT events dir
Cool. Let me try that.


http://gerrit.cloudera.org:8080/#/c/19052/3/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/19052/3/be/src/service/client-request-state.cc@2047
PS3, Line 2047:     
> nit: 2 spaces indent here
Ack



-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 03 Oct 2022 16:57:25 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org>.
Yu-Wen Lai has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................

IMPALA-8592: Add support for insert events for 'LOAD DATA' statements
from Impala

In this patch, we use TUpdateCatalogRequest to refresh metadata after
'LOAD DATA' instead of TResetMetadataRequest so that we can reuse the
code for 'INSERT' statements. It will fire an insert event just same
as what we did for 'INSERT' statements.

We also fix the inconsistent indentation in event_processor_utils.py.

Testing:
- Run existing test_load.py
- Added test_load_data_from_impala() in test_event_processing.py

Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/metadata/test_event_processing.py
M tests/util/event_processor_utils.py
7 files changed, 195 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/19052/5
-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 5
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org>.
Yu-Wen Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19052/4/tests/metadata/test_event_processing.py
File tests/metadata/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/19052/4/tests/metadata/test_event_processing.py@408
PS4, Line 408:         into table {1}.{2}".format(staging_dir, unique_database, tbl_nopart))
> I think we need to mark this test using @pytest.mark.execute_serially. Othe
Thanks for pointing out this mark!



-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 5
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Oct 2022 00:31:13 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 6: Code-Review+2

LGTM. Thanks for fixing this gap!


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 6
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Oct 2022 12:50:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11523/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 6
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Oct 2022 01:03:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11510/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 3
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Sat, 01 Oct 2022 02:10:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Yu-Wen Lai (Code Review)" <ge...@cloudera.org>.
Yu-Wen Lai has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................

IMPALA-8592: Add support for insert events for 'LOAD DATA' statements
from Impala

In this patch, we use TUpdateCatalogRequest to refresh metadata after
'LOAD DATA' instead of TResetMetadataRequest so that we can reuse the
code for 'INSERT' statements. It will fire an insert event just same
as what we did for 'INSERT' statements.

We also fix the inconsistent indentation in event_processor_utils.py.

Testing:
- Run existing test_load.py
- Added test_load_data_from_impala() in test_event_processing.py

Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/metadata/test_event_processing.py
M tests/util/event_processor_utils.py
7 files changed, 193 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/19052/4
-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 4
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................

IMPALA-8592: Add support for insert events for 'LOAD DATA' statements
from Impala

In this patch, we use TUpdateCatalogRequest to refresh metadata after
'LOAD DATA' instead of TResetMetadataRequest so that we can reuse the
code for 'INSERT' statements. It will fire an insert event just same
as what we did for 'INSERT' statements.

We also fix the inconsistent indentation in event_processor_utils.py.

Testing:
- Run existing test_load.py
- Added test_load_data_from_impala() in test_event_processing.py

Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Reviewed-on: http://gerrit.cloudera.org:8080/19052
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Quanlong Huang <hu...@gmail.com>
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/metadata/test_event_processing.py
M tests/util/event_processor_utils.py
7 files changed, 194 insertions(+), 84 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Quanlong Huang: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 7
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 1:

(3 comments)

This is a pretty nice fix!

http://gerrit.cloudera.org:8080/#/c/19052/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19052/1//COMMIT_MSG@16
PS1, Line 16: - Run existing test_load.py
We also need tests to verify the INSERT events. Could you add some tests in tests/metadata/test_event_processing.py, e.g. add some cases for LOAD DATA in test_event_based_replication?


http://gerrit.cloudera.org:8080/#/c/19052/1/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/19052/1/be/src/service/client-request-state.cc@806
PS1, Line 806: response.partition_name
nit: Could you add a comment mentioning that the partition_name is an empty string for unpartitioned tables? It's the same as DataSink::ROOT_PARTITION_KEY.


http://gerrit.cloudera.org:8080/#/c/19052/1/be/src/service/client-request-state.cc@809
PS1, Line 809:   catalog_update.__set_header(TCatalogServiceRequestHeader());
             :   catalog_update.header.__set_requesting_user(effective_user());
             :   catalog_update.header.__set_client_ip(session()->network_address.hostname);
             :   catalog_update.header.__set_want_minimal_response(FLAGS_use_local_catalog);
             :   catalog_update.header.__set_redacted_sql_stmt(
             :       query_ctx_.client_request.__isset.redacted_stmt ?
             :           query_ctx_.client_request.redacted_stmt : query_ctx_.client_request.stmt);
nit: these duplicate the code in ClientRequestState::ExecLoadDataRequestImpl(). It'd be nice to extract them into a method like GetCatalogServiceRequestHeader(), and simplify the code to

  catalog_update.__set_header(GetCatalogServiceRequestHeader())



-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 30 Sep 2022 01:55:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8653/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 6
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Oct 2022 01:47:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 6: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 6
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Oct 2022 06:50:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19052/5/tests/metadata/test_event_processing.py
File tests/metadata/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/19052/5/tests/metadata/test_event_processing.py@380
PS5, Line 380: p
flake8: F821 undefined name 'pytest'



-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 5
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 04 Oct 2022 00:32:03 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19052 )

Change subject: IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala
......................................................................


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8638/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/19052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc
Gerrit-Change-Number: 19052
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 28 Sep 2022 23:54:44 +0000
Gerrit-HasComments: No