You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Xiaomeng Zhang (Code Review)" <ge...@cloudera.org> on 2020/04/06 02:32:16 UTC
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15648
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
IMPALA-8632: Add support for self-event detection for insert events
In case of INSERT_EVENTS if Impala inserts into a table it causes a
refresh to the underlying table/partition. This could be unnecessary
when there is only one Impala cluster in the system.
We can detect a self-event in such cases when the HMS API to fire a
listener event returns the event id. This is used by EventProcessor
to ignore the event when it is fetched later in the next polling cycle.
Testing:
Add testInsertEvents() in MetastoreEventsProcessorTest.java to test
insert event self-event detection when insert into table and partition.
Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/events/EventId.java
M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
10 files changed, 297 insertions(+), 83 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/15648/1
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 1
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 4:
(9 comments)
mostly looks good to me. Should be okay once we add the suggestions.
http://gerrit.cloudera.org:8080/#/c/15648/3/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:
http://gerrit.cloudera.org:8080/#/c/15648/3/be/src/common/global-flags.cc@254
PS3, Line 254:
: DEFINE_int32(hms_event_polling_interval_s, 0,
> I think this is implementation detail and can be skipped. Also the summary
Are you planning to update this in a subsequent patch?
http://gerrit.cloudera.org:8080/#/c/15648/4/bin/impala-config.sh
File bin/impala-config.sh:
http://gerrit.cloudera.org:8080/#/c/15648/4/bin/impala-config.sh@177
PS4, Line 177: export CDP_BUILD_NUMBER=2506706
The changes in this file should be made in a separate change. Also, this build number has a problem which causes a test failure. I uploaded a different patch to bump it up a better build. https://gerrit.cloudera.org/#/c/15668/. May be revert the changes to this file and rebase on top of my patch linked above?
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@555
PS4, Line 555: null
May be change this to Collections.emptyList() so that the callers don't need to check for null.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@997
PS4, Line 997: public static class InsertEventInfo {
this duplicates a lot of code between the two version of the MetastoreShim. I think we can keep the existing fireInsertEvent method in the MetastoreUtils method. Here we can just make use of the return types. For instance, in hive-2 shim we do
public static List<Long> fireInsertEvent(IMetaStoreClient msClient,
List<InsertEventInfo> insertEventInfos) throws TException {
MetastoreUtil.fireInsertEvent(msClient, insertEventInfos);
return Collections.emptyList();
}
in hive-3 shim we do
public static List<Long> fireInsertEvent(IMetaStoreClient msClient,
List<InsertEventInfo> insertEventInfos) throws TException {
FireEventResponse resp = MetastoreUtil.fireInsertEvent(msClient, insertEventInfos);
Preconditions.checkState(resp.getEventIds() != null && !resp.getEventIds().isEmpty());
return resp.getEventIds();
}
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
File fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java@35
PS4, Line 35: private final long idFromEvent_;
rename to insertEventId_?
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java@69
PS4, Line 69: idFromEvent_ = eventId;
can you add a preconditions check that this eventId>0
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4487
PS4, Line 4487: // Firing insert events by making calls to HMS APIs can be slow for tables with
: // large number of partitions.
this comment should be updated with the bulk API it should not be a problem anymore. May be add a TODO to evaluate the performance of bulk API
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4506
PS4, Line 4506: e
can be moved to earlier line.
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@725
PS3, Line 725: Assume.assumeTrue("Skipping this test because it only works with Hive-3 or greater",
> Can you also enable the which is commented out here: https://github.com/apa
ping
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 4
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Apr 2020 22:21:40 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Xiaomeng Zhang (Code Review)" <ge...@cloudera.org>.
Xiaomeng Zhang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
IMPALA-8632: Add support for self-event detection for insert events
In case of INSERT_EVENTS if Impala inserts into a table it causes a
refresh to the underlying table/partition. This could be unnecessary
when there is only one Impala cluster in the system.
We can detect a self-event in such cases when the HMS API to fire a
listener event returns the event id. This is used by EventProcessor
to ignore the event when it is fetched later in the next polling cycle.
Testing:
Add testInsertFromImpala() in MetastoreEventsProcessorTest.java to test
insert event self-event detection when insert into table and partition.
Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M bin/impala-config.sh
M common/thrift/BackendGflags.thrift
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
18 files changed, 487 insertions(+), 171 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/15648/5
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 5
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Xiaomeng Zhang (Code Review)" <ge...@cloudera.org>.
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 4:
(5 comments)
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@286
PS3, Line 286: *
> guess this will be removed once we bump up the CDP_BUILD right?
Yes. It will be removed after I have hive jar updated.
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@849
PS3, Line 849: versionNumber == -1
> Why would this be a case?
As we check -1 for version number. I am checking -1 for eventId as well. Is it guaranteed that eventId will not be less than 0?
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@897
PS3, Line 897: Preconditions.checkState(inFlightEvents_.size(false) == 0);
> do we need a similar check for inFlightEvents_.size(false)?
Done
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4411
PS3, Line 4411: ram isInsertOverwrite indicates if the operation was an inse
> I think this check should be ((!catalog_.isEventProcessingActive() && isIns
Done
http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4466
PS3, Line 4466: sMapBeforeInsert.entrySet().iterator().next();
> I think we need to do this via MetastoreShim since otherwise the response w
Done
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 4
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Apr 2020 21:41:18 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 1:
Build Failed
https://jenkins.impala.io/job/gerrit-code-review-checks/5703/ : Initial code review checks failed. See linked job for details on the failure.
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 1
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Apr 2020 03:17:17 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Anurag Mantripragada (Code Review)" <ge...@cloudera.org>.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 5: Code-Review+1
Thanks for addressing my comments. I will let Vihang +2 the change.
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 5
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Apr 2020 05:17:14 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 5:
Build Failed
https://jenkins.impala.io/job/gerrit-code-review-checks/5746/ : Initial code review checks failed. See linked job for details on the failure.
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 5
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Apr 2020 05:27:55 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 4:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/5741/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 4
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Apr 2020 22:23:09 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
IMPALA-8632: Add support for self-event detection for insert events
In case of INSERT_EVENTS if Impala inserts into a table it causes a
refresh to the underlying table/partition. This could be unnecessary
when there is only one Impala cluster in the system.
We can detect a self-event in such cases when the HMS API to fire a
listener event returns the event id. This is used by EventProcessor
to ignore the event when it is fetched later in the next polling cycle.
Testing:
Add testInsertFromImpala() in MetastoreEventsProcessorTest.java to test
insert event self-event detection when insert into table and partition.
Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Reviewed-on: http://gerrit.cloudera.org:8080/15648
Reviewed-by: Vihang Karajgaonkar <vi...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
17 files changed, 504 insertions(+), 171 deletions(-)
Approvals:
Vihang Karajgaonkar: Looks good to me, approved
Impala Public Jenkins: Verified
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 10
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Anurag Mantripragada (Code Review)" <ge...@cloudera.org>.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 4:
(15 comments)
Thanks for the change, I have added some comments, most of them are style-related and some nits.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@54
PS4, Line 54: import org.apache.impala.catalog.events.NoOpEventProcessor;
Nit: Duplicate import, could you remove this?
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@934
PS4, Line 934:
Nit: Incorrect indentation. Remove the empty spaces.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@937
PS4, Line 937:
Incorrect indentation. Remove the empty spaces.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/Table.java@807
PS4, Line 807:
Nit: Incorrect indentation. Remove the empty spaces.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/Table.java@826
PS4, Line 826:
Nit: Incorrect indentation. Remove the empty spaces.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java@65
PS4, Line 65: eventIds_
Here and everywhere else, if this means "only" insert events, may I suggest using insertEventIds_?
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java@69
PS4, Line 69: @param isInsertEvent if true, return list of versions for in-flight Insert events
: * if false, return list of eventIds for in-flight DDL events
Should this be the other way around? i.e., return eventIds for in-flight insert events if it is insert events and versions if it is not.
Also, indentation is incorrect, please remove the extra spaces.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java@87
PS4, Line 87:
Nit: Indentation incorrect, remove extra spaces.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java@107
PS4, Line 107:
Nit: Indentation incorrect, remove extra spaces.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@773
PS4, Line 773: // long eventId_test = catalog_.public_eventId.consumeId();
Could you remove this comment?
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@787
PS4, Line 787: /**
: * Check self-event for in flight Insert event using eventId
: */
Nit: Not sure this is needed since this is an overridden method and its purpose is the same as described in L318.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@75
PS4, Line 75: *;
We should be careful with *. Are you sure we are importing everything from org.apache.impala.catalog?
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4424
PS4, Line 4424: List of all catalog object that was insert into
Did you mean "List of all partitions we insert into"?
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4487
PS4, Line 4487: // Firing insert events by making calls to HMS APIs can be slow for tables with
: // large number of partitions.
> this comment should be updated with the bulk API it should not be a problem
I see that MetastoreShim.fireInsertEvent() for hive-2 does not insert in bulk and is still making one RPC per partition. We should keep the earlier asynchronous fireInsertEvent() for hive-2.
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java:
http://gerrit.cloudera.org:8080/#/c/15648/4/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java@33
PS4, Line 33: import org.apache.hadoop.hive.metastore.api.FireEventRequest;
: import org.apache.hadoop.hive.metastore.api.FireEventRequestData;
: import org.apache.hadoop.hive.metastore.api.FireEventResponse;
: import org.apache.hadoop.hive.metastore.api.InsertEventRequestData;
These imports are not needed anymore.
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 4
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Apr 2020 23:47:36 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Xiaomeng Zhang (Code Review)" <ge...@cloudera.org>.
Xiaomeng Zhang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
IMPALA-8632: Add support for self-event detection for insert events
In case of INSERT_EVENTS if Impala inserts into a table it causes a
refresh to the underlying table/partition. This could be unnecessary
when there is only one Impala cluster in the system.
We can detect a self-event in such cases when the HMS API to fire a
listener event returns the event id. This is used by EventProcessor
to ignore the event when it is fetched later in the next polling cycle.
Testing:
Add testInsertFromImpala() in MetastoreEventsProcessorTest.java to test
insert event self-event detection when insert into table and partition.
Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M bin/impala-config.sh
M common/thrift/BackendGflags.thrift
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
17 files changed, 506 insertions(+), 205 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/15648/4
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 4
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 9:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/5770/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 9
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 05:22:36 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
Patch Set 9:
(2 comments)
http://gerrit.cloudera.org:8080/#/c/15648/9/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java:
http://gerrit.cloudera.org:8080/#/c/15648/9/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@513
PS9, Line 513: fireInsertEventHelper(msClient.getHiveClient(), insertEventInfos, dbName, tableName);
line too long (93 > 90)
http://gerrit.cloudera.org:8080/#/c/15648/9/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:
http://gerrit.cloudera.org:8080/#/c/15648/9/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@1004
PS9, Line 1004: return fireInsertEventHelper(msClient.getHiveClient(), insertEventInfos, dbName, tableName);
line too long (98 > 90)
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 9
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Comment-Date: Thu, 09 Apr 2020 04:31:20 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Posted by "Xiaomeng Zhang (Code Review)" <ge...@cloudera.org>.
Xiaomeng Zhang has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/15648 )
Change subject: IMPALA-8632: Add support for self-event detection for insert events
......................................................................
IMPALA-8632: Add support for self-event detection for insert events
In case of INSERT_EVENTS if Impala inserts into a table it causes a
refresh to the underlying table/partition. This could be unnecessary
when there is only one Impala cluster in the system.
We can detect a self-event in such cases when the HMS API to fire a
listener event returns the event id. This is used by EventProcessor
to ignore the event when it is fetched later in the next polling cycle.
Testing:
Add testInsertFromImpala() in MetastoreEventsProcessorTest.java to test
insert event self-event detection when insert into table and partition.
Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
17 files changed, 504 insertions(+), 171 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/15648/9
--
To view, visit http://gerrit.cloudera.org:8080/15648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf
Gerrit-Change-Number: 15648
Gerrit-PatchSet: 9
Gerrit-Owner: Xiaomeng Zhang <xi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Xiaomeng Zhang <xi...@cloudera.com>