You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Sourabh Goyal (Code Review)" <ge...@cloudera.org> on 2021/09/20 23:29:10 UTC

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Sourabh Goyal has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17859


Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,535 insertions(+), 240 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9504/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 7
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Mon, 27 Sep 2021 17:40:56 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7483/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 03:53:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#24).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,495 insertions(+), 283 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/24
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 24
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#27).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,548 insertions(+), 292 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/27
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 27
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 33:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7673/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 33
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Nov 2021 11:02:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 33:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9840/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 33
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Nov 2021 11:24:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 32:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9837/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 32
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 23 Nov 2021 17:21:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/18/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/18/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402
PS18, Line 2402:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 18
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 11 Oct 2021 18:24:04 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 37: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7690/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 37
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Dec 2021 00:51:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 34:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9847/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 34
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:07:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 39:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/39/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/39/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@3177
PS39, Line 3177:               String.format("Drop database event not received for db: %s from event id: %s. "
line too long (93 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 39
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Dec 2021 16:18:25 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 36:

You probably have to rebase and resolve the conflicts. Can you please update the patch with rebase?


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 36
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Dec 2021 20:57:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 18: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7529/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 18
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Oct 2021 00:26:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 15:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9575/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 15
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 07 Oct 2021 20:25:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 11: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 11
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Sep 2021 23:04:56 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 26: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7582/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 26
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 02 Nov 2021 03:50:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#26).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:

- Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
this improvement. It is turned off by default.

- Sync db/table to latest event id for ddls from catalog HMS endpoints.
A subsequent patch would address the same for DDLs executed from Impala
shell

- Event processor skips processing an event if db/table is already
synced till that event id and sets that event id in db/table in case
the event is processed

- When EventProcessor detects a self event, it sets the last synced
event id in db/table before skipping an event

- Full table refresh sets the last event processed in table cache

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,561 insertions(+), 292 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/26
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 26
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#28).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,558 insertions(+), 293 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/28
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 28
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 35:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7679/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 35
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:49:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#35).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,668 insertions(+), 312 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/35
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 35
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 36:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9860/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 36
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Dec 2021 13:29:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 31:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7650/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 31
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 19 Nov 2021 12:07:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Sourabh Goyal has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 36:

@Vihang: Thanks for reviewing it thoroughly. I already have a jira for the followup https://issues.apache.org/jira/browse/IMPALA-10976.

I will resolve the conflicts after rebasing the patch. 
> Patch Set 36: Code-Review+2
> 
> The patch looks good to me. Thanks a lot for seeing it through. I know it took a lot of iterations. I think as a followup, we would need to update the lastSyncedEventId for DDLs from catalogOpExecutor too. For now, this patch enables strong consistency guarantee for catalog's metastore service endpoint.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 36
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 Dec 2021 12:42:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9494/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 5
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 18:17:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#16).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,401 insertions(+), 274 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/16
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 16
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 17:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9579/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 17
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Oct 2021 18:17:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7497/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 10
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Sep 2021 01:07:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 3:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/9488/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 19:19:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9475/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Sep 2021 23:50:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 27:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7603/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 27
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Nov 2021 14:35:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7556/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 20
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Oct 2021 14:16:02 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 20:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9643/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 20
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Oct 2021 14:27:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#33).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,606 insertions(+), 301 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/33
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 33
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#2).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,325 insertions(+), 250 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 19:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/17859/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17859/12//COMMIT_MSG@7
PS12, Line 7: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
            : DDL operations via catalog HMS endpoints
> Ack
ping


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2672
PS12, Line 2672: per.getT
> I think the intention for this change was - currently for removeTable() api
I am not sure I fully understand. If this change is not needed for the patch, please remove this. It makes sense to me serialize write operations on the table. It is not clear to me why 2 read operations on a table should be serialized on the read lock for version number. The code doesn't change the table version number. Any concurrent table modification operation atomically replaces the table which is modified.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@288
PS19, Line 288: syncToLatestEventFactory_
Based on my understanding this field here is unnecessary and is complicating the code unnecessarily. Can we remove this field altogether? From what I looked at, this is used in MetastoreEventsProcessor.syncToLatestEventId() which is called from tableLoader and CatalogMetastoreServiceHandler. 

Within the method MetastoreEventsProcessor.syncToLatestEventId() it is only used to get the events so ultimately, the only purpose of having this is to disable isSelfEvent() method for certain event types. Why can't we just do it in the isSelfEvent() itself (look for enable_sync_to_latest_event_on_ddls and if it enabled return false.)

The advantage of doing this would be we would get rid of this unnecessarily complicated way to initialize this factory and then inject in the Catalog from JniCatalog.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@316
PS19, Line 316: numLoadingThreads_
this is unused and can be removed.


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java@115
PS6, Line 115: 
> The intention was - by making lastSyncedEventId_ volatile, the get and set 
All the places where I see this variable getting accessed, I see that it is under the db lock. Am I missing something?


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Db.java@134
PS19, Line 134:     Preconditions.checkState(eventId <= lastSyncedEventId_,
              :         String.format("create event id: %s to be set for db %s should "
              :             + "be <= lastSyncedEvent id: %s", eventId, getName(), lastSyncedEventId_));
              :      */
Pls remove if not needed anymore.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Db.java@153
PS19, Line 153:     /*
              :     Preconditions.checkState(eventId >= createEventId_,
              :         String.format("last synced event id: %s to be set for db %s should "
              :                 + "be >= createEvent id: %s", eventId, getName(), createEventId_));
              :      */
pls remove if not needed.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Table.java@186
PS12, Line 186: 
> nit, add a comment on what this field represents and why it is volatile.
Thanks for adding the comment. But similar to the Db.java's volatile keyword, I don't see any code which is accessing this variable without a lock on the table object. I would prefer to not keep it volatile unless it is absolutely necessary since AFAIK, introducing volatile keyword will disable certain compiler optimizations not just for this variable but for other variables as well to make sure that the "happens-before" relationship is maintained. See https://www.baeldung.com/java-volatile#happens-before for example.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@60
PS19, Line 60: import org.apache.log4j.Logger;
please remove this


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@77
PS19, Line 77: org.slf4j.
this is not needed if you remove line 60


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@213
PS19, Line 213:     Preconditions.checkState(eventId <= lastSyncedEventId_,
              :         String.format("create event id: %s to be set for table %s should "
              :             + "be <= lastSyncedEvent id: %s", eventId, getFullName(),
              :             lastSyncedEventId_));
              :      */
remove if not needed.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@220
PS19, Line 220:     // TODO: Should we reset lastSyncedEvent Id if it is less than event Id?
              :     // If we don't reset it - we may start syncing table from an event id which
              :     // is less than create event id
the createEventId is set when the table is created, I am not sure when would it happen that lastSyncedEventId will be different than -1 in such a case.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@234
PS19, Line 234:     Preconditions.checkState(eventId >= createEventId_,
              :         String.format("last synced event id: %s to be set for table %s "
              :                 + "should be >= createEvent id: %s", eventId, getFullName(),
              :             createEventId_));
              :      */
pls remove if not needed.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@170
PS12, Line 170: initMetrics
> Are you suggesting to pass metrics as null in 
I was suggesting that we should not pass the metrics in which case we would use the MetastoreEventsProcessor's metrics object. In CatalogMetastoreServiceHandler, we can explicitly pass a metrics object since the intent there is to have separate metrics eventually. In the long run these metrics should be moved to table objects instead of keeping them at a eventsprocessor or metastoreServiceHandler's level.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@34
PS19, Line 34: import
remove if not necessary? In fact all the changes to this file can be removed from the patch since I don't see any relevant code changes related to this patch.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@461
PS12, Line 461: 
> Sorry I didn't understand your comment
Actually, I see why you have a new abstract method now. However, instead of calling it from processIfEnabled() it would be cleaner in my opinion to have all this logic in one single method for readability reasons. isSelfEvent() is already hooked with the metrics correctly so it would make sense to just modify that to something like below:

boolean isSelfEvent() {
  boolean canBeSkipped = false;
  if (BackendConfig.INSTANCE.enableSyncToLatestEventOnDdls()) {
  canBeSkipped = shouldSkipWhenSyncingToLatestEventId();
} else {
  canBeSkipped = catalog_.evaluateSelfEvent(getSelfEventContext()));
}

if (canBeSkipped) {
  metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC)
    .inc(getNumberOfEvents());
  infoLog("Incremented events skipped counter to {}",
    metrics_.getCounter(MetastoreEventsProcessor.EVENTS_SKIPPED_METRIC)
      .getCount());

}
return canBeSkipped;
}


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@316
PS19, Line 316: EventFactoryForSyncToLatestEvent
Ideally would be great to get rid of this class since all it is really doing is to return false on isSelfEvent() for some events which can be done using the config check directly in MetastoreEventFactory


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@479
PS19, Line 479: BackendConfig
pls see my comment about moving this logic into isSelfEvent() so that there is only one place where all the self-event evaluation (both legacy and new way).


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@875
PS19, Line 875: tbl.getLastSyncedEventId() == -1
Why do we need to logically AND with this condition here?


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@371
PS19, Line 371:         if (currentEvent.isDropEvent()) {
if we are stopping at the drop event it would be weird I feel. For instance, after table load there would not be any table. I think it would make sense to keep things simple and process all the events.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@435
PS19, Line 435: if (currentEvent.isDropEvent()) {
same as previous comment.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1939
PS19, Line 1939: null
I am surprised we are not hitting NPE due to this and other similar calls below.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java@40
PS19, Line 40: import org.apache.impala.catalog.TableLoadingMgr;
             : import org.apache.impala.catalog.events.EventFactory;
please remove the unused imports



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 19
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 21 Oct 2021 18:25:35 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 14:

(49 comments)

Sorry for adding some comments on the older patch sets. I reviewed this over multiple days and the update was updated during that. Many of these comments are questions to understand things better.

http://gerrit.cloudera.org:8080/#/c/17859/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17859/11//COMMIT_MSG@7
PS11, Line 7: Sync db/table in catalog cache to latest HMS event id when performing
            : DDL operations via catalog HMS endpoints
If this patch is close to getting merged, now is a good to add more details here and follow the commit message format styles.


http://gerrit.cloudera.org:8080/#/c/17859/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17859/12//COMMIT_MSG@7
PS12, Line 7: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
            : DDL operations via catalog HMS endpoints
Since this patch is not a WIP anymore, can you please follow the commit message conventions (limit the subject to 72 chars) and add a detailed description of change?


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@457
PS6, Line 457:     for(int i = 0; i < numTables; i++) {
             :       Table tbl = tables[i];
             :       if (!tryWriteLock(tbl)) {
             :         LOG.debug("Could not acquire write lock on table: " + tbl.getFullName());
             :         // unlock previously locked tables
             :         for(int j = 0; j < i; j++) {
             :           tables[j].releaseWriteLock();
             :         }
             :         return false;
             :       }
             :       // unlock version write lock for all tables
             :       // except last
             :       if (i < numTables-1) {
             :         versionLock_.writeLock().unlock();
             :       }
I think the versionLock.writeLock().unlock() should be moved out finally block. Also if tryWriteLock() throws an exception the locks on the previous tables is not released. It is critical to make sure that this method is releasing locks correctly under error conditions.


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3652
PS6, Line 3652: setEventFactoryForSyncToLatestEvent
annotate with @VisibleForTesting if this was for testing.


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3658
PS6, Line 3658: public
nit, missing newline.


http://gerrit.cloudera.org:8080/#/c/17859/11/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/11/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@459
PS11, Line 459: tryWriteLock
I had left some comments in the older gerrit url for this patch. Can you please address them?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@457
PS12, Line 457:     for(int i = 0; i < numTables; i++) {
              :       Table tbl = tables[i];
              :       if (!tryWriteLock(tbl)) {
              :         LOG.debug("Could not acquire write lock on table: " + tbl.getFullName());
              :         // unlock previously locked tables
              :         for(int j = 0; j < i; j++) {
              :           tables[j].releaseWriteLock();
              :         }
              :         return false;
              :       }
              :       // unlock version write lock for all tables
              :       // except last
              :       if (i < numTables-1) {
              :         versionLock_.writeLock().unlock();
              :       }
a RuntimeException thrown at line 459 will not release the table locks as well as the versionLock. Can you please make this more robust? Specifically, we want locks on all the tables or on none. A typical way to implement this would be
List<Table> lockedTbls = new ArrayList<>(tables.length);
try {
  for (Table tbl : tables) {
    if (!tryWriteLock(tbl)) throw new CatalogException("Could not acquire lock on " + tbl.fullName());
    lockedTbls.add(tbl);
  }
} finally {
  try {
  if (lockedTbls.size() != tables.length) {
    for (Table tbl : lockedTbls) tbl.releaseWriteLock();
  }
  } catch (Exception e) {
    LOG.error("Some write locks may not have been released on the tables in + lockedTbls.toString()", e);
  }
  // versionLock_ must be released before leaving the method.
  versionLock_.writeLock().unlock();
}


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2672
PS12, Line 2672: getTable
why do we need to override this method?


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java@115
PS6, Line 115: lastSyncedEventId_
please add a comment here explaining what the value of this field signifies.


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java@115
PS6, Line 115: volatile
need this?


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java@132
PS6, Line 132:   }
add a newline.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Db.java@115
PS12, Line 115: lastSyncedEventId_
nit, add a comment on what this field represents and why it is volatile.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Db.java@132
PS12, Line 132:   }
nit, add a single blank line.


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Table.java@186
PS6, Line 186: volatile
not sure I understand why this needs to be volatile?


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Table.java@211
PS6, Line 211: }
new line.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Table.java@186
PS12, Line 186: lastSyncedEventId_
nit, add a comment on what this field represents and why it is volatile.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Table.java@212
PS12, Line 212: public
nit, single line space.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@170
PS12, Line 170: initMetrics
since these metrics are not getting exposed anywhere. can we remove these?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@168
PS12, Line 168: start
do we need these changes to this class?


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@166
PS6, Line 166: getInstance
can be simply renamed to get()


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@461
PS12, Line 461: shouldSkipWhenSyncingToLatestEventId
why are we not just using isSelfEvent() here? Would it not be cleaner to change its implementation to use lastSyncedEventId instead of self event identifiers?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@576
PS12, Line 576: boolean
is this commented for a reason?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@800
PS12, Line 800:         if (tbl == null) {
              :           infoLog("Skipping on table {}.{} since it does not exist in cache", dbName,
              :               tblName);
              :           return true;
              :         }
if the table is not present do we need to check if it is in the deletelog?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@990
PS12, Line 990: InsertEvent
I think we should implement a check for lastSyncedEventId based self-event detection in this class as well.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1175
PS12, Line 1175: !shouldSkipHelper(tableBefore_.getDbName(), tableBefore_.getTableName()) &&
               :           shouldSkipHelper(tableAfter_.getDbName(), tableAfter_.getTableName())
It is not clear to me why we are checking on both the before and after table?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1699
PS12, Line 1699: AlterPartitionEvent
Why does this class not implement shouldSkipWhenSyncingToLatestEventId?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@328
PS12, Line 328: shouldBeAlreadyLocked
Is there a caller which passes this as false?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@329
PS12, Line 329: Preconditions.checkState(tbl.isWriteLockedByCurrentThread(),
              :           String.format("Write lock is not held on table %s by current thread",
              :               tbl.getFullName()));
I think this preconditions check should be done in any case to catch bugs where a caller is calling this method without holding the lock and passing shouldBeAlreadyLocked as false.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@344
PS12, Line 344: if (events.isEmpty()) {
              :         LOG.debug("table {} synced till event id {}. No new HMS events to process from "
              :                 + "event id: {}", tbl.getFullName(), lastEventId,
              :             lastEventId + 1);
              :         return;
              :       }
it looks like the lastSyncedEventId represents the lastEventId of the table not the lastEventId globally. We should update the documentation of the field to reflect that accurately.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@354
PS12, Line 354: TODO:
              :        1. Should we stop after processing drop_table
              :           event because drop_table event would remove table
              :           from cache and subsequent create_table event would add
              :           new table object in cache?
Since the objective of this method to sync this table to its latest eventid, I think we should continue processing inspite of a drop table event. E.g if the table has been recreated since the lastSyncedEvent, it makes sense to me that this table will be recreated by this method.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@386
PS12, Line 386: tbl.setLastSyncedEventId(currentEvent.getEventId());
can this logic be moved to the event.processIfEnabled() itself?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@458
PS12, Line 458: TODO: should we ignore case
yes, case should be ignored. Also I think we should make sure that the catalog name is same as well or we skip the events on non-default catalogs like events processor.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@478
PS12, Line 478: event.getTableName() == null
not sure I get this part. Why do we need to do this?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@533
PS12, Line 533: storeEventFactory.get
can we use the singleton way to get the factory here?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@283
PS12, Line 283: dropDbIfExists
why not sync to latest event here?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@857
PS12, Line 857: // TODO: Check if HMS events are generated for
              :       // both source and dest table. I think exchange partition
              :       // generates drop_partition event for source table
              :       // and add_partition event for destination table
Can this be removed now?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@905
PS12, Line 905:       // TODO: Check if HMS events are generated for
              :       // both source and dest table. I think exchange partition
              :       // generates drop_partition event for source table
              :       // and add_partition event for destination table
remove if verified


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1105
PS12, Line 1105: dropTableIfExists
why not sync to latest event here?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1279
PS12, Line 1279: renameTable
Do we need a preconditions check to confirm that table is locked by this thread?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1402
PS12, Line 1402:       catalog_.addIncompleteTable(dbName, tblName, createEventId);
               :       LOG.info("Added incomplete table {}.{} with create event id: {}", dbName, tblName,
               :           createEventId);
               :       // sync to latest event ID
               :       tbl = getTableAndAcquireWriteLock(dbName, tblName, apiName);
               :       catalog_.getLock().writeLock().unlock();
               :       syncToLatestEventId(tbl, apiName);
intuitively it probably makes more sense to let the syncToLatestEventId create the table for you.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@747
PS12, Line 747: TODO: Should we recreate table if its create event id
              :       // does not match the one passed in fn argument?
That sounds like a error condition to me. We could add a Preconditions.checkState(existingTable.getCreateEventId() >= eventId) in such a case.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@909
PS12, Line 909:   if (dbToAlter == null) {
              :       LOG.debug("Event id: {}, not altering db {} since it does not exist in catalogd",
              :           eventId, dbName);
              :       
check deleteEventLog here?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@926
PS12, Line 926: updateDbIfExists
not sure if you should use this method here because it takes a version lock again. It looks like updateDbIfExists isn't used anymore so it would be good to remove it altogether and do the update from this method itself. You should also get the catalogVersion for the update before releasing the versionLock on 919 so that we don't need to take it again.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@757
PS12, Line 757: TODO
still WIP?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@1611
PS12, Line 1611:         if (!BackendConfig.INSTANCE.enableSyncToLatestEventOnDdls()) {
               :           assertEquals(EventProcessorStatus.NEEDS_INVALIDATE,
               :               eventsProcessor_.getStatus());
               :         }
not sure I fully understand this.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@199
PS12, Line 199: tsProcessor_.processEve
Will this cause flakiness of the test since the HMS is shared among multiple tests. If another test is running concurrently which generates the events, this check will fail.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@240
PS12, Line 240: // stored in catalog cache
              :             assertTrue(tbl.getPartitions().size() == 0);
              : 
not sure I understand this. The add_partitions API in HMS handler should sync the table to latest event here. Why are we asserting that there would not be partitions?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@550
PS12, Line 550: 
can you also assert here that lastSyncedEventId > createEventId


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@561
PS12, Line 561: log_.setMetastoreEventProcessor(eventsPr
not sure what is the goal of this test? This is testing functionality which was pre-existing and mostly covered in MetastoreEventsProcessorTest, right?



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 14
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Oct 2021 19:27:41 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#19).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,410 insertions(+), 281 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/19
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 19
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 16:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9578/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 16
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Oct 2021 17:08:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#17).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,413 insertions(+), 280 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/17
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 17
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 33:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/33/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/33/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1299
PS33, Line 1299:             MetastoreEventsProcessor.getNextMetastoreEventsInBatches(catalog_, currentEventId,
line too long (94 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 33
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Nov 2021 11:03:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 41: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 41
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 14 Dec 2021 02:18:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 41: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 41
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 13 Dec 2021 20:03:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 22:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7562/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 19:37:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#20).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,362 insertions(+), 282 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/20
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 20
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 27:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9730/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 27
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Nov 2021 13:02:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7493/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 9
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Sep 2021 01:15:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@166
PS3, Line 166:     public synchronized static MetastoreEventFactory getInstance(CatalogOpExecutor opExecutor) {
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@227
PS3, Line 227:     List<MetastoreEvent> getFilteredEvents(List<NotificationEvent> events, Metrics metrics)
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@295
PS3, Line 295:     public static synchronized MetastoreEventFactory getInstance(CatalogOpExecutor opExecutor) {
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@245
PS3, Line 245:       metastoreEventFactory_.get(events.get(0), metastoreEventsMetrics_).processIfEnabled();
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@509
PS3, Line 509:     org.apache.impala.catalog.Table tbl = getTableAndAcquireWriteLock(partition.getDbName(),
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@367
PS3, Line 367:             HdfsTable currentTbl = (HdfsTable) catalog_.getTable(TEST_DB_NAME, tblNameLowerCase);
line too long (97 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/3/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@116
PS3, Line 116:     cs.setEventFactoryForSyncToLatestEvent(EventFactoryForSyncToLatestEvent.getInstance(opExecutor));
line too long (101 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 19:11:59 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7485/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 5
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Fri, 24 Sep 2021 00:53:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7478/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Sep 2021 23:31:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@509
PS2, Line 509:     org.apache.impala.catalog.Table tbl = getTableAndAcquireWriteLock(partition.getDbName(),
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@367
PS2, Line 367:             HdfsTable currentTbl = (HdfsTable) catalog_.getTable(TEST_DB_NAME, tblNameLowerCase);
line too long (97 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Sep 2021 22:01:01 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7497/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 10
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Sep 2021 07:21:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7529/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 18
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 11 Oct 2021 18:23:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#6).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
29 files changed, 3,362 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/6
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 6
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 17:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7525/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 17
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Oct 2021 18:13:47 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 15:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17859/15/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/15/fe/src/main/java/org/apache/impala/catalog/Db.java@150
PS15, Line 150:         String.format("last synced event id: %s to be set for db %s should be >= createEvent id: %s",
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/15/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/15/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1311
PS15, Line 1311:         MetastoreEvents.AlterTableEvent alterEvent = (MetastoreEvents.AlterTableEvent) event;
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/15/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1313
PS15, Line 1313:         org.apache.hadoop.hive.metastore.api.Table oldMsTable = alterEvent.getBeforeTable();
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/15/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1314
PS15, Line 1314:         org.apache.hadoop.hive.metastore.api.Table newMsTable = alterEvent.getAfterTable();
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 15
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 07 Oct 2021 20:04:01 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#21).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,397 insertions(+), 289 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/21
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 21:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@63
PS21, Line 63: // import org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent;
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2412
PS21, Line 2412:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS21, Line 85:     private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/21/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@28
PS21, Line 28: //import org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent;
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:00:59 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 23:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17859/23/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/23/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402
PS23, Line 2402:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/23/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/23/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS23, Line 85:     private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 23
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 Oct 2021 10:05:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#23).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,362 insertions(+), 282 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/23
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 23
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#40).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design, catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread which polls HMS
events from notifications log table and apply them asynchronously.
These two stream of updates cause consistency issues. For example
consider the following sequence of alter table events on a table
t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Db/Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd from both catalog HMS
metastore server and Impala shell) will follow the following steps
to update the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache in the
same order as they appear in HMS thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. And while excuting a ddl,
   db/table will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,674 insertions(+), 314 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/40
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 40
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 39: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 39
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Dec 2021 22:30:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 31:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7650/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 31
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 19 Nov 2021 18:20:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 30:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7647/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 30
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Nov 2021 11:39:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#4).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
29 files changed, 3,403 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 19:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7532/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 19
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Oct 2021 17:02:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7493/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 9
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Mon, 27 Sep 2021 19:00:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7501/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 11
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Sep 2021 16:52:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#10).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,367 insertions(+), 272 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/10
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 10
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design, catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread which polls HMS
events from notifications log table and apply them asynchronously.
These two stream of updates cause consistency issues. For example
consider the following sequence of alter table events on a table
t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Db/Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd from both catalog HMS
metastore server and Impala shell) will follow the following steps
to update the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache in the
same order as they appear in HMS thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. And while excuting a ddl,
   db/table will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Reviewed-on: http://gerrit.cloudera.org:8080/17859
Reviewed-by: Vihang Karajgaonkar <vi...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,674 insertions(+), 314 deletions(-)

Approvals:
  Vihang Karajgaonkar: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 42
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 27:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17859/27/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/27/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2423
PS27, Line 2423:    * String, long)} which passes false for {@code refreshUpdatedPartitions} argument and ignore
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/27/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/27/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5878
PS27, Line 5878:               updatedThriftTable = catalog_.reloadTable(tbl, req, resultType, cmdString, -1);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/27/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/27/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2471
PS27, Line 2471:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 27
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Nov 2021 12:41:03 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 28:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9766/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 28
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 12 Nov 2021 11:47:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 30: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7647/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 30
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Nov 2021 18:03:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 31: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7648/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 31
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 19 Nov 2021 00:41:48 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#37).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design, catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread which polls HMS
events from notifications log table and apply them asynchronously.
These two stream of updates cause consistency issues. For example
consider the following sequence of alter table events on a table
t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Db/Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd from both catalog HMS
metastore server and Impala shell) will follow the following steps
to update the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache in the
same order as they appear in HMS thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. And while excuting a ddl,
   db/table will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,674 insertions(+), 314 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/37
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 37
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#34).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,670 insertions(+), 312 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/34
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 34
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 24:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9695/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 24
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 29 Oct 2021 10:36:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 22: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7562/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Oct 2021 01:40:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#9).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,358 insertions(+), 272 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/9
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 9
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#8).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,359 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/8
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 8
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#3).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
29 files changed, 3,396 insertions(+), 272 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 3
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 11:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9519/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 11
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Sep 2021 17:12:56 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#5).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
29 files changed, 3,396 insertions(+), 269 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/5
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 5
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 1: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Sep 2021 05:40:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7483/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 21:33:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 22:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9655/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:33:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 41:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7702/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 41
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 13 Dec 2021 20:03:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 25:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9696/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 25
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 29 Oct 2021 13:14:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 25:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17859/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/25/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2423
PS25, Line 2423:    * String, long)} which passes false for {@code refreshUpdatedPartitions} argument and ignore
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/25/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/25/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@5877
PS25, Line 5877:               updatedThriftTable = catalog_.reloadTable(tbl, req, resultType, cmdString, -1);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/25/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2471
PS25, Line 2471:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 25
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 29 Oct 2021 12:54:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 31:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7656/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 31
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 22 Nov 2021 11:58:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 26:

(7 comments)

The patch makes sense to me. There is definitely more work needed like updating the lastSyncedEventId from catalogOpExecutor#execDdl. Currently, this patch supports updates to the lastSyncedEventId from CatalogMetastoreServiceHandler and uses it for its self-event detection. It would be good to do the same on the execDdl path in the longer run. I have left some questions/comments which when resolved, I can +2 the patch.

http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@328
PS26, Line 328: EventFactoryForSyncToLatestEvent
remove if not needed.


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@919
PS26, Line 919: if (!isSelfEvent || !BackendConfig.INSTANCE.enableSyncToLatestEventOnDdls()) {
              :         return isSelfEvent;
              :       }
Not sure if I understand this condition correctly. Why are evaluating old selfEvent when flag is true? Also, it looks like if the old selfEvent evaluation is false, then irrespective of where the syncToLatest flag is true or not we return early.


Can we change this to:


if (!BackendConfig.INSTANCE.enableSyncToLatestEventOnDdls()) {
  return isSelfEvent();
}


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@932
PS26, Line 932: tbl.setLastSyncedEventId(getEventId());
Why do we need to set the lastSyncedEventId here? Can we keep the scope of this method to only detecting self-events? I thought we are already setting the lastSyncedEvent when we actually process the method.


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@988
PS26, Line 988: if (!isSelfEvent || !BackendConfig.INSTANCE.enableSyncToLatestEventOnDdls()) {
              :         return isSelfEvent;
              :       }
same as previous comment.


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@3088
PS26, Line 3088: Reference
remove if not needed.


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@749
PS26, Line 749: // TODO: Should we recreate table if its create event id
              :       // does not match the one passed in fn argument?
Is this something that you are working on?


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@929
PS26, Line 929:       // TODO: should event id be set only in case of success?
Is this TODO still unresolved? Please remove if it is. My understanding is the it returns false when the Db doesn't exist anymore in which case adding the lastSyncedEventId here doesn't make sense.



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 26
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 02 Nov 2021 20:37:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#36).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,668 insertions(+), 312 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/36
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 36
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 27: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7603/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 27
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 09 Nov 2021 00:36:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/17/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/17/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402
PS17, Line 2402:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 17
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Oct 2021 17:57:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 37:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7690/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 37
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 03 Dec 2021 18:41:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 39:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9892/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 39
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Dec 2021 16:39:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#30).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,578 insertions(+), 296 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/30
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 30
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 40:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9919/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 40
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 13 Dec 2021 20:06:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Sourabh Goyal has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 20:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/17859/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17859/12//COMMIT_MSG@7
PS12, Line 7: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
            : DDL operations via catalog HMS endpoints
> ping
@Vihang: I already have this in my to-do list. Will write a detailed commit message.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2672
PS12, Line 2672: lse;
> I am not sure I fully understand. If this change is not needed for the patc
I understand and as discussed over call, I will remove this method


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@288
PS19, Line 288: syncToLatestEventFactory_
> Based on my understanding this field here is unnecessary and is complicatin
@Vihang: We do need to pass event factory object to syncToLatestEventId() in metastoreEventProcessor since it is a static method. However I agree that we should not create a new event factory and instead modify isSelfEvent() to accomodate sync to latest event id flag. One issue that I see is - if event processing is disabled then MetastoreEventProcessor is not initialized and there would be no way to access eventFactory object from it. 

Few ways to solve this: 
1. Decouple MetastoreEventProcessor and EventFactory creation. In JniCatalog, we can create a common event factory that would be set in EventProcessor as well as used elsewhere. Doing so, we need to make sure that the factory is thread safe.

2. Make MetastoreEventFactory singleton and use it from all the places. This would avoid JniCatalog route. I had tried this approach in the past and encountered some test failures. Didn't investigate the failures in depth but it appeared to be race conditions issues. I can take a shot at it again if the approach seems cleaner. 

Let me know your thoughts.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@316
PS19, Line 316: 
> this is unused and can be removed.
Ack


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java@115
PS6, Line 115: 
> All the places where I see this variable getting accessed, I see that it is
As discussed over call, we will keep it as volatile so that EventProcessor can check if it needs to process an event or not without acquiring readlock on db/table.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Db.java@134
PS19, Line 134:     LOG.debug("createEventId_ for db: {} set to: {}", getName(), createEventId_);
              :     if (lastSyncedEventId_ < eventId) {
              :       setLastSyncedEventId(eventId);
              :     }
> Pls remove if not needed anymore.
Sure. I had added this check earlier but then saw some failures in the tests related to ddls from Impala shell . For now, I will add a TODO comment and we can address it in a follow up jira.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Db.java@153
PS19, Line 153:   /**
              :    * Creates a Db object with no tables based on the given TDatabase thrift struct.
              :    */
              :   public static Db fromTDatabase(TDatabase db) {
              :     ret
> pls remove if not needed.
Same as previous comment


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Table.java@186
PS12, Line 186: 
> Thanks for adding the comment. But similar to the Db.java's volatile keywor
As discussed, we will keep the variable as volatile so that event processor can read it (to check if event should be skipped or not) without acquiring read lock on table object.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/Table.java@212
PS12, Line 212:   crea
> nit, single line space.
Ack


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@60
PS19, Line 60: 
> please remove this
Ack


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@77
PS19, Line 77: Logger LOG
> this is not needed if you remove line 60
Ack


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@220
PS19, Line 220:   }
              : 
              :   public long getLastSyncedEventId(
> the createEventId is set when the table is created, I am not sure when woul
I did see an occurrence of this scenario in catalogOpExecutor code but don't remember clearly now.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/Table.java@234
PS19, Line 234:    * Returns if the given HMS table is an external table (uses table type if
              :    * available or else uses table properties). Implementation is based on org.apache
              :    * .hadoop.hive.metastore.utils.MetaStoreUtils.isExternalTable()
              :    */
              :   publi
> pls remove if not needed.
Sure I will add a TODO for now and we can take it up in a follow up jira.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@170
PS12, Line 170: alidate met
> I was suggesting that we should not pass the metrics in which case we would
I understand. However reusing MetastoreEventProcessor's metrics seems to be tricky because:

What if event processing is disabled? In that case eventProcessor will be an instance of NoOpEventProcessor. Also ExternalEventProcessor currently does not have getMetrics() api. We can expose this api  and create a default metrics for NoOpEventProcessor as well. 

Let me know what you think.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@34
PS19, Line 34: import
> remove if not necessary? In fact all the changes to this file can be remove
Sure.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@461
PS12, Line 461:  return dbName_; }
> Actually, I see why you have a new abstract method now. However, instead of
As discussed over call, we will keep this method separate from isSelfEvent() because when sync to latest event id is enabled by default, we can then get rid of self event logic


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@875
PS19, Line 875: 
> Why do we need to logically AND with this condition here?
As discussed, I will add a comment for the same.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@371
PS19, Line 371:    * @param db
> if we are stopping at the drop event it would be weird I feel. For instance
As discussed over call, there are few complications in processing events beyond drop event. For example - as a result of drop event, the table would be dropped from the cache. If we continue processing next events, the first such event would be create_table. While processing that event, we would again have to acquire write lock on the new table. But this method assumes that the write lock is already acquired. 
For now, we have discussed to break after drop event.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1939
PS19, Line 1939:  pro
> I am surprised we are not hitting NPE due to this and other similar calls b
It is because we are not processing these events. But I agree that we shouldn't pass null metrics.


http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/main/java/org/apache/impala/service/JniCatalog.java@40
PS19, Line 40: import org.apache.impala.catalog.events.ExternalEventsProcessor;
             : import org.apache.impala.catalog.events.MetastoreEven
> please remove the unused imports
Sure.



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 20
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Oct 2021 18:01:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 20: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 20
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Oct 2021 20:29:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 32:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7672/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 32
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Nov 2021 10:06:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 21:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7559/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:03:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Sourabh Goyal has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@253
PS4, Line 253:   protected int getPort() throws CatalogException {
> This is supposed to be only used during testing ?
I don't recollect now why I had to make it public (maybe did this when adding new tests). Will fix it.


http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@609
PS4, Line 609:       syncToLatestEventId(tbl, apiName);
> There are so many TODOs in patch which needs discussion. I think you should
Yes that makes sense. I had intentionally added so that I don't miss getting feedback on them. Will remove them before the merge into master branch


http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java@54
PS4, Line 54:   @Test
> Looks like no changes in this file.
Ack


http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@118
PS4, Line 118:     cs.setCatalogOpExecutor(opExecutor);
> Please remove the statements, instead of commenting
Left them as such because of some back and forth code changes. Will remove it.


http://gerrit.cloudera.org:8080/#/c/17859/4/tests/custom_cluster/test_metastore_service.py
File tests/custom_cluster/test_metastore_service.py:

http://gerrit.cloudera.org:8080/#/c/17859/4/tests/custom_cluster/test_metastore_service.py@126
PS4, Line 126:                       "--enable_sync_to_latest_event_on_ddls=false"
> Will it not be false by default ? Do we need these changes ?
Yes the flag is false by default. But these tests would start failing in future when the flag is turned on. And these tests are supposed to work with flag set to false. Therefore added a config for it here.



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 8
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Mon, 27 Sep 2021 17:39:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9489/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 20:27:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 13:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9559/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 13
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Oct 2021 17:21:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#13).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,358 insertions(+), 272 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/13
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 13
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
kishen@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java@253
PS4, Line 253:   public int getPort() throws CatalogException {
This is supposed to be only used during testing ?
Why it has to be made public ?


http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@609
PS4, Line 609:     // TODO: Should we sync the table till latest event id if partitions list
There are so many TODOs in patch which needs discussion. I think you should just discuss these internally and finalize the implementation and remove these TODOs, if they are not about some missing logic that needs to be implemented in future.


http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java@54
PS4, Line 54: 
Looks like no changes in this file.


http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@174
PS4, Line 174:     public void testAlterDatabase() throws Exception {
How do make sure future DDLs will sync to latest event ID ?


http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/4/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@118
PS4, Line 118:     //cs.setTableLoadingMgr(new TableLoadingMgr(opExecutor, 16));
Please remove the statements, instead of commenting


http://gerrit.cloudera.org:8080/#/c/17859/4/tests/custom_cluster/test_metastore_service.py
File tests/custom_cluster/test_metastore_service.py:

http://gerrit.cloudera.org:8080/#/c/17859/4/tests/custom_cluster/test_metastore_service.py@126
PS4, Line 126:                       "--enable_sync_to_latest_event_on_ddls=false"
Will it not be false by default ? Do we need these changes ?



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 4
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 17:10:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7485/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 5
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 17:56:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 26:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7582/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 26
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Nov 2021 17:50:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 35: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 35
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 30 Nov 2021 01:17:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 35:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9848/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 35
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 29 Nov 2021 18:13:46 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 38:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7693/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 38
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Dec 2021 12:19:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Sourabh Goyal has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 6:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@223
PS2, Line 223:   //value of timeout for the topic update thread while waiting on the table lock.
> 7200000 seems to be too large. How did you pick this value ?
I didn't choose the value. It was already present.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2142
PS2, Line 2142:     // Return the table if it is already loaded or submit a new load request.
> Why do you have many .trace() instead of .debug() ?
VIhang had suggested to convert them to trace instead of debug so as to have lesser logs.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3654
PS2, Line 3654:     Preconditions.checkArgument(factory instanceof EventFactoryForSyncToLatestEvent,
> Since its no longer final, can you check the code to make sure, we are not 
I have refactored the code in later patches to not set tableloading manager from outside.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@51
PS2, Line 51:   private final CatalogServiceCatalog catalog_;
> What is the difference between LoggerFactory vs Logger ?
LoggerFactory is from slf4j whereas previous logger was from log4j


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@164
PS2, Line 164:     }
> Can there be multiple threads trying to do full table reload at the same ti
We create a new table object at line 138 so it is thread safe and no lock is required.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@159
PS2, Line 159:   private final CatalogServiceCatalog catalog_;
> Why are you removing "final" for CatalogServiceCatalog and CatalogOpExecuto
I have fixed it in subsequent patch sets.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@266
PS2, Line 266:         }
> I think it all depends on whether we can toggle this flag, without restarti
I tried reusing the existing event factory and caught few issues in the test suite. For now, we are keeping event factory for syncing to latest event id separate.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@835
PS2, Line 835:      * Skip this event if the table is already synced till this event id. Otherwise
> Please write the Javadoc for this method.
Sure. Thanks for catching it.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@966
PS2, Line 966:               && dbName_.equalsIgnoreCase(alterTableEvent.msTbl_.getDbName())
> May be you can add default interface method which returns false in the abst
The parent class i.e MetastoreTableEvent has a default implementation which works for all table events except create and drop event.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1163
PS2, Line 1163: 
> Please use the variables directly from tableBefore and tableAfter.
Sure.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1504
PS2, Line 1504: 
> You can avoid most of these by having default implementation, which returns
Have fixed it for DatabaseEvent classes


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@357
PS2, Line 357:           from cache and subsequent create_table event would add
> Please remove these TODOs or update them as per what is needed in future, b
Yes, I will remove/update them after the discussion.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@475
PS2, Line 475:           return db.getName().equalsIgnoreCase(event.getDbName());
> I think it should become a utility method. Otherwise, everywhere we have to
Didn't understand. The method is static. In a way, it is already a utility method.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@899
PS2, Line 899:     return metrics_;
> Why this has to be changed to "protected", if its only for testing ?
Can't recollect my thoughts at that time. I am overriding this method in SynchronousHMSEventProcessorForTests for testing purpose. Removing this change.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@110
PS2, Line 110: 
> Please remove the variable, if you don't need it.
Ack


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@117
PS2, Line 117:   public GetTableResult get_table_req(GetTableRequest getTableRequest)
> I don't think that would be a good idea. 
I too had the same thought, just wanted to confirm. Thanks for the confirmation.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@509
PS2, Line 509:       syncToLatestEventId(tbl, apiName);
> Hope you are using the Java template.
Yes I am


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@607
PS2, Line 607: 
> There is no harm in doing it.
Ack


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1332
PS2, Line 1332:         T resp = task.execute();
> Do you have to distinguish between database rename and table rename ?
Didn't get you. We consider a table as renamed if 
 It is moved to a different db or it is in same db but name is changed.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@749
PS2, Line 749:       if (existingTable != null) {
@Vihang: I have modified the check to also check for create event id in an existing table. Please take a look.



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 6
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 22:09:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/1/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/1/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@704
PS1, Line 704:     org.apache.impala.catalog.Table tbl = getTableAndAcquireWriteLock(partition.getDbName(),
line too long (92 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 1
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 20 Sep 2021 23:29:58 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9495/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 6
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Sep 2021 22:30:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 17: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7525/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 17
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Sat, 09 Oct 2021 00:22:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#15).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,399 insertions(+), 274 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/15
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 15
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 19: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7532/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 19
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Oct 2021 23:15:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9505/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 8
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Mon, 27 Sep 2021 17:51:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#7).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
M fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
29 files changed, 3,363 insertions(+), 274 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/7
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 7
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#22).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,398 insertions(+), 290 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/22
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 22:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@63
PS22, Line 63: // import org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent;
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2412
PS22, Line 2412:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS22, Line 85:     private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/22/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java@28
PS22, Line 28: //import org.apache.impala.catalog.events.MetastoreEvents.EventFactoryForSyncToLatestEvent;
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 22
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:11:54 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 21: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 18:14:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 38: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 38
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Dec 2021 18:36:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#25).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,561 insertions(+), 292 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/25
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 25
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 33: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 33
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Nov 2021 17:19:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 31: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 31
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 22 Nov 2021 18:23:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#32).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread, polls HMS events
and apply them asynchronously. These two stream of updates cause
consistency issues. For example consider a following sequence of
alter table events on a table t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd (from both catalog HMS
endpoint and Impala shell) will follow the following steps to update
the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache (in the
same order as they appear in HMS) thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. While excuting a ddl, db/table
   will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,611 insertions(+), 307 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/32
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 32
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 34:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/34/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/34/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1299
PS34, Line 1299:             MetastoreEventsProcessor.getNextMetastoreEventsInBatches(catalog_, currentEventId,
line too long (94 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 34
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 29 Nov 2021 17:45:13 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 37:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/37/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/37/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@3177
PS37, Line 3177:               String.format("Drop database event not received for db: %s from event id: %s. "
line too long (93 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 37
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 03 Dec 2021 18:39:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 37:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9871/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 37
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 03 Dec 2021 18:59:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 32: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7672/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 32
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Nov 2021 16:24:30 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7492/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 7
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Mon, 27 Sep 2021 17:24:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7492/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 7
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Mon, 27 Sep 2021 23:42:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
kishen@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 14:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@373
PS14, Line 373: 
Extra line


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@436
PS14, Line 436:    * Acquire write lock on multiple tables If the lock couldn't be acquired on any
multiple tables. If the


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@440
PS14, Line 440:    * @return true if lock was acquired on all tables successfully. False
true if lock was acquired on all tables successfully; false otherwise.


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@468
PS14, Line 468:       // except last
Why except last, can you add that in the comment itself ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@470
PS14, Line 470:         versionLock_.writeLock().unlock();
Is it possible to get some exception here where we end up with some tables that are still locked  ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2261
PS14, Line 2261: 
extra line


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2661
PS14, Line 2661:       // TODO: If reloadTable succeeds, should we sync table till current HMS
Yes, we should sync it to the latest. 
Please address all TODO before submit it again for review.


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3658
PS14, Line 3658:   public MetastoreEventFactory getEventFactoryForSyncToLatestEvent() {
white line missing between methods.


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/Table.java@199
PS14, Line 199:   // TODO: Get rid of get and createEventId
Will this TODO be done as part of this patch ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/Table.java@201
PS14, Line 201:   // last synced id in full table reload
last or latest ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/Table.java@212
PS14, Line 212:   public void setLastSyncedEventId(long eventId) {
white line missing between methods.


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@57
PS14, Line 57:   private Metrics metrics_ = new Metrics();
Where are you publishing or logging this ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@165
PS14, Line 165:     tblLoader_ = new TableLoader(catalog);
Why is this change required ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@840
PS14, Line 840:     protected boolean shouldSkipWhenSyncingToLatestEventId() throws CatalogException {
Can it not be a separate utility method, which takes eventId, dbName and table name as arguments ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@329
PS14, Line 329:       Preconditions.checkState(tbl.isWriteLockedByCurrentThread(),
Since you anyway check, why the caller has to send the boolean flag ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@354
PS14, Line 354:       TODO:
Can you write the final version of this TODO ?
Are you going to do it in the same patch ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@474
PS14, Line 474:           // TODO: should we ignore case?
You might have to remove this TODO.


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@3060
PS14, Line 3060:     String[] catAndDbName = MetaStoreUtils.parseDbName(dbNameWithCatalog, serverConf_);
Why did you remove the try catch block ?


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@896
PS14, Line 896:    * Updates the catalog db with alteredMsDb. To do so, first acquire lock on catalog db
What is alteredMsDb  ? Instead of saying "alteredMsDb", may be you have to say what it's meant for.


http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/14/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@421
PS14, Line 421: 
Remove all the unnecessary new lines



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 14
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Oct 2021 22:05:59 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello kishen@cloudera.com, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#11).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,370 insertions(+), 272 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/11
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 11
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 10:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9511/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 10
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Sep 2021 01:30:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9506/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 9
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Comment-Date: Mon, 27 Sep 2021 19:21:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 31:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7648/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 31
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Nov 2021 18:17:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 30:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9801/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 30
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Nov 2021 11:58:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 36: Code-Review+2

The patch looks good to me. Thanks a lot for seeing it through. I know it took a lot of iterations. I think as a followup, we would need to update the lastSyncedEventId for DDLs from catalogOpExecutor too. For now, this patch enables strong consistency guarantee for catalog's metastore service endpoint.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 36
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Dec 2021 20:57:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 39:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7696/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 39
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Dec 2021 16:18:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#39).

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................

IMPALA-10926: Improve catalogd consistency and self events detection

In the current design, catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread which polls HMS
events from notifications log table and apply them asynchronously.
These two stream of updates cause consistency issues. For example
consider the following sequence of alter table events on a table
t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Db/Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd from both catalog HMS
metastore server and Impala shell) will follow the following steps
to update the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache in the
same order as they appear in HMS thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. And while excuting a ddl,
   db/table will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
26 files changed, 3,675 insertions(+), 315 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/39
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 39
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 20:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17859/20/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/20/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402
PS20, Line 2402:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/20/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/20/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS20, Line 85:     private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 20
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Oct 2021 14:06:18 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 21:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9649/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 21
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 Oct 2021 12:22:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 23:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9687/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 23
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 Oct 2021 10:25:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 24:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17859/24/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/24/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2452
PS24, Line 2452:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/24/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/24/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@85
PS24, Line 85:     private static boolean flagEnableCatalogCache ,flagInvalidateCache, flagSyncToLatestEventId;
line too long (96 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 24
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 29 Oct 2021 10:14:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 26:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7586/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 26
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 02 Nov 2021 09:44:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 26:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7586/


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 26
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 02 Nov 2021 19:44:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 18:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9586/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 18
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 11 Oct 2021 18:44:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Hello Vihang Karajgaonkar, kishen@cloudera.com, Yu-Wen Lai, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17859

to look at the new patch set (#18).

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................

IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
DDL operations via catalog HMS endpoints

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
---
M be/src/catalog/catalog-server.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/test/java/org/apache/impala/catalog/AlterDatabaseTest.java
A fe/src/test/java/org/apache/impala/catalog/MetastoreApiTestUtils.java
M fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/catalog/metastore/AbstractCatalogMetastoreTest.java
A fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_metastore_service.py
27 files changed, 3,390 insertions(+), 280 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/17859/18
-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 18
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 19:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/9595/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 19
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Oct 2021 17:23:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/19/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2402
PS19, Line 2402:       batchEvents = eventFactory.createBatchEvents(mockEvents, eventsProcessor_.getMetrics());
line too long (94 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 19
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Tue, 12 Oct 2021 17:02:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 16:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17859/16/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/16/fe/src/main/java/org/apache/impala/catalog/Db.java@150
PS16, Line 150:         String.format("last synced event id: %s to be set for db %s should be >= createEvent id: %s",
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/16/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/16/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1311
PS16, Line 1311:         MetastoreEvents.AlterTableEvent alterEvent = (MetastoreEvents.AlterTableEvent) event;
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/16/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1313
PS16, Line 1313:         org.apache.hadoop.hive.metastore.api.Table oldMsTable = alterEvent.getBeforeTable();
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17859/16/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1314
PS16, Line 1314:         org.apache.hadoop.hive.metastore.api.Table newMsTable = alterEvent.getAfterTable();
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 16
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Oct 2021 16:44:14 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
kishen@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 2:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@223
PS2, Line 223:   public static final long LOCK_RETRY_TIMEOUT_MS = 7200000;
7200000 seems to be too large. How did you pick this value ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2142
PS2, Line 2142:       LOG.trace("table {} exits in cache, last synced id {}", tbl.getFullName(),
Why do you have many .trace() instead of .debug() ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3654
PS2, Line 3654:     tableLoadingMgr_.start();
Since its no longer final, can you check the code to make sure, we are not keeping its reference somewhere else ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@51
PS2, Line 51:   private static final Logger LOG = LoggerFactory.getLogger(TableLoader.class);
What is the difference between LoggerFactory vs Logger ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@164
PS2, Line 164:         // write lock is not required since it is full table reload
Can there be multiple threads trying to do full table reload at the same time ? Would a write lock be useful in that case ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@159
PS2, Line 159:   private CatalogServiceCatalog catalog_;
Why are you removing "final" for CatalogServiceCatalog and CatalogOpExecutor ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@266
PS2, Line 266:   // TODO: Should we skip creating a new metastore events factory
I think it all depends on whether we can toggle this flag, without restarting CatalogD or not.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@835
PS2, Line 835:     protected boolean shouldSkipWhenSyncingToLatestEventId() throws CatalogException {
Please write the Javadoc for this method.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@966
PS2, Line 966:     protected boolean shouldSkipWhenSyncingToLatestEventId() {
May be you can add default interface method which returns false in the abstract class ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1163
PS2, Line 1163:       String oldDbName = tableBefore_.getDbName();
Please use the variables directly from tableBefore and tableAfter.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1504
PS2, Line 1504:     @Override
You can avoid most of these by having default implementation, which returns false.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@357
PS2, Line 357:        1. Should we stop after processing drop_table
Please remove these TODOs or update them as per what is needed in future, before you commit the changes. Looks like some of them are discussions and not strictly TODOs.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@475
PS2, Line 475:           // TODO: should we ignore case?
I think it should become a utility method. Otherwise, everywhere we have to keep thinking whether it should be ignore case or not !


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@899
PS2, Line 899:   protected Metrics getMetrics() {
Why this has to be changed to "protected", if its only for testing ?


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@110
PS2, Line 110:   // protected final boolean invalidateCacheOnDDLs_;
Please remove the variable, if you don't need it.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@117
PS2, Line 117:     // TODO: Should we honor fallbacktoHMS in case we
I don't think that would be a good idea. 
If you keep doing it for some reason, events can pile up. 
So, I prefer you rather fail so that we can detect the problem early.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@509
PS2, Line 509:     org.apache.impala.catalog.Table tbl = getTableAndAcquireWriteLock(partition.getDbName(),
> line too long (92 > 90)
Hope you are using the Java template.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@607
PS2, Line 607:     // TODO: Should we sync the table till latest event id if partitions list
There is no harm in doing it.


http://gerrit.cloudera.org:8080/#/c/17859/2/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1332
PS2, Line 1332:     boolean isRename = !dbname.equalsIgnoreCase(newTable.getDbName()) ||
Do you have to distinguish between database rename and table rename ?



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Sep 2021 17:43:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 2:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/9480/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 2
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Sep 2021 22:09:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Sourabh Goyal has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing DDL operations via catalog HMS endpoints
......................................................................


Patch Set 15:

(35 comments)

http://gerrit.cloudera.org:8080/#/c/17859/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17859/11//COMMIT_MSG@7
PS11, Line 7: Sync db/table in catalog cache to latest HMS event id when performing
            : DDL operations via catalog HMS endpoints
> If this patch is close to getting merged, now is a good to add more details
Yes I will add more details in the commit message.  Will fix message format styles as well.


http://gerrit.cloudera.org:8080/#/c/17859/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17859/12//COMMIT_MSG@7
PS12, Line 7: IMPALA-10926: Sync db/table in catalog cache to latest HMS event id when performing
            : DDL operations via catalog HMS endpoints
> Since this patch is not a WIP anymore, can you please follow the commit mes
Ack


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@457
PS6, Line 457:           tableInfo);
             :     }
             :     int tableIndex=-1, versionLockCount = 0;
             :     try {
             :       for(tableIndex = 0; tableIndex < numTables; tableIndex++) {
             :         Table tbl = tables[tableIndex];
             :         if (!tryWriteLock(tbl)) {
             :           LOG.debug("Could not acquire write lock on table: " + tbl.getFullName());
             :           return false;
             :         }
             :         versionLockCount += 1;
             :       }
             :       // in case of success, release version write lock for all tables except last
             :       if (tableIndex == numTables) {
             :        
> I think the versionLock.writeLock().unlock() should be moved out finally bl
Currently tryWriteLock(tbl) does not throw an exception. But it still makes sense to release version lock in a finally block. Will fix it.


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3652
PS6, Line 3652: 
> annotate with @VisibleForTesting if this was for testing.
It is used in testing as well as in JniCatalog (after initializing catalogServiceCatalog object)


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3658
PS6, Line 3658: 
> nit, missing newline.
Ack


http://gerrit.cloudera.org:8080/#/c/17859/11/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/11/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@459
PS11, Line 459: leIndex=-1, 
> I had left some comments in the older gerrit url for this patch. Can you pl
Sure


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@457
PS12, Line 457:           tableInfo);
              :     }
              :     int tableIndex=-1, versionLockCount = 0;
              :     try {
              :       for(tableIndex = 0; tableIndex < numTables; tableIndex++) {
              :         Table tbl = tables[tableIndex];
              :         if (!tryWriteLock(tbl)) {
              :           LOG.debug("Could not acquire write lock on table: " + tbl.getFullName());
              :           return false;
              :         }
              :         versionLockCount += 1;
              :       }
              :       // in case of success, release version write lock for all tables except last
              :       if (tableIndex == numTables) {
              :        
> a RuntimeException thrown at line 459 will not release the table locks as w
Makes sense. I didn't think about the RunTimeException. Will address it.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2672
PS12, Line 2672: 
> why do we need to override this method?
I think the intention for this change was - currently for removeTable() api , we acquire global write lock. So the read for the table should acquire global read lock.


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java@115
PS6, Line 115: 
> need this?
The intention was - by making lastSyncedEventId_ volatile, the get and set apis for this variable would become thread safe. Otherwise to read this, one would have to take read lock.


http://gerrit.cloudera.org:8080/#/c/17859/6/fe/src/main/java/org/apache/impala/catalog/Db.java@115
PS6, Line 115: 
> please add a comment here explaining what the value of this field signifies
Ack


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@170
PS12, Line 170: initMetrics
> since these metrics are not getting exposed anywhere. can we remove these?
Are you suggesting to pass metrics as null in 
 MetastoreEventsProcessor.syncToLatestEventId(catalog_, table,
            catalog_.getEventFactoryForSyncToLatestEvent(), metrics_, false); ? 
If yes, then we need to handle null condition for metrics in MetastoreEvents.java


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
File fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java@168
PS12, Line 168: // St
> do we need these changes to this class?
Not really. I will revert back to original behavior.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@461
PS12, Line 461: shouldSkipWhenSyncingToLatestEventId
> why are we not just using isSelfEvent() here? Would it not be cleaner to ch
Sorry I didn't understand your comment


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@576
PS12, Line 576:   if (c
> is this commented for a reason?
No. Forgot to clean it up. Will do it.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@800
PS12, Line 800:           return true;
              :         }
              :       } catch (DatabaseNotFoundException e) {
              :         infoLog("Skipping on table {} because db {} not found in cache", tblName,
              :          
> if the table is not present do we need to check if it is in the deletelog?
How would checking in deletelog help? Even if it is present, should we not skip processing this event?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@990
PS12, Line 990: 
> I think we should implement a check for lastSyncedEventId based self-event 
MetastoreTableEvent class already has a default implementation for skipping the event if the table is already synced till this event id. Do we need a custom implementation here?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1175
PS12, Line 1175:  the ALTER_TABLE event is due a table rename, this method removes the old table
               :      * and creates a new table with the new name. Else, this just issues a refr
> It is not clear to me why we are checking on both the before and after tabl
This check is done for rename table. Already have an explanation for this in method's description 

 // If the alter table event is generated because of table rename then event
 // should *NOT* be skipped if old table is not synced this till event AND
 //  new table doesn't exist in cache. Skip otherwise

Revisiting the logic and I think we can skip this logic for rename and simply rely on renameTableFromEvent() when actually processing the event. 

Thoughts?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1699
PS12, Line 1699: 
> Why does this class not implement shouldSkipWhenSyncingToLatestEventId?
MetastoreTableEvent class already has the default implementation for this.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@328
PS12, Line 328: shouldBeAlreadyLocked
> Is there a caller which passes this as false?
Yes, during full table reload from TableLoader. WriteLock is not required there since we are doing reload on a fresh table
Rethinking about this - It probably makes more sense to acquire write lock on a newly created table in TableLoader  and get rid of shouldBeAlreadyLocked. 

Thoughts?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@329
PS12, Line 329: Preconditions.checkState(tbl.isWriteLockedByCurrentThread(),
              :           String.format("Write lock is not held on table %s by current thread",
              :               tbl.getFullName()));
> I think this preconditions check should be done in any case to catch bugs w
Sorry I couldn't understand your comment.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@386
PS12, Line 386: tbl.setLastSyncedEventId(currentEvent.getEventId());
> can this logic be moved to the event.processIfEnabled() itself?
Actual event processing logic like: addPartitionsIfNotRemovedLater, reloadPartition() (for alter partition) is already taking care of setting event id in table/db.
I think we shouldn't move this to event.processIfEnabled() because the actual implementation of process() api should decide whether a given event id should be set in table/db or not


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@458
PS12, Line 458: TODO: should we ignore case
> yes, case should be ignored. Also I think we should make sure that the cata
Do you mean we support only "hive" catalog? If yes should the filter be modified to 

if (event.getCatName() != null && event.getDbName() != null && event.getTableName() != null) {
          return event.getCatName().equalsIgnoreCase("hive") && tbl.getDb().getName().equalsIgnoreCase(event.getDbName()) &&
              tbl.getName().equalsIgnoreCase(event.getTableName());

Same question for db filter as well. Should we check for catalog name?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@475
PS12, Line 475:           return db.getName().equalsIgnoreCase(event.getDbName());
Should we have catalog name check here? Ignore events if catalog name is not hive?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@478
PS12, Line 478: event.getTableName() == null
> not sure I get this part. Why do we need to do this?
The check is to ignore any table specific events since those events would have table name as non null and accept the rest


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@533
PS12, Line 533: storeEventFactory.get
> can we use the singleton way to get the factory here?
I tried using singleton object after our discussion but that introduced some tests failures. Most likely some race conditions. I couldn't repro them on local. Also since the logs from pre commit tests were missing upstream I couldn't debug it further. Therefore switched to using non singleton object of event factory.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@283
PS12, Line 283: dropDbIfExists
> why not sync to latest event here?
My thought was - when the db is eventually going to be dropped, why sync it to latest event i.e drop event and then  drop the db? We can by pass the sync and drop the db right away. 

Thoughts?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@857
PS12, Line 857: syncToLatestEventId(srcTbl, apiName);
              :       syncToLatestEventId(destinationTbl, apiName);
              :     } catch (Exception e) {
              :       rethrowException(e, apiName);
> Can this be removed now?
Ack


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1105
PS12, Line 1105: 
> why not sync to latest event here?
Same reasoning as dropDbIfExists. 

Currently when syncing to latest event if we encounter drop table/db event then we drop the db/table and stop processing further events (if any). So why not drop the table right away?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java@1279
PS12, Line 1279: ename = !db
> Do we need a preconditions check to confirm that table is locked by this th
Yes we should have. 
For now, I have removed this method and moved the logic in the caller since there is only one caller of this method.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@909
PS12, Line 909:   if (dbToAlter == null) {
              :       LOG.debug("Event id: {}, not altering db {} since it does not exist in catalogd",
              :           eventId, dbName);
              :       
> check deleteEventLog here?
How would that help? Even if db is present in deleteEventLog, is it not already deleted?


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@1611
PS12, Line 1611:         if (!BackendConfig.INSTANCE.enableSyncToLatestEventOnDdls()) {
               :           assertEquals(EventProcessorStatus.NEEDS_INVALIDATE,
               :               eventsProcessor_.getStatus());
               :         }
> not sure I fully understand this.
When sync to latest event id is enabled, any  event on a non existing table is skipped (check shouldSkipWhenSyncingToLatestEventId in MetastoreTableEvent).
 Regarding this change, I am actually not sure if this is causing any regression but the behavior in case the flag is turned on seems to be right. 
Let me know your thoughts


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java
File fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java:

http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@199
PS12, Line 199: prevSyncedEventId);
> Will this cause flakiness of the test since the HMS is shared among multipl
Yes the check will fail in that case. Thanks for pointing it out. I will modify the check at all places in this file to

getLastSyncedEventId() > prevSyncedId


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@240
PS12, Line 240: // added partitions should not reflect in table
              :             // stored in catalog cache
              :             assertTrue(tbl.getPartitions().size() == 0);
> not sure I understand this. The add_partitions API in HMS handler should sy
addPartitionsInHms() method does not use catalogHmsClient and uses normal hms client to add partitions to metastore. Therefore if cataligHmsClient is by passed, then that ddl operation should not affect the cache.


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@550
PS12, Line 550:             catalogHmsClient_.dropTable(TEST_DB_NAME, tblName, true, true);
> can you also assert here that lastSyncedEventId > createEventId
Ack


http://gerrit.cloudera.org:8080/#/c/17859/12/fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsSyncToLatestEventIdTest.java@561
PS12, Line 561: EventsProcessor prevEventProcessor =
> not sure what is the goal of this test? This is testing functionality which
Thats correct. However tests in MetastoreEventsProcessorTest currently do not test if table has been synced to latest event id. We can modify those tests. May be run all metastore events processor tests as parameterized tests with and without sync to latest event id patch ? 

Thoughts?



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 15
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Thu, 07 Oct 2021 20:03:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10926: Improve catalogd consistency and self events detection

Posted by "Sourabh Goyal (Code Review)" <ge...@cloudera.org>.
Sourabh Goyal has posted comments on this change. ( http://gerrit.cloudera.org:8080/17859 )

Change subject: IMPALA-10926: Improve catalogd consistency and self events detection
......................................................................


Patch Set 27:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@328
PS26, Line 328: toryForSyncToLatestEvent(Catalog
> remove if not needed.
Ack


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@919
PS26, Line 919: }
              :       org.apache.impala.catalog.Table tbl = null;
              :       t
> Not sure if I understand this condition correctly. Why are evaluating old s
This is not the right condition because if sync to latest event id flag is set to true and the event is *not* a self event then the code from line no: 922 shall not get executed as tbl.setLastSyncedEventId(getEventId()) would get called while processing the event. 

To make this code more readable I can modify the if condition to: 

if (isSelfEvent && BackendConfig.INSTANCE.enableSyncToLatestEventOnDdls() ) {
    tbl = catalog_.getTable(getDbName(), getTableName());

        if (tbl != null && catalog_.tryWriteLock(tbl)) {
          catalog_.getLock().writeLock().unlock();
          if (tbl.getLastSyncedEventId() < getEventId()) {
            infoLog("is a self event. last synced event id for "
                    + "table {} is {}. Setting it to {}", tbl.getFullName(),
                tbl.getLastSyncedEventId(), getEventId());
            tbl.setLastSyncedEventId(getEventId());
          }
}

Thoughts?


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@932
PS26, Line 932: 
> Why do we need to set the lastSyncedEventId here? Can we keep the scope of 
MetastoreEventFactory (and not EventFactoryForSyncToLatestEvent) skips processing an event which is self event. In that case, for a table, event factory should set the last synced event id to this self event id if enableSyncToLatestEventOnDdls() is set before skipping the processing of an event (more details in method comments)

I am not sure what is the right place to do that and thats what this overridden method isSelfEvent does.


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java:

http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@3088
PS26, Line 3088: String ms
> remove if not needed.
Ack


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@749
PS26, Line 749: if (existingTable != null) {
              :         LOG.debug("EventId: {} Table {} was not added 
> Is this something that you are working on?
I am not looking into it right now. But when I was working on it, the question in TODO comment crossed my mind. I am not sure if it is a valid scenario and for now I am thinking of adding a warning message if existing table's create event id does not match the event id passed in method argument.


http://gerrit.cloudera.org:8080/#/c/17859/26/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@929
PS26, Line 929:       }
> Is this TODO still unresolved? Please remove if it is. My understanding is 
Ack



-- 
To view, visit http://gerrit.cloudera.org:8080/17859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Gerrit-Change-Number: 17859
Gerrit-PatchSet: 27
Gerrit-Owner: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <ki...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sourabh Goyal <so...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yu-Wen Lai <yu...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Nov 2021 12:40:54 +0000
Gerrit-HasComments: Yes