You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org> on 2019/11/26 20:08:22 UTC

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14799


Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically for the partition
events. Before the patch, the self-event identifiers for a partition were stored at a
table level when generating the partition events. This was problematic since unlike
ADD_PARTITION and DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events generated, only the
first event is identified as a self-event and the reset of the events are processed. This
patch fixes this by adding the self-event identifiers to each partition so that when the
event is later received, each ALTER_PARTITION uses the state stored in HdfsPartition to
evaluate the self-events. The patch makes sure that the event processor takes a table lock
during self-event evaluation to avoid races with other parts of the code which try to
modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a loaded table
(incomplete tables are not refreshed) when a ALTER_TABLE event is received instead of
invalidating the table. This makes the events processor consistent with respect to all the
other event types. In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when user modifies the
comment on the database.
2. In case of queries like "alter table add if not exists partition...", the partition is
not added since its pre-existing. The self-event identifiers should not be added in such
cases since no event is expected from such queries.
3. Changed wait_for_event_processing test util method in EventProcessorUtils to use a more
deterministic way to determine if the catalog updates have propogated to impalad instead
of waiting for a random duration of time. This also speeds up the event processing tests
significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala queries and makes sure that the
event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. [TODO] Run core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
13 files changed, 983 insertions(+), 593 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/14799/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 5:

(12 comments)

It takes some time to review since this patch is larger... Still looking at the DDL code paths.

http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG@9
PS2, Line 9: This commit redoes some of the self-event detection logic, specifically for the partition
> Please wrap the lines to fix the 72 character limit (to be consistent with 
Could you address on this?


http://gerrit.cloudera.org:8080/#/c/14799/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14799/5//COMMIT_MSG@14
PS5, Line 14: reset
rest?


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2251
PS5, Line 2251: Returns true if partition is found and is refreshed,
              :    * false otherwise.
stale comment


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2262
PS5, Line 2262:     if (table == null || table instanceof IncompleteTable) return new Pair<>(false,
              :         false);
nit: refactor into multiple lines if-clause


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@863
PS5, Line 863: did'nt
typo: didn't


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@875
PS5, Line 875: .
Also log the table name here?


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@595
PS5, Line 595:         Pair<Boolean, Boolean> result = catalog_.reloadPartitionIfExists(dbName_,
Codes not quite clear due to the use of Pair. I think we can throw an exception (e.g. TableNotLoadedException) from reloadPartitionIfExists() if the first boolean is false. And deal with the exception in the catch-clause.


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@906
PS5, Line 906: Else, this just issues a invalidate
             :      * table on the tblName from the event
this comment needs updates too


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3527
PS5, Line 3527: if the HDFSPartition is provided
Now the callers all provide it. Add skip logic for null partition at the end?


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3564
PS5, Line 3564: { return; }
nit: don't need braces


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@891
PS5, Line 891: invalidate
stale comment: refresh. Same in the following comments.


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@892
PS5, Line 892:     loadTable(testTblName);
Is there test coverage on alter events on unloaded tables?



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Jan 2020 09:38:57 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5366/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 3
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Jan 2020 19:28:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5148/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 26 Nov 2019 20:51:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 2:

(33 comments)

Did a first round review. Most of the comments are nits :)

Still need some time to digest this patch deeper. Thanks!

http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG@9
PS2, Line 9: This commit redoes some of the self-event detection logic, specifically for the partition
Please wrap the lines to fix the 72 character limit (to be consistent with other commits). There's an interesting blog about explaining the 72 character limit: https://preslav.me/2015/02/21/what-s-with-the-50-72-rule/


http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG@42
PS2, Line 42: [TODO]
Is it done?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@52
PS2, Line 52: import org.apache.impala.catalog.events.SelfEventContext;
nit: keep the import list sorted


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@397
PS2, Line 397:         .equals(((MetastoreEventsProcessor) metastoreEventProcessor_).getStatus());
Looks like we check ACTIVE in many places. It may worth to add a function isActive() in MetastoreEventsProcessor.


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@817
PS2, Line 817: a
nit: an


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@831
PS2, Line 831: versionNumber == -1
Is this safe enough or should be something like "versionNumber < 0" ?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@868
PS2, Line 868: a
nit: an


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@872
PS2, Line 872:               LOG.warn(String.format("Partition %s not found during self-event "
I think we need to log the table full name too since partName is just something like "year=2019/month=1"


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2617
PS2, Line 2617:           } else {
              :             LOG.info(String.format("Partition metadata for %s was not refreshed since "
              :                     + "it does not exist in metastore anymore",
              :                 hdfsTable.getFullName() + " " + partitionName));
In this case, I think we can return hdfsTable.toMinimalTCatalogObject() since the whole Thrift object won't be used. I left my reason here: https://gerrit.cloudera.org/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#3983


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@623
PS2, Line 623: 20
Just be interested, is there a reason we choose 20?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java@44
PS2, Line 44:   private static final Logger LOG = LoggerFactory.getLogger(InFlightEvents.class);
nit: move it above to be together with "private static final int DEFAULT_MAX_NUMBER_OF_INFLIGHT_EVENTS" ?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@567
PS2, Line 567: incomplete
change to unloaded?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@577
PS2, Line 577: T
nit: T => t


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1379
PS2, Line 1379:           //TODO refresh all the partition together instead of looping one by one
Could you create a JIRA for this follow-up?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1532
PS2, Line 1532: ,
nit: need space


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
File fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java@48
PS2, Line 48: parameters
Could you add a comment about this variable? Looks like it could be tblproperties of a table or something similar for a db.


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@37
PS2, Line 37: import javax.annotation.Nullable;
nit: move it one line above to be together with the "java" packages?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3525
PS2, Line 3525: a
nit: an


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3982
PS2, Line 3982:                 // TODO is the partition was not really refresed because the partSpec
typos: is => if, refresed => refreshed


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3983
PS2, Line 3983:                 // was wrong, do we still need to send back the table?
I think we won't send back the table to coordinators in this case since the catalog version is unchanged (the table is not modified). Then it won't be collected in topic update since its version is not in range.

However, we can do a minor optimization in reloadPartition() that if the partSpec references a partition that is non-exists in both HMS and catalog, we can return hdfsTable.toMinimalTCatalogObject() instead of the whole hdfsTable.toTCatalogObject().


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4182
PS2, Line 4182: 
nit: empty line


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@858
PS2, Line 858: numberOfInvalidatesBefore
need rename to numberOfRefreshesBefore


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@927
PS2, Line 927: an
nit: a


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2204
PS2, Line 2204: Assert.
nit: don't need the class name since this static function is imported.


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2226
PS2, Line 2226: Assert.
nit: don't need class name


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2230
PS2, Line 2230: Assert.
nit: don't need class name


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2233
PS2, Line 2233: Assert.
nit: don't need class name


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2243
PS2, Line 2243: 
unfinished?


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@17
PS2, Line 17: import random
            : import string
nit: be together with the other imports and leave one empty line after the file comments?


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@94
PS2, Line 94:  
nit: should be two spaces indent inside this block.


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@215
PS2, Line 215:     time.sleep(build_flavor_timeout(2, slow_build_timeout=4))
Can do the same change as EventProcessorUtils.wait_for_event_processing() that waits for 'catalog.curr-version' instead.


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@302
PS2, Line 302:             db_name, tbl_name)
Also add coverage for "alter table xxx recover partitions" statement here?


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/2/tests/util/event_processor_utils.py@21
PS2, Line 21: import logging
nit: be together with the other imports and leave one empty line after the file comments?



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 16 Dec 2019 07:01:01 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5399/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 04:29:00 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 7: Code-Review+2

(3 comments)

Thanks for addressing the comments! Carry on Anurag's +1.

http://gerrit.cloudera.org:8080/#/c/14799/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14799/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2265
PS7, Line 2265: table == null || 
nit: don't need it. already checked above


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4307
PS5, Line 4307:           commitTransaction(update.getTransaction_id());
> Unfortunately, adding the catalog version for inserted table and partitions
Thanks for the explanation!


http://gerrit.cloudera.org:8080/#/c/14799/7/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/7/tests/custom_cluster/test_event_processing.py@245
PS7, Line 245: [
> flake8: E131 continuation line unaligned for hanging indent
nit: Not sure if moving the "[" above to be right after "True: " cound fix this.
https://www.flake8rules.com/rules/E131.html



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 06:53:52 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14799/7/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/7/tests/custom_cluster/test_event_processing.py@245
PS7, Line 245: [
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/7/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/7/tests/util/event_processor_utils.py@42
PS7, Line 42: )
flake8: E501 line too long (91 > 90 characters)



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 03:59:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5368/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Jan 2020 23:21:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically
for the partition events. Before the patch, the self-event identifiers
for a partition were stored at a table level when generating the
partition events. This was problematic since unlike ADD_PARTITION and
DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events
generated, only the first event is identified as a self-event and the
rest of the events are processed. This patch fixes this by adding the
self-event identifiers to each partition so that when the event is later
received, each ALTER_PARTITION uses the state stored in HdfsPartition to
valuate the self-events. The patch makes sure that the event processor
takes a table lock during self-event evaluation to avoid races with
other parts of the code which try to modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a
loaded table (incomplete tables are not refreshed) when a ALTER_TABLE
event is received instead of invalidating the table. This makes the
events processor consistent with respect to all the other event types.
In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when
user modifies the comment on the database.
2. In case of queries like "alter table add if not exists partition...",
the partition is not added since its pre-existing. The self-event
identifiers should not be added in such cases since no event is expected
from such queries.
3. Changed wait_for_event_processing test util method in
EventProcessorUtils to use a more deterministic way to determine if the
catalog updates have propogated to impalad instead of waiting for a
random duration of time.  This also speeds up the event processing tests
significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala
queries and makes sure that the event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. Ran core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableNotLoadedException.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
14 files changed, 1,091 insertions(+), 666 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/14799/7
-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@892
PS2, Line 892:         .parseLong(hmsParameters_.get(MetastoreEventPropertyKey.CATALOG_VERSION.getKey())));
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@765
PS2, Line 765:         // now that HMS alter operation has succeeded, add this version to list of inflight
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3852
PS2, Line 3852:               addToInflightVersionsOfPartition(msPartition.getParameters(), hdfsPartition);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2666
PS2, Line 2666:   private void alterTableComputeStats(String tblName, List<List<String>> partValsList) throws ImpalaException {
line too long (111 > 90)


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@251
PS2, Line 251: [
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@290
PS2, Line 290: .
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@323
PS2, Line 323: d
flake8: E501 line too long (99 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@338
PS2, Line 338: .
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@408
PS2, Line 408: 
flake8: W292 no newline at end of file


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/2/tests/util/event_processor_utils.py@43
PS2, Line 43: )
flake8: E501 line too long (91 > 90 characters)



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 26 Nov 2019 20:09:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 4:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/5367/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Jan 2020 22:35:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 8: Code-Review+2

Carrying code-review votes from Quanlong and Anurag.


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:10:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/14799/3/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/3/tests/custom_cluster/test_event_processing.py@101
PS3, Line 101: e
flake8: E501 line too long (91 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/14799/3/tests/custom_cluster/test_event_processing.py@232
PS3, Line 232: [
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/3/tests/custom_cluster/test_event_processing.py@271
PS3, Line 271: .
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/3/tests/custom_cluster/test_event_processing.py@304
PS3, Line 304: d
flake8: E501 line too long (99 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/14799/3/tests/custom_cluster/test_event_processing.py@319
PS3, Line 319: .
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/3/tests/custom_cluster/test_event_processing.py@389
PS3, Line 389: 
flake8: W292 no newline at end of file


http://gerrit.cloudera.org:8080/#/c/14799/3/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/3/tests/util/event_processor_utils.py@42
PS3, Line 42: )
flake8: E501 line too long (91 > 90 characters)



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 3
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Jan 2020 18:58:22 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically for the partition
events. Before the patch, the self-event identifiers for a partition were stored at a
table level when generating the partition events. This was problematic since unlike
ADD_PARTITION and DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events generated, only the
first event is identified as a self-event and the reset of the events are processed. This
patch fixes this by adding the self-event identifiers to each partition so that when the
event is later received, each ALTER_PARTITION uses the state stored in HdfsPartition to
evaluate the self-events. The patch makes sure that the event processor takes a table lock
during self-event evaluation to avoid races with other parts of the code which try to
modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a loaded table
(incomplete tables are not refreshed) when a ALTER_TABLE event is received instead of
invalidating the table. This makes the events processor consistent with respect to all the
other event types. In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when user modifies the
comment on the database.
2. In case of queries like "alter table add if not exists partition...", the partition is
not added since its pre-existing. The self-event identifiers should not be added in such
cases since no event is expected from such queries.
3. Changed wait_for_event_processing test util method in EventProcessorUtils to use a more
deterministic way to determine if the catalog updates have propogated to impalad instead
of waiting for a random duration of time. This also speeds up the event processing tests
significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala queries and makes sure that the
event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. [TODO] Run core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
13 files changed, 1,032 insertions(+), 643 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/14799/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 3
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14799/5/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/5/tests/custom_cluster/test_event_processing.py@245
PS5, Line 245: [
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/5/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/5/tests/util/event_processor_utils.py@42
PS5, Line 42: )
flake8: E501 line too long (91 > 90 characters)



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Jan 2020 22:51:40 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14799/8/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/8/tests/custom_cluster/test_event_processing.py@283
PS8, Line 283: "
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/8/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/8/tests/util/event_processor_utils.py@42
PS8, Line 42: )
flake8: E501 line too long (91 > 90 characters)



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:10:48 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Anurag Mantripragada (Code Review)" <ge...@cloudera.org>.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 2:

(3 comments)

Thanks for making this change. Overall, the patch looks good to me. I just posted some questions to understand a few things better.

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1383
PS2, Line 1383: reloadPartition(tPartSpec, "ADD_PARTITION");
Just for my understanding, here and in DropPartitionEvents, are we changing the behavior from bailing out of refreshing the rest of the partitions when we encounter a failed refresh to a best-effort (refresh as many partitions in the event as possible) refresh?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@210
PS2, Line 210: NUMBER_OF_PARTITION_REFRESHES
Should this metric be initialized similarly to table refreshes in line 316?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@631
PS2, Line 631: catalog_.addVersionsForInflightEvents(tbl, newCatalogVersion);
Is there a reason why we are not doing this for DROP_PARTITION case below on L650?



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 17 Dec 2019 18:44:11 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/14799/4/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/4/tests/custom_cluster/test_event_processing.py@103
PS4, Line 103: e
flake8: E501 line too long (91 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/14799/4/tests/custom_cluster/test_event_processing.py@245
PS4, Line 245: [
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/4/tests/custom_cluster/test_event_processing.py@282
PS4, Line 282: [
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/4/tests/custom_cluster/test_event_processing.py@286
PS4, Line 286: .
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/4/tests/custom_cluster/test_event_processing.py@319
PS4, Line 319: d
flake8: E501 line too long (99 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/14799/4/tests/custom_cluster/test_event_processing.py@334
PS4, Line 334: .
flake8: E131 continuation line unaligned for hanging indent


http://gerrit.cloudera.org:8080/#/c/14799/4/tests/custom_cluster/test_event_processing.py@405
PS4, Line 405: 
flake8: W292 no newline at end of file


http://gerrit.cloudera.org:8080/#/c/14799/4/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/4/tests/util/event_processor_utils.py@42
PS4, Line 42: )
flake8: E501 line too long (91 > 90 characters)



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Jan 2020 22:06:21 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 5:

(3 comments)

Looks good to me in general. Will give +1 after the comments are addressed. Thanks!

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2641
PS5, Line 2641:         wasPartitionReloaded.setRef(true);
It'd be better to move this line down after line 2643, or move line 2643 before this line. So the logic is more clear that whenever we bump table's catalog version, we set wasPartitionReloaded to true. (Same as line 2626-2629).


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@663
PS5, Line 663:             refreshedTable.setCatalogVersion(newCatalogVersion);
Could you comment about why we don't need addVersionsForInflightEvents here?


http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4307
PS5, Line 4307:       loadTableMetadata(table, newCatalogVersion, true, false, partsToLoadMetadata,
Should we add newCatalogVersion to the inflight versions of table and partitions? Could you add a test case for "insert into table xxx as select ..." to cover changes here?



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Thu, 09 Jan 2020 07:14:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically
for the partition events. Before the patch, the self-event identifiers
for a partition were stored at a table level when generating the
partition events. This was problematic since unlike ADD_PARTITION and
DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events
generated, only the first event is identified as a self-event and the
rest of the events are processed. This patch fixes this by adding the
self-event identifiers to each partition so that when the event is later
received, each ALTER_PARTITION uses the state stored in HdfsPartition to
valuate the self-events. The patch makes sure that the event processor
takes a table lock during self-event evaluation to avoid races with
other parts of the code which try to modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a
loaded table (incomplete tables are not refreshed) when a ALTER_TABLE
event is received instead of invalidating the table. This makes the
events processor consistent with respect to all the other event types.
In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when
user modifies the comment on the database.
2. In case of queries like "alter table add if not exists partition...",
the partition is not added since its pre-existing. The self-event
identifiers should not be added in such cases since no event is expected
from such queries.
3. Changed wait_for_event_processing test util method in
EventProcessorUtils to use a more deterministic way to determine if the
catalog updates have propogated to impalad instead of waiting for a
random duration of time.  This also speeds up the event processing tests
significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala
queries and makes sure that the event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. Ran core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableNotLoadedException.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
14 files changed, 1,091 insertions(+), 666 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/14799/8
-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 9: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 22:45:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 9: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:11:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Anurag Mantripragada (Code Review)" <ge...@cloudera.org>.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 5: Code-Review+1

(1 comment)

Did another round. This patch looks good to me.

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/14799/5/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@924
PS5, Line 924: full
             :       // invalidate
Should this be refresh?



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Thu, 09 Jan 2020 04:12:24 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5405/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:40:02 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14799/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14799/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2265
PS7, Line 2265: table == null || 
> nit: don't need it. already checked above
Thanks for catching that. Fixed.


http://gerrit.cloudera.org:8080/#/c/14799/7/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/7/tests/custom_cluster/test_event_processing.py@245
PS7, Line 245: [
> nit: Not sure if moving the "[" above to be right after "True: " cound fix 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:09:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically
for the partition events. Before the patch, the self-event identifiers
for a partition were stored at a table level when generating the
partition events. This was problematic since unlike ADD_PARTITION and
DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events
generated, only the first event is identified as a self-event and the
rest of the events are processed. This patch fixes this by adding the
self-event identifiers to each partition so that when the event is later
received, each ALTER_PARTITION uses the state stored in HdfsPartition to
valuate the self-events. The patch makes sure that the event processor
takes a table lock during self-event evaluation to avoid races with
other parts of the code which try to modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a
loaded table (incomplete tables are not refreshed) when a ALTER_TABLE
event is received instead of invalidating the table. This makes the
events processor consistent with respect to all the other event types.
In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when
user modifies the comment on the database.
2. In case of queries like "alter table add if not exists partition...",
the partition is not added since its pre-existing. The self-event
identifiers should not be added in such cases since no event is expected
from such queries.
3. Changed wait_for_event_processing test util method in
EventProcessorUtils to use a more deterministic way to determine if the
catalog updates have propogated to impalad instead of waiting for a
random duration of time.  This also speeds up the event processing tests
significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala
queries and makes sure that the event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. Ran core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Reviewed-on: http://gerrit.cloudera.org:8080/14799
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TableNotLoadedException.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
14 files changed, 1,091 insertions(+), 666 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 10
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5398/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Fri, 10 Jan 2020 18:11:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@302
PS2, Line 302:             db_name, tbl_name)
> Thanks for pointing this out. Will add it the coverage for that statement.
Done in patchset#4



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Jan 2020 22:06:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically for the partition
events. Before the patch, the self-event identifiers for a partition were stored at a
table level when generating the partition events. This was problematic since unlike
ADD_PARTITION and DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events generated, only the
first event is identified as a self-event and the reset of the events are processed. This
patch fixes this by adding the self-event identifiers to each partition so that when the
event is later received, each ALTER_PARTITION uses the state stored in HdfsPartition to
valuate the self-events. The patch makes sure that the event processor takes a table lock
during self-event evaluation to avoid races with other parts of the code which try to
modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a loaded table
(incomplete tables are not refreshed) when a ALTER_TABLE event is received instead of
invalidating the table. This makes the events processor consistent with respect to all
the other event types. In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when user modifies the
comment on the database.
2. In case of queries like "alter table add if not exists partition...", the partition is
not added since its pre-existing. The self-event identifiers should not be added in such
cases since no event is expected from such queries.
3. Changed wait_for_event_processing test util method in
EventProcessorUtils to use a more deterministic way to determine if the catalog updates
have propogated to impalad instead of waiting for a random duration of time. This also
speeds up the event processing tests significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala queries and makes sure that
the event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. [TODO] Run core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
13 files changed, 1,042 insertions(+), 641 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/14799/5
-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Posted by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org>.
Vihang Karajgaonkar has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................

IMPALA-9101: Add support for detecting self-events on partition events

This commit redoes some of the self-event detection logic, specifically for the partition
events. Before the patch, the self-event identifiers for a partition were stored at a
table level when generating the partition events. This was problematic since unlike
ADD_PARTITION and DROP_PARTITION event, ALTER_PARTITION event is generated one per
partition. Due to this if there are multiple ALTER_PARTITION events generated, only the
first event is identified as a self-event and the reset of the events are processed. This
patch fixes this by adding the self-event identifiers to each partition so that when the
event is later received, each ALTER_PARTITION uses the state stored in HdfsPartition to
valuate the self-events. The patch makes sure that the event processor takes a table lock
during self-event evaluation to avoid races with other parts of the code which try to
modify the table at the same time.

Additionally, this patch also changes the event processor to refresh a loaded table
(incomplete tables are not refreshed) when a ALTER_TABLE event is received instead of
invalidating the table. This makes the events processor consistent with respect to all
the other event types. In future, we should add a flag to choose the behavior preference
(prefer invalidate or refresh).

Also, this patch fixes the following related issues:
1. Self-event logic was not triggered for alter database events when user modifies the
comment on the database.
2. In case of queries like "alter table add if not exists partition...", the partition is
not added since its pre-existing. The self-event identifiers should not be added in such
cases since no event is expected from such queries.
3. Changed wait_for_event_processing test util method in
EventProcessorUtils to use a more deterministic way to determine if the catalog updates
have propogated to impalad instead of waiting for a random duration of time. This also
speeds up the event processing tests significantly.

Testing Done:
1. Added a e2e self-events test which runs multiple impala queries and makes sure that
the event is skips processing.
2. Ran MetastoreEventsProcessorTest
3. [TODO] Run core tests on CDH and CDP builds.

Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M tests/custom_cluster/test_event_processing.py
M tests/util/event_processor_utils.py
13 files changed, 1,045 insertions(+), 641 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/14799/4
-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>