You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Quanlong Huang (Code Review)" <ge...@cloudera.org> on 2019/12/16 07:01:01 UTC

[Impala-ASF-CR] IMPALA-9101: Add support for detecting self-events on partition events

Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )

Change subject: IMPALA-9101: Add support for detecting self-events on partition events
......................................................................


Patch Set 2:

(33 comments)

Did a first round review. Most of the comments are nits :)

Still need some time to digest this patch deeper. Thanks!

http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG@9
PS2, Line 9: This commit redoes some of the self-event detection logic, specifically for the partition
Please wrap the lines to fix the 72 character limit (to be consistent with other commits). There's an interesting blog about explaining the 72 character limit: https://preslav.me/2015/02/21/what-s-with-the-50-72-rule/


http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG@42
PS2, Line 42: [TODO]
Is it done?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@52
PS2, Line 52: import org.apache.impala.catalog.events.SelfEventContext;
nit: keep the import list sorted


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@397
PS2, Line 397:         .equals(((MetastoreEventsProcessor) metastoreEventProcessor_).getStatus());
Looks like we check ACTIVE in many places. It may worth to add a function isActive() in MetastoreEventsProcessor.


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@817
PS2, Line 817: a
nit: an


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@831
PS2, Line 831: versionNumber == -1
Is this safe enough or should be something like "versionNumber < 0" ?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@868
PS2, Line 868: a
nit: an


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@872
PS2, Line 872:               LOG.warn(String.format("Partition %s not found during self-event "
I think we need to log the table full name too since partName is just something like "year=2019/month=1"


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2617
PS2, Line 2617:           } else {
              :             LOG.info(String.format("Partition metadata for %s was not refreshed since "
              :                     + "it does not exist in metastore anymore",
              :                 hdfsTable.getFullName() + " " + partitionName));
In this case, I think we can return hdfsTable.toMinimalTCatalogObject() since the whole Thrift object won't be used. I left my reason here: https://gerrit.cloudera.org/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#3983


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@623
PS2, Line 623: 20
Just be interested, is there a reason we choose 20?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java@44
PS2, Line 44:   private static final Logger LOG = LoggerFactory.getLogger(InFlightEvents.class);
nit: move it above to be together with "private static final int DEFAULT_MAX_NUMBER_OF_INFLIGHT_EVENTS" ?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@567
PS2, Line 567: incomplete
change to unloaded?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@577
PS2, Line 577: T
nit: T => t


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1379
PS2, Line 1379:           //TODO refresh all the partition together instead of looping one by one
Could you create a JIRA for this follow-up?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1532
PS2, Line 1532: ,
nit: need space


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java
File fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java@48
PS2, Line 48: parameters
Could you add a comment about this variable? Looks like it could be tblproperties of a table or something similar for a db.


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@37
PS2, Line 37: import javax.annotation.Nullable;
nit: move it one line above to be together with the "java" packages?


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3525
PS2, Line 3525: a
nit: an


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3982
PS2, Line 3982:                 // TODO is the partition was not really refresed because the partSpec
typos: is => if, refresed => refreshed


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3983
PS2, Line 3983:                 // was wrong, do we still need to send back the table?
I think we won't send back the table to coordinators in this case since the catalog version is unchanged (the table is not modified). Then it won't be collected in topic update since its version is not in range.

However, we can do a minor optimization in reloadPartition() that if the partSpec references a partition that is non-exists in both HMS and catalog, we can return hdfsTable.toMinimalTCatalogObject() instead of the whole hdfsTable.toTCatalogObject().


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4182
PS2, Line 4182: 
nit: empty line


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@858
PS2, Line 858: numberOfInvalidatesBefore
need rename to numberOfRefreshesBefore


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@927
PS2, Line 927: an
nit: a


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2204
PS2, Line 2204: Assert.
nit: don't need the class name since this static function is imported.


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2226
PS2, Line 2226: Assert.
nit: don't need class name


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2230
PS2, Line 2230: Assert.
nit: don't need class name


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2233
PS2, Line 2233: Assert.
nit: don't need class name


http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2243
PS2, Line 2243: 
unfinished?


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py
File tests/custom_cluster/test_event_processing.py:

http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@17
PS2, Line 17: import random
            : import string
nit: be together with the other imports and leave one empty line after the file comments?


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@94
PS2, Line 94:  
nit: should be two spaces indent inside this block.


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@215
PS2, Line 215:     time.sleep(build_flavor_timeout(2, slow_build_timeout=4))
Can do the same change as EventProcessorUtils.wait_for_event_processing() that waits for 'catalog.curr-version' instead.


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@302
PS2, Line 302:             db_name, tbl_name)
Also add coverage for "alter table xxx recover partitions" statement here?


http://gerrit.cloudera.org:8080/#/c/14799/2/tests/util/event_processor_utils.py
File tests/util/event_processor_utils.py:

http://gerrit.cloudera.org:8080/#/c/14799/2/tests/util/event_processor_utils.py@21
PS2, Line 21: import logging
nit: be together with the other imports and leave one empty line after the file comments?



-- 
To view, visit http://gerrit.cloudera.org:8080/14799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c
Gerrit-Change-Number: 14799
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <an...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 16 Dec 2019 07:01:01 +0000
Gerrit-HasComments: Yes