You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2019/04/06 00:11:35 UTC

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Hello Bharath Vissapragada, Vihang Karajgaonkar, Anonymous Coward (486),

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/12950

to review the following change.


Change subject: WIP: Refactor file descriptor loading code
......................................................................

WIP: Refactor file descriptor loading code

This refactors various file-descriptor loading code out of HdfsTable
into new standalone classes. In order to support ACID tables, we'll need
to make various changes to these bits of code, and having them extracted
and cleaned up will make that easier.

This consolidates all of the places in which we list partition
directories into one method which does the appropriate thing regardless
of situation.

WIP: may have lost some logging. Need to do some more thorough testing.

Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
---
A fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M fe/src/test/java/org/apache/impala/planner/TestCaseLoaderTest.java
8 files changed, 428 insertions(+), 460 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/12950/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@51
PS4, Line 51: private final Path partDir_;
            :   private final ImmutableMap<String, FileDescriptor> oldFdsByName_;
            :   private final ListMap<TNetworkAddress> hostIndex_;
            : 
            :   private boolean forceRefreshLocations = false;
            : 
            :   private List<FileDescriptor> loadedFds_;
            :   private LoadStats loadStats_;
> javadocs
Added javadocs on the getters/setters where missing. I think it's redundant to document both the member and the getter, and the public accessor is more useful to document since callers would likely look there for usage info rather than the implementation


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@137
PS4, Line 137:         if (listWithLocations || forceRefreshLocations || hasFileChanged(fd, fileStatus)) {
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@143
PS4, Line 143:   Preconditions.checkNotNull(fd);
             :         loadedFds_.add(fd);
> nit:merge?
Done


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@162
PS4, Line 162: f (fileStatus instanceof LocatedFileStatus) {
             :       locations = ((LocatedFileStatus)fileStatus).getBlockLocations();
             :     } else {
             :       locations = fs.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());
             :     }
> nit: document this behavior in the method doc? I feel the behavior is subtl
Done


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@171
PS4, Line 171:  public List<FileDescriptor> getLoadedFds() {
             :     return loadedFds_;
             :   }
             : 
             :   public LoadStats getStats() {
             :     return loadStats_;
             :   }
             : 
             :   Path getPartDir() {
             :     return partDir_;
             :   }
> nit: single liners
changed the top two to have Preconditions checks so they don't fit on single line anymore. But made partDir be single-line


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@545
PS4, Line 545:    * @param isRefresh whether this is a refresh operation or an initial load. This only
> I think this is confusing. How about getting rid of this flag here and just
The problem with that is that the FileMetadataLoader.load() logging is only at trace-level. We don't want to log one line per path. I'll add a TODO that perhaps we could aggregate the stats better and log it at the end. It would be useful to see a log message like:

Loaded metadata for table t: loaded metadata for 10/20 partitions, 9 incrementally. Fetched 120 individual block locations... blah blah

but don't want to scope-creep this patch too much. I'll add a TODO


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@587
PS4, Line 587:     // we're in the middle of. This could cause a partial metadata update -- eg we may have
> line too long (91 > 90)
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 07:02:49 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2730/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 04:18:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 2:

is it still a WIP or is this ready for review?


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 01:20:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada, Vihang Karajgaonkar, Anonymous Coward (486), Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12950

to look at the new patch set (#2).

Change subject: WIP: Refactor file descriptor loading code
......................................................................

WIP: Refactor file descriptor loading code

This refactors various file-descriptor loading code out of HdfsTable
into new standalone classes. In order to support ACID tables, we'll need
to make various changes to these bits of code, and having them extracted
and cleaned up will make that easier.

This consolidates all of the places in which we list partition
directories into one method which does the appropriate thing regardless
of situation.

WIP: may have lost some logging. Need to do some more thorough testing.

Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
---
A fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M fe/src/test/java/org/apache/impala/planner/TestCaseLoaderTest.java
9 files changed, 437 insertions(+), 460 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/12950/2
-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4066/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 16:58:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4007/


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 08:45:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2909/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 09:48:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 3:

Got a few test failures on the first dryrun due to not properly handling the case of missing partitions. Should be fixed in the new rev (at least I spot-checked a few of the tests that failed the first time around).


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 07:20:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@48
PS2, Line 48: Parallel
> remove?
Done


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@49
PS2, Line 49: private static final Configuration CONF = new Configuration();
> nit: Use the cached FileSystemUtil.CONF?
since this is static anyway and Configuration objects are not a memory hog, I'd rather avoid a cross-class reference where unnecessary.

(for many years now 'new Configuration()' has been pretty fast since it just keeps a pointer back to a common defaults reference rather than reloading the config files or anything like that)


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@58
PS2, Line 58: stats_
> nit: maybe call loadStats_ or something? There are too many stats in the co
Done. Renamed the type to LoadStats as well.


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@560
PS2, Line 560: Map<Path, FileMetadataLoader> loadersByPath = Maps.newHashMap();
             :     for (Map.Entry<Path, List<HdfsPartition>> e : partsByPath.entrySet()) {
             :       List<FileDescriptor> oldFds = e.getValue().get(0).getFileDescriptors();
             :       FileMetadataLoader loader = new FileMetadataLoader(e.getKey(), oldFds, hostIndex_);
> does it make sense to move the loaders init into ParallelFileMetadata c'tor
I think that would increase coupling since, in order to init the FileMetadataLoader, we need to look at some partition metadata info (L567 below), and I don't want to put HdfsPartition dependencies into the metadata-loading classes themselves. Though, maybe I misunderstood your suggestion.


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@588
PS2, Line 588:     // TODO(todd): it seems like the old implementation would actually modify
I filed a bug here: https://issues.apache.org/jira/browse/IMPALA-8406

Probably merits some discussion on that JIRA what our intended behavior is supposed to be. The current behavior (prior to this patch) is not very good. The behavior on this patch might be incrementally better, so I'm tempted to leave as is and separately fix IMPALA-8406 in one of the two ways suggested there.


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1083
PS2, Line 1083:    * objects grouped by their base directory path.
> Nit: Update javadoc
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 00:07:08 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@137
PS4, Line 137:         if (listWithLocations || forceRefreshLocations || hasFileChanged(fd, fileStatus)) {
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@587
PS4, Line 587:     // we're in the middle of. This could cause a partial metadata update -- eg we may have
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 07:20:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3993/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Sat, 06 Apr 2019 03:41:47 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2666/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Sat, 06 Apr 2019 00:45:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4078/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 05:56:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3993/


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Sat, 06 Apr 2019 05:26:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2733/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 08:03:56 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada, Yongzhi Chen, Vihang Karajgaonkar, Anonymous Coward (486), Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12950

to look at the new patch set (#4).

Change subject: Refactor file descriptor loading code
......................................................................

Refactor file descriptor loading code

This refactors various file-descriptor loading code out of HdfsTable
into new standalone classes. In order to support ACID tables, we'll need
to make various changes to these bits of code, and having them extracted
and cleaned up will make that easier.

This consolidates all of the places in which we list partition
directories into one method which does the appropriate thing regardless
of situation.

This has a small behavior change related to IMPALA-8406: previously, we
had a bug where, while refreshing a table, if one or more partitions
failed to refresh, the other partitions might still get refreshed
despite an error being returned. Those other partitions wouldn't be
available immediately until some other operation caused the table's
catalog version number to increase. This was buggy behavior.

Rather than tackle that problem in this "refactor" patch, this patch
just slightly improves the behavior: we'll either atomically update or
not update all partitions, but we might still add new partitions noticed
by the REFRESH, and might still update other HMS metadata.

This patch may end up slightly improving various other code paths that
refresh file descriptor lists. We used to have slightly different ways
of doing this in three different places, with different sets of
optimizations. Now we do it all in one place, and we pull all the known
tricks.

Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
---
A fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M fe/src/test/java/org/apache/impala/planner/TestCaseLoaderTest.java
10 files changed, 464 insertions(+), 466 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/12950/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12950/5/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12950/5/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@86
PS5, Line 86: ");
> nit: worth including the partition name? (same with other logs)
would rather not add code for this since it's just a developer-facing assertion. This shouldn't ever fire in real life



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 20:50:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 05:56:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada, Yongzhi Chen, Vihang Karajgaonkar, Sudhanshu Arora, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12950

to look at the new patch set (#5).

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................

IMPALA-8454 (part 1): Refactor file descriptor loading code

This refactors various file-descriptor loading code out of HdfsTable
into new standalone classes. In order to support ACID tables, we'll need
to make various changes to these bits of code, and having them extracted
and cleaned up will make that easier.

This consolidates all of the places in which we list partition
directories into one method which does the appropriate thing regardless
of situation.

This has a small behavior change related to IMPALA-8406: previously, we
had a bug where, while refreshing a table, if one or more partitions
failed to refresh, the other partitions might still get refreshed
despite an error being returned. Those other partitions wouldn't be
available immediately until some other operation caused the table's
catalog version number to increase. This was buggy behavior.

Rather than tackle that problem in this "refactor" patch, this patch
just slightly improves the behavior: we'll either atomically update or
not update all partitions, but we might still add new partitions noticed
by the REFRESH, and might still update other HMS metadata.

This patch may end up slightly improving various other code paths that
refresh file descriptor lists. We used to have slightly different ways
of doing this in three different places, with different sets of
optimizations. Now we do it all in one place, and we pull all the known
tricks.

Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
---
A fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M fe/src/test/java/org/apache/impala/planner/TestCaseLoaderTest.java
10 files changed, 478 insertions(+), 466 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/12950/5
-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4066/


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 22:02:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 12:29:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 2:

(5 comments)

refactor generally makes sense to me.

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@48
PS2, Line 48: Parallel
remove?


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@49
PS2, Line 49: private static final Configuration CONF = new Configuration();
nit: Use the cached FileSystemUtil.CONF?


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@58
PS2, Line 58: stats_
nit: maybe call loadStats_ or something? There are too many stats in the code, hoping the rename clears some ambiguity.


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@560
PS2, Line 560: Map<Path, FileMetadataLoader> loadersByPath = Maps.newHashMap();
             :     for (Map.Entry<Path, List<HdfsPartition>> e : partsByPath.entrySet()) {
             :       List<FileDescriptor> oldFds = e.getValue().get(0).getFileDescriptors();
             :       FileMetadataLoader loader = new FileMetadataLoader(e.getKey(), oldFds, hostIndex_);
does it make sense to move the loaders init into ParallelFileMetadata c'tor?


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@288
PS2, Line 288: fml
lol :-)



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Apr 2019 18:23:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12950/3/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12950/3/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@134
PS3, Line 134:         if (listWithLocations || forceRefreshLocations || hasFileChanged(fd, fileStatus)) {
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/12950/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@587
PS3, Line 587:     // we're in the middle of. This could cause a partial metadata update -- eg we may have
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 03:37:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4007/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 03:44:00 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 6: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 10:01:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Anonymous Coward (486) has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1083
PS2, Line 1083:    * objects grouped by their base directory path.
Nit: Update javadoc



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Tue, 09 Apr 2019 23:00:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 3:

Just uploaded a new version which should be ready for review after a few tweaks. I kicked off a test run as well since I didn't get a chance to fully test it locally


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 03:44:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 4:

(6 comments)

Looks pretty good to me, a bunch of nits and I can +2.

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@51
PS4, Line 51: private final Path partDir_;
            :   private final ImmutableMap<String, FileDescriptor> oldFdsByName_;
            :   private final ListMap<TNetworkAddress> hostIndex_;
            : 
            :   private boolean forceRefreshLocations = false;
            : 
            :   private List<FileDescriptor> loadedFds_;
            :   private LoadStats loadStats_;
javadocs


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@143
PS4, Line 143:   Preconditions.checkNotNull(fd);
             :         loadedFds_.add(fd);
nit:merge?


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@162
PS4, Line 162: f (fileStatus instanceof LocatedFileStatus) {
             :       locations = ((LocatedFileStatus)fileStatus).getBlockLocations();
             :     } else {
             :       locations = fs.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());
             :     }
nit: document this behavior in the method doc? I feel the behavior is subtle since one of the branches is an RPC and the other is not.


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@171
PS4, Line 171:  public List<FileDescriptor> getLoadedFds() {
             :     return loadedFds_;
             :   }
             : 
             :   public LoadStats getStats() {
             :     return loadStats_;
             :   }
             : 
             :   Path getPartDir() {
             :     return partDir_;
             :   }
nit: single liners


http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@560
PS2, Line 560: Map<Path, FileMetadataLoader> loadersByPath = Maps.newHashMap();
             :     for (Map.Entry<Path, List<HdfsPartition>> e : partsByPath.entrySet()) {
             :       List<FileDescriptor> oldFds = e.getValue().get(0).getFileDescriptors();
             :       FileMetadataLoader loader = new FileMetadataLoader(e.getKey(), oldFds, hostIndex_);
> I think that would increase coupling since, in order to init the FileMetada
Fair point. Makes sense to me.


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@545
PS4, Line 545:    * @param isRefresh whether this is a refresh operation or an initial load. This only
I think this is confusing. How about getting rid of this flag here and just say

Loading block metadata for table foo, paths xxx

and then in the actual FileMetadataLoader::cal() log whether it is refresh/load from scratch for each path?



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Sat, 13 Apr 2019 02:36:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4077/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 04:30:48 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 05:56:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada, Yongzhi Chen, Vihang Karajgaonkar, Anonymous Coward (486), Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12950

to look at the new patch set (#3).

Change subject: Refactor file descriptor loading code
......................................................................

Refactor file descriptor loading code

This refactors various file-descriptor loading code out of HdfsTable
into new standalone classes. In order to support ACID tables, we'll need
to make various changes to these bits of code, and having them extracted
and cleaned up will make that easier.

This consolidates all of the places in which we list partition
directories into one method which does the appropriate thing regardless
of situation.

This has a small behavior change related to IMPALA-8406: previously, we
had a bug where, while refreshing a table, if one or more partitions
failed to refresh, the other partitions might still get refreshed
despite an error being returned. Those other partitions wouldn't be
available immediately until some other operation caused the table's
catalog version number to increase. This was buggy behavior.

Rather than tackle that problem in this "refactor" patch, this patch
just slightly improves the behavior: we'll either atomically update or
not update all partitions, but we might still add new partitions noticed
by the REFRESH, and might still update other HMS metadata.

This patch may end up slightly improving various other code paths that
refresh file descriptor lists. We used to have slightly different ways
of doing this in three different places, with different sets of
optimizations. Now we do it all in one place, and we pull all the known
tricks.

Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
---
A fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M fe/src/test/java/org/apache/impala/planner/TestCaseLoaderTest.java
10 files changed, 462 insertions(+), 466 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/12950/3
-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2668/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Sat, 06 Apr 2019 04:24:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: Refactor file descriptor loading code
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4009/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Apr 2019 07:20:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 5: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12950/5/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12950/5/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@86
PS5, Line 86: ");
nit: worth including the partition name? (same with other logs)


http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12950/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@545
PS4, Line 545:    * @param isRefresh whether this is a refresh operation or an initial load. This only
> The problem with that is that the FileMetadataLoader.load() logging is only
cool



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 20:02:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................


Patch Set 7: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 10:54:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP: Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: WIP: Refactor file descriptor loading code
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12950/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/12950/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@500
PS1, Line 500:   public static RemoteIterator<FileStatus> listStatus(FileSystem fs, Path p) throws IOException {
line too long (97 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Anonymous Coward (486)
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Comment-Date: Sat, 06 Apr 2019 00:12:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8454 (part 1): Refactor file descriptor loading code

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12950 )

Change subject: IMPALA-8454 (part 1): Refactor file descriptor loading code
......................................................................

IMPALA-8454 (part 1): Refactor file descriptor loading code

This refactors various file-descriptor loading code out of HdfsTable
into new standalone classes. In order to support ACID tables, we'll need
to make various changes to these bits of code, and having them extracted
and cleaned up will make that easier.

This consolidates all of the places in which we list partition
directories into one method which does the appropriate thing regardless
of situation.

This has a small behavior change related to IMPALA-8406: previously, we
had a bug where, while refreshing a table, if one or more partitions
failed to refresh, the other partitions might still get refreshed
despite an error being returned. Those other partitions wouldn't be
available immediately until some other operation caused the table's
catalog version number to increase. This was buggy behavior.

Rather than tackle that problem in this "refactor" patch, this patch
just slightly improves the behavior: we'll either atomically update or
not update all partitions, but we might still add new partitions noticed
by the REFRESH, and might still update other HMS metadata.

This patch may end up slightly improving various other code paths that
refresh file descriptor lists. We used to have slightly different ways
of doing this in three different places, with different sets of
optimizations. Now we do it all in one place, and we pull all the known
tricks.

Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Reviewed-on: http://gerrit.cloudera.org:8080/12950
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
A fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M fe/src/test/java/org/apache/impala/planner/TestCaseLoaderTest.java
10 files changed, 478 insertions(+), 465 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/12950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I59edf493b9ba38be5f556b4795a7684d9c9e3a07
Gerrit-Change-Number: 12950
Gerrit-PatchSet: 8
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Sudhanshu Arora <su...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>