You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org> on 2018/10/09 23:49:59 UTC

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Bharath Vissapragada has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11638


Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................

IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

The bug here was that any partial RPC on an IncompleteTable was throwing
an NPE.

Ideally, we attempt to load the table (if we find that it is not loaded)
before making the partial info request, but a concurrent invalidate could
reset the table state and move it back to an uninitialized state.

This patch handles this case better by propagating a meaningful error to
the caller.

Testing:
-------
- I was able to repro the bug by running invalidate metadata and DDLs in
  a tight loop in parallel shells. Without the patch I see the NPEs and
  with the patch I can see the exception being propagated.

- It is tricky to repro this in a test since any getPartialInfo()
  request attempts to reload the table if it is in an incomplete state.
  We need some tight loops that invalidate metadata in parallel.

Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
---
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
1 file changed, 9 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/11638/1
-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Vuk Ercegovac (Code Review)" <ge...@cloudera.org>.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11638/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11638/2//COMMIT_MSG@22
PS2, Line 22: tight loop in parallel shells
worthwhile making an e2e test for this? we have a couple of these for query re-planning.


http://gerrit.cloudera.org:8080/#/c/11638/2/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/2/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@145
PS2, Line 145: cause_ == null
This could be guarded equivalently at the call-site with isLoaded(). Here is an example where isLoaded can be either true or false. Default impl is that its always loaded but its unclear if all impls need to (or will need to) behave the same way. If partitions are fetched from a table that's not loaded, seems like a precondition violation. 
Not too strong an opinion here... putting it here makes all this easier to use and less error prone but at the risk of not uniformly handling the Table api more generally.


http://gerrit.cloudera.org:8080/#/c/11638/2/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@149
PS2, Line 149: more meaningful lookup status
perhaps TABLE_NOT_LOADED?



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Oct 2018 07:24:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Hello Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11638

to look at the new patch set (#2).

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................

IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

The bug here was that any partial RPC on an IncompleteTable was throwing
an NPE.

Ideally, we attempt to load the table (if we find that it is not loaded)
before making the partial info request, but a concurrent invalidate could
reset the table state and move it back to an uninitialized state.

This patch handles this case better by propagating a meaningful error to
the caller.

Testing:
-------
- I was able to repro the bug by running invalidate metadata and DDLs in
  a tight loop in parallel shells. Without the patch I see the NPEs and
  with the patch I can see the exception being propagated.

- Added a unit test that consistently fails without the patch.

Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
---
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
2 files changed, 27 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/11638/2
-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Oct 2018 04:03:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 1:

(1 comment)

Unit test?

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@144
PS1, Line 144: if (cause_ == null) {
             :       throw new TableLoadingException("Table state on the Catalog server has been " +
             :           "invalidated by another concurrent request. Please retry the query.");
this is kind of lame from the user's perspective. Can't we just run against the new table, which will get the frontend to throw InconsistentMetadataFetchException and automatically re-plan?



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Oct 2018 20:55:37 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1070/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Oct 2018 00:39:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Hello Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11638

to look at the new patch set (#3).

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................

IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

The bug here was that any partial RPC on an IncompleteTable was throwing
an NPE.

Ideally, we attempt to load the table (if we find that it is not loaded)
before making the partial info request, but a concurrent invalidate could
reset the table state and move it back to an uninitialized state.

This patch handles this case better by propagating a meaningful error to
the caller.

Testing:
-------
- Added a test that fails consistently with an NPE without this patch.

Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
---
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M tests/custom_cluster/test_local_catalog.py
6 files changed, 60 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/11638/3
-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@144
PS1, Line 144: if (cause_ == null) {
             :       throw new TableLoadingException("Table state on the Catalog server has been " +
             :           "invalidated by another concurrent request. Please retry the query.");
> Fair point. 
perhaps, but I'm actually still a little confused how we get to this case in the first place. Don't we call getOrLoadTable which ends up always returning either a "complete" table or a "failed" table?



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Oct 2018 21:57:00 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@144
PS1, Line 144: if (cause_ == null) {
             :       throw new TableLoadingException("Table state on the Catalog server has been " +
             :           "invalidated by another concurrent request. Please retry the query.");
> ah, I see. I think in other places we also make an assumption that getOrLoa
I was looking at the callers of this. I see a few of them actually expect that the table could still be in an invalidated by the time this returns [1,2] (multiple DDLs actually call the wrapper [2])

One interesting caller here is [3,4] (that you modified :)) is probably affected. 

[1] https://github.com/apache/impala/blob/0047191262d6a90eb704dff880efe6e625b805bc/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L452

[2] https://github.com/apache/impala/blob/0047191262d6a90eb704dff880efe6e625b805bc/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L3672

[3] https://github.com/apache/impala/blob/0047191262d6a90eb704dff880efe6e625b805bc/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1412

[4] https://github.com/apache/impala/commit/fa6db57769ddef60f6d4f8aa342fb1b0a4d4eb8b#diff-b0fabd1ea9efd6f9e7fdd6f01ecedfbd



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Fri, 12 Oct 2018 01:17:03 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3321/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Oct 2018 00:06:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@144
PS1, Line 144: // distinguish between these scenarios based on 'cause_'.
             :     if (cause_ == null) {
             :       TGetPartialCatalogObjectResponse resp = new TGetPartialCatalogObjectRespon
> I was looking at the callers of this. I see a few of them actually expect t
I've stared at the code ([3]) for a bit and I don't think it is affected by it being an IncompleteTable. Looks like we are only interested in the backing hms table for rest of the method. So, I'm leaving it as is. (Please correct me if you think I got something wrong).



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Mon, 15 Oct 2018 23:35:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 4: Code-Review+2

(1 comment)

carrying +2.

http://gerrit.cloudera.org:8080/#/c/11638/3/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/11638/3/tests/custom_cluster/test_local_catalog.py@274
PS3, Line 274: ert failed_queries.empty(),\
             :           "Failed 
> can simplify to: assert not failed_queries.empty(), "..."
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Oct 2018 00:06:48 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@144
PS1, Line 144: if (cause_ == null) {
             :       throw new TableLoadingException("Table state on the Catalog server has been " +
             :           "invalidated by another concurrent request. Please retry the query.");
> The problem seems to be here [1] (I added some additional logging to confir
ah, I see. I think in other places we also make an assumption that getOrLoadTable always returns either a failed or a complete table (never an incomplete one). Is this problem more widespread than just this API call?



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Oct 2018 22:48:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1051/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Oct 2018 00:12:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Vuk Ercegovac (Code Review)" <ge...@cloudera.org>.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11638/3/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/11638/3/tests/custom_cluster/test_local_catalog.py@274
PS3, Line 274: not failed_queries.empty():
             :         assert Fal
can simplify to: assert not failed_queries.empty(), "..."



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Oct 2018 23:58:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1003/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 10 Oct 2018 00:19:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Hello Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11638

to look at the new patch set (#4).

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................

IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

The bug here was that any partial RPC on an IncompleteTable was throwing
an NPE.

Ideally, we attempt to load the table (if we find that it is not loaded)
before making the partial info request, but a concurrent invalidate could
reset the table state and move it back to an uninitialized state.

This patch handles this case better by propagating a meaningful error to
the caller.

Testing:
-------
- Added a test that fails consistently with an NPE without this patch.

Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
---
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M tests/custom_cluster/test_local_catalog.py
6 files changed, 60 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/11638/4
-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11638/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11638/2//COMMIT_MSG@22
PS2, Line 22: tight loop in parallel shells
> worthwhile making an e2e test for this? we have a couple of these for query
IIRC I added a few sleeps in some codepaths to repro this, since the race window was small. Lemme try to write an e-e test and see if it repros without the patch.


http://gerrit.cloudera.org:8080/#/c/11638/2/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/2/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@145
PS2, Line 145: cause_ == null
> This could be guarded equivalently at the call-site with isLoaded(). Here i
you mean use !isLoaded() here instead of (cause_ == null)?


http://gerrit.cloudera.org:8080/#/c/11638/2/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@149
PS2, Line 149: more meaningful lookup status
> perhaps TABLE_NOT_LOADED?
will do.



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Oct 2018 17:44:48 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@144
PS1, Line 144: if (cause_ == null) {
             :       throw new TableLoadingException("Table state on the Catalog server has been " +
             :           "invalidated by another concurrent request. Please retry the query.");
> perhaps, but I'm actually still a little confused how we get to this case i
The problem seems to be here [1] (I added some additional logging to confirm that). If a racy invalidate changes the table state, we return the invalidated table [2].

[1] https://github.com/apache/impala/blob/0047191262d6a90eb704dff880efe6e625b805bc/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1317

[2] https://github.com/apache/impala/blob/0047191262d6a90eb704dff880efe6e625b805bc/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1341



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Oct 2018 22:32:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1069/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Oct 2018 23:10:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................

IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

The bug here was that any partial RPC on an IncompleteTable was throwing
an NPE.

Ideally, we attempt to load the table (if we find that it is not loaded)
before making the partial info request, but a concurrent invalidate could
reset the table state and move it back to an uninitialized state.

This patch handles this case better by propagating a meaningful error to
the caller.

Testing:
-------
- Added a test that fails consistently with an NPE without this patch.

Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Reviewed-on: http://gerrit.cloudera.org:8080/11638
Reviewed-by: Bharath Vissapragada <bh...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M tests/custom_cluster/test_local_catalog.py
6 files changed, 60 insertions(+), 3 deletions(-)

Approvals:
  Bharath Vissapragada: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>

[Impala-ASF-CR] IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11638 )

Change subject: IMPALA-7669: Gracefully handle concurrent invalidate/partial fetch RPCs
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java:

http://gerrit.cloudera.org:8080/#/c/11638/1/fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java@144
PS1, Line 144: if (cause_ == null) {
             :       throw new TableLoadingException("Table state on the Catalog server has been " +
             :           "invalidated by another concurrent request. Please retry the query.");
> this is kind of lame from the user's perspective. Can't we just run against
Fair point. 

Rather than triggering another reload from here, does it make sense to send a custom lookup status (INVALIDATED_TABLE_FOUND or something like that) and wait for the frontend to trigger a re-plan?



-- 
To view, visit http://gerrit.cloudera.org:8080/11638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8533f73f25ca42a20f146ddfd95d4213add9b705
Gerrit-Change-Number: 11638
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Oct 2018 21:42:50 +0000
Gerrit-HasComments: Yes