You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Henry Robinson (Code Review)" <ge...@cloudera.org> on 2017/06/01 21:00:49 UTC

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7045

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................

IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

If a query was waiting for metadata to be loaded at the same time as the
statestore restarted, it would not correctly see the metadata even after
the statestore had recovered and sent it.

This is because the AnalysisContext used for analysis is created with a
reference to a ImpaladCatalog once, before the analysis->metadata-load
loop started. If the catalog sent a *complete* update (which it would do
after a statestore restart), the ImpaladCatalog would be replaced, not
updated, and so the tables would exist in the new object, but not the
old one that the AnalysisContext had a reference to.

The fix is to recreate the AnalysisContext every time through the loop,
taking a recent reference to the ImpaladCatalog every time. That way, if
the ImpaladCatalog is updated, the next time through the loop the
analysis phase will see those changes.

Also fix a potential bug where access to ImpaladCatalog is not
synchronized between update and analysis threads, by using an
AtomicReference<ImpaladCatalog>.

Testing: manual - queries completed after statestore restarts with this
patch, where previously they consumed 100% CPU spinning forever.

Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
1 file changed, 26 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/7045/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 4: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7045/4/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

PS4, Line 553: ImpaladCatalog catalog = impaladCatalog_.get();
nit: move this above L549 and use 'catalog' in both if and else blocks.


PS4, Line 927:       Preconditions.checkNotNull(analysisCtx);
Don't think this is needed anymore.


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7045/4/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 237:         impaladCatalog_.get().getAuthPolicy()));
use catalog.getAuthPolicy()


Line 610:     List<String> tblNames = impaladCatalog_.get().getTableNames(dbName, matcher);
I'm a little worried about the repercussions of operations seeing different versions of the catalog.

For example, take a look at MetadataOp.getDbsMetadata() which accesses the catalog several times, and might see different versions.

Some more comments along similar lines below.

Maybe the new behavior is better than before, but if we do hit one of these edge cases, it might be very hard to debug.

Thoughts?


Line 890:     AnalysisContext analysisCtx = new AnalysisContext(impaladCatalog_.get(), queryCtx,
Are we guaranteed that this new catalog returns true for isReady()?


Line 901:         analysisCtx.setCatalog(impaladCatalog_.get());
Are we guaranteed that this new catalog returns true for isReady()?


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7045/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 901:         analysisCtx = new AnalysisContext(impaladCatalog_.get(), queryCtx,
> Ugh, I think you're right. But I don't think we want to put the atomic ref 
Ahh yes, you are right. +1 for AnalysisContext.setCatalog() or similar


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 2:

Rebased and included an unsquashed fix.

-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7045/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 901:         analysisCtx = new AnalysisContext(impaladCatalog_.get(), queryCtx,
> Doesn't this destroy the analysisCtx.timeline_ and effectively hide how lon
Ugh, I think you're right. But I don't think we want to put the atomic ref into the analysisctx - we don't want to read two different catalogs at different points during analysis by dereferencing the atomic ref at different times. Probably needs AnalysisContext.setCatalog() or similar.


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7045/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 901:         analysisCtx = new AnalysisContext(impaladCatalog_.get(), queryCtx,
Doesn't this destroy the analysisCtx.timeline_ and effectively hide how long the metadata loading took?

An alternative is to put add the atomic ref into the AnalysisCtx instead of a plain ref.


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7045/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

PS1, Line 552:       destPathString = impaladCatalog_.get().getTable(tableName.getDb(), tableName.getTbl())
> nit: long line
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7045/4/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 237:         impaladCatalog_.get().getAuthPolicy()));
> use catalog.getAuthPolicy()
Done


PS4, Line 553: ImpaladCatalog catalog = impaladCatalog_.get();
> nit: move this above L549 and use 'catalog' in both if and else blocks.
Done


PS4, Line 927:       Preconditions.checkNotNull(analysisCtx);
> Don't think this is needed anymore.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 4:

Henry, you mentioned you'd post a new patch once my concerns are addressed. You have address them, I'm just waiting for the latest PS. Change looks good.

-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Hello Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7045

to look at the new patch set (#2).

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................

IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

If a query was waiting for metadata to be loaded at the same time as the
statestore restarted, it would not correctly see the metadata even after
the statestore had recovered and sent it.

This is because the AnalysisContext used for analysis is created with a
reference to a ImpaladCatalog once, before the analysis->metadata-load
loop started. If the catalog sent a *complete* update (which it would do
after a statestore restart), the ImpaladCatalog would be replaced, not
updated, and so the tables would exist in the new object, but not the
old one that the AnalysisContext had a reference to.

The fix is to recreate the AnalysisContext every time through the loop,
taking a recent reference to the ImpaladCatalog every time. That way, if
the ImpaladCatalog is updated, the next time through the loop the
analysis phase will see those changes.

Also fix a potential bug where access to ImpaladCatalog is not
synchronized between update and analysis threads, by using an
AtomicReference<ImpaladCatalog>.

Testing: manual - queries completed after statestore restarts with this
patch, where previously they consumed 100% CPU spinning forever.

Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
1 file changed, 27 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/7045/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Hello Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7045

to look at the new patch set (#3).

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................

IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

If a query was waiting for metadata to be loaded at the same time as the
statestore restarted, it would not correctly see the metadata even after
the statestore had recovered and sent it.

This is because the AnalysisContext used for analysis is created with a
reference to a ImpaladCatalog once, before the analysis->metadata-load
loop started. If the catalog sent a *complete* update (which it would do
after a statestore restart), the ImpaladCatalog would be replaced, not
updated, and so the tables would exist in the new object, but not the
old one that the AnalysisContext had a reference to.

The fix is to recreate the AnalysisContext every time through the loop,
taking a recent reference to the ImpaladCatalog every time. That way, if
the ImpaladCatalog is updated, the next time through the loop the
analysis phase will see those changes.

Also fix a potential bug where access to ImpaladCatalog is not
synchronized between update and analysis threads, by using an
AtomicReference<ImpaladCatalog>.

Testing: manual - queries completed after statestore restarts with this
patch, where previously they consumed 100% CPU spinning forever.

Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
1 file changed, 28 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/7045/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 4:

Alex / Dimitris - any further thoughts here?

-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Hello Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7045

to look at the new patch set (#5).

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................

IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

If a query was waiting for metadata to be loaded at the same time as the
statestore restarted, it would not correctly see the metadata even after
the statestore had recovered and sent it.

This is because the AnalysisContext used for analysis is created with a
reference to a ImpaladCatalog once, before the analysis->metadata-load
loop started. If the catalog sent a *complete* update (which it would do
after a statestore restart), the ImpaladCatalog would be replaced, not
updated, and so the tables would exist in the new object, but not the
old one that the AnalysisContext had a reference to.

The fix is to update the AnalysisContext's catalog every time through
the loop, taking a recent reference to the ImpaladCatalog every
time. That way, if the ImpaladCatalog is updated, the next time through
the loop the analysis phase will see those changes.

Also fix a potential bug where access to ImpaladCatalog is not
synchronized between update and analysis threads, by using an
AtomicReference<ImpaladCatalog>.

Testing: manual - queries completed after statestore restarts with this
patch, where previously they consumed 100% CPU spinning forever.

Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
2 files changed, 37 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/7045/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 4:

(2 comments)

Will post a new patch when I'm sure I'm addressing Alex's concerns.

http://gerrit.cloudera.org:8080/#/c/7045/4/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 610:     List<String> tblNames = impaladCatalog_.get().getTableNames(dbName, matcher);
> I'm a little worried about the repercussions of operations seeing different
Can you clarify - is this an issue that you see introduced by this patch? 

getDbsMetadata() retrieves the catalog reference once. So it should see a consistent snapshot throughout its execution (or at least as consistent as it did previously).

Previously, the naked catalog reference would be updated asynchronously. The only change here is an atomic ref that makes sure the update is published cross-thread. 

I agree that this is all a bit confusing. We should be able to snapshot a version of the catalog and keep it fixed during analysis.


Line 890:     AnalysisContext analysisCtx = new AnalysisContext(impaladCatalog_.get(), queryCtx,
> Are we guaranteed that this new catalog returns true for isReady()?
I believe so (since the new catalog is created from the statestore update). I'll look further and make the code clearer.


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7045/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 901:         analysisCtx = new AnalysisContext(impaladCatalog_.get(), queryCtx,
> Ahh yes, you are right. +1 for AnalysisContext.setCatalog() or similar
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

If a query was waiting for metadata to be loaded at the same time as the
statestore restarted, it would not correctly see the metadata even after
the statestore had recovered and sent it.

This is because the AnalysisContext used for analysis is created with a
reference to a ImpaladCatalog once, before the analysis->metadata-load
loop started. If the catalog sent a *complete* update (which it would do
after a statestore restart), the ImpaladCatalog would be replaced, not
updated, and so the tables would exist in the new object, but not the
old one that the AnalysisContext had a reference to.

The fix is to update the AnalysisContext's catalog every time through
the loop, taking a recent reference to the ImpaladCatalog every
time. That way, if the ImpaladCatalog is updated, the next time through
the loop the analysis phase will see those changes.

Also fix a potential bug where access to ImpaladCatalog is not
synchronized between update and analysis threads, by using an
AtomicReference<ImpaladCatalog>.

Testing: manual - queries completed after statestore restarts with this
patch, where previously they consumed 100% CPU spinning forever.

Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Reviewed-on: http://gerrit.cloudera.org:8080/7045
Reviewed-by: Henry Robinson <he...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
2 files changed, 37 insertions(+), 29 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Henry Robinson: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7045/4/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 610:     List<String> tblNames = impaladCatalog_.get().getTableNames(dbName, matcher);
> Can you clarify - is this an issue that you see introduced by this patch? 
getDbsMetadata() calls fe.getTableNames() which gets the catalog again

I see your point though that your patch does not introduce this issue, so nvm my comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 6: Code-Review+2

Rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Hello Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7045

to look at the new patch set (#4).

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................

IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

If a query was waiting for metadata to be loaded at the same time as the
statestore restarted, it would not correctly see the metadata even after
the statestore had recovered and sent it.

This is because the AnalysisContext used for analysis is created with a
reference to a ImpaladCatalog once, before the analysis->metadata-load
loop started. If the catalog sent a *complete* update (which it would do
after a statestore restart), the ImpaladCatalog would be replaced, not
updated, and so the tables would exist in the new object, but not the
old one that the AnalysisContext had a reference to.

The fix is to update the AnalysisContext's catalog every time through
the loop, taking a recent reference to the ImpaladCatalog every
time. That way, if the ImpaladCatalog is updated, the next time through
the loop the analysis phase will see those changes.

Also fix a potential bug where access to ImpaladCatalog is not
synchronized between update and analysis threads, by using an
AtomicReference<ImpaladCatalog>.

Testing: manual - queries completed after statestore restarts with this
patch, where previously they consumed 100% CPU spinning forever.

Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
2 files changed, 35 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/7045/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/699/

-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 1: Code-Review+2

(1 comment)

Thanks. Not sure if I am technically allowed to +2 it... so feel free to ask someone to take a look as well.

http://gerrit.cloudera.org:8080/#/c/7045/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

PS1, Line 552:       destPathString = impaladCatalog_.get().getTable(tableName.getDb(), tableName.getTbl())
nit: long line


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5056: Ensure analysis uses 'fresh' catalog after metadata loading
......................................................................


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7045/2/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

PS2, Line 553:       destPathString = impaladCatalog_.get().getTable(tableName.getDb(), tableName.getTbl())
nit: long line


-- 
To view, visit http://gerrit.cloudera.org:8080/7045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I574d69fa75198499523dc291fbbd0d7e3d8d968f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes