You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Alex Behm (Code Review)" <ge...@cloudera.org> on 2017/01/13 08:18:21 UTC

[Impala-ASF-CR] Improve logging of table loading.

Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5709

Change subject: Improve logging of table loading.
......................................................................

Improve logging of table loading.

Changes the log level from trace to debug for several
important events, in partiular, during table loading.
The goal is to improve supportability without having
to turn on trace debugging which can generate a
significant log volume. Also improves the log output.

Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
5 files changed, 83 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/5709/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5709/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS4, Line 327: if (unknownDiskIdCount > 0) {
             :           if (LOG.isWarnEnabled()) {
             :             LOG.warn("Unknown disk id count for filesystem " + fs + ":" +
             :                 unknownDiskIdCount);
             :           }
             :         }
> Do we print for every file? Shouldn't this be outside the while loop?
This was a mistake. Fixed.


PS4, Line 753: Loading
> "Loaded"
Done


http://gerrit.cloudera.org:8080/#/c/5709/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

PS3, Line 890: LOG.info("Compiling query: " + queryCtx.client_request.stmt);
> Don't we also want to mark the successful compilation of a query in L926?
For some queries the SQL text can be pretty large, I added the successful compilation msg without the stmt.


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 3:

(2 comments)

Overall, not a big fan of this change. There are better ways to generate logging of important function calls without adding matching LOG.info("Starting...") - LOG.info("Started...") at the beginning and end of each function. That said, if we believe these new statements really help us debug the catalog, I will not object this change.

http://gerrit.cloudera.org:8080/#/c/5709/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS3, Line 241: BlockMetadataLoadStats
I don't see any block-metadata-specific stats, so I am wondering if the name of this class is accurate.


PS3, Line 244: // Number of files that were found to belong to the table being loaded.
             :     public long tableFileCount = 0;
Not sure I understand what this field represents? Can you please explain ?


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] Improve logging of table loading.

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change.

Change subject: Improve logging of table loading.
......................................................................


Patch Set 1:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/5709/1//COMMIT_MSG
Commit Message:

Line 7: Improve logging of table loading.
JIRA id? I think it helps in backporting to other branches.


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

Line 528:       LOG.trace("Loading native functions for database: " + db.getName());
Should we enable logging related to fns (here and below) ? I think it helps figuring out startup issues and IIUC this is not a frequent operation (I might be wrong here).


Line 898:       LOG.debug(String.format("Refreshing metadata: %s", tbl.getFullName()));
Given we support both table & partition granularity, Is it better to explicitly mention table/partition in the statement? Refreshing table metadata...? (Same below)


Line 1036:           tableName.getDb_name(), tableName.getTable_name()));
this and invalidateDb()? Can potentially help the frequency with which invalidates are run.


Line 1262:   public Table reloadPartition(Table tbl, List<TPartitionKeyValue> partitionSpec)
Should we add this too?


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 681:       LOG.debug("Creating empty partition objects: " + getFullName());
Log statment would probably be confusing for non-partitioned tables?  Should we move it to after L713 or rephrase here?


PS1, Line 1057: db_.getName() + "." + name_
getFullName()?


Line 1081:           LOG.debug("Incrementally loaded metadata for: " + tableName);
I have a feeling these could be noisy and can do a lot of logging.


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

Line 570:       LOG.debug("Loading table metadata: " + tbl.getFullName());
This will be logged twice for non-hdfs tables, once here and once inside Table.load()?


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

PS1, Line 832: info
debug like others for consistency?


Line 915:               LOG.trace(String.format("Missing tables were not received in %dms. Load " +
Isn't this a warn/error?


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/190/

-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No

[Impala-ASF-CR] Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: Improve logging of table loading.
......................................................................


Patch Set 1:

(10 comments)

Thanks, Bharath. Good suggestions.

http://gerrit.cloudera.org:8080/#/c/5709/1//COMMIT_MSG
Commit Message:

Line 7: Improve logging of table loading.
> JIRA id? I think it helps in backporting to other branches.
Don't think we're going to backport this anywhere, but done anyway.


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

Line 528:       LOG.trace("Loading native functions for database: " + db.getName());
> Should we enable logging related to fns (here and below) ? I think it helps
I agree. The basic loading steps should be included in the logs.

To make the intention clearer, I changed the new log messages to LOG.info() because LOG.debug() could mislead a reader.


Line 898:       LOG.debug(String.format("Refreshing metadata: %s", tbl.getFullName()));
> Given we support both table & partition granularity, Is it better to explic
Agree. Done.


Line 1036:           tableName.getDb_name(), tableName.getTable_name()));
> this and invalidateDb()? Can potentially help the frequency with which inva
Agree. invalidateDb() is a private function only called from reset(), so I added the logging there (indicates an "invalidate metadata").


Line 1262:   public Table reloadPartition(Table tbl, List<TPartitionKeyValue> partitionSpec)
> Should we add this too?
Yea I think so.


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 681:       LOG.debug("Creating empty partition objects: " + getFullName());
> Log statment would probably be confusing for non-partitioned tables?  Shoul
I'm not even sure this message is useful. Removed.


PS1, Line 1057: db_.getName() + "." + name_
> getFullName()?
Done


Line 1081:           LOG.debug("Incrementally loaded metadata for: " + tableName);
> I have a feeling these could be noisy and can do a lot of logging.
We only get into this code path for INSERT, ALTER TABLE, TRUNCATE TABLE and DROP STATS. Do you still think it will be too noisy?


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

Line 570:       LOG.debug("Loading table metadata: " + tbl.getFullName());
> This will be logged twice for non-hdfs tables, once here and once inside Ta
Assuming you mean HDFS tables. The HDFS loading message is a little more detailed, it says whether an incremental or complete load is being done.

I think it also makes sense to track the caller of load in this case. What do you think?


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 915:               LOG.trace(String.format("Missing tables were not received in %dms. Load " +
> Isn't this a warn/error?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 3:

(2 comments)

I agree our logging story needs an overhaul, but that's a project and not a fix.

http://gerrit.cloudera.org:8080/#/c/5709/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS3, Line 241: BlockMetadataLoadStats
> Thanks for explaining. Based on the name, I would also expect something lik
We can easily get the file count info from HDFS if we suspect such a perf issue. I removed this to minimize the changes.


PS3, Line 244: // Number of files that were found to belong to the table being loaded.
             :     public long tableFileCount = 0;
> Hm, maybe a better name for these fields? e.g. examinedFilesCount and table
I removed the BlockMetadataLoadStats to simplify this patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


IMPALA-4768: Improve logging of table loading.

- Improves the logging for several important events,
  in particular, during table loading.
- Uses LOG.info() for such messages to clarify their
  intent.

The goal is to improve supportability without having
to turn on trace debugging which can generate a
significant log volume.

Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Reviewed-on: http://gerrit.cloudera.org:8080/5709
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
5 files changed, 39 insertions(+), 39 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 3: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada, Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5709

to look at the new patch set (#5).

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................

IMPALA-4768: Improve logging of table loading.

- Improves the logging for several important events,
  in particular, during table loading.
- Uses LOG.info() for such messages to clarify their
  intent.

The goal is to improve supportability without having
to turn on trace debugging which can generate a
significant log volume.

Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
5 files changed, 39 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/5709/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5709/2//COMMIT_MSG
Commit Message:

PS2, Line 10: partiular
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/5709/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

PS2, Line 938:  
> ...table...
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5709

to look at the new patch set (#3).

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................

IMPALA-4768: Improve logging of table loading.

- Improves the logging for several important events,
  in particular, during table loading.
- Uses LOG.info() for such messages to clarify their
  intent.

The goal is to improve supportability without having
to turn on trace debugging which can generate a
significant log volume.

Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
5 files changed, 66 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/5709/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5709

to look at the new patch set (#4).

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................

IMPALA-4768: Improve logging of table loading.

- Improves the logging for several important events,
  in particular, during table loading.
- Uses LOG.info() for such messages to clarify their
  intent.

The goal is to improve supportability without having
to turn on trace debugging which can generate a
significant log volume.

Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
5 files changed, 43 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/5709/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 2: Code-Review+1

(6 comments)

LGTM.

http://gerrit.cloudera.org:8080/#/c/5709/2//COMMIT_MSG
Commit Message:

PS2, Line 10: partiular
nit: typo


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

Line 528:       if (!key.startsWith(Db.FUNCTION_INDEX_PREFIX)) continue;
> I agree. The basic loading steps should be included in the logs.
Thanks Alex.


http://gerrit.cloudera.org:8080/#/c/5709/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

PS2, Line 938:  
...table...


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 681:     // Map of partition paths to their corresponding HdfsPartition objects. Populated
> I'm not even sure this message is useful. Removed.
Agreed.


Line 1081:   }
> We only get into this code path for INSERT, ALTER TABLE, TRUNCATE TABLE and
Yea I think so. Not too strong about it, I think its fine too.


http://gerrit.cloudera.org:8080/#/c/5709/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

Line 570:     try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) {
> Assuming you mean HDFS tables. The HDFS loading message is a little more de
Makes sense to track the caller, Agreed.


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 6: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has uploaded a new patch set (#2).

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................

IMPALA-4768: Improve logging of table loading.

- Improves the logging for several important events,
  in partiular, during table loading.
- Uses LOG.info() for such messages to clarify their
  intent.

The goal is to improve supportability without having
to turn on trace debugging which can generate a
significant log volume.

Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
5 files changed, 66 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/5709/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 4: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5709/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS4, Line 327: if (unknownDiskIdCount > 0) {
             :           if (LOG.isWarnEnabled()) {
             :             LOG.warn("Unknown disk id count for filesystem " + fs + ":" +
             :                 unknownDiskIdCount);
             :           }
             :         }
Do we print for every file? Shouldn't this be outside the while loop?


PS4, Line 753: Loading
"Loaded"


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 3:

(2 comments)

I'm not set on implementing the logging this way, but shipping the current code as is will be a supportability nightmare. Bharath and I have felt the impact of the lack of log messages while debugging a problem in an internal cluster.

I'm happy to implement these changes in a different way if you have a suggestion.

http://gerrit.cloudera.org:8080/#/c/5709/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS3, Line 241: BlockMetadataLoadStats
> I don't see any block-metadata-specific stats, so I am wondering if the nam
The file counts are specific to the block metadata loading. For full loads of partitioned tables, the common case is that we recursively iterate over *all* files in the table's root directory and then map the files to partitions based on the their containing directory. This means that there could be files in the table's root dir that don't belong to any partition. So the number of files scanned and the number of files actually belonging to the table could be different, and a large discrepancy could explain loading perf regressions (if any).

I'm ok with removing this change if you prefer.


PS3, Line 244: // Number of files that were found to belong to the table being loaded.
             :     public long tableFileCount = 0;
> Not sure I understand what this field represents? Can you please explain ?
See above comment, hope it explains it.

Let me know if you want to keep or remove this. If you prefer to keep, I'll expand the comments.


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 3:

(1 comment)

I see a comment about removing BlockMetadataLoadStats but this is not reflected in the code. Are you sending a new patch?

http://gerrit.cloudera.org:8080/#/c/5709/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

PS3, Line 890: LOG.info("Compiling query: " + queryCtx.client_request.stmt);
Don't we also want to mark the successful compilation of a query in L926?


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Bharath Vissapragada, Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5709

to look at the new patch set (#6).

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................

IMPALA-4768: Improve logging of table loading.

- Improves the logging for several important events,
  in particular, during table loading.
- Uses LOG.info() for such messages to clarify their
  intent.

The goal is to improve supportability without having
to turn on trace debugging which can generate a
significant log volume.

Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
5 files changed, 39 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/5709/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>

[Impala-ASF-CR] IMPALA-4768: Improve logging of table loading.

Posted by "Dimitris Tsirogiannis (Code Review)" <ge...@cloudera.org>.
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4768: Improve logging of table loading.
......................................................................


Patch Set 3:

(2 comments)

To answer Alex's question for alternatives to logging approaches. I would prefer a more principled approach like java method annotations using AOP. I had a proof-of-concept patch out for review but it got discarded with some gerrit changes; I can resume it. The reason I prefer the annotation based approach is that a) avoids spurious log statements in the codebase and in the logs, b) it automatically gives latency of each annotated method execution, c) you can control the log level at which each set of methods is logged and d) can log information about input and output parameters. For performance monitoring and debugging I would prefer the use of something like Java JMX. 

I am not arguing we should do these changes for 5.10. I understand the immediate need for better logging in some cases but we should definitely consider other options other than adding log statements here and there.

http://gerrit.cloudera.org:8080/#/c/5709/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS3, Line 241: BlockMetadataLoadStats
> The file counts are specific to the block metadata loading. For full loads 
Thanks for explaining. Based on the name, I would also expect something like # blocks loaded, avg block size, etc. If we believe the information here will help us diagnose perf problems, I don't mind having this change.


PS3, Line 244: // Number of files that were found to belong to the table being loaded.
             :     public long tableFileCount = 0;
> See above comment, hope it explains it.
Hm, maybe a better name for these fields? e.g. examinedFilesCount and tableFileCount, or something along these lines?


-- 
To view, visit http://gerrit.cloudera.org:8080/5709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dt...@cloudera.com>
Gerrit-HasComments: Yes