You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Philip Zeyliger (Code Review)" <ge...@cloudera.org> on 2017/09/26 23:03:55 UTC

[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected.

Hello Alex Behm, Mostafa Mokhtar, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8100

to look at the new patch set (#2).

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected.
......................................................................

IMPALA-5940: Avoid log spew by using Status::Expected.

In IMPALA-5926, we fixed a case where closing the session triggered a
stack trace in the logs which impacted performance for short-running
queries. Looking at log files from several active clusters, I identified a few
other cases where we could clean up log spew with the same (trivial)
approach.

The following messages will no longer log:

  "Failed to codegen MaterializeTuple() due to unsupported type: $0"
  "Not implemented for this format."
  "ScalarFnCall Codegen not supported for CHAR"
  "Could not codegen CodegenMaterializeExprs: "
  "Could not codegen TupleRowComparator::Compare(): $0"

These codegen related messages were happening at every execution node, and were
therefore very common.

In addition, the following messages, which were very frequent in the logs, now
will log, but without the back trace:

  "... Client ... timed-out during recv call."
  "Query Id ... not found."

To preserve the source of the messages, I used preprocessor macros __FILE__ and
__LINE__ and stuffed them into the error string. As you can see in the worked
example in the testing section below, __FILE__ includes the full source code
path, which is uglier than the logging equivalent. (glog has a helper
called "const_basename" which handles this at
https://github.com/google/glog/blob/2a6df66252d266080489c310b8146e63b66b2add/src/utilities.cc#L279)

The snippet I used to identify these was:

  find . -type f -name '*IMPALAD*.gz' | xargs gzcat  | awk '/^I/ { if(x) { print x; } x = "" } /status.cc/ { x=" "; } { if(x) { x=x  $0 } }'  | sed -e 's/0x[0-9a-fx]* //g' | sed -e 's/[0-9a-f]\{16\}:[0-9a-f]*/QUERYID/g' |  tr -s '\t' ' ' | tr '[0-9]' 'N' | sort | uniq -c  | sort -n | tee output.txt

I also analyzed some logs using SQL, against a pre-processed logs table:

  with v as (
    select regexp_replace(
        regexp_replace(
          translate(substr(message, 42), "\n\t", "  "),
          "[a-zA-Z0-9.-]*[.][a-zA-Z0-9-]*:[0-9]*",
          "<host>"),
        "@.*$", "@@@...") as m
    from logs_table where `class`="status.cc")
  select m, count(*) from v group by 1 order by 2 desc limit 100

Testing:
* Automated tests.
* Manual testing for one NoTrace code path:

  $ impala-python shell/gen-py/ImpalaService/ImpalaService-remote -h localhost:21000 GetRuntimeProfile 'beeswaxd.ttypes.QueryHandle()'
  Traceback (most recent call last):
    File "shell/gen-py/ImpalaService/ImpalaService-remote", line 106, in <module>
      pp.pprint(client.GetRuntimeProfile(eval(args[0]),))
    File "/home/philip/src/impala/shell/gen-py/ImpalaService/ImpalaService.py", line 161, in GetRuntimeProfile
      return self.recv_GetRuntimeProfile()
    File "/home/philip/src/impala/shell/gen-py/ImpalaService/ImpalaService.py", line 184, in recv_GetRuntimeProfile
      raise result.error
  beeswaxd.ttypes.BeeswaxException: BeeswaxException(handle=QueryHandle(log_context='', id=''), log_context='', SQLState='HY000', _message='GetRuntimeProfile error: /home/philip/src/impala/be/src/service/impala-server.cc:641: Query id 0:0 not found.\n', errorCode=0)

  $ tail -n 1 logs/cluster/impalad.INFO
  I0926 14:51:19.867837 30582 status.cc:147] /home/philip/src/impala/be/src/service/impala-server.cc:641: Query id 0:0 not found.

Change-Id: I38088482377a1c3e794a9c8178ef83f29957a330
---
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/runtime/client-cache.h
M be/src/runtime/tuple.cc
M be/src/service/impala-server.cc
M be/src/util/tuple-row-compare.cc
9 files changed, 70 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8100/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I38088482377a1c3e794a9c8178ef83f29957a330
Gerrit-Change-Number: 8100
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>