You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2017/11/04 00:28:16 UTC

[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
......................................................................


Patch Set 2:

(12 comments)

Did a first pass over it.

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/codegen/gen_ir_descriptions.py
File be/src/codegen/gen_ir_descriptions.py:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/codegen/gen_ir_descriptions.py@99
PS2, Line 99:   ["HASH_FAST", "IrFastHash"],
Do we use this? I don't see any places where it's used currently.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc@a468
PS2, Line 468: 
Lol, weird.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc@474
PS2, Line 474: partition_expr_evals_[j
Use 'eval' directly.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/raw-value.cc
File be/src/runtime/raw-value.cc:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/raw-value.cc@220
PS2, Line 220:       return HashUtil::FastHash64(v, static_cast<size_t>(type.GetByteSize()), seed);
I think this might be slightly slower - with the previous approach I think we get a specialised version of the hash function for that input length for each data type, whereas here we have to go through another switch in GetByteSize().


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h
File be/src/util/hash-util.h:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@235
PS2, Line 235:   /* The MIT License
I think this should go at the top of the file beneath the apache license. Then we can just say that the FastHash64 implementation came with that license. E.g.

/* 
  FastHash64 implementation derived from MIT-licensed code written by Zilong Tan

  The MIT License
  ...
*/


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@263
PS2, Line 263: size_t
use int64_t - we generally use signed integers for lengths.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@266
PS2, Line 266: (const uint64_t *)
We use c-style casts - i.e. reinterpret_cast<>


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@267
PS2, Line 267: pos
I believe the C++ standard doesn't allow pointer arithmetic on void - we should convert to uint8_t for doing that.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@278
PS2, Line 278:  (const unsigned char*
See above comments.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@282
PS2, Line 282: (uint64_t)
should use static_cast<uint64_t>()


http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
File testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test:

http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test@145
PS2, Line 145: 8,'k1',-1,'k1',1
Why does this make a difference when we have VERIFY_IS_EQUAL_SORTED above?


http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/tpcds/queries/tpcds-q77a.test
File testdata/workloads/tpcds/queries/tpcds-q77a.test:

http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/tpcds/queries/tpcds-q77a.test@125
PS2, Line 125: 'catalog channel',NULL,538912.55,2050279.74,-1383554.73
I'll file a JIRA to make this test deterministically pass: IMPALA-6155. In the meantime, can you change this to

   ---- RESULTS: VERIFY_IS_EQUAL_SORTED

That will make it ignore the order of results.



-- 
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Nov 2017 00:28:16 +0000
Gerrit-HasComments: Yes