You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@orc.apache.org by om...@apache.org on 2019/06/10 18:00:30 UTC
[orc] branch master updated: Fix specification of float encoding in
BloomFilter Floats are converted to double and added to BloomFilter
according to the double rules.
This is an automated email from the ASF dual-hosted git repository.
omalley pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/orc.git
The following commit(s) were added to refs/heads/master by this push:
new 2df3f15 Fix specification of float encoding in BloomFilter Floats are converted to double and added to BloomFilter according to the double rules.
2df3f15 is described below
commit 2df3f15739b34e597ce8f37bbb1a5798108e736c
Author: Dain Sundstrom <da...@iq80.com>
AuthorDate: Thu Jun 6 23:01:13 2019 -0700
Fix specification of float encoding in BloomFilter
Floats are converted to double and added to BloomFilter according to the double rules.
Fixes #400
Signed-off-by: Owen O'Malley <om...@apache.org>
---
site/specification/ORCv1.md | 7 +++----
site/specification/ORCv2.md | 7 +++----
2 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/site/specification/ORCv1.md b/site/specification/ORCv1.md
index cf29272..db63356 100644
--- a/site/specification/ORCv1.md
+++ b/site/specification/ORCv1.md
@@ -1271,10 +1271,9 @@ message BloomFilterIndex {
Bloom filter internally uses two different hash functions to map a key
to a position in the bit set. For tinyint, smallint, int, bigint, float
and double types, Thomas Wang's 64-bit integer hash function is used.
-Floats are converted to IEEE-754 32 bit representation
-(using Java's Float.floatToIntBits(float)). Similary, Doubles are
-converted to IEEE-754 64 bit representation (using Java's
-Double.doubleToLongBits(double)). All these primitive types
+Doubles are converted to IEEE-754 64 bit representation (using Java's
+Double.doubleToLongBits(double)). Floats are as converted to double
+(using Java's float to double cast). All these primitive types
are cast to long base type before being passed on to the hash function.
For strings and binary types, Murmur3 64 bit hash algorithm is used.
The 64 bit variant of Murmur3 considers only the most significant
diff --git a/site/specification/ORCv2.md b/site/specification/ORCv2.md
index 98ed070..b84dac9 100644
--- a/site/specification/ORCv2.md
+++ b/site/specification/ORCv2.md
@@ -1287,10 +1287,9 @@ message BloomFilterIndex {
Bloom filter internally uses two different hash functions to map a key
to a position in the bit set. For tinyint, smallint, int, bigint, float
and double types, Thomas Wang's 64-bit integer hash function is used.
-Floats are converted to IEEE-754 32 bit representation
-(using Java's Float.floatToIntBits(float)). Similary, Doubles are
-converted to IEEE-754 64 bit representation (using Java's
-Double.doubleToLongBits(double)). All these primitive types
+Doubles are converted to IEEE-754 64 bit representation (using Java's
+Double.doubleToLongBits(double)). Floats are as converted to double
+(using Java's float to double cast). All these primitive types
are cast to long base type before being passed on to the hash function.
For strings and binary types, Murmur3 64 bit hash algorithm is used.
The 64 bit variant of Murmur3 considers only the most significant