You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Navis Ryu <na...@nexr.com> on 2014/05/24 09:44:04 UTC
Review Request 21886: Column stats : LOW_VALUE (or HIGH_VALUE) will always
be 0.0000 ,
if all the column values larger than 0.0 (or if all column values smaller
than 0.0)
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21886/
-----------------------------------------------------------
Review request for hive.
Bugs: HIVE-4561
https://issues.apache.org/jira/browse/HIVE-4561
Repository: hive-git
Description
-------
if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0
or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be
hive (default)> create table src_test (price double);
hive (default)> load data local inpath './test.txt' into table src_test;
hive (default)> select * from src_test;
OK
1.0
2.0
3.0
Time taken: 0.313 seconds, Fetched: 3 row(s)
hive (default)> analyze table src_test compute statistics for columns price;
mysql> select * from TAB_COL_STATS \G;
CS_ID: 16
DB_NAME: default
TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
TBL_ID: 2586
LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
DOUBLE_LOW_VALUE: 0.0000 # Wrong Result ! Expected is 1.0000
DOUBLE_HIGH_VALUE: 3.0000
BIG_DECIMAL_LOW_VALUE: NULL
BIG_DECIMAL_HIGH_VALUE: NULL
NUM_NULLS: 0
NUM_DISTINCTS: 1
AVG_COL_LEN: 0.0000
MAX_COL_LEN: 0
NUM_TRUES: 0
NUM_FALSES: 0
LAST_ANALYZED: 1368596151
2 rows in set (0.00 sec)
Diffs
-----
metastore/if/hive_metastore.thrift eef1b80
metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 43869c2
metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 9e440bb
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DecimalColumnStatsData.java 5661252
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DoubleColumnStatsData.java d3f3f68
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LongColumnStatsData.java 2cf4380
metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py c4b583b
metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 79b7a1a
metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java dc0e266
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java f61cdf0
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java 85f6427
ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 3dc02f0
ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java ee4d56c
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 3b063eb
ql/src/test/queries/clientpositive/metadata_only_queries.q b549a56
ql/src/test/results/clientpositive/compute_stats_empty_table.q.out 50d6c8d
ql/src/test/results/clientpositive/compute_stats_long.q.out 2f5cbdd
ql/src/test/results/clientpositive/metadata_only_queries.q.out 531ea41
Diff: https://reviews.apache.org/r/21886/diff/
Testing
-------
Thanks,
Navis Ryu
Re: Review Request 21886: Column stats : LOW_VALUE (or HIGH_VALUE) will
always be 0.0000 ,
if all the column values larger than 0.0 (or if all column values smaller
than 0.0)
Posted by Zhuoluo Yang <zh...@taobao.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21886/#review44095
-----------------------------------------------------------
Ship it!
Thanks, Looks good to me!
- Zhuoluo Yang
On May 28, 2014, 5:45 a.m., Navis Ryu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21886/
> -----------------------------------------------------------
>
> (Updated May 28, 2014, 5:45 a.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-4561
> https://issues.apache.org/jira/browse/HIVE-4561
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0
> or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be
>
> hive (default)> create table src_test (price double);
> hive (default)> load data local inpath './test.txt' into table src_test;
> hive (default)> select * from src_test;
> OK
> 1.0
> 2.0
> 3.0
> Time taken: 0.313 seconds, Fetched: 3 row(s)
> hive (default)> analyze table src_test compute statistics for columns price;
>
> mysql> select * from TAB_COL_STATS \G;
> CS_ID: 16
> DB_NAME: default
> TABLE_NAME: src_test
> COLUMN_NAME: price
> COLUMN_TYPE: double
> TBL_ID: 2586
> LONG_LOW_VALUE: 0
> LONG_HIGH_VALUE: 0
> DOUBLE_LOW_VALUE: 0.0000 # Wrong Result ! Expected is 1.0000
> DOUBLE_HIGH_VALUE: 3.0000
> BIG_DECIMAL_LOW_VALUE: NULL
> BIG_DECIMAL_HIGH_VALUE: NULL
> NUM_NULLS: 0
> NUM_DISTINCTS: 1
> AVG_COL_LEN: 0.0000
> MAX_COL_LEN: 0
> NUM_TRUES: 0
> NUM_FALSES: 0
> LAST_ANALYZED: 1368596151
> 2 rows in set (0.00 sec)
>
>
> Diffs
> -----
>
> metastore/if/hive_metastore.thrift eef1b80
> metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 43869c2
> metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 9e440bb
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DecimalColumnStatsData.java 5661252
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DoubleColumnStatsData.java d3f3f68
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LongColumnStatsData.java 2cf4380
> metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py c4b583b
> metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 79b7a1a
> metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java dc0e266
> metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java f61cdf0
> metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java 85f6427
> ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 3dc02f0
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java ee4d56c
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 3b063eb
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java 24159b8
> ql/src/test/queries/clientpositive/metadata_only_queries.q b549a56
> ql/src/test/results/clientpositive/compute_stats_empty_table.q.out 50d6c8d
> ql/src/test/results/clientpositive/compute_stats_long.q.out 2f5cbdd
> ql/src/test/results/clientpositive/metadata_only_queries.q.out 531ea41
> ql/src/test/results/clientpositive/metadata_only_queries_with_filters.q.out c8e2c0c
>
> Diff: https://reviews.apache.org/r/21886/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Navis Ryu
>
>
Re: Review Request 21886: Column stats : LOW_VALUE (or HIGH_VALUE) will
always be 0.0000 ,
if all the column values larger than 0.0 (or if all column values smaller
than 0.0)
Posted by Navis Ryu <na...@nexr.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21886/
-----------------------------------------------------------
(Updated May 28, 2014, 5:45 a.m.)
Review request for hive.
Changes
-------
Fixed test fails & Refactoring
Bugs: HIVE-4561
https://issues.apache.org/jira/browse/HIVE-4561
Repository: hive-git
Description
-------
if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0
or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be
hive (default)> create table src_test (price double);
hive (default)> load data local inpath './test.txt' into table src_test;
hive (default)> select * from src_test;
OK
1.0
2.0
3.0
Time taken: 0.313 seconds, Fetched: 3 row(s)
hive (default)> analyze table src_test compute statistics for columns price;
mysql> select * from TAB_COL_STATS \G;
CS_ID: 16
DB_NAME: default
TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
TBL_ID: 2586
LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
DOUBLE_LOW_VALUE: 0.0000 # Wrong Result ! Expected is 1.0000
DOUBLE_HIGH_VALUE: 3.0000
BIG_DECIMAL_LOW_VALUE: NULL
BIG_DECIMAL_HIGH_VALUE: NULL
NUM_NULLS: 0
NUM_DISTINCTS: 1
AVG_COL_LEN: 0.0000
MAX_COL_LEN: 0
NUM_TRUES: 0
NUM_FALSES: 0
LAST_ANALYZED: 1368596151
2 rows in set (0.00 sec)
Diffs (updated)
-----
metastore/if/hive_metastore.thrift eef1b80
metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 43869c2
metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 9e440bb
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DecimalColumnStatsData.java 5661252
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DoubleColumnStatsData.java d3f3f68
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LongColumnStatsData.java 2cf4380
metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py c4b583b
metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 79b7a1a
metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java dc0e266
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java f61cdf0
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java 85f6427
ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 3dc02f0
ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java ee4d56c
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 3b063eb
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java 24159b8
ql/src/test/queries/clientpositive/metadata_only_queries.q b549a56
ql/src/test/results/clientpositive/compute_stats_empty_table.q.out 50d6c8d
ql/src/test/results/clientpositive/compute_stats_long.q.out 2f5cbdd
ql/src/test/results/clientpositive/metadata_only_queries.q.out 531ea41
ql/src/test/results/clientpositive/metadata_only_queries_with_filters.q.out c8e2c0c
Diff: https://reviews.apache.org/r/21886/diff/
Testing
-------
Thanks,
Navis Ryu