You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by pengcheng xiong <px...@hortonworks.com> on 2015/05/22 01:34:50 UTC

Review Request 34576: Bucketized Table feature fails in some cases

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/
-----------------------------------------------------------

Review request for hive and John Pullokkaran.


Repository: hive-git


Description
-------

Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
Example
----------------------------------------------------------------------
CREATE TABLE P1(key STRING, val STRING)
CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
– perform an insert to make sure there are 2 files
INSERT OVERWRITE TABLE P1 select key, val from P1;
--------------------------------------------------
This is not a regression. This has never worked.
This got only discovered due to Hadoop2 changes.
In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
Long term solution seems to be to prevent load data for bucketed table.


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
  ql/src/test/results/clientnegative/alter_partition_invalidspec.q.out 404115f 
  ql/src/test/results/clientnegative/alter_partition_nodrop.q.out 1c78cff 
  ql/src/test/results/clientnegative/alter_partition_nodrop_table.q.out 3c425da 
  ql/src/test/results/clientnegative/alter_partition_offline.q.out c70fcb4 
  ql/src/test/results/clientnegative/archive_corrupt.q.out 56e8ec4 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
  ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
  ql/src/test/results/clientnegative/columnstats_partlvl_invalid_values.q.java1.7.out 4ea70e3 
  ql/src/test/results/clientnegative/columnstats_partlvl_multiple_part_clause.q.out ce79830 
  ql/src/test/results/clientnegative/dynamic_partitions_with_whitelist.q.out f069ae8 
  ql/src/test/results/clientnegative/exim_02_all_part_over_overlap.q.out 3c05600 
  ql/src/test/results/clientnegative/exim_15_part_nonpart.q.out dfbf025 
  ql/src/test/results/clientnegative/exim_16_part_noncompat_schema.q.out 4cb6ca7 
  ql/src/test/results/clientnegative/exim_17_part_spec_underspec.q.out 23caa4a 
  ql/src/test/results/clientnegative/exim_18_part_spec_missing.q.out 23caa4a 
  ql/src/test/results/clientnegative/exim_21_part_managed_external.q.out fd27f29 
  ql/src/test/results/clientnegative/exim_24_import_part_authfail.q.out 1a9a34d 
  ql/src/test/results/clientnegative/insertover_dynapart_ifnotexists.q.out a40ffab 
  ql/src/test/results/clientnegative/load_exist_part_authfail.q.out 491cfd0 
  ql/src/test/results/clientnegative/load_part_authfail.q.out 4ea8be9 
  ql/src/test/results/clientnegative/load_part_nospec.q.out bebaf92 
  ql/src/test/results/clientnegative/nopart_load.q.out 8815146 
  ql/src/test/results/clientnegative/protectmode_part2.q.out 16d58c7 
  ql/src/test/results/clientpositive/alter_concatenate_indexed_table.q.out ffcbcf9 
  ql/src/test/results/clientpositive/alter_merge.q.out 17d86b8 
  ql/src/test/results/clientpositive/alter_merge_2.q.out e118c39 
  ql/src/test/results/clientpositive/alter_merge_stats.q.out fdd2ddc 
  ql/src/test/results/clientpositive/alter_partition_protect_mode.q.out 80990d9 
  ql/src/test/results/clientpositive/alter_rename_table.q.out 732d8a2 
  ql/src/test/results/clientpositive/alter_table_cascade.q.out 0139466 
  ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
  ql/src/test/results/clientpositive/auto_sortmerge_join_16.q.out d4ecb19 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
  ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
  ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
  ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
  ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
  ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
  ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
  ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
  ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
  ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
  ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
  ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
  ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
  ql/src/test/results/clientpositive/columnstats_partlvl.q.out 3c22d40 
  ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out 18a6909 
  ql/src/test/results/clientpositive/database.q.out 043d91b 
  ql/src/test/results/clientpositive/database_drop.q.out 225104f 
  ql/src/test/results/clientpositive/drop_partition_with_stats.q.out e27e557 
  ql/src/test/results/clientpositive/exim_02_part.q.out 6e0988a 
  ql/src/test/results/clientpositive/exim_04_all_part.q.out 862efa3 
  ql/src/test/results/clientpositive/exim_05_some_part.q.out 1b6a515 
  ql/src/test/results/clientpositive/exim_06_one_part.q.out 39c83c3 
  ql/src/test/results/clientpositive/exim_07_all_part_over_nonoverlap.q.out b55a0bd 
  ql/src/test/results/clientpositive/exim_08_nonpart_rename.q.out 740833b 
  ql/src/test/results/clientpositive/exim_09_part_spec_nonoverlap.q.out d71f36f 
  ql/src/test/results/clientpositive/exim_15_external_part.q.out d24f18a 
  ql/src/test/results/clientpositive/exim_16_part_external.q.out af748c9 
  ql/src/test/results/clientpositive/exim_17_part_managed.q.out a92f95a 
  ql/src/test/results/clientpositive/exim_18_part_external.q.out a082a11 
  ql/src/test/results/clientpositive/exim_19_00_part_external_location.q.out 5a97e03 
  ql/src/test/results/clientpositive/exim_19_part_external_location.q.out f9a20f7 
  ql/src/test/results/clientpositive/exim_20_part_managed_location.q.out b196ba5 
  ql/src/test/results/clientpositive/exim_23_import_part_authsuccess.q.out 5f78a76 
  ql/src/test/results/clientpositive/exim_hidden_files.q.out e449e0e 
  ql/src/test/results/clientpositive/global_limit.q.out 7da20d5 
  ql/src/test/results/clientpositive/groupby_sort_6.q.out c5cb8b9 
  ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
  ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
  ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
  ql/src/test/results/clientpositive/infer_bucket_sort_dyn_part.q.out 773a2a8 
  ql/src/test/results/clientpositive/input40.q.out bb0eabe 
  ql/src/test/results/clientpositive/inputddl6.q.out 5a040e6 
  ql/src/test/results/clientpositive/inputddl7.q.out 0d64baf 
  ql/src/test/results/clientpositive/insert1_overwrite_partitions.q.out 900babe 
  ql/src/test/results/clientpositive/insert2_overwrite_partitions.q.out 25c438f 
  ql/src/test/results/clientpositive/leftsemijoin.q.out 11f0bb0 
  ql/src/test/results/clientpositive/load_exist_part_authsuccess.q.out 8ec7e62 
  ql/src/test/results/clientpositive/load_part_authsuccess.q.out 8249dce 
  ql/src/test/results/clientpositive/loadpart2.q.out 201a957 
  ql/src/test/results/clientpositive/merge_dynamic_partition.q.out da19b32 
  ql/src/test/results/clientpositive/merge_dynamic_partition2.q.out 5a2afb0 
  ql/src/test/results/clientpositive/merge_dynamic_partition3.q.out 86978f3 
  ql/src/test/results/clientpositive/merge_dynamic_partition4.q.out 86af660 
  ql/src/test/results/clientpositive/merge_dynamic_partition5.q.out c1468c1 
  ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
  ql/src/test/results/clientpositive/nullgroup3.q.out 7712d4d 
  ql/src/test/results/clientpositive/nullgroup5.q.out 8a94d62 
  ql/src/test/results/clientpositive/orc_analyze.q.out a61a2e6 
  ql/src/test/results/clientpositive/orc_split_elimination.q.out 7134ff5 
  ql/src/test/results/clientpositive/parquet_serde.q.out e753180 
  ql/src/test/results/clientpositive/partition_type_check.q.out e25d527 
  ql/src/test/results/clientpositive/partition_wise_fileformat17.q.out 028a26e 
  ql/src/test/results/clientpositive/partition_wise_fileformat18.q.out 6303d44 
  ql/src/test/results/clientpositive/repl_1_drop.q.out 9fb65d1 
  ql/src/test/results/clientpositive/repl_2_exim_basic.q.out 8df0653 
  ql/src/test/results/clientpositive/repl_3_exim_metadata.q.out 8387c02 
  ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
  ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_16.q.out d4ecb19 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
  ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
  ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
  ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
  ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
  ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
  ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
  ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
  ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
  ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
  ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
  ql/src/test/results/clientpositive/spark/leftsemijoin.q.out 11f0bb0 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
  ql/src/test/results/clientpositive/spark/stats18.q.out a061846 
  ql/src/test/results/clientpositive/spark/stats_counter_partitioned.q.out 4b84eca 
  ql/src/test/results/clientpositive/spark/statsfs.q.out b0bca41 
  ql/src/test/results/clientpositive/stats11.q.out e51f049 
  ql/src/test/results/clientpositive/stats18.q.out a061846 
  ql/src/test/results/clientpositive/stats_counter_partitioned.q.out ab1270c 
  ql/src/test/results/clientpositive/statsfs.q.out b0bca41 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_16.q.out d4ecb19 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
  ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
  ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
  ql/src/test/results/clientpositive/tez/dynamic_partition_pruning_2.q.out 8b0b81d 
  ql/src/test/results/clientpositive/tez/explainuser_1.q.out b684858 
  ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
  ql/src/test/results/clientpositive/tez/leftsemijoin.q.out 11f0bb0 
  ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
  ql/src/test/results/clientpositive/tez/orc_analyze.q.out a61a2e6 
  ql/src/test/results/clientpositive/tez/stats_counter_partitioned.q.out ab1270c 
  ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
  ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
  ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
  ql/src/test/results/clientpositive/truncate_table.q.out 4d8f38c 
  ql/src/test/results/clientpositive/view_cast.q.out 34444ae 

Diff: https://reviews.apache.org/r/34576/diff/


Testing
-------


Thanks,

pengcheng xiong


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by pengcheng xiong <px...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/
-----------------------------------------------------------

(Updated May 29, 2015, 6:15 p.m.)


Review request for hive and John Pullokkaran.


Repository: hive-git


Description
-------

Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
Example
----------------------------------------------------------------------
CREATE TABLE P1(key STRING, val STRING)
CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
– perform an insert to make sure there are 2 files
INSERT OVERWRITE TABLE P1 select key, val from P1;
--------------------------------------------------
This is not a regression. This has never worked.
This got only discovered due to Hadoop2 changes.
In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
Long term solution seems to be to prevent load data for bucketed table.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
  ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
  ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
  ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
  ql/src/test/results/clientpositive/auto_join32.q.out f862870 
  ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 5114038 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out b2e782f 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 210f1ab 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out a307b13 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out f4ceee7 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 3c2951a 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e1f3888 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 38ecdbe 
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out 42e6a3f 
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out af73309 
  ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
  ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
  ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
  ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
  ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
  ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
  ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
  ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
  ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
  ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
  ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out 471ff73 
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
  ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
  ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
  ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
  ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
  ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
  ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
  ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
  ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
  ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
  ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
  ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
  ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
  ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
  ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
  ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
  ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
  ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
  ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
  ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out 51445a5 
  ql/src/test/results/clientpositive/skewjoinopt19.q.out 91167db 
  ql/src/test/results/clientpositive/skewjoinopt20.q.out 15e96fd 
  ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
  ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
  ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
  ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
  ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
  ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
  ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
  ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
  ql/src/test/results/clientpositive/spark/auto_join32.q.out e26e4a2 
  ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out a70b161 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
  ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 7570ebe 
  ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out 80b44e9 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
  ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out d4a9c98 
  ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
  ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
  ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
  ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
  ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
  ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
  ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
  ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
  ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
  ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
  ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 65a8374 
  ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9f30e15 
  ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out f51d805 
  ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 338da34 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
  ql/src/test/results/clientpositive/stats11.q.out e51f049 
  ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out e90af15 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
  ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
  ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
  ql/src/test/results/clientpositive/tez/explainuser_2.q.out 0511819 
  ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
  ql/src/test/results/clientpositive/tez/mergejoin.q.out c4be404 
  ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
  ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
  ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 52e1750 
  ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 

Diff: https://reviews.apache.org/r/34576/diff/


Testing
-------


Thanks,

pengcheng xiong


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by pengcheng xiong <px...@hortonworks.com>.

> On May 24, 2015, 2:03 a.m., Xuefu Zhang wrote:
> > Have you thought of what if the client is not interactive, such as JDBC or thrift?

I am sorry that we have not thought about it yet. We admitted that the patch will not cover the case when the client is not interactive. Do you have any good ideas that you can share with us? Do you think logging this besides printing a waring msg is good enough? Thanks.


- pengcheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
-----------------------------------------------------------


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by John Pullokkaran <jp...@hortonworks.com>.

> On May 24, 2015, 2:03 a.m., Xuefu Zhang wrote:
> > Have you thought of what if the client is not interactive, such as JDBC or thrift?
> 
> pengcheng xiong wrote:
>     I am sorry that we have not thought about it yet. We admitted that the patch will not cover the case when the client is not interactive. Do you have any good ideas that you can share with us? Do you think logging this besides printing a waring msg is good enough? Thanks.
> 
> Xuefu Zhang wrote:
>     There are all kinds of issues with data loading into bucketed tables. While advanced users might be able to load data correctly, I think that's really rare. The data in a bucketed table needs to be generated by Hive. Thefore, I think we should disable "insert into" and "load data into|overwrite" for a bucketed table. We should also disallow external tables for the same reason.
>     
>     To allow the advanced user to achieve what they used to do, we can have a flag, such as "hive.enforce.strict.bucketing", which defaults to true. Those users can proceed by turning this off.
>     
>     Another option for "insert into" would be supporting appending new data, such as proposed in HIVE-3244.
> 
> Gopal V wrote:
>     Why would you disable "insert into" bucketed tables? How else would ACID work?
> 
> Xuefu Zhang wrote:
>     yeah. but I guess we were talking about things out of the context of ACID. Even before ACID, user can do "insert into" a bucketed table, which can be very harmful.

This patch is only addressing "Load" path. Which i think we all agree is a problem.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
-----------------------------------------------------------


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by Gopal V <go...@hortonworks.com>.

> On May 24, 2015, 2:03 a.m., Xuefu Zhang wrote:
> > Have you thought of what if the client is not interactive, such as JDBC or thrift?
> 
> pengcheng xiong wrote:
>     I am sorry that we have not thought about it yet. We admitted that the patch will not cover the case when the client is not interactive. Do you have any good ideas that you can share with us? Do you think logging this besides printing a waring msg is good enough? Thanks.
> 
> Xuefu Zhang wrote:
>     There are all kinds of issues with data loading into bucketed tables. While advanced users might be able to load data correctly, I think that's really rare. The data in a bucketed table needs to be generated by Hive. Thefore, I think we should disable "insert into" and "load data into|overwrite" for a bucketed table. We should also disallow external tables for the same reason.
>     
>     To allow the advanced user to achieve what they used to do, we can have a flag, such as "hive.enforce.strict.bucketing", which defaults to true. Those users can proceed by turning this off.
>     
>     Another option for "insert into" would be supporting appending new data, such as proposed in HIVE-3244.

Why would you disable "insert into" bucketed tables? How else would ACID work?


- Gopal


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
-----------------------------------------------------------


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by Xuefu Zhang <xz...@cloudera.com>.

> On May 24, 2015, 2:03 a.m., Xuefu Zhang wrote:
> > Have you thought of what if the client is not interactive, such as JDBC or thrift?
> 
> pengcheng xiong wrote:
>     I am sorry that we have not thought about it yet. We admitted that the patch will not cover the case when the client is not interactive. Do you have any good ideas that you can share with us? Do you think logging this besides printing a waring msg is good enough? Thanks.
> 
> Xuefu Zhang wrote:
>     There are all kinds of issues with data loading into bucketed tables. While advanced users might be able to load data correctly, I think that's really rare. The data in a bucketed table needs to be generated by Hive. Thefore, I think we should disable "insert into" and "load data into|overwrite" for a bucketed table. We should also disallow external tables for the same reason.
>     
>     To allow the advanced user to achieve what they used to do, we can have a flag, such as "hive.enforce.strict.bucketing", which defaults to true. Those users can proceed by turning this off.
>     
>     Another option for "insert into" would be supporting appending new data, such as proposed in HIVE-3244.
> 
> Gopal V wrote:
>     Why would you disable "insert into" bucketed tables? How else would ACID work?

yeah. but I guess we were talking about things out of the context of ACID. Even before ACID, user can do "insert into" a bucketed table, which can be very harmful.


- Xuefu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
-----------------------------------------------------------


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by Xuefu Zhang <xz...@cloudera.com>.

> On May 24, 2015, 2:03 a.m., Xuefu Zhang wrote:
> > Have you thought of what if the client is not interactive, such as JDBC or thrift?
> 
> pengcheng xiong wrote:
>     I am sorry that we have not thought about it yet. We admitted that the patch will not cover the case when the client is not interactive. Do you have any good ideas that you can share with us? Do you think logging this besides printing a waring msg is good enough? Thanks.

There are all kinds of issues with data loading into bucketed tables. While advanced users might be able to load data correctly, I think that's really rare. The data in a bucketed table needs to be generated by Hive. Thefore, I think we should disable "insert into" and "load data into|overwrite" for a bucketed table. We should also disallow external tables for the same reason.

To allow the advanced user to achieve what they used to do, we can have a flag, such as "hive.enforce.strict.bucketing", which defaults to true. Those users can proceed by turning this off.

Another option for "insert into" would be supporting appending new data, such as proposed in HIVE-3244.


- Xuefu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
-----------------------------------------------------------


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by Xuefu Zhang <xz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
-----------------------------------------------------------


Have you thought of what if the client is not interactive, such as JDBC or thrift?

- Xuefu Zhang


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by John Pullokkaran <jp...@hortonworks.com>.

> On May 24, 2015, 1:50 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java, line 226
> > <https://reviews.apache.org/r/34576/diff/2/?file=971006#file971006line226>
> >
> >     Warning is proper, but I think the words should say "might" because the source data might be already bucketed and matches the target, in which case, there is no problem.

Load command doesn't excersise bucketizing. IMO "will not" is correct.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85081
-----------------------------------------------------------


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by pengcheng xiong <px...@hortonworks.com>.

> On May 24, 2015, 1:50 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java, line 226
> > <https://reviews.apache.org/r/34576/diff/2/?file=971006#file971006line226>
> >
> >     Warning is proper, but I think the words should say "might" because the source data might be already bucketed and matches the target, in which case, there is no problem.
> 
> John Pullokkaran wrote:
>     Load command doesn't excersise bucketizing. IMO "will not" is correct.

So, I will add Logging for this. And after we discussed with Hive JDBC guy, we found that current infrastructure does not support warning msg to be passed through JDBC. We acknowledge that this is something that we need to improve in the future.


- pengcheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85081
-----------------------------------------------------------


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by Xuefu Zhang <xz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85081
-----------------------------------------------------------


could you also link the JIRA number in the review request?


ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
<https://reviews.apache.org/r/34576/#comment136557>

    nit: remove tab/spacke



ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
<https://reviews.apache.org/r/34576/#comment136558>

    Warning is proper, but I think the words should say "might" because the source data might be already bucketed and matches the target, in which case, there is no problem.


- Xuefu Zhang


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Re: Review Request 34576: Bucketized Table feature fails in some cases

Posted by pengcheng xiong <px...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/
-----------------------------------------------------------

(Updated May 23, 2015, 5:47 p.m.)


Review request for hive and John Pullokkaran.


Repository: hive-git


Description
-------

Bucketized Table feature fails in some cases. if src & destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
Example
----------------------------------------------------------------------
CREATE TABLE P1(key STRING, val STRING)
CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
– perform an insert to make sure there are 2 files
INSERT OVERWRITE TABLE P1 select key, val from P1;
--------------------------------------------------
This is not a regression. This has never worked.
This got only discovered due to Hadoop2 changes.
In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads).
Long term solution seems to be to prevent load data for bucketed table.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
  ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2 
  ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d 
  ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e 
  ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
  ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
  ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
  ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
  ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
  ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
  ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
  ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
  ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
  ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
  ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
  ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
  ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
  ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
  ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
  ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
  ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
  ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
  ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
  ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
  ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
  ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
  ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
  ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
  ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
  ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
  ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
  ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
  ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
  ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
  ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
  ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
  ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
  ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
  ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
  ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
  ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
  ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
  ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
  ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
  ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
  ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
  ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
  ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
  ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
  ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
  ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
  ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
  ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
  ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
  ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
  ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
  ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
  ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
  ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
  ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
  ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
  ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
  ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
  ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
  ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
  ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
  ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
  ql/src/test/results/clientpositive/stats11.q.out e51f049 
  ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
  ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
  ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
  ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
  ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
  ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
  ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
  ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
  ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
  ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
  ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 

Diff: https://reviews.apache.org/r/34576/diff/


Testing
-------


Thanks,

pengcheng xiong