You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by we...@sina.com on 2019/08/13 01:28:51 UTC

hive BucketedTables

hello users:
(apache-hive-3.1.1-bin)
I have a problem.The bucket file is already produced, but the contents of the file have nothing to do with hash_function(bucketing_column) mod num_buckets and appear to be randomly allocated.
Below are the statements to create the table, the bucket files, and the query results.
+----------------------------------------------------+
|                   createtab_stmt                   |
+----------------------------------------------------+
| CREATE TABLE `students`(                           |
|   `name` varchar(64),                              |
|   `age` int,                                       |
|   `gpa` decimal(3,2))                              |
| CLUSTERED BY (                                     |
|   age)                                             |
| INTO 3 BUCKETS                                     |
| ROW FORMAT SERDE                                   |
|   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'      |
| STORED AS INPUTFORMAT                              |
|   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
| OUTPUTFORMAT                                       |
|   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
| LOCATION                                           |
|   'hdfs://sicluster/hive/warehouse/inputdb.db/students' |
| TBLPROPERTIES (                                    |
|   'bucketing_version'='2',                         |
|   'transient_lastDdlTime'='1564647129')            |
+----------------------------------------------------+

My English is not very good, I hope you can understand it.Best regards, Li Wei