You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Tim Robertson <ti...@gmail.com> on 2016/04/19 20:43:05 UTC

Writing HFiles using Hive for an HBase bulk load - possible bug?

Hi folks,

I am trying to create HFiles from a Hive table to bulk load into HBase and
am following the HWX [1] tutorial.

It creates the HFiles correctly but then fails when closing the
RecordWriter with the following stack trace.

Error: java.lang.RuntimeException: Hive Runtime Error while closing
operators: java.io.IOException: Multiple family directories found in hdfs://
c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

The reason is that the HFiles are created in the task attempt folder but it
is looking 2 directories above the task attempt for the HFile.

Does anyone else see this behaviour please?

I logged it as a bug [2], which also details exactly my procedure, but I
wonder if someone could confirm that they also see this, or if perhaps I am
just doing something wrong and it works for them?

Thanks all,
Tim



[1]
https://community.hortonworks.com/articles/2745/creating-hbase-hfiles-from-an-existing-hive-table.html

[2] https://issues.apache.org/jira/browse/HIVE-13539