You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Luke Lovett (JIRA)" <ji...@apache.org> on 2015/02/24 20:27:05 UTC

[jira] [Created] (HIVE-9771) HiveCombineInputFormat does not appropriately call getSplits on InputFormats for native tables

Luke Lovett created HIVE-9771:
---------------------------------

             Summary: HiveCombineInputFormat does not appropriately call getSplits on InputFormats for native tables
                 Key: HIVE-9771
                 URL: https://issues.apache.org/jira/browse/HIVE-9771
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.12.0
         Environment: Hive 0.12.0
Hadoop 2.4.1
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
            Reporter: Luke Lovett


{{HiveCombineInputFormat}} never calls {{getSplits}} on a custom {{InputFormat}} when those InputFormats are used by native tables. If I {{set}} 
{{hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;}}, then 
{{getSplits}} is called. I'm not the first user to have experience this either; see [this post from the hive-user mailing list|https://mail-archives.apache.org/mod_mbox/hive-user/201410.mbox/%3CCAENxBwy+XB1OB2ZOjz=4=NxKNMsWA==O0iBRD+gOpXGQRJ2T4A@mail.gmail.com%3E].

The purpose of this ticket is to discover:
- Is this difference in behavior between CombineHiveInputFormat and 
HiveInputFormat intentional?
- Is there any way of forcing CombineHiveInputFormat to call getSplits 
on my own InputFormat? I was reading through the code for 
CombineHiveInputFormat, and it looks like it might only call my own 
InputFormat's getSplits method if the table is non-native. I'm not sure 
if I'm interpreting this correctly.
- For the purpose of creating an InputFormat to be used by other users, is it better to set {{hive.input.format}} to work around this, or to 
create a StorageHandler and make non-native tables?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)