You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Brian Bloniarz (JIRA)" <ji...@apache.org> on 2012/06/25 21:13:44 UTC
[jira] [Updated] (HIVE-3198) StorageHandler properties not passed
to InputFormat (?)
[ https://issues.apache.org/jira/browse/HIVE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brian Bloniarz updated HIVE-3198:
---------------------------------
Status: Patch Available (was: Open)
> StorageHandler properties not passed to InputFormat (?)
> -------------------------------------------------------
>
> Key: HIVE-3198
> URL: https://issues.apache.org/jira/browse/HIVE-3198
> Project: Hive
> Issue Type: Bug
> Environment: trunk r1352973
> Reporter: Brian Bloniarz
>
> I'm working on a custom StorageHandler implementation. I use configureTableJobProperties to pass properties onto a serde & InputFormat, but it looks to me like the properties aren't present inside the InputFormat.
> I found the following code which looks like it's supposed to propagate JobProperties:
> {code}
> public class HiveInputFormat<K extends WritableComparable, V extends Writable>
> ...
> public RecordReader getRecordReader(InputSplit split, JobConf job,
> Reporter reporter) throws IOException {
> HiveInputSplit hsplit = (HiveInputSplit) split;
> ...
> boolean nonNative = false;
> PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
> if ((part != null) && (part.getTableDesc() != null)) {
> Utilities.copyTableJobPropertiesToConf(part.getTableDesc(), cloneJobConf);
> nonNative = part.getTableDesc().isNonNative();
> }
> {code}
> In the debugger, I see that part==null so copyTableJobPropertiesToConf doesn't get called. I see that for this table:
> {code}
> create external table test3 () STORED BY 'foo' location '/data/bar';
> {code}
> The InputSplit path is the *file* (i.e. "/data/bar/part-00000") but pathToPartitionInfo has an entry for the *dir* (i.e "/data/bar").
> I attached a patch which fixes the problem for me; it makes things explicit by passing along the directory name inside the HiveInputSplit; this mean we don't have to figure out which files are a part of which partition.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira