You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ron Bodkin (Created) (JIRA)" <ji...@apache.org> on 2011/12/09 10:25:40 UTC

[jira] [Created] (HIVE-2639) Allow Configuring Input Formats for Hive

Allow Configuring Input Formats for Hive
----------------------------------------

                 Key: HIVE-2639
                 URL: https://issues.apache.org/jira/browse/HIVE-2639
             Project: Hive
          Issue Type: New Feature
            Reporter: Ron Bodkin


If you want to use a parameterizable input format in Hive, you currently have to either use set to pass a configuration to everything (on each appropriate query). It would be nice to have a kind of property in hive that's associated with the input format. 

The best option for this today seems to be the storage handler mechanism, but that seems a bit heavyweight for this purpose (requiring you to also specify an output format and a serde).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2639) Allow Configuring Input Formats for Hive

Posted by "Ron Bodkin (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166049#comment-13166049 ] 

Ron Bodkin commented on HIVE-2639:
----------------------------------

I tried to use a storage handler to allow passing configuration data down, but the limitations are too severe (e.g., you can't have a partitioned table with a custom storage handler).

Here's a way I'd like to see Hive's default storage handler implement this:

public class DefaultStorageHandler {
  
    public static final String INPUTFORMAT = "inputformat.";

    @Override
    public void configureTableJobProperties(TableDesc tableDesc, Map<String, String> jobProperties) {
        for (Map.Entry entry : tableDesc.getProperties().entrySet()) {
            String key = entry.getKey().toString();
            if (key.startsWith(INPUTFORMAT)) {
                jobProperties.put(key.substring(INPUTFORMAT.length()), entry.getValue().toString());
            }
        }      
    }

}

                
> Allow Configuring Input Formats for Hive
> ----------------------------------------
>
>                 Key: HIVE-2639
>                 URL: https://issues.apache.org/jira/browse/HIVE-2639
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Ron Bodkin
>
> If you want to use a parameterizable input format in Hive, you currently have to either use set to pass a configuration to everything (on each appropriate query). It would be nice to have a kind of property in hive that's associated with the input format. 
> The best option for this today seems to be the storage handler mechanism, but that seems a bit heavyweight for this purpose (requiring you to also specify an output format and a serde).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2639) Allow Configuring Input Formats for Hive

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283878#comment-13283878 ] 

Edward Capriolo commented on HIVE-2639:
---------------------------------------

This makes sense we should do this have some system to turn table properties to job conf properties.
                
> Allow Configuring Input Formats for Hive
> ----------------------------------------
>
>                 Key: HIVE-2639
>                 URL: https://issues.apache.org/jira/browse/HIVE-2639
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Ron Bodkin
>
> If you want to use a parameterizable input format in Hive, you currently have to either use set to pass a configuration to everything (on each appropriate query). It would be nice to have a kind of property in hive that's associated with the input format. 
> The best option for this today seems to be the storage handler mechanism, but that seems a bit heavyweight for this purpose (requiring you to also specify an output format and a serde).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira