You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org> on 2008/09/14 09:43:44 UTC

[jira] Updated: (HADOOP-4169) 'compressed' keyword in DDL syntax misleading and does not compress

     [ https://issues.apache.org/jira/browse/HADOOP-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joydeep Sen Sarma updated HADOOP-4169:
--------------------------------------

    Description: 
Hive produces two types of data files - flat files and sequencefiles. Syntax should reflect this. Currently the 'compressed' keyword is used to choose sequencefile format - but does not actually compress the files. this is misleading. In addition - flat files can also be compressed.

Proposal is to replace 'compressed' with 'sequencefile'. And compression options should be applied from standard hadoop way of specifying whether output should be compressed (''mapred.output.compress') - ie. session options. (session options will also define codec etc.). default file format and compression options can be specified in conf file.

  was:
Hive two types of data files - flat files and sequencefiles. Syntax should reflect this. Currently the 'compressed' keyword is used to choose sequencefile format - but does not actually compress the files. this is misleading. In addition - flat files can also be compressed.

Proposal is to replace 'compressed' with 'sequencefile'. And compression options should be applied from standard hadoop way of specifying whether output should be compressed (''mapred.output.compress') - ie. session options. (session options will also define codec etc.). default file format and compression options can be specified in conf file.


> 'compressed' keyword in DDL syntax misleading and does not compress
> -------------------------------------------------------------------
>
>                 Key: HADOOP-4169
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4169
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hive
>            Reporter: Joydeep Sen Sarma
>            Assignee: Joydeep Sen Sarma
>
> Hive produces two types of data files - flat files and sequencefiles. Syntax should reflect this. Currently the 'compressed' keyword is used to choose sequencefile format - but does not actually compress the files. this is misleading. In addition - flat files can also be compressed.
> Proposal is to replace 'compressed' with 'sequencefile'. And compression options should be applied from standard hadoop way of specifying whether output should be compressed (''mapred.output.compress') - ie. session options. (session options will also define codec etc.). default file format and compression options can be specified in conf file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.