You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by SP <sa...@gmail.com> on 2015/07/30 00:34:24 UTC
Need command to compress the files
Hi All,
I am working on comparing different compression ratios.
I have these files in AVRO format. How can I compress them using snappy or
gzip.
-rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
/tmp/fact_splitby_date_id/part-m-00000.avro
-rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
/tmp/fact_splitby_date_id/part-m-00001.avro
-rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
/tmp/fact_splitby_date_id/part-m-00002.avro
-rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
/tmp/fact_splitby_date_id/part-m-00003.avro
Thanks
Sp
Re: Need command to compress the files
Posted by Hadoop User <kj...@gmail.com>.
I already have the data in HDFS. I want to test compression ratio with gzip and snappy.
Thanks
Sajid
Sent from my iPhone
> On Jul 29, 2015, at 5:37 PM, Ron Gonzalez <zl...@yahoo.com> wrote:
>
> I think you can pick the compression algorithm when using sqoop - either deflate or snappy when specifying the --compress option.
> Is that what you were asking?
>
> Thanks,
> Ron
>
>> On 07/29/2015 03:40 PM, Ted Yu wrote:
>> You can use the following command to see options for gzip:
>> gzip -h
>>
>> For snappy, see:
>> https://github.com/kubo/snzip
>> https://code.google.com/p/snappy/issues/detail?id=34
>>
>> FYI
>>
>>> On Wed, Jul 29, 2015 at 3:34 PM, SP <sa...@gmail.com> wrote:
>>> Hi All,
>>>
>>> I am working on comparing different compression ratios.
>>>
>>> I have these files in AVRO format. How can I compress them using snappy or gzip.
>>>
>>> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16 /tmp/fact_splitby_date_id/part-m-00000.avro
>>> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15 /tmp/fact_splitby_date_id/part-m-00001.avro
>>> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17 /tmp/fact_splitby_date_id/part-m-00002.avro
>>> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16 /tmp/fact_splitby_date_id/part-m-00003.avro
>>>
>>>
>>>
>>>
>>> Thanks
>>> Sp
Re: Need command to compress the files
Posted by Hadoop User <kj...@gmail.com>.
I already have the data in HDFS. I want to test compression ratio with gzip and snappy.
Thanks
Sajid
Sent from my iPhone
> On Jul 29, 2015, at 5:37 PM, Ron Gonzalez <zl...@yahoo.com> wrote:
>
> I think you can pick the compression algorithm when using sqoop - either deflate or snappy when specifying the --compress option.
> Is that what you were asking?
>
> Thanks,
> Ron
>
>> On 07/29/2015 03:40 PM, Ted Yu wrote:
>> You can use the following command to see options for gzip:
>> gzip -h
>>
>> For snappy, see:
>> https://github.com/kubo/snzip
>> https://code.google.com/p/snappy/issues/detail?id=34
>>
>> FYI
>>
>>> On Wed, Jul 29, 2015 at 3:34 PM, SP <sa...@gmail.com> wrote:
>>> Hi All,
>>>
>>> I am working on comparing different compression ratios.
>>>
>>> I have these files in AVRO format. How can I compress them using snappy or gzip.
>>>
>>> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16 /tmp/fact_splitby_date_id/part-m-00000.avro
>>> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15 /tmp/fact_splitby_date_id/part-m-00001.avro
>>> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17 /tmp/fact_splitby_date_id/part-m-00002.avro
>>> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16 /tmp/fact_splitby_date_id/part-m-00003.avro
>>>
>>>
>>>
>>>
>>> Thanks
>>> Sp
Re: Need command to compress the files
Posted by Hadoop User <kj...@gmail.com>.
I already have the data in HDFS. I want to test compression ratio with gzip and snappy.
Thanks
Sajid
Sent from my iPhone
> On Jul 29, 2015, at 5:37 PM, Ron Gonzalez <zl...@yahoo.com> wrote:
>
> I think you can pick the compression algorithm when using sqoop - either deflate or snappy when specifying the --compress option.
> Is that what you were asking?
>
> Thanks,
> Ron
>
>> On 07/29/2015 03:40 PM, Ted Yu wrote:
>> You can use the following command to see options for gzip:
>> gzip -h
>>
>> For snappy, see:
>> https://github.com/kubo/snzip
>> https://code.google.com/p/snappy/issues/detail?id=34
>>
>> FYI
>>
>>> On Wed, Jul 29, 2015 at 3:34 PM, SP <sa...@gmail.com> wrote:
>>> Hi All,
>>>
>>> I am working on comparing different compression ratios.
>>>
>>> I have these files in AVRO format. How can I compress them using snappy or gzip.
>>>
>>> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16 /tmp/fact_splitby_date_id/part-m-00000.avro
>>> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15 /tmp/fact_splitby_date_id/part-m-00001.avro
>>> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17 /tmp/fact_splitby_date_id/part-m-00002.avro
>>> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16 /tmp/fact_splitby_date_id/part-m-00003.avro
>>>
>>>
>>>
>>>
>>> Thanks
>>> Sp
Re: Need command to compress the files
Posted by Hadoop User <kj...@gmail.com>.
I already have the data in HDFS. I want to test compression ratio with gzip and snappy.
Thanks
Sajid
Sent from my iPhone
> On Jul 29, 2015, at 5:37 PM, Ron Gonzalez <zl...@yahoo.com> wrote:
>
> I think you can pick the compression algorithm when using sqoop - either deflate or snappy when specifying the --compress option.
> Is that what you were asking?
>
> Thanks,
> Ron
>
>> On 07/29/2015 03:40 PM, Ted Yu wrote:
>> You can use the following command to see options for gzip:
>> gzip -h
>>
>> For snappy, see:
>> https://github.com/kubo/snzip
>> https://code.google.com/p/snappy/issues/detail?id=34
>>
>> FYI
>>
>>> On Wed, Jul 29, 2015 at 3:34 PM, SP <sa...@gmail.com> wrote:
>>> Hi All,
>>>
>>> I am working on comparing different compression ratios.
>>>
>>> I have these files in AVRO format. How can I compress them using snappy or gzip.
>>>
>>> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16 /tmp/fact_splitby_date_id/part-m-00000.avro
>>> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15 /tmp/fact_splitby_date_id/part-m-00001.avro
>>> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17 /tmp/fact_splitby_date_id/part-m-00002.avro
>>> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16 /tmp/fact_splitby_date_id/part-m-00003.avro
>>>
>>>
>>>
>>>
>>> Thanks
>>> Sp
Re: Need command to compress the files
Posted by Ron Gonzalez <zl...@yahoo.com>.
I think you can pick the compression algorithm when using sqoop - either
deflate or snappy when specifying the --compress option.
Is that what you were asking?
Thanks,
Ron
On 07/29/2015 03:40 PM, Ted Yu wrote:
> You can use the following command to see options for gzip:
> gzip -h
>
> For snappy, see:
> https://github.com/kubo/snzip
> https://code.google.com/p/snappy/issues/detail?id=34
>
> FYI
>
> On Wed, Jul 29, 2015 at 3:34 PM, SP <sajidmca@gmail.com
> <ma...@gmail.com>> wrote:
>
> Hi All,
>
> I am working on comparing different compression ratios.
>
> I have these files in AVRO format. How can I compress them using
> snappy or gzip.
>
> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00000.avro
> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
> /tmp/fact_splitby_date_id/part-m-00001.avro
> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
> /tmp/fact_splitby_date_id/part-m-00002.avro
> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00003.avro
>
>
>
>
> Thanks
> Sp
>
>
Re: Need command to compress the files
Posted by Ron Gonzalez <zl...@yahoo.com>.
I think you can pick the compression algorithm when using sqoop - either
deflate or snappy when specifying the --compress option.
Is that what you were asking?
Thanks,
Ron
On 07/29/2015 03:40 PM, Ted Yu wrote:
> You can use the following command to see options for gzip:
> gzip -h
>
> For snappy, see:
> https://github.com/kubo/snzip
> https://code.google.com/p/snappy/issues/detail?id=34
>
> FYI
>
> On Wed, Jul 29, 2015 at 3:34 PM, SP <sajidmca@gmail.com
> <ma...@gmail.com>> wrote:
>
> Hi All,
>
> I am working on comparing different compression ratios.
>
> I have these files in AVRO format. How can I compress them using
> snappy or gzip.
>
> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00000.avro
> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
> /tmp/fact_splitby_date_id/part-m-00001.avro
> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
> /tmp/fact_splitby_date_id/part-m-00002.avro
> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00003.avro
>
>
>
>
> Thanks
> Sp
>
>
Re: Need command to compress the files
Posted by Ron Gonzalez <zl...@yahoo.com>.
I think you can pick the compression algorithm when using sqoop - either
deflate or snappy when specifying the --compress option.
Is that what you were asking?
Thanks,
Ron
On 07/29/2015 03:40 PM, Ted Yu wrote:
> You can use the following command to see options for gzip:
> gzip -h
>
> For snappy, see:
> https://github.com/kubo/snzip
> https://code.google.com/p/snappy/issues/detail?id=34
>
> FYI
>
> On Wed, Jul 29, 2015 at 3:34 PM, SP <sajidmca@gmail.com
> <ma...@gmail.com>> wrote:
>
> Hi All,
>
> I am working on comparing different compression ratios.
>
> I have these files in AVRO format. How can I compress them using
> snappy or gzip.
>
> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00000.avro
> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
> /tmp/fact_splitby_date_id/part-m-00001.avro
> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
> /tmp/fact_splitby_date_id/part-m-00002.avro
> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00003.avro
>
>
>
>
> Thanks
> Sp
>
>
Re: Need command to compress the files
Posted by Ron Gonzalez <zl...@yahoo.com>.
I think you can pick the compression algorithm when using sqoop - either
deflate or snappy when specifying the --compress option.
Is that what you were asking?
Thanks,
Ron
On 07/29/2015 03:40 PM, Ted Yu wrote:
> You can use the following command to see options for gzip:
> gzip -h
>
> For snappy, see:
> https://github.com/kubo/snzip
> https://code.google.com/p/snappy/issues/detail?id=34
>
> FYI
>
> On Wed, Jul 29, 2015 at 3:34 PM, SP <sajidmca@gmail.com
> <ma...@gmail.com>> wrote:
>
> Hi All,
>
> I am working on comparing different compression ratios.
>
> I have these files in AVRO format. How can I compress them using
> snappy or gzip.
>
> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00000.avro
> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
> /tmp/fact_splitby_date_id/part-m-00001.avro
> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
> /tmp/fact_splitby_date_id/part-m-00002.avro
> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00003.avro
>
>
>
>
> Thanks
> Sp
>
>
Re: Need command to compress the files
Posted by Ted Yu <yu...@gmail.com>.
You can use the following command to see options for gzip:
gzip -h
For snappy, see:
https://github.com/kubo/snzip
https://code.google.com/p/snappy/issues/detail?id=34
FYI
On Wed, Jul 29, 2015 at 3:34 PM, SP <sa...@gmail.com> wrote:
> Hi All,
>
> I am working on comparing different compression ratios.
>
> I have these files in AVRO format. How can I compress them using snappy or
> gzip.
>
> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00000.avro
> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
> /tmp/fact_splitby_date_id/part-m-00001.avro
> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
> /tmp/fact_splitby_date_id/part-m-00002.avro
> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00003.avro
>
>
>
>
> Thanks
> Sp
>
Re: Need command to compress the files
Posted by Ted Yu <yu...@gmail.com>.
You can use the following command to see options for gzip:
gzip -h
For snappy, see:
https://github.com/kubo/snzip
https://code.google.com/p/snappy/issues/detail?id=34
FYI
On Wed, Jul 29, 2015 at 3:34 PM, SP <sa...@gmail.com> wrote:
> Hi All,
>
> I am working on comparing different compression ratios.
>
> I have these files in AVRO format. How can I compress them using snappy or
> gzip.
>
> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00000.avro
> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
> /tmp/fact_splitby_date_id/part-m-00001.avro
> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
> /tmp/fact_splitby_date_id/part-m-00002.avro
> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00003.avro
>
>
>
>
> Thanks
> Sp
>
Re: Need command to compress the files
Posted by Ted Yu <yu...@gmail.com>.
You can use the following command to see options for gzip:
gzip -h
For snappy, see:
https://github.com/kubo/snzip
https://code.google.com/p/snappy/issues/detail?id=34
FYI
On Wed, Jul 29, 2015 at 3:34 PM, SP <sa...@gmail.com> wrote:
> Hi All,
>
> I am working on comparing different compression ratios.
>
> I have these files in AVRO format. How can I compress them using snappy or
> gzip.
>
> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00000.avro
> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
> /tmp/fact_splitby_date_id/part-m-00001.avro
> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
> /tmp/fact_splitby_date_id/part-m-00002.avro
> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00003.avro
>
>
>
>
> Thanks
> Sp
>
Re: Need command to compress the files
Posted by Ted Yu <yu...@gmail.com>.
You can use the following command to see options for gzip:
gzip -h
For snappy, see:
https://github.com/kubo/snzip
https://code.google.com/p/snappy/issues/detail?id=34
FYI
On Wed, Jul 29, 2015 at 3:34 PM, SP <sa...@gmail.com> wrote:
> Hi All,
>
> I am working on comparing different compression ratios.
>
> I have these files in AVRO format. How can I compress them using snappy or
> gzip.
>
> -rw-r--r-- 3 hdfs supergroup 3080866838 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00000.avro
> -rw-r--r-- 3 hdfs supergroup 3021258762 2015-07-29 18:15
> /tmp/fact_splitby_date_id/part-m-00001.avro
> -rw-r--r-- 3 hdfs supergroup 3164101762 2015-07-29 18:17
> /tmp/fact_splitby_date_id/part-m-00002.avro
> -rw-r--r-- 3 hdfs supergroup 3251578205 2015-07-29 18:16
> /tmp/fact_splitby_date_id/part-m-00003.avro
>
>
>
>
> Thanks
> Sp
>