You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by r K <ap...@gmail.com> on 2014/03/30 23:52:38 UTC

Compression Problem

Hello Everyone,

I'm new to hadoop and tried to compress few files, using a streaming job.
Used streaming job mentioned in this post
http://stackoverflow.com/questions/7153087/hadoop-compress-file-in-hdfs

Used bzip format instead. After they compressed, because there were too
many small files, I did

hadoop fs -cat /input/* | hadoop fs -put - /output/day1

Tried to build an external table from this day1 file and i get weird
characters.Renamed the file to day1.bz and then tried to rebuild table but
still get weird characters.

Realized I messed up pretty bad. Is there anyway to salvage this data ?
Is there anyway to use this compressed file /output/day1 at all ?

Thanks in advance.