You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Igor Tatarinov <ig...@decide.com> on 2011/04/21 19:56:14 UTC

merging small output files with compression

I have about 5K input files so running a Hive job creates as many (small)
output files. Small-file merging seems to be enabled by default
(hive.merge.mapfiles=true) but it doesn't seem to work unless  output
compression is disabled (hive.exec.compress.output=false). If I do that, I
get only 30 (uncompressed) output files which is much more manageable.

Is there a way to enable both compression and small-file merge?

If not, I am thinking about saving into an uncompressed temp table first,
then enabling compression and saving into the output table. Is there an
easier way?

Thanks.