You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Steven Wong <sw...@netflix.com> on 2011/07/02 01:38:58 UTC
RE: how to disable mapred.reduce.tasks
Try -1, judging from this:
<property>
<name>mapred.reduce.tasks</name>
<value>-1</value>
<description>The default number of reduce tasks per job. Typically set
to a prime close to the number of available hosts. Ignored when
mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas hive uses -1 as its default value.
By setting this property to -1, Hive will automatically figure out what should be the number of reducers.
</description>
</property>
From: Igor Tatarinov [mailto:igor@decide.com]
Sent: Wednesday, June 29, 2011 4:16 PM
To: user@hive.apache.org
Subject: how to disable mapred.reduce.tasks
I set mapred.reduce.tasks manually to have a single wave of reducers (does that make sense, by the way?)
When I save the data, I often end up with a bunch of small files because we use compression and Hive doesn't seem to merge small compressed files.
So my question is: can I disable mapred.reduce.tasks somehow and make Hive use the hive.exec.reducers.bytes.per.reducer instead to reduce the number of output files? It seems the former overrides the latter.