You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by john smith <js...@gmail.com> on 2011/09/22 01:52:19 UTC

Reducer hanging ( swapping? )

Hi Folks,

I am running hive on a 10 node cluster. Since my hive queries have joins in
them, their reduce phases are a bit heavy.

I have 2GB RAM on each TT . The problem is that my reducer hangs at 76% for
a large amount of time.  I guess this is due to excessive swapping from disk
to memory. My vmstat shows  (on one of the TTs)

procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
wa
 1  0   1860  34884 189948 1997644    0    0     2     1    0    1  0  0 100
 0

My related config parms are pasted below. (I turned off speculative
execution for both maps and reduces). Can anyone suggest me
some improvements so as to make my reduce a bit faster?
(I've allotted 900MB to task and reduced other params. Even then it is not
showing any improvments.) . Any suggestions?

========================================

<property>
<name>mapred.min.split.size</name>
<value>65536</value>
</property>

        <property>
                <name>mapred.reduce.copy.backoff</name>
                <value>5</value>
        </property>


    <property>
        <name>io.sort.factor</name>
        <value>60</value>
    </property>

    <property>
        <name>mapred.reduce.parallel.copies</name>
        <value>25</value>
    </property>

        <property>
                <name>io.sort.mb</name>
                <value>70</value>
        </property>

 <property>
        <name>io.file.buffer.size</name>
        <value>32768</value>
    </property>

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx900m</value>
  </property>

===================================

Re: Reducer hanging ( swapping? )

Posted by john smith <js...@gmail.com>.

Hi,

I am CC'ing this to hive-user as well .

I tried to do a simple join between two tables 2.2GB and 137MB.

select count(*) from A JOIN B ON (A.a = B.b);

The query ran for 7 hours . I am sure this is not normal. The reducer gets
stuck at reduce > reduce phase . Map, copy phases complete just in a matter
of minutes and it gets stuck at reducer. Please see my previous mail below
for my config and vmstat output.

My job has 40 Maps and 7 reduces.

My JT and TT logs doesn't show any warnings, except that one of my nodes got
black listed because of Too many fetch failures.

Initially there was an error in that node's hosts file. I corrected it and
restarted the cluster. Even then that node gets blacklisted frequently.
Should I restart the node after changing hosts file?

Any help ? 7 hrs is too large for such a simple query.

On Thu, Sep 22, 2011 at 5:43 AM, Raj V <ra...@yahoo.com> wrote:

> 2GB for a task tracker? Here are some possible thoughts.
> Compress  map output.
> Change  mapred.reduce.slowstart.completed.maps
>
>
> By the way I see no swapping.  Anything interesting from the task tracker
> log? System log?
>
> Raj
>
>
>
>
>
> >________________________________
> >From: john smith <js...@gmail.com>
> >To: common-user@hadoop.apache.org
> >Sent: Wednesday, September 21, 2011 4:52 PM
> >Subject: Reducer hanging ( swapping? )
> >
> >Hi Folks,
> >
> >I am running hive on a 10 node cluster. Since my hive queries have joins
> in
> >them, their reduce phases are a bit heavy.
> >
> >I have 2GB RAM on each TT . The problem is that my reducer hangs at 76%
> for
> >a large amount of time.  I guess this is due to excessive swapping from
> disk
> >to memory. My vmstat shows  (on one of the TTs)
> >
> >procs -----------memory---------- ---swap-- -----io---- -system--
> >----cpu----
> >r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
> >wa
> >1  0   1860  34884 189948 1997644    0    0     2     1    0    1  0  0
> 100
> >0
> >
> >My related config parms are pasted below. (I turned off speculative
> >execution for both maps and reduces). Can anyone suggest me
> >some improvements so as to make my reduce a bit faster?
> >(I've allotted 900MB to task and reduced other params. Even then it is not
> >showing any improvments.) . Any suggestions?
> >
> >========================================
> >
> ><property>
> ><name>mapred.min.split.size</name>
> ><value>65536</value>
> ></property>
> >
> >        <property>
> >                <name>mapred.reduce.copy.backoff</name>
> >                <value>5</value>
> >        </property>
> >
> >
> >    <property>
> >        <name>io.sort.factor</name>
> >        <value>60</value>
> >    </property>
> >
> >    <property>
> >        <name>mapred.reduce.parallel.copies</name>
> >        <value>25</value>
> >    </property>
> >
> >        <property>
> >                <name>io.sort.mb</name>
> >                <value>70</value>
> >        </property>
> >
> ><property>
> >        <name>io.file.buffer.size</name>
> >        <value>32768</value>
> >    </property>
> >
> ><property>
> >    <name>mapred.child.java.opts</name>
> >    <value>-Xmx900m</value>
> >  </property>
> >
> >===================================
> >
> >
> >
>

Re: Reducer hanging ( swapping? )

Posted by john smith <js...@gmail.com>.

Hi,

I am CC'ing this to hive-user as well .

I tried to do a simple join between two tables 2.2GB and 137MB.

select count(*) from A JOIN B ON (A.a = B.b);

The query ran for 7 hours . I am sure this is not normal. The reducer gets
stuck at reduce > reduce phase . Map, copy phases complete just in a matter
of minutes and it gets stuck at reducer. Please see my previous mail below
for my config and vmstat output.

My job has 40 Maps and 7 reduces.

My JT and TT logs doesn't show any warnings, except that one of my nodes got
black listed because of Too many fetch failures.

Initially there was an error in that node's hosts file. I corrected it and
restarted the cluster. Even then that node gets blacklisted frequently.
Should I restart the node after changing hosts file?

Any help ? 7 hrs is too large for such a simple query.

On Thu, Sep 22, 2011 at 5:43 AM, Raj V <ra...@yahoo.com> wrote:

> 2GB for a task tracker? Here are some possible thoughts.
> Compress  map output.
> Change  mapred.reduce.slowstart.completed.maps
>
>
> By the way I see no swapping.  Anything interesting from the task tracker
> log? System log?
>
> Raj
>
>
>
>
>
> >________________________________
> >From: john smith <js...@gmail.com>
> >To: common-user@hadoop.apache.org
> >Sent: Wednesday, September 21, 2011 4:52 PM
> >Subject: Reducer hanging ( swapping? )
> >
> >Hi Folks,
> >
> >I am running hive on a 10 node cluster. Since my hive queries have joins
> in
> >them, their reduce phases are a bit heavy.
> >
> >I have 2GB RAM on each TT . The problem is that my reducer hangs at 76%
> for
> >a large amount of time.  I guess this is due to excessive swapping from
> disk
> >to memory. My vmstat shows  (on one of the TTs)
> >
> >procs -----------memory---------- ---swap-- -----io---- -system--
> >----cpu----
> >r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
> >wa
> >1  0   1860  34884 189948 1997644    0    0     2     1    0    1  0  0
> 100
> >0
> >
> >My related config parms are pasted below. (I turned off speculative
> >execution for both maps and reduces). Can anyone suggest me
> >some improvements so as to make my reduce a bit faster?
> >(I've allotted 900MB to task and reduced other params. Even then it is not
> >showing any improvments.) . Any suggestions?
> >
> >========================================
> >
> ><property>
> ><name>mapred.min.split.size</name>
> ><value>65536</value>
> ></property>
> >
> >        <property>
> >                <name>mapred.reduce.copy.backoff</name>
> >                <value>5</value>
> >        </property>
> >
> >
> >    <property>
> >        <name>io.sort.factor</name>
> >        <value>60</value>
> >    </property>
> >
> >    <property>
> >        <name>mapred.reduce.parallel.copies</name>
> >        <value>25</value>
> >    </property>
> >
> >        <property>
> >                <name>io.sort.mb</name>
> >                <value>70</value>
> >        </property>
> >
> ><property>
> >        <name>io.file.buffer.size</name>
> >        <value>32768</value>
> >    </property>
> >
> ><property>
> >    <name>mapred.child.java.opts</name>
> >    <value>-Xmx900m</value>
> >  </property>
> >
> >===================================
> >
> >
> >
>

Re: Reducer hanging ( swapping? )

Posted by Raj V <ra...@yahoo.com>.

2GB for a task tracker? Here are some possible thoughts.
Compress  map output.
Change  mapred.reduce.slowstart.completed.maps


By the way I see no swapping.  Anything interesting from the task tracker log? System log?

Raj





>________________________________
>From: john smith <js...@gmail.com>
>To: common-user@hadoop.apache.org
>Sent: Wednesday, September 21, 2011 4:52 PM
>Subject: Reducer hanging ( swapping? )
>
>Hi Folks,
>
>I am running hive on a 10 node cluster. Since my hive queries have joins in
>them, their reduce phases are a bit heavy.
>
>I have 2GB RAM on each TT . The problem is that my reducer hangs at 76% for
>a large amount of time.  I guess this is due to excessive swapping from disk
>to memory. My vmstat shows  (on one of the TTs)
>
>procs -----------memory---------- ---swap-- -----io---- -system--
>----cpu----
>r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
>wa
>1  0   1860  34884 189948 1997644    0    0     2     1    0    1  0  0 100
>0
>
>My related config parms are pasted below. (I turned off speculative
>execution for both maps and reduces). Can anyone suggest me
>some improvements so as to make my reduce a bit faster?
>(I've allotted 900MB to task and reduced other params. Even then it is not
>showing any improvments.) . Any suggestions?
>
>========================================
>
><property>
><name>mapred.min.split.size</name>
><value>65536</value>
></property>
>
>        <property>
>                <name>mapred.reduce.copy.backoff</name>
>                <value>5</value>
>        </property>
>
>
>    <property>
>        <name>io.sort.factor</name>
>        <value>60</value>
>    </property>
>
>    <property>
>        <name>mapred.reduce.parallel.copies</name>
>        <value>25</value>
>    </property>
>
>        <property>
>                <name>io.sort.mb</name>
>                <value>70</value>
>        </property>
>
><property>
>        <name>io.file.buffer.size</name>
>        <value>32768</value>
>    </property>
>
><property>
>    <name>mapred.child.java.opts</name>
>    <value>-Xmx900m</value>
>  </property>
>
>===================================
>
>
>