You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Piotr Kubaj <pk...@riseup.net> on 2014/10/09 23:24:24 UTC
MapReduce jobs start only on the PC they are typed on
Hi. I'm trying to run Hadoop on a 2-PC cluster (I need to do some
benchmarks for my bachelor thesis) and it works, but jobs start only on
the PC I typed the command (doesn't matter whether it has better specs
or not or where data is physically since I count Pi). My mapred-site.xml is:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>10.0.0.1:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapred.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>20</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>20</value>
</property>
<property>
<name>mapreduce.tasktracker.map.tasks.maximum</name>
<value>20</value>
</property>
<property>
<name>mapreduce.tasktracker.reduce.tasks.maximum</name>
<value>20</value>
</property>
<property>
<name>mapreduce.tasktracker.map.tasks.maximum</name>
<value>30</value>
<final>true</final>
</property>
<property>
<name>mapreduce.tasktracker.reduce.tasks.maximum</name>
<value>30</value>
</property>
<property>
<name>mapreduce.job.maps</name>
<value>3500</value>
</property>
<property>
<name>mapreduce.job.reduces</name>
<value>3500</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx2048m</value>
</property>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>10</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>DESKTOP1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>DESKTOP1:19888</value>
</property>
</configuration>
And yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/var/cache/hadoop-hdfs/hdfs</value>
<description>Comma separated list of paths. Use the list of directories
from $YARN_LOCAL_DIR.
For example,
/grid/hadoop/hdfs/yarn,/grid1/hadoop/hdfs/yarn.</description>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/var/log/hadoop/yarn</value>
<description>Use the list of directories from $YARN_LOG_DIR.
For example, /var/log/hadoop/yarn.</description>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>10.0.0.1</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property>
<property>
<description>The address of the RM web application.</description>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>131072</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>131072</value>
</property>
<property>
<description>Number of CPU cores that can be allocated
for containers.</description>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>8</value>
</property>
<property>
<name>yarn.resourcemanager.am.max-attempts</name>
<value>3</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
</configuration>
Re: MapReduce jobs start only on the PC they are typed on
Posted by Piotr Kubaj <pk...@riseup.net>.
On 10/09/2014 23:44, SF Hadoop wrote:
> What is in /etc/hadoop/conf/slaves?
>
> Something tells me it just says 'localhost'. You need to specify your
> slaves in that file.
Nope, my slaves file is as following:
10.0.0.1
10.0.0.2
Re: MapReduce jobs start only on the PC they are typed on
Posted by Piotr Kubaj <pk...@riseup.net>.
On 10/09/2014 23:44, SF Hadoop wrote:
> What is in /etc/hadoop/conf/slaves?
>
> Something tells me it just says 'localhost'. You need to specify your
> slaves in that file.
Nope, my slaves file is as following:
10.0.0.1
10.0.0.2
Re: MapReduce jobs start only on the PC they are typed on
Posted by Piotr Kubaj <pk...@riseup.net>.
On 10/09/2014 23:44, SF Hadoop wrote:
> What is in /etc/hadoop/conf/slaves?
>
> Something tells me it just says 'localhost'. You need to specify your
> slaves in that file.
Nope, my slaves file is as following:
10.0.0.1
10.0.0.2
Re: MapReduce jobs start only on the PC they are typed on
Posted by Piotr Kubaj <pk...@riseup.net>.
On 10/09/2014 23:44, SF Hadoop wrote:
> What is in /etc/hadoop/conf/slaves?
>
> Something tells me it just says 'localhost'. You need to specify your
> slaves in that file.
Nope, my slaves file is as following:
10.0.0.1
10.0.0.2
Re: MapReduce jobs start only on the PC they are typed on
Posted by SF Hadoop <sf...@gmail.com>.
What is in /etc/hadoop/conf/slaves?
Something tells me it just says 'localhost'. You need to specify your
slaves in that file.
On Thu, Oct 9, 2014 at 2:24 PM, Piotr Kubaj <pk...@riseup.net> wrote:
> Hi. I'm trying to run Hadoop on a 2-PC cluster (I need to do some
> benchmarks for my bachelor thesis) and it works, but jobs start only on
> the PC I typed the command (doesn't matter whether it has better specs
> or not or where data is physically since I count Pi). My mapred-site.xml
> is:
>
> <configuration>
> <property>
> <name>mapred.job.tracker</name>
> <value>10.0.0.1:54311</value>
> <description>The host and port that the MapReduce job tracker runs
> at. If "local", then jobs are run in-process as a single map
> and reduce task.
> </description>
> </property>
> <property>
> <name>mapred.framework.name</name>
> <value>yarn</value>
> </property>
> <property>
> <name>mapred.map.tasks</name>
> <value>20</value>
> </property>
> <property>
> <name>mapred.reduce.tasks</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.map.tasks.maximum</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.map.tasks.maximum</name>
> <value>30</value>
> <final>true</final>
> </property>
> <property>
> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> <value>30</value>
> </property>
> <property>
> <name>mapreduce.job.maps</name>
> <value>3500</value>
> </property>
> <property>
> <name>mapreduce.job.reduces</name>
> <value>3500</value>
> </property>
> <property>
> <name>mapred.child.java.opts</name>
> <value>-Xmx2048m</value>
> </property>
> <property>
> <name>mapreduce.reduce.shuffle.parallelcopies</name>
> <value>10</value>
> </property>
> <property>
> <name>mapreduce.jobhistory.address</name>
> <value>DESKTOP1:10020</value>
> </property>
> <property>
> <name>mapreduce.jobhistory.webapp.address</name>
> <value>DESKTOP1:19888</value>
> </property>
> </configuration>
>
> And yarn-site.xml:
> <configuration>
>
> <property>
> <name>yarn.nodemanager.local-dirs</name>
> <value>/var/cache/hadoop-hdfs/hdfs</value>
> <description>Comma separated list of paths. Use the list of directories
> from $YARN_LOCAL_DIR.
> For example,
> /grid/hadoop/hdfs/yarn,/grid1/hadoop/hdfs/yarn.</description>
> </property>
>
> <property>
> <name>yarn.nodemanager.log-dirs</name>
> <value>/var/log/hadoop/yarn</value>
> <description>Use the list of directories from $YARN_LOG_DIR.
> For example, /var/log/hadoop/yarn.</description>
> </property>
>
> <property>
> <name>yarn.resourcemanager.hostname</name>
> <value>10.0.0.1</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.address</name>
> <value>${yarn.resourcemanager.hostname}:8032</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.scheduler.address</name>
> <value>${yarn.resourcemanager.hostname}:8030</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.resource-tracker.address</name>
> <value>${yarn.resourcemanager.hostname}:8031</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.admin.address</name>
> <value>${yarn.resourcemanager.hostname}:8033</value>
> </property>
>
> <property>
> <description>The address of the RM web application.</description>
> <name>yarn.resourcemanager.webapp.address</name>
> <value>${yarn.resourcemanager.hostname}:8088</value>
> </property>
>
> <property>
> <name>yarn.scheduler.maximum-allocation-mb</name>
> <value>131072</value>
> </property>
>
> <property>
> <name>yarn.nodemanager.resource.memory-mb</name>
> <value>131072</value>
> </property>
>
> <property>
> <description>Number of CPU cores that can be allocated
> for containers.</description>
> <name>yarn.nodemanager.resource.cpu-vcores</name>
> <value>8</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.am.max-attempts</name>
> <value>3</value>
> </property>
> <property>
> <name>yarn.log-aggregation-enable</name>
> <value>true</value>
> </property>
> <property>
> <name>yarn.log-aggregation.retain-seconds</name>
> <value>604800</value>
> </property>
>
> </configuration>
>
>
>
>
Re: MapReduce jobs start only on the PC they are typed on
Posted by SF Hadoop <sf...@gmail.com>.
What is in /etc/hadoop/conf/slaves?
Something tells me it just says 'localhost'. You need to specify your
slaves in that file.
On Thu, Oct 9, 2014 at 2:24 PM, Piotr Kubaj <pk...@riseup.net> wrote:
> Hi. I'm trying to run Hadoop on a 2-PC cluster (I need to do some
> benchmarks for my bachelor thesis) and it works, but jobs start only on
> the PC I typed the command (doesn't matter whether it has better specs
> or not or where data is physically since I count Pi). My mapred-site.xml
> is:
>
> <configuration>
> <property>
> <name>mapred.job.tracker</name>
> <value>10.0.0.1:54311</value>
> <description>The host and port that the MapReduce job tracker runs
> at. If "local", then jobs are run in-process as a single map
> and reduce task.
> </description>
> </property>
> <property>
> <name>mapred.framework.name</name>
> <value>yarn</value>
> </property>
> <property>
> <name>mapred.map.tasks</name>
> <value>20</value>
> </property>
> <property>
> <name>mapred.reduce.tasks</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.map.tasks.maximum</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.map.tasks.maximum</name>
> <value>30</value>
> <final>true</final>
> </property>
> <property>
> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> <value>30</value>
> </property>
> <property>
> <name>mapreduce.job.maps</name>
> <value>3500</value>
> </property>
> <property>
> <name>mapreduce.job.reduces</name>
> <value>3500</value>
> </property>
> <property>
> <name>mapred.child.java.opts</name>
> <value>-Xmx2048m</value>
> </property>
> <property>
> <name>mapreduce.reduce.shuffle.parallelcopies</name>
> <value>10</value>
> </property>
> <property>
> <name>mapreduce.jobhistory.address</name>
> <value>DESKTOP1:10020</value>
> </property>
> <property>
> <name>mapreduce.jobhistory.webapp.address</name>
> <value>DESKTOP1:19888</value>
> </property>
> </configuration>
>
> And yarn-site.xml:
> <configuration>
>
> <property>
> <name>yarn.nodemanager.local-dirs</name>
> <value>/var/cache/hadoop-hdfs/hdfs</value>
> <description>Comma separated list of paths. Use the list of directories
> from $YARN_LOCAL_DIR.
> For example,
> /grid/hadoop/hdfs/yarn,/grid1/hadoop/hdfs/yarn.</description>
> </property>
>
> <property>
> <name>yarn.nodemanager.log-dirs</name>
> <value>/var/log/hadoop/yarn</value>
> <description>Use the list of directories from $YARN_LOG_DIR.
> For example, /var/log/hadoop/yarn.</description>
> </property>
>
> <property>
> <name>yarn.resourcemanager.hostname</name>
> <value>10.0.0.1</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.address</name>
> <value>${yarn.resourcemanager.hostname}:8032</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.scheduler.address</name>
> <value>${yarn.resourcemanager.hostname}:8030</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.resource-tracker.address</name>
> <value>${yarn.resourcemanager.hostname}:8031</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.admin.address</name>
> <value>${yarn.resourcemanager.hostname}:8033</value>
> </property>
>
> <property>
> <description>The address of the RM web application.</description>
> <name>yarn.resourcemanager.webapp.address</name>
> <value>${yarn.resourcemanager.hostname}:8088</value>
> </property>
>
> <property>
> <name>yarn.scheduler.maximum-allocation-mb</name>
> <value>131072</value>
> </property>
>
> <property>
> <name>yarn.nodemanager.resource.memory-mb</name>
> <value>131072</value>
> </property>
>
> <property>
> <description>Number of CPU cores that can be allocated
> for containers.</description>
> <name>yarn.nodemanager.resource.cpu-vcores</name>
> <value>8</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.am.max-attempts</name>
> <value>3</value>
> </property>
> <property>
> <name>yarn.log-aggregation-enable</name>
> <value>true</value>
> </property>
> <property>
> <name>yarn.log-aggregation.retain-seconds</name>
> <value>604800</value>
> </property>
>
> </configuration>
>
>
>
>
Re: MapReduce jobs start only on the PC they are typed on
Posted by SF Hadoop <sf...@gmail.com>.
What is in /etc/hadoop/conf/slaves?
Something tells me it just says 'localhost'. You need to specify your
slaves in that file.
On Thu, Oct 9, 2014 at 2:24 PM, Piotr Kubaj <pk...@riseup.net> wrote:
> Hi. I'm trying to run Hadoop on a 2-PC cluster (I need to do some
> benchmarks for my bachelor thesis) and it works, but jobs start only on
> the PC I typed the command (doesn't matter whether it has better specs
> or not or where data is physically since I count Pi). My mapred-site.xml
> is:
>
> <configuration>
> <property>
> <name>mapred.job.tracker</name>
> <value>10.0.0.1:54311</value>
> <description>The host and port that the MapReduce job tracker runs
> at. If "local", then jobs are run in-process as a single map
> and reduce task.
> </description>
> </property>
> <property>
> <name>mapred.framework.name</name>
> <value>yarn</value>
> </property>
> <property>
> <name>mapred.map.tasks</name>
> <value>20</value>
> </property>
> <property>
> <name>mapred.reduce.tasks</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.map.tasks.maximum</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.map.tasks.maximum</name>
> <value>30</value>
> <final>true</final>
> </property>
> <property>
> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> <value>30</value>
> </property>
> <property>
> <name>mapreduce.job.maps</name>
> <value>3500</value>
> </property>
> <property>
> <name>mapreduce.job.reduces</name>
> <value>3500</value>
> </property>
> <property>
> <name>mapred.child.java.opts</name>
> <value>-Xmx2048m</value>
> </property>
> <property>
> <name>mapreduce.reduce.shuffle.parallelcopies</name>
> <value>10</value>
> </property>
> <property>
> <name>mapreduce.jobhistory.address</name>
> <value>DESKTOP1:10020</value>
> </property>
> <property>
> <name>mapreduce.jobhistory.webapp.address</name>
> <value>DESKTOP1:19888</value>
> </property>
> </configuration>
>
> And yarn-site.xml:
> <configuration>
>
> <property>
> <name>yarn.nodemanager.local-dirs</name>
> <value>/var/cache/hadoop-hdfs/hdfs</value>
> <description>Comma separated list of paths. Use the list of directories
> from $YARN_LOCAL_DIR.
> For example,
> /grid/hadoop/hdfs/yarn,/grid1/hadoop/hdfs/yarn.</description>
> </property>
>
> <property>
> <name>yarn.nodemanager.log-dirs</name>
> <value>/var/log/hadoop/yarn</value>
> <description>Use the list of directories from $YARN_LOG_DIR.
> For example, /var/log/hadoop/yarn.</description>
> </property>
>
> <property>
> <name>yarn.resourcemanager.hostname</name>
> <value>10.0.0.1</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.address</name>
> <value>${yarn.resourcemanager.hostname}:8032</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.scheduler.address</name>
> <value>${yarn.resourcemanager.hostname}:8030</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.resource-tracker.address</name>
> <value>${yarn.resourcemanager.hostname}:8031</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.admin.address</name>
> <value>${yarn.resourcemanager.hostname}:8033</value>
> </property>
>
> <property>
> <description>The address of the RM web application.</description>
> <name>yarn.resourcemanager.webapp.address</name>
> <value>${yarn.resourcemanager.hostname}:8088</value>
> </property>
>
> <property>
> <name>yarn.scheduler.maximum-allocation-mb</name>
> <value>131072</value>
> </property>
>
> <property>
> <name>yarn.nodemanager.resource.memory-mb</name>
> <value>131072</value>
> </property>
>
> <property>
> <description>Number of CPU cores that can be allocated
> for containers.</description>
> <name>yarn.nodemanager.resource.cpu-vcores</name>
> <value>8</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.am.max-attempts</name>
> <value>3</value>
> </property>
> <property>
> <name>yarn.log-aggregation-enable</name>
> <value>true</value>
> </property>
> <property>
> <name>yarn.log-aggregation.retain-seconds</name>
> <value>604800</value>
> </property>
>
> </configuration>
>
>
>
>
Re: MapReduce jobs start only on the PC they are typed on
Posted by SF Hadoop <sf...@gmail.com>.
What is in /etc/hadoop/conf/slaves?
Something tells me it just says 'localhost'. You need to specify your
slaves in that file.
On Thu, Oct 9, 2014 at 2:24 PM, Piotr Kubaj <pk...@riseup.net> wrote:
> Hi. I'm trying to run Hadoop on a 2-PC cluster (I need to do some
> benchmarks for my bachelor thesis) and it works, but jobs start only on
> the PC I typed the command (doesn't matter whether it has better specs
> or not or where data is physically since I count Pi). My mapred-site.xml
> is:
>
> <configuration>
> <property>
> <name>mapred.job.tracker</name>
> <value>10.0.0.1:54311</value>
> <description>The host and port that the MapReduce job tracker runs
> at. If "local", then jobs are run in-process as a single map
> and reduce task.
> </description>
> </property>
> <property>
> <name>mapred.framework.name</name>
> <value>yarn</value>
> </property>
> <property>
> <name>mapred.map.tasks</name>
> <value>20</value>
> </property>
> <property>
> <name>mapred.reduce.tasks</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.map.tasks.maximum</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> <value>20</value>
> </property>
> <property>
> <name>mapreduce.tasktracker.map.tasks.maximum</name>
> <value>30</value>
> <final>true</final>
> </property>
> <property>
> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
> <value>30</value>
> </property>
> <property>
> <name>mapreduce.job.maps</name>
> <value>3500</value>
> </property>
> <property>
> <name>mapreduce.job.reduces</name>
> <value>3500</value>
> </property>
> <property>
> <name>mapred.child.java.opts</name>
> <value>-Xmx2048m</value>
> </property>
> <property>
> <name>mapreduce.reduce.shuffle.parallelcopies</name>
> <value>10</value>
> </property>
> <property>
> <name>mapreduce.jobhistory.address</name>
> <value>DESKTOP1:10020</value>
> </property>
> <property>
> <name>mapreduce.jobhistory.webapp.address</name>
> <value>DESKTOP1:19888</value>
> </property>
> </configuration>
>
> And yarn-site.xml:
> <configuration>
>
> <property>
> <name>yarn.nodemanager.local-dirs</name>
> <value>/var/cache/hadoop-hdfs/hdfs</value>
> <description>Comma separated list of paths. Use the list of directories
> from $YARN_LOCAL_DIR.
> For example,
> /grid/hadoop/hdfs/yarn,/grid1/hadoop/hdfs/yarn.</description>
> </property>
>
> <property>
> <name>yarn.nodemanager.log-dirs</name>
> <value>/var/log/hadoop/yarn</value>
> <description>Use the list of directories from $YARN_LOG_DIR.
> For example, /var/log/hadoop/yarn.</description>
> </property>
>
> <property>
> <name>yarn.resourcemanager.hostname</name>
> <value>10.0.0.1</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.address</name>
> <value>${yarn.resourcemanager.hostname}:8032</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.scheduler.address</name>
> <value>${yarn.resourcemanager.hostname}:8030</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.resource-tracker.address</name>
> <value>${yarn.resourcemanager.hostname}:8031</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.admin.address</name>
> <value>${yarn.resourcemanager.hostname}:8033</value>
> </property>
>
> <property>
> <description>The address of the RM web application.</description>
> <name>yarn.resourcemanager.webapp.address</name>
> <value>${yarn.resourcemanager.hostname}:8088</value>
> </property>
>
> <property>
> <name>yarn.scheduler.maximum-allocation-mb</name>
> <value>131072</value>
> </property>
>
> <property>
> <name>yarn.nodemanager.resource.memory-mb</name>
> <value>131072</value>
> </property>
>
> <property>
> <description>Number of CPU cores that can be allocated
> for containers.</description>
> <name>yarn.nodemanager.resource.cpu-vcores</name>
> <value>8</value>
> </property>
>
> <property>
> <name>yarn.resourcemanager.am.max-attempts</name>
> <value>3</value>
> </property>
> <property>
> <name>yarn.log-aggregation-enable</name>
> <value>true</value>
> </property>
> <property>
> <name>yarn.log-aggregation.retain-seconds</name>
> <value>604800</value>
> </property>
>
> </configuration>
>
>
>
>