You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Arindam Choudhury <ar...@gmail.com> on 2013/03/07 13:31:01 UTC

error while running reduce

Hi,

I am trying to do a performance analysis of hadoop on virtual machine. When
I try to run terasort with 2GB of input data with 1 map and 1 reduce, the
map finishes properly, but reduce gives error. I can not understand why?
any help?

I have a single node hadoop deployment in a virtual machine. The F18
virtual machine have 1 core and 2 GB of memory.

my configuration:
core-site.xml
<configuration>
<property>
  <name>fs.default.name</name>
  <value>hdfs://hadoopa.arindam.com:54310</value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/${user.name}</value>
</property>
<property>
  <name>fs.inmemory.size.mb</name>
  <value>20</value>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
</property>
</configuration>

hdfs-site.xml
<configuration>
<property>
  <name>dfs.name.dir</name>
  <value>/home/hadoop/hadoop-dir/name-dir</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/home/hadoop/hadoop-dir/data-dir</value>
</property>
<property>
  <name>dfs.block.size</name>
  <value>2048000000</value>
  <final>true</final>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
</configuration>


mapred-site.xml
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>hadoopa.arindam.com:54311</value>
</property>
<property>
  <name>mapred.system.dir</name>
  <value>/home/hadoop/hadoop-dir/system-dir</value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value>/home/hadoop/hadoop-dir/local-dir</value>
</property>
<property>
  <name>mapred.map.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
</configuration>

I created 2GB of data to run tera sort.

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 14480427242 (13.49 GB)
DFS Remaining: 12416368640 (11.56 GB)
DFS Used: 2064058602 (1.92 GB)
DFS Used%: 14.25%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064058602 (1.92 GB)
Non DFS Used: 7125718806 (6.64 GB)
DFS Remaining: 12416368640(11.56 GB)
DFS Used%: 9.55%
DFS Remaining%: 57.47%


But when I run the terasort, i am getting the following error:

13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
attempt_201303041741_0002_r_000000_0, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
could only be replicated to 0 nodes, instead of 1

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 10582014209 (9.86 GB)
DFS Remaining: 8517738496 (7.93 GB)
DFS Used: 2064275713 (1.92 GB)
DFS Used%: 19.51%
Under replicated blocks: 2
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064275713 (1.92 GB)
Non DFS Used: 11024131839 (10.27 GB)
DFS Remaining: 8517738496(7.93 GB)
DFS Used%: 9.55%
DFS Remaining%: 39.42%


Thanks,

Re: error while running reduce

Posted by Arindam Choudhury <ar...@gmail.com>.

how can I fix this?
when I run the same job with 1GB of input with 1 map and 1 reducer, it
works fine.


On Thu, Mar 7, 2013 at 11:14 PM, Jagmohan Chauhan <
simplefundumnnit@gmail.com> wrote:

> Hi
>
> I think the problem is in replication factor.. As, you are using
> replication factor of 1 and you have a single node the data cannot be
> replicated anywhere else.
>
> On Thu, Mar 7, 2013 at 4:31 AM, Arindam Choudhury <
> arindamchoudhury0@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to do a performance analysis of hadoop on virtual machine.
>> When I try to run terasort with 2GB of input data with 1 map and 1 reduce,
>> the map finishes properly, but reduce gives error. I can not understand
>> why? any help?
>>
>> I have a single node hadoop deployment in a virtual machine. The F18
>> virtual machine have 1 core and 2 GB of memory.
>>
>> my configuration:
>> core-site.xml
>> <configuration>
>> <property>
>>   <name>fs.default.name</name>
>>   <value>hdfs://hadoopa.arindam.com:54310</value>
>> </property>
>> <property>
>>   <name>hadoop.tmp.dir</name>
>>   <value>/tmp/${user.name}</value>
>> </property>
>> <property>
>>   <name>fs.inmemory.size.mb</name>
>>   <value>20</value>
>> </property>
>> <property>
>>   <name>io.file.buffer.size</name>
>>   <value>131072</value>
>> </property>
>> </configuration>
>>
>> hdfs-site.xml
>> <configuration>
>> <property>
>>   <name>dfs.name.dir</name>
>>   <value>/home/hadoop/hadoop-dir/name-dir</value>
>> </property>
>> <property>
>>   <name>dfs.data.dir</name>
>>   <value>/home/hadoop/hadoop-dir/data-dir</value>
>> </property>
>> <property>
>>   <name>dfs.block.size</name>
>>   <value>2048000000</value>
>>   <final>true</final>
>> </property>
>> <property>
>>   <name>dfs.replication</name>
>>   <value>1</value>
>> </property>
>> </configuration>
>>
>>
>> mapred-site.xml
>> <configuration>
>> <property>
>>   <name>mapred.job.tracker</name>
>>   <value>hadoopa.arindam.com:54311</value>
>> </property>
>> <property>
>>   <name>mapred.system.dir</name>
>>   <value>/home/hadoop/hadoop-dir/system-dir</value>
>> </property>
>> <property>
>>   <name>mapred.local.dir</name>
>>   <value>/home/hadoop/hadoop-dir/local-dir</value>
>> </property>
>> <property>
>>   <name>mapred.map.child.java.opts</name>
>>   <value>-Xmx1024M</value>
>> </property>
>> <property>
>>   <name>mapred.reduce.child.java.opts</name>
>>   <value>-Xmx1024M</value>
>> </property>
>> </configuration>
>>
>> I created 2GB of data to run tera sort.
>>
>> hadoop dfsadmin -report
>> Configured Capacity: 21606146048 (20.12 GB)
>> Present Capacity: 14480427242 (13.49 GB)
>> DFS Remaining: 12416368640 (11.56 GB)
>> DFS Used: 2064058602 (1.92 GB)
>> DFS Used%: 14.25%
>> Under replicated blocks: 0
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.122.32:50010
>> Decommission Status : Normal
>> Configured Capacity: 21606146048 (20.12 GB)
>> DFS Used: 2064058602 (1.92 GB)
>> Non DFS Used: 7125718806 (6.64 GB)
>> DFS Remaining: 12416368640(11.56 GB)
>> DFS Used%: 9.55%
>> DFS Remaining%: 57.47%
>>
>>
>> But when I run the terasort, i am getting the following error:
>>
>> 13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
>> attempt_201303041741_0002_r_000000_0, Status : FAILED
>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
>> /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
>> could only be replicated to 0 nodes, instead of 1
>>
>> hadoop dfsadmin -report
>> Configured Capacity: 21606146048 (20.12 GB)
>> Present Capacity: 10582014209 (9.86 GB)
>> DFS Remaining: 8517738496 (7.93 GB)
>> DFS Used: 2064275713 (1.92 GB)
>> DFS Used%: 19.51%
>> Under replicated blocks: 2
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.122.32:50010
>> Decommission Status : Normal
>> Configured Capacity: 21606146048 (20.12 GB)
>> DFS Used: 2064275713 (1.92 GB)
>> Non DFS Used: 11024131839 (10.27 GB)
>> DFS Remaining: 8517738496(7.93 GB)
>> DFS Used%: 9.55%
>> DFS Remaining%: 39.42%
>>
>>
>> Thanks,
>>
>
>
>
> --
> Thanks and Regards
> Jagmohan Chauhan
> MSc student,CS
> Univ. of Saskatchewan
> IEEE Graduate Student Member
>
> http://homepage.usask.ca/~jac735/
>

Re: error while running reduce

Posted by Arindam Choudhury <ar...@gmail.com>.

how can I fix this?
when I run the same job with 1GB of input with 1 map and 1 reducer, it
works fine.


On Thu, Mar 7, 2013 at 11:14 PM, Jagmohan Chauhan <
simplefundumnnit@gmail.com> wrote:

> Hi
>
> I think the problem is in replication factor.. As, you are using
> replication factor of 1 and you have a single node the data cannot be
> replicated anywhere else.
>
> On Thu, Mar 7, 2013 at 4:31 AM, Arindam Choudhury <
> arindamchoudhury0@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to do a performance analysis of hadoop on virtual machine.
>> When I try to run terasort with 2GB of input data with 1 map and 1 reduce,
>> the map finishes properly, but reduce gives error. I can not understand
>> why? any help?
>>
>> I have a single node hadoop deployment in a virtual machine. The F18
>> virtual machine have 1 core and 2 GB of memory.
>>
>> my configuration:
>> core-site.xml
>> <configuration>
>> <property>
>>   <name>fs.default.name</name>
>>   <value>hdfs://hadoopa.arindam.com:54310</value>
>> </property>
>> <property>
>>   <name>hadoop.tmp.dir</name>
>>   <value>/tmp/${user.name}</value>
>> </property>
>> <property>
>>   <name>fs.inmemory.size.mb</name>
>>   <value>20</value>
>> </property>
>> <property>
>>   <name>io.file.buffer.size</name>
>>   <value>131072</value>
>> </property>
>> </configuration>
>>
>> hdfs-site.xml
>> <configuration>
>> <property>
>>   <name>dfs.name.dir</name>
>>   <value>/home/hadoop/hadoop-dir/name-dir</value>
>> </property>
>> <property>
>>   <name>dfs.data.dir</name>
>>   <value>/home/hadoop/hadoop-dir/data-dir</value>
>> </property>
>> <property>
>>   <name>dfs.block.size</name>
>>   <value>2048000000</value>
>>   <final>true</final>
>> </property>
>> <property>
>>   <name>dfs.replication</name>
>>   <value>1</value>
>> </property>
>> </configuration>
>>
>>
>> mapred-site.xml
>> <configuration>
>> <property>
>>   <name>mapred.job.tracker</name>
>>   <value>hadoopa.arindam.com:54311</value>
>> </property>
>> <property>
>>   <name>mapred.system.dir</name>
>>   <value>/home/hadoop/hadoop-dir/system-dir</value>
>> </property>
>> <property>
>>   <name>mapred.local.dir</name>
>>   <value>/home/hadoop/hadoop-dir/local-dir</value>
>> </property>
>> <property>
>>   <name>mapred.map.child.java.opts</name>
>>   <value>-Xmx1024M</value>
>> </property>
>> <property>
>>   <name>mapred.reduce.child.java.opts</name>
>>   <value>-Xmx1024M</value>
>> </property>
>> </configuration>
>>
>> I created 2GB of data to run tera sort.
>>
>> hadoop dfsadmin -report
>> Configured Capacity: 21606146048 (20.12 GB)
>> Present Capacity: 14480427242 (13.49 GB)
>> DFS Remaining: 12416368640 (11.56 GB)
>> DFS Used: 2064058602 (1.92 GB)
>> DFS Used%: 14.25%
>> Under replicated blocks: 0
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.122.32:50010
>> Decommission Status : Normal
>> Configured Capacity: 21606146048 (20.12 GB)
>> DFS Used: 2064058602 (1.92 GB)
>> Non DFS Used: 7125718806 (6.64 GB)
>> DFS Remaining: 12416368640(11.56 GB)
>> DFS Used%: 9.55%
>> DFS Remaining%: 57.47%
>>
>>
>> But when I run the terasort, i am getting the following error:
>>
>> 13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
>> attempt_201303041741_0002_r_000000_0, Status : FAILED
>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
>> /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
>> could only be replicated to 0 nodes, instead of 1
>>
>> hadoop dfsadmin -report
>> Configured Capacity: 21606146048 (20.12 GB)
>> Present Capacity: 10582014209 (9.86 GB)
>> DFS Remaining: 8517738496 (7.93 GB)
>> DFS Used: 2064275713 (1.92 GB)
>> DFS Used%: 19.51%
>> Under replicated blocks: 2
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.122.32:50010
>> Decommission Status : Normal
>> Configured Capacity: 21606146048 (20.12 GB)
>> DFS Used: 2064275713 (1.92 GB)
>> Non DFS Used: 11024131839 (10.27 GB)
>> DFS Remaining: 8517738496(7.93 GB)
>> DFS Used%: 9.55%
>> DFS Remaining%: 39.42%
>>
>>
>> Thanks,
>>
>
>
>
> --
> Thanks and Regards
> Jagmohan Chauhan
> MSc student,CS
> Univ. of Saskatchewan
> IEEE Graduate Student Member
>
> http://homepage.usask.ca/~jac735/
>

Re: error while running reduce

Posted by Arindam Choudhury <ar...@gmail.com>.

how can I fix this?
when I run the same job with 1GB of input with 1 map and 1 reducer, it
works fine.


On Thu, Mar 7, 2013 at 11:14 PM, Jagmohan Chauhan <
simplefundumnnit@gmail.com> wrote:

> Hi
>
> I think the problem is in replication factor.. As, you are using
> replication factor of 1 and you have a single node the data cannot be
> replicated anywhere else.
>
> On Thu, Mar 7, 2013 at 4:31 AM, Arindam Choudhury <
> arindamchoudhury0@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to do a performance analysis of hadoop on virtual machine.
>> When I try to run terasort with 2GB of input data with 1 map and 1 reduce,
>> the map finishes properly, but reduce gives error. I can not understand
>> why? any help?
>>
>> I have a single node hadoop deployment in a virtual machine. The F18
>> virtual machine have 1 core and 2 GB of memory.
>>
>> my configuration:
>> core-site.xml
>> <configuration>
>> <property>
>>   <name>fs.default.name</name>
>>   <value>hdfs://hadoopa.arindam.com:54310</value>
>> </property>
>> <property>
>>   <name>hadoop.tmp.dir</name>
>>   <value>/tmp/${user.name}</value>
>> </property>
>> <property>
>>   <name>fs.inmemory.size.mb</name>
>>   <value>20</value>
>> </property>
>> <property>
>>   <name>io.file.buffer.size</name>
>>   <value>131072</value>
>> </property>
>> </configuration>
>>
>> hdfs-site.xml
>> <configuration>
>> <property>
>>   <name>dfs.name.dir</name>
>>   <value>/home/hadoop/hadoop-dir/name-dir</value>
>> </property>
>> <property>
>>   <name>dfs.data.dir</name>
>>   <value>/home/hadoop/hadoop-dir/data-dir</value>
>> </property>
>> <property>
>>   <name>dfs.block.size</name>
>>   <value>2048000000</value>
>>   <final>true</final>
>> </property>
>> <property>
>>   <name>dfs.replication</name>
>>   <value>1</value>
>> </property>
>> </configuration>
>>
>>
>> mapred-site.xml
>> <configuration>
>> <property>
>>   <name>mapred.job.tracker</name>
>>   <value>hadoopa.arindam.com:54311</value>
>> </property>
>> <property>
>>   <name>mapred.system.dir</name>
>>   <value>/home/hadoop/hadoop-dir/system-dir</value>
>> </property>
>> <property>
>>   <name>mapred.local.dir</name>
>>   <value>/home/hadoop/hadoop-dir/local-dir</value>
>> </property>
>> <property>
>>   <name>mapred.map.child.java.opts</name>
>>   <value>-Xmx1024M</value>
>> </property>
>> <property>
>>   <name>mapred.reduce.child.java.opts</name>
>>   <value>-Xmx1024M</value>
>> </property>
>> </configuration>
>>
>> I created 2GB of data to run tera sort.
>>
>> hadoop dfsadmin -report
>> Configured Capacity: 21606146048 (20.12 GB)
>> Present Capacity: 14480427242 (13.49 GB)
>> DFS Remaining: 12416368640 (11.56 GB)
>> DFS Used: 2064058602 (1.92 GB)
>> DFS Used%: 14.25%
>> Under replicated blocks: 0
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.122.32:50010
>> Decommission Status : Normal
>> Configured Capacity: 21606146048 (20.12 GB)
>> DFS Used: 2064058602 (1.92 GB)
>> Non DFS Used: 7125718806 (6.64 GB)
>> DFS Remaining: 12416368640(11.56 GB)
>> DFS Used%: 9.55%
>> DFS Remaining%: 57.47%
>>
>>
>> But when I run the terasort, i am getting the following error:
>>
>> 13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
>> attempt_201303041741_0002_r_000000_0, Status : FAILED
>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
>> /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
>> could only be replicated to 0 nodes, instead of 1
>>
>> hadoop dfsadmin -report
>> Configured Capacity: 21606146048 (20.12 GB)
>> Present Capacity: 10582014209 (9.86 GB)
>> DFS Remaining: 8517738496 (7.93 GB)
>> DFS Used: 2064275713 (1.92 GB)
>> DFS Used%: 19.51%
>> Under replicated blocks: 2
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.122.32:50010
>> Decommission Status : Normal
>> Configured Capacity: 21606146048 (20.12 GB)
>> DFS Used: 2064275713 (1.92 GB)
>> Non DFS Used: 11024131839 (10.27 GB)
>> DFS Remaining: 8517738496(7.93 GB)
>> DFS Used%: 9.55%
>> DFS Remaining%: 39.42%
>>
>>
>> Thanks,
>>
>
>
>
> --
> Thanks and Regards
> Jagmohan Chauhan
> MSc student,CS
> Univ. of Saskatchewan
> IEEE Graduate Student Member
>
> http://homepage.usask.ca/~jac735/
>

Re: error while running reduce

Posted by Arindam Choudhury <ar...@gmail.com>.

how can I fix this?
when I run the same job with 1GB of input with 1 map and 1 reducer, it
works fine.


On Thu, Mar 7, 2013 at 11:14 PM, Jagmohan Chauhan <
simplefundumnnit@gmail.com> wrote:

> Hi
>
> I think the problem is in replication factor.. As, you are using
> replication factor of 1 and you have a single node the data cannot be
> replicated anywhere else.
>
> On Thu, Mar 7, 2013 at 4:31 AM, Arindam Choudhury <
> arindamchoudhury0@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to do a performance analysis of hadoop on virtual machine.
>> When I try to run terasort with 2GB of input data with 1 map and 1 reduce,
>> the map finishes properly, but reduce gives error. I can not understand
>> why? any help?
>>
>> I have a single node hadoop deployment in a virtual machine. The F18
>> virtual machine have 1 core and 2 GB of memory.
>>
>> my configuration:
>> core-site.xml
>> <configuration>
>> <property>
>>   <name>fs.default.name</name>
>>   <value>hdfs://hadoopa.arindam.com:54310</value>
>> </property>
>> <property>
>>   <name>hadoop.tmp.dir</name>
>>   <value>/tmp/${user.name}</value>
>> </property>
>> <property>
>>   <name>fs.inmemory.size.mb</name>
>>   <value>20</value>
>> </property>
>> <property>
>>   <name>io.file.buffer.size</name>
>>   <value>131072</value>
>> </property>
>> </configuration>
>>
>> hdfs-site.xml
>> <configuration>
>> <property>
>>   <name>dfs.name.dir</name>
>>   <value>/home/hadoop/hadoop-dir/name-dir</value>
>> </property>
>> <property>
>>   <name>dfs.data.dir</name>
>>   <value>/home/hadoop/hadoop-dir/data-dir</value>
>> </property>
>> <property>
>>   <name>dfs.block.size</name>
>>   <value>2048000000</value>
>>   <final>true</final>
>> </property>
>> <property>
>>   <name>dfs.replication</name>
>>   <value>1</value>
>> </property>
>> </configuration>
>>
>>
>> mapred-site.xml
>> <configuration>
>> <property>
>>   <name>mapred.job.tracker</name>
>>   <value>hadoopa.arindam.com:54311</value>
>> </property>
>> <property>
>>   <name>mapred.system.dir</name>
>>   <value>/home/hadoop/hadoop-dir/system-dir</value>
>> </property>
>> <property>
>>   <name>mapred.local.dir</name>
>>   <value>/home/hadoop/hadoop-dir/local-dir</value>
>> </property>
>> <property>
>>   <name>mapred.map.child.java.opts</name>
>>   <value>-Xmx1024M</value>
>> </property>
>> <property>
>>   <name>mapred.reduce.child.java.opts</name>
>>   <value>-Xmx1024M</value>
>> </property>
>> </configuration>
>>
>> I created 2GB of data to run tera sort.
>>
>> hadoop dfsadmin -report
>> Configured Capacity: 21606146048 (20.12 GB)
>> Present Capacity: 14480427242 (13.49 GB)
>> DFS Remaining: 12416368640 (11.56 GB)
>> DFS Used: 2064058602 (1.92 GB)
>> DFS Used%: 14.25%
>> Under replicated blocks: 0
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.122.32:50010
>> Decommission Status : Normal
>> Configured Capacity: 21606146048 (20.12 GB)
>> DFS Used: 2064058602 (1.92 GB)
>> Non DFS Used: 7125718806 (6.64 GB)
>> DFS Remaining: 12416368640(11.56 GB)
>> DFS Used%: 9.55%
>> DFS Remaining%: 57.47%
>>
>>
>> But when I run the terasort, i am getting the following error:
>>
>> 13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
>> attempt_201303041741_0002_r_000000_0, Status : FAILED
>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
>> /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
>> could only be replicated to 0 nodes, instead of 1
>>
>> hadoop dfsadmin -report
>> Configured Capacity: 21606146048 (20.12 GB)
>> Present Capacity: 10582014209 (9.86 GB)
>> DFS Remaining: 8517738496 (7.93 GB)
>> DFS Used: 2064275713 (1.92 GB)
>> DFS Used%: 19.51%
>> Under replicated blocks: 2
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.122.32:50010
>> Decommission Status : Normal
>> Configured Capacity: 21606146048 (20.12 GB)
>> DFS Used: 2064275713 (1.92 GB)
>> Non DFS Used: 11024131839 (10.27 GB)
>> DFS Remaining: 8517738496(7.93 GB)
>> DFS Used%: 9.55%
>> DFS Remaining%: 39.42%
>>
>>
>> Thanks,
>>
>
>
>
> --
> Thanks and Regards
> Jagmohan Chauhan
> MSc student,CS
> Univ. of Saskatchewan
> IEEE Graduate Student Member
>
> http://homepage.usask.ca/~jac735/
>

Re: error while running reduce

Posted by Jagmohan Chauhan <si...@gmail.com>.

Hi

I think the problem is in replication factor.. As, you are using
replication factor of 1 and you have a single node the data cannot be
replicated anywhere else.

On Thu, Mar 7, 2013 at 4:31 AM, Arindam Choudhury <
arindamchoudhury0@gmail.com> wrote:

> Hi,
>
> I am trying to do a performance analysis of hadoop on virtual machine.
> When I try to run terasort with 2GB of input data with 1 map and 1 reduce,
> the map finishes properly, but reduce gives error. I can not understand
> why? any help?
>
> I have a single node hadoop deployment in a virtual machine. The F18
> virtual machine have 1 core and 2 GB of memory.
>
> my configuration:
> core-site.xml
> <configuration>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://hadoopa.arindam.com:54310</value>
> </property>
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/tmp/${user.name}</value>
> </property>
> <property>
>   <name>fs.inmemory.size.mb</name>
>   <value>20</value>
> </property>
> <property>
>   <name>io.file.buffer.size</name>
>   <value>131072</value>
> </property>
> </configuration>
>
> hdfs-site.xml
> <configuration>
> <property>
>   <name>dfs.name.dir</name>
>   <value>/home/hadoop/hadoop-dir/name-dir</value>
> </property>
> <property>
>   <name>dfs.data.dir</name>
>   <value>/home/hadoop/hadoop-dir/data-dir</value>
> </property>
> <property>
>   <name>dfs.block.size</name>
>   <value>2048000000</value>
>   <final>true</final>
> </property>
> <property>
>   <name>dfs.replication</name>
>   <value>1</value>
> </property>
> </configuration>
>
>
> mapred-site.xml
> <configuration>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>hadoopa.arindam.com:54311</value>
> </property>
> <property>
>   <name>mapred.system.dir</name>
>   <value>/home/hadoop/hadoop-dir/system-dir</value>
> </property>
> <property>
>   <name>mapred.local.dir</name>
>   <value>/home/hadoop/hadoop-dir/local-dir</value>
> </property>
> <property>
>   <name>mapred.map.child.java.opts</name>
>   <value>-Xmx1024M</value>
> </property>
> <property>
>   <name>mapred.reduce.child.java.opts</name>
>   <value>-Xmx1024M</value>
> </property>
> </configuration>
>
> I created 2GB of data to run tera sort.
>
> hadoop dfsadmin -report
> Configured Capacity: 21606146048 (20.12 GB)
> Present Capacity: 14480427242 (13.49 GB)
> DFS Remaining: 12416368640 (11.56 GB)
> DFS Used: 2064058602 (1.92 GB)
> DFS Used%: 14.25%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.122.32:50010
> Decommission Status : Normal
> Configured Capacity: 21606146048 (20.12 GB)
> DFS Used: 2064058602 (1.92 GB)
> Non DFS Used: 7125718806 (6.64 GB)
> DFS Remaining: 12416368640(11.56 GB)
> DFS Used%: 9.55%
> DFS Remaining%: 57.47%
>
>
> But when I run the terasort, i am getting the following error:
>
> 13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
> attempt_201303041741_0002_r_000000_0, Status : FAILED
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
> could only be replicated to 0 nodes, instead of 1
>
> hadoop dfsadmin -report
> Configured Capacity: 21606146048 (20.12 GB)
> Present Capacity: 10582014209 (9.86 GB)
> DFS Remaining: 8517738496 (7.93 GB)
> DFS Used: 2064275713 (1.92 GB)
> DFS Used%: 19.51%
> Under replicated blocks: 2
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.122.32:50010
> Decommission Status : Normal
> Configured Capacity: 21606146048 (20.12 GB)
> DFS Used: 2064275713 (1.92 GB)
> Non DFS Used: 11024131839 (10.27 GB)
> DFS Remaining: 8517738496(7.93 GB)
> DFS Used%: 9.55%
> DFS Remaining%: 39.42%
>
>
> Thanks,
>



-- 
Thanks and Regards
Jagmohan Chauhan
MSc student,CS
Univ. of Saskatchewan
IEEE Graduate Student Member

http://homepage.usask.ca/~jac735/

Re: error while running reduce

Posted by Jagmohan Chauhan <si...@gmail.com>.

Hi

I think the problem is in replication factor.. As, you are using
replication factor of 1 and you have a single node the data cannot be
replicated anywhere else.

On Thu, Mar 7, 2013 at 4:31 AM, Arindam Choudhury <
arindamchoudhury0@gmail.com> wrote:

> Hi,
>
> I am trying to do a performance analysis of hadoop on virtual machine.
> When I try to run terasort with 2GB of input data with 1 map and 1 reduce,
> the map finishes properly, but reduce gives error. I can not understand
> why? any help?
>
> I have a single node hadoop deployment in a virtual machine. The F18
> virtual machine have 1 core and 2 GB of memory.
>
> my configuration:
> core-site.xml
> <configuration>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://hadoopa.arindam.com:54310</value>
> </property>
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/tmp/${user.name}</value>
> </property>
> <property>
>   <name>fs.inmemory.size.mb</name>
>   <value>20</value>
> </property>
> <property>
>   <name>io.file.buffer.size</name>
>   <value>131072</value>
> </property>
> </configuration>
>
> hdfs-site.xml
> <configuration>
> <property>
>   <name>dfs.name.dir</name>
>   <value>/home/hadoop/hadoop-dir/name-dir</value>
> </property>
> <property>
>   <name>dfs.data.dir</name>
>   <value>/home/hadoop/hadoop-dir/data-dir</value>
> </property>
> <property>
>   <name>dfs.block.size</name>
>   <value>2048000000</value>
>   <final>true</final>
> </property>
> <property>
>   <name>dfs.replication</name>
>   <value>1</value>
> </property>
> </configuration>
>
>
> mapred-site.xml
> <configuration>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>hadoopa.arindam.com:54311</value>
> </property>
> <property>
>   <name>mapred.system.dir</name>
>   <value>/home/hadoop/hadoop-dir/system-dir</value>
> </property>
> <property>
>   <name>mapred.local.dir</name>
>   <value>/home/hadoop/hadoop-dir/local-dir</value>
> </property>
> <property>
>   <name>mapred.map.child.java.opts</name>
>   <value>-Xmx1024M</value>
> </property>
> <property>
>   <name>mapred.reduce.child.java.opts</name>
>   <value>-Xmx1024M</value>
> </property>
> </configuration>
>
> I created 2GB of data to run tera sort.
>
> hadoop dfsadmin -report
> Configured Capacity: 21606146048 (20.12 GB)
> Present Capacity: 14480427242 (13.49 GB)
> DFS Remaining: 12416368640 (11.56 GB)
> DFS Used: 2064058602 (1.92 GB)
> DFS Used%: 14.25%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.122.32:50010
> Decommission Status : Normal
> Configured Capacity: 21606146048 (20.12 GB)
> DFS Used: 2064058602 (1.92 GB)
> Non DFS Used: 7125718806 (6.64 GB)
> DFS Remaining: 12416368640(11.56 GB)
> DFS Used%: 9.55%
> DFS Remaining%: 57.47%
>
>
> But when I run the terasort, i am getting the following error:
>
> 13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
> attempt_201303041741_0002_r_000000_0, Status : FAILED
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
> could only be replicated to 0 nodes, instead of 1
>
> hadoop dfsadmin -report
> Configured Capacity: 21606146048 (20.12 GB)
> Present Capacity: 10582014209 (9.86 GB)
> DFS Remaining: 8517738496 (7.93 GB)
> DFS Used: 2064275713 (1.92 GB)
> DFS Used%: 19.51%
> Under replicated blocks: 2
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.122.32:50010
> Decommission Status : Normal
> Configured Capacity: 21606146048 (20.12 GB)
> DFS Used: 2064275713 (1.92 GB)
> Non DFS Used: 11024131839 (10.27 GB)
> DFS Remaining: 8517738496(7.93 GB)
> DFS Used%: 9.55%
> DFS Remaining%: 39.42%
>
>
> Thanks,
>



-- 
Thanks and Regards
Jagmohan Chauhan
MSc student,CS
Univ. of Saskatchewan
IEEE Graduate Student Member

http://homepage.usask.ca/~jac735/

Re: error while running reduce

Posted by Jagmohan Chauhan <si...@gmail.com>.

Hi

I think the problem is in replication factor.. As, you are using
replication factor of 1 and you have a single node the data cannot be
replicated anywhere else.

On Thu, Mar 7, 2013 at 4:31 AM, Arindam Choudhury <
arindamchoudhury0@gmail.com> wrote:

> Hi,
>
> I am trying to do a performance analysis of hadoop on virtual machine.
> When I try to run terasort with 2GB of input data with 1 map and 1 reduce,
> the map finishes properly, but reduce gives error. I can not understand
> why? any help?
>
> I have a single node hadoop deployment in a virtual machine. The F18
> virtual machine have 1 core and 2 GB of memory.
>
> my configuration:
> core-site.xml
> <configuration>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://hadoopa.arindam.com:54310</value>
> </property>
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/tmp/${user.name}</value>
> </property>
> <property>
>   <name>fs.inmemory.size.mb</name>
>   <value>20</value>
> </property>
> <property>
>   <name>io.file.buffer.size</name>
>   <value>131072</value>
> </property>
> </configuration>
>
> hdfs-site.xml
> <configuration>
> <property>
>   <name>dfs.name.dir</name>
>   <value>/home/hadoop/hadoop-dir/name-dir</value>
> </property>
> <property>
>   <name>dfs.data.dir</name>
>   <value>/home/hadoop/hadoop-dir/data-dir</value>
> </property>
> <property>
>   <name>dfs.block.size</name>
>   <value>2048000000</value>
>   <final>true</final>
> </property>
> <property>
>   <name>dfs.replication</name>
>   <value>1</value>
> </property>
> </configuration>
>
>
> mapred-site.xml
> <configuration>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>hadoopa.arindam.com:54311</value>
> </property>
> <property>
>   <name>mapred.system.dir</name>
>   <value>/home/hadoop/hadoop-dir/system-dir</value>
> </property>
> <property>
>   <name>mapred.local.dir</name>
>   <value>/home/hadoop/hadoop-dir/local-dir</value>
> </property>
> <property>
>   <name>mapred.map.child.java.opts</name>
>   <value>-Xmx1024M</value>
> </property>
> <property>
>   <name>mapred.reduce.child.java.opts</name>
>   <value>-Xmx1024M</value>
> </property>
> </configuration>
>
> I created 2GB of data to run tera sort.
>
> hadoop dfsadmin -report
> Configured Capacity: 21606146048 (20.12 GB)
> Present Capacity: 14480427242 (13.49 GB)
> DFS Remaining: 12416368640 (11.56 GB)
> DFS Used: 2064058602 (1.92 GB)
> DFS Used%: 14.25%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.122.32:50010
> Decommission Status : Normal
> Configured Capacity: 21606146048 (20.12 GB)
> DFS Used: 2064058602 (1.92 GB)
> Non DFS Used: 7125718806 (6.64 GB)
> DFS Remaining: 12416368640(11.56 GB)
> DFS Used%: 9.55%
> DFS Remaining%: 57.47%
>
>
> But when I run the terasort, i am getting the following error:
>
> 13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
> attempt_201303041741_0002_r_000000_0, Status : FAILED
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
> could only be replicated to 0 nodes, instead of 1
>
> hadoop dfsadmin -report
> Configured Capacity: 21606146048 (20.12 GB)
> Present Capacity: 10582014209 (9.86 GB)
> DFS Remaining: 8517738496 (7.93 GB)
> DFS Used: 2064275713 (1.92 GB)
> DFS Used%: 19.51%
> Under replicated blocks: 2
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.122.32:50010
> Decommission Status : Normal
> Configured Capacity: 21606146048 (20.12 GB)
> DFS Used: 2064275713 (1.92 GB)
> Non DFS Used: 11024131839 (10.27 GB)
> DFS Remaining: 8517738496(7.93 GB)
> DFS Used%: 9.55%
> DFS Remaining%: 39.42%
>
>
> Thanks,
>



-- 
Thanks and Regards
Jagmohan Chauhan
MSc student,CS
Univ. of Saskatchewan
IEEE Graduate Student Member

http://homepage.usask.ca/~jac735/

Re: error while running reduce

Posted by Jagmohan Chauhan <si...@gmail.com>.

Hi

I think the problem is in replication factor.. As, you are using
replication factor of 1 and you have a single node the data cannot be
replicated anywhere else.

On Thu, Mar 7, 2013 at 4:31 AM, Arindam Choudhury <
arindamchoudhury0@gmail.com> wrote:

> Hi,
>
> I am trying to do a performance analysis of hadoop on virtual machine.
> When I try to run terasort with 2GB of input data with 1 map and 1 reduce,
> the map finishes properly, but reduce gives error. I can not understand
> why? any help?
>
> I have a single node hadoop deployment in a virtual machine. The F18
> virtual machine have 1 core and 2 GB of memory.
>
> my configuration:
> core-site.xml
> <configuration>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://hadoopa.arindam.com:54310</value>
> </property>
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/tmp/${user.name}</value>
> </property>
> <property>
>   <name>fs.inmemory.size.mb</name>
>   <value>20</value>
> </property>
> <property>
>   <name>io.file.buffer.size</name>
>   <value>131072</value>
> </property>
> </configuration>
>
> hdfs-site.xml
> <configuration>
> <property>
>   <name>dfs.name.dir</name>
>   <value>/home/hadoop/hadoop-dir/name-dir</value>
> </property>
> <property>
>   <name>dfs.data.dir</name>
>   <value>/home/hadoop/hadoop-dir/data-dir</value>
> </property>
> <property>
>   <name>dfs.block.size</name>
>   <value>2048000000</value>
>   <final>true</final>
> </property>
> <property>
>   <name>dfs.replication</name>
>   <value>1</value>
> </property>
> </configuration>
>
>
> mapred-site.xml
> <configuration>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>hadoopa.arindam.com:54311</value>
> </property>
> <property>
>   <name>mapred.system.dir</name>
>   <value>/home/hadoop/hadoop-dir/system-dir</value>
> </property>
> <property>
>   <name>mapred.local.dir</name>
>   <value>/home/hadoop/hadoop-dir/local-dir</value>
> </property>
> <property>
>   <name>mapred.map.child.java.opts</name>
>   <value>-Xmx1024M</value>
> </property>
> <property>
>   <name>mapred.reduce.child.java.opts</name>
>   <value>-Xmx1024M</value>
> </property>
> </configuration>
>
> I created 2GB of data to run tera sort.
>
> hadoop dfsadmin -report
> Configured Capacity: 21606146048 (20.12 GB)
> Present Capacity: 14480427242 (13.49 GB)
> DFS Remaining: 12416368640 (11.56 GB)
> DFS Used: 2064058602 (1.92 GB)
> DFS Used%: 14.25%
> Under replicated blocks: 0
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.122.32:50010
> Decommission Status : Normal
> Configured Capacity: 21606146048 (20.12 GB)
> DFS Used: 2064058602 (1.92 GB)
> Non DFS Used: 7125718806 (6.64 GB)
> DFS Remaining: 12416368640(11.56 GB)
> DFS Used%: 9.55%
> DFS Remaining%: 57.47%
>
>
> But when I run the terasort, i am getting the following error:
>
> 13/03/04 17:56:16 INFO mapred.JobClient: Task Id :
> attempt_201303041741_0002_r_000000_0, Status : FAILED
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000
> could only be replicated to 0 nodes, instead of 1
>
> hadoop dfsadmin -report
> Configured Capacity: 21606146048 (20.12 GB)
> Present Capacity: 10582014209 (9.86 GB)
> DFS Remaining: 8517738496 (7.93 GB)
> DFS Used: 2064275713 (1.92 GB)
> DFS Used%: 19.51%
> Under replicated blocks: 2
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.122.32:50010
> Decommission Status : Normal
> Configured Capacity: 21606146048 (20.12 GB)
> DFS Used: 2064275713 (1.92 GB)
> Non DFS Used: 11024131839 (10.27 GB)
> DFS Remaining: 8517738496(7.93 GB)
> DFS Used%: 9.55%
> DFS Remaining%: 39.42%
>
>
> Thanks,
>



-- 
Thanks and Regards
Jagmohan Chauhan
MSc student,CS
Univ. of Saskatchewan
IEEE Graduate Student Member

http://homepage.usask.ca/~jac735/

RE: error while running reduce

Posted by Samir Kumar Das Mohapatra <da...@adobe.com>.

<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>

Above parameter you have to increase the memory. Bcz it is showing only 1GB , Again VM will  comsume more memory as compared with real physical system.

Just try to increase the VM Memory first. Then increase the reduce memory.



From: Arindam Choudhury [mailto:arindamchoudhury0@gmail.com]
Sent: 07 March 2013 18:01
To: user@hadoop.apache.org
Subject: error while running reduce

Hi,
I am trying to do a performance analysis of hadoop on virtual machine. When I try to run terasort with 2GB of input data with 1 map and 1 reduce, the map finishes properly, but reduce gives error. I can not understand why? any help?

I have a single node hadoop deployment in a virtual machine. The F18 virtual machine have 1 core and 2 GB of memory.
my configuration:
core-site.xml
<configuration>
<property>
  <name>fs.default.name<http://fs.default.name></name>
  <value>hdfs://hadoopa.arindam.com:54310<http://hadoopa.arindam.com:54310></value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/${user.name<http://user.name>}</value>
</property>
<property>
  <name>fs.inmemory.size.mb</name>
  <value>20</value>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
</property>
</configuration>

hdfs-site.xml
<configuration>
<property>
  <name>dfs.name.dir</name>
  <value>/home/hadoop/hadoop-dir/name-dir</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/home/hadoop/hadoop-dir/data-dir</value>
</property>
<property>
  <name>dfs.block.size</name>
  <value>2048000000<tel:2048000000></value>
  <final>true</final>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
</configuration>


mapred-site.xml
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>hadoopa.arindam.com:54311<http://hadoopa.arindam.com:54311></value>
</property>
<property>
  <name>mapred.system.dir</name>
  <value>/home/hadoop/hadoop-dir/system-dir</value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value>/home/hadoop/hadoop-dir/local-dir</value>
</property>
<property>
  <name>mapred.map.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
</configuration>
I created 2GB of data to run tera sort.

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 14480427242 (13.49 GB)
DFS Remaining: 12416368640 (11.56 GB)
DFS Used: 2064058602<tel:2064058602> (1.92 GB)
DFS Used%: 14.25%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010<http://192.168.122.32:50010>
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064058602<tel:2064058602> (1.92 GB)
Non DFS Used: 7125718806<tel:7125718806> (6.64 GB)
DFS Remaining: 12416368640(11.56 GB)
DFS Used%: 9.55%
DFS Remaining%: 57.47%

But when I run the terasort, i am getting the following error:

13/03/04 17:56:16 INFO mapred.JobClient: Task Id : attempt_201303041741_0002_r_000000_0, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000 could only be replicated to 0 nodes, instead of 1

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 10582014209 (9.86 GB)
DFS Remaining: 8517738496 (7.93 GB)
DFS Used: 2064275713<tel:2064275713> (1.92 GB)
DFS Used%: 19.51%
Under replicated blocks: 2
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010<http://192.168.122.32:50010>
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064275713<tel:2064275713> (1.92 GB)
Non DFS Used: 11024131839 (10.27 GB)
DFS Remaining: 8517738496(7.93 GB)
DFS Used%: 9.55%
DFS Remaining%: 39.42%


Thanks,

RE: error while running reduce

Posted by Samir Kumar Das Mohapatra <da...@adobe.com>.

<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>

Above parameter you have to increase the memory. Bcz it is showing only 1GB , Again VM will  comsume more memory as compared with real physical system.

Just try to increase the VM Memory first. Then increase the reduce memory.



From: Arindam Choudhury [mailto:arindamchoudhury0@gmail.com]
Sent: 07 March 2013 18:01
To: user@hadoop.apache.org
Subject: error while running reduce

Hi,
I am trying to do a performance analysis of hadoop on virtual machine. When I try to run terasort with 2GB of input data with 1 map and 1 reduce, the map finishes properly, but reduce gives error. I can not understand why? any help?

I have a single node hadoop deployment in a virtual machine. The F18 virtual machine have 1 core and 2 GB of memory.
my configuration:
core-site.xml
<configuration>
<property>
  <name>fs.default.name<http://fs.default.name></name>
  <value>hdfs://hadoopa.arindam.com:54310<http://hadoopa.arindam.com:54310></value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/${user.name<http://user.name>}</value>
</property>
<property>
  <name>fs.inmemory.size.mb</name>
  <value>20</value>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
</property>
</configuration>

hdfs-site.xml
<configuration>
<property>
  <name>dfs.name.dir</name>
  <value>/home/hadoop/hadoop-dir/name-dir</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/home/hadoop/hadoop-dir/data-dir</value>
</property>
<property>
  <name>dfs.block.size</name>
  <value>2048000000<tel:2048000000></value>
  <final>true</final>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
</configuration>


mapred-site.xml
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>hadoopa.arindam.com:54311<http://hadoopa.arindam.com:54311></value>
</property>
<property>
  <name>mapred.system.dir</name>
  <value>/home/hadoop/hadoop-dir/system-dir</value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value>/home/hadoop/hadoop-dir/local-dir</value>
</property>
<property>
  <name>mapred.map.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
</configuration>
I created 2GB of data to run tera sort.

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 14480427242 (13.49 GB)
DFS Remaining: 12416368640 (11.56 GB)
DFS Used: 2064058602<tel:2064058602> (1.92 GB)
DFS Used%: 14.25%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010<http://192.168.122.32:50010>
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064058602<tel:2064058602> (1.92 GB)
Non DFS Used: 7125718806<tel:7125718806> (6.64 GB)
DFS Remaining: 12416368640(11.56 GB)
DFS Used%: 9.55%
DFS Remaining%: 57.47%

But when I run the terasort, i am getting the following error:

13/03/04 17:56:16 INFO mapred.JobClient: Task Id : attempt_201303041741_0002_r_000000_0, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000 could only be replicated to 0 nodes, instead of 1

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 10582014209 (9.86 GB)
DFS Remaining: 8517738496 (7.93 GB)
DFS Used: 2064275713<tel:2064275713> (1.92 GB)
DFS Used%: 19.51%
Under replicated blocks: 2
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010<http://192.168.122.32:50010>
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064275713<tel:2064275713> (1.92 GB)
Non DFS Used: 11024131839 (10.27 GB)
DFS Remaining: 8517738496(7.93 GB)
DFS Used%: 9.55%
DFS Remaining%: 39.42%


Thanks,

RE: error while running reduce

Posted by Samir Kumar Das Mohapatra <da...@adobe.com>.

<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>

Above parameter you have to increase the memory. Bcz it is showing only 1GB , Again VM will  comsume more memory as compared with real physical system.

Just try to increase the VM Memory first. Then increase the reduce memory.



From: Arindam Choudhury [mailto:arindamchoudhury0@gmail.com]
Sent: 07 March 2013 18:01
To: user@hadoop.apache.org
Subject: error while running reduce

Hi,
I am trying to do a performance analysis of hadoop on virtual machine. When I try to run terasort with 2GB of input data with 1 map and 1 reduce, the map finishes properly, but reduce gives error. I can not understand why? any help?

I have a single node hadoop deployment in a virtual machine. The F18 virtual machine have 1 core and 2 GB of memory.
my configuration:
core-site.xml
<configuration>
<property>
  <name>fs.default.name<http://fs.default.name></name>
  <value>hdfs://hadoopa.arindam.com:54310<http://hadoopa.arindam.com:54310></value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/${user.name<http://user.name>}</value>
</property>
<property>
  <name>fs.inmemory.size.mb</name>
  <value>20</value>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
</property>
</configuration>

hdfs-site.xml
<configuration>
<property>
  <name>dfs.name.dir</name>
  <value>/home/hadoop/hadoop-dir/name-dir</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/home/hadoop/hadoop-dir/data-dir</value>
</property>
<property>
  <name>dfs.block.size</name>
  <value>2048000000<tel:2048000000></value>
  <final>true</final>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
</configuration>


mapred-site.xml
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>hadoopa.arindam.com:54311<http://hadoopa.arindam.com:54311></value>
</property>
<property>
  <name>mapred.system.dir</name>
  <value>/home/hadoop/hadoop-dir/system-dir</value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value>/home/hadoop/hadoop-dir/local-dir</value>
</property>
<property>
  <name>mapred.map.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
</configuration>
I created 2GB of data to run tera sort.

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 14480427242 (13.49 GB)
DFS Remaining: 12416368640 (11.56 GB)
DFS Used: 2064058602<tel:2064058602> (1.92 GB)
DFS Used%: 14.25%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010<http://192.168.122.32:50010>
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064058602<tel:2064058602> (1.92 GB)
Non DFS Used: 7125718806<tel:7125718806> (6.64 GB)
DFS Remaining: 12416368640(11.56 GB)
DFS Used%: 9.55%
DFS Remaining%: 57.47%

But when I run the terasort, i am getting the following error:

13/03/04 17:56:16 INFO mapred.JobClient: Task Id : attempt_201303041741_0002_r_000000_0, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000 could only be replicated to 0 nodes, instead of 1

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 10582014209 (9.86 GB)
DFS Remaining: 8517738496 (7.93 GB)
DFS Used: 2064275713<tel:2064275713> (1.92 GB)
DFS Used%: 19.51%
Under replicated blocks: 2
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010<http://192.168.122.32:50010>
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064275713<tel:2064275713> (1.92 GB)
Non DFS Used: 11024131839 (10.27 GB)
DFS Remaining: 8517738496(7.93 GB)
DFS Used%: 9.55%
DFS Remaining%: 39.42%


Thanks,

RE: error while running reduce

Posted by Samir Kumar Das Mohapatra <da...@adobe.com>.

<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>

Above parameter you have to increase the memory. Bcz it is showing only 1GB , Again VM will  comsume more memory as compared with real physical system.

Just try to increase the VM Memory first. Then increase the reduce memory.



From: Arindam Choudhury [mailto:arindamchoudhury0@gmail.com]
Sent: 07 March 2013 18:01
To: user@hadoop.apache.org
Subject: error while running reduce

Hi,
I am trying to do a performance analysis of hadoop on virtual machine. When I try to run terasort with 2GB of input data with 1 map and 1 reduce, the map finishes properly, but reduce gives error. I can not understand why? any help?

I have a single node hadoop deployment in a virtual machine. The F18 virtual machine have 1 core and 2 GB of memory.
my configuration:
core-site.xml
<configuration>
<property>
  <name>fs.default.name<http://fs.default.name></name>
  <value>hdfs://hadoopa.arindam.com:54310<http://hadoopa.arindam.com:54310></value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/${user.name<http://user.name>}</value>
</property>
<property>
  <name>fs.inmemory.size.mb</name>
  <value>20</value>
</property>
<property>
  <name>io.file.buffer.size</name>
  <value>131072</value>
</property>
</configuration>

hdfs-site.xml
<configuration>
<property>
  <name>dfs.name.dir</name>
  <value>/home/hadoop/hadoop-dir/name-dir</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/home/hadoop/hadoop-dir/data-dir</value>
</property>
<property>
  <name>dfs.block.size</name>
  <value>2048000000<tel:2048000000></value>
  <final>true</final>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
</configuration>


mapred-site.xml
<configuration>
<property>
  <name>mapred.job.tracker</name>
  <value>hadoopa.arindam.com:54311<http://hadoopa.arindam.com:54311></value>
</property>
<property>
  <name>mapred.system.dir</name>
  <value>/home/hadoop/hadoop-dir/system-dir</value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value>/home/hadoop/hadoop-dir/local-dir</value>
</property>
<property>
  <name>mapred.map.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-Xmx1024M</value>
</property>
</configuration>
I created 2GB of data to run tera sort.

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 14480427242 (13.49 GB)
DFS Remaining: 12416368640 (11.56 GB)
DFS Used: 2064058602<tel:2064058602> (1.92 GB)
DFS Used%: 14.25%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010<http://192.168.122.32:50010>
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064058602<tel:2064058602> (1.92 GB)
Non DFS Used: 7125718806<tel:7125718806> (6.64 GB)
DFS Remaining: 12416368640(11.56 GB)
DFS Used%: 9.55%
DFS Remaining%: 57.47%

But when I run the terasort, i am getting the following error:

13/03/04 17:56:16 INFO mapred.JobClient: Task Id : attempt_201303041741_0002_r_000000_0, Status : FAILED
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hadoop/output/_temporary/_attempt_201303041741_0002_r_000000_0/part-00000 could only be replicated to 0 nodes, instead of 1

hadoop dfsadmin -report
Configured Capacity: 21606146048 (20.12 GB)
Present Capacity: 10582014209 (9.86 GB)
DFS Remaining: 8517738496 (7.93 GB)
DFS Used: 2064275713<tel:2064275713> (1.92 GB)
DFS Used%: 19.51%
Under replicated blocks: 2
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.122.32:50010<http://192.168.122.32:50010>
Decommission Status : Normal
Configured Capacity: 21606146048 (20.12 GB)
DFS Used: 2064275713<tel:2064275713> (1.92 GB)
Non DFS Used: 11024131839 (10.27 GB)
DFS Remaining: 8517738496(7.93 GB)
DFS Used%: 9.55%
DFS Remaining%: 39.42%


Thanks,