You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Manuel de Ferran <ma...@gmail.com> on 2013/07/03 18:14:30 UTC

Could not get additional block while writing hundreds of files

Greetings all,

we try to import data to an HDFS cluster, but we face random Exception. We
try to figure out what is the root cause: misconfiguration, too much load,
... and how to solve that.

The client writes hundred of files with a replication factor of 3. It
crashes sometimes at the beginning, sometimes close to the end, and in rare
case it succeeds.

On failure, we have on client side:
 DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: File /log/1372863795616 could only be replicated to 0
nodes, instead of 1
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
         ....

which seems to be well known. We have followed the hints from the
Troubleshooting page, but we're still stuck: lots of disk available on
datanodes, free inodes, far below the open files limit , all datanodes are
up and running.

Note that we have other HDFS clients that are still able to write files
while import is running.

Here is the corresponding extract of the namenode log file:

2013-07-03 15:03:15,951 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
transactions: 46009 Total time for transactions(ms): 153Number of
transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
139555
2013-07-03 15:03:16,427 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 3
2013-07-03 15:03:16,427 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:root cause:java.io.IOException: File /log/1372863795616 could only be
replicated to 0 nodes, instead of 1
2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
null) from 192.168.1.141:41376: error: java.io.IOException: File
/log/1372863795616 could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /log/1372863795616 could only be replicated to 0
nodes, instead of 1
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)


During the process, fsck reports about 300 of open files. The cluster is
running hadoop-1.0.3.

Any advice about the configuration ? We tried to
lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
maybe raising dfs.datanode.handler.count ?


Thanks for your help

Re: Could not get additional block while writing hundreds of files

Posted by Manuel de Ferran <ma...@gmail.com>.
Hye Azuryy,

During the import, dfsadmin -report :

DFS Used%: 17.72%

Moreover, it succeeds from time to time w/ the same data load. It seems
that Datanode appears to be down to the Namenode, but why ?



On Thu, Jul 4, 2013 at 3:31 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Manuel,
>
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
>
>
> This indicates you haven't enough space on the HDFS. can you check the
> cluster capacity used?
>
>
>
>
> On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <
> manuel.deferran@gmail.com> wrote:
>
>> Greetings all,
>>
>> we try to import data to an HDFS cluster, but we face random Exception.
>> We try to figure out what is the root cause: misconfiguration, too much
>> load, ... and how to solve that.
>>
>> The client writes hundred of files with a replication factor of 3. It
>> crashes sometimes at the beginning, sometimes close to the end, and in rare
>> case it succeeds.
>>
>> On failure, we have on client side:
>>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
>> java.io.IOException: File /log/1372863795616 could only be replicated to 0
>> nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>          ....
>>
>> which seems to be well known. We have followed the hints from the
>> Troubleshooting page, but we're still stuck: lots of disk available on
>> datanodes, free inodes, far below the open files limit , all datanodes are
>> up and running.
>>
>> Note that we have other HDFS clients that are still able to write files
>> while import is running.
>>
>> Here is the corresponding extract of the namenode log file:
>>
>> 2013-07-03 15:03:15,951 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
>> transactions: 46009 Total time for transactions(ms): 153Number of
>> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
>> 139555
>> 2013-07-03 15:03:16,427 WARN
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
>> enough replicas, still in need of 3
>> 2013-07-03 15:03:16,427 ERROR
>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> as:root cause:java.io.IOException: File /log/1372863795616 could only be
>> replicated to 0 nodes, instead of 1
>> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
>> null) from 192.168.1.141:41376: error: java.io.IOException: File
>> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
>> java.io.IOException: File /log/1372863795616 could only be replicated to
>> 0 nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>
>>
>> During the process, fsck reports about 300 of open files. The cluster is
>> running hadoop-1.0.3.
>>
>> Any advice about the configuration ? We tried to
>> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
>> maybe raising dfs.datanode.handler.count ?
>>
>>
>> Thanks for your help
>>
>
>

Re: Could not get additional block while writing hundreds of files

Posted by Manuel de Ferran <ma...@gmail.com>.
Hye Azuryy,

During the import, dfsadmin -report :

DFS Used%: 17.72%

Moreover, it succeeds from time to time w/ the same data load. It seems
that Datanode appears to be down to the Namenode, but why ?



On Thu, Jul 4, 2013 at 3:31 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Manuel,
>
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
>
>
> This indicates you haven't enough space on the HDFS. can you check the
> cluster capacity used?
>
>
>
>
> On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <
> manuel.deferran@gmail.com> wrote:
>
>> Greetings all,
>>
>> we try to import data to an HDFS cluster, but we face random Exception.
>> We try to figure out what is the root cause: misconfiguration, too much
>> load, ... and how to solve that.
>>
>> The client writes hundred of files with a replication factor of 3. It
>> crashes sometimes at the beginning, sometimes close to the end, and in rare
>> case it succeeds.
>>
>> On failure, we have on client side:
>>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
>> java.io.IOException: File /log/1372863795616 could only be replicated to 0
>> nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>          ....
>>
>> which seems to be well known. We have followed the hints from the
>> Troubleshooting page, but we're still stuck: lots of disk available on
>> datanodes, free inodes, far below the open files limit , all datanodes are
>> up and running.
>>
>> Note that we have other HDFS clients that are still able to write files
>> while import is running.
>>
>> Here is the corresponding extract of the namenode log file:
>>
>> 2013-07-03 15:03:15,951 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
>> transactions: 46009 Total time for transactions(ms): 153Number of
>> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
>> 139555
>> 2013-07-03 15:03:16,427 WARN
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
>> enough replicas, still in need of 3
>> 2013-07-03 15:03:16,427 ERROR
>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> as:root cause:java.io.IOException: File /log/1372863795616 could only be
>> replicated to 0 nodes, instead of 1
>> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
>> null) from 192.168.1.141:41376: error: java.io.IOException: File
>> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
>> java.io.IOException: File /log/1372863795616 could only be replicated to
>> 0 nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>
>>
>> During the process, fsck reports about 300 of open files. The cluster is
>> running hadoop-1.0.3.
>>
>> Any advice about the configuration ? We tried to
>> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
>> maybe raising dfs.datanode.handler.count ?
>>
>>
>> Thanks for your help
>>
>
>

Re: Could not get additional block while writing hundreds of files

Posted by Manuel de Ferran <ma...@gmail.com>.
Hye Azuryy,

During the import, dfsadmin -report :

DFS Used%: 17.72%

Moreover, it succeeds from time to time w/ the same data load. It seems
that Datanode appears to be down to the Namenode, but why ?



On Thu, Jul 4, 2013 at 3:31 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Manuel,
>
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
>
>
> This indicates you haven't enough space on the HDFS. can you check the
> cluster capacity used?
>
>
>
>
> On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <
> manuel.deferran@gmail.com> wrote:
>
>> Greetings all,
>>
>> we try to import data to an HDFS cluster, but we face random Exception.
>> We try to figure out what is the root cause: misconfiguration, too much
>> load, ... and how to solve that.
>>
>> The client writes hundred of files with a replication factor of 3. It
>> crashes sometimes at the beginning, sometimes close to the end, and in rare
>> case it succeeds.
>>
>> On failure, we have on client side:
>>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
>> java.io.IOException: File /log/1372863795616 could only be replicated to 0
>> nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>          ....
>>
>> which seems to be well known. We have followed the hints from the
>> Troubleshooting page, but we're still stuck: lots of disk available on
>> datanodes, free inodes, far below the open files limit , all datanodes are
>> up and running.
>>
>> Note that we have other HDFS clients that are still able to write files
>> while import is running.
>>
>> Here is the corresponding extract of the namenode log file:
>>
>> 2013-07-03 15:03:15,951 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
>> transactions: 46009 Total time for transactions(ms): 153Number of
>> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
>> 139555
>> 2013-07-03 15:03:16,427 WARN
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
>> enough replicas, still in need of 3
>> 2013-07-03 15:03:16,427 ERROR
>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> as:root cause:java.io.IOException: File /log/1372863795616 could only be
>> replicated to 0 nodes, instead of 1
>> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
>> null) from 192.168.1.141:41376: error: java.io.IOException: File
>> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
>> java.io.IOException: File /log/1372863795616 could only be replicated to
>> 0 nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>
>>
>> During the process, fsck reports about 300 of open files. The cluster is
>> running hadoop-1.0.3.
>>
>> Any advice about the configuration ? We tried to
>> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
>> maybe raising dfs.datanode.handler.count ?
>>
>>
>> Thanks for your help
>>
>
>

Re: Could not get additional block while writing hundreds of files

Posted by Manuel de Ferran <ma...@gmail.com>.
Hye Azuryy,

During the import, dfsadmin -report :

DFS Used%: 17.72%

Moreover, it succeeds from time to time w/ the same data load. It seems
that Datanode appears to be down to the Namenode, but why ?



On Thu, Jul 4, 2013 at 3:31 AM, Azuryy Yu <az...@gmail.com> wrote:

> Hi Manuel,
>
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
>
>
> This indicates you haven't enough space on the HDFS. can you check the
> cluster capacity used?
>
>
>
>
> On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <
> manuel.deferran@gmail.com> wrote:
>
>> Greetings all,
>>
>> we try to import data to an HDFS cluster, but we face random Exception.
>> We try to figure out what is the root cause: misconfiguration, too much
>> load, ... and how to solve that.
>>
>> The client writes hundred of files with a replication factor of 3. It
>> crashes sometimes at the beginning, sometimes close to the end, and in rare
>> case it succeeds.
>>
>> On failure, we have on client side:
>>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
>> java.io.IOException: File /log/1372863795616 could only be replicated to 0
>> nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>          ....
>>
>> which seems to be well known. We have followed the hints from the
>> Troubleshooting page, but we're still stuck: lots of disk available on
>> datanodes, free inodes, far below the open files limit , all datanodes are
>> up and running.
>>
>> Note that we have other HDFS clients that are still able to write files
>> while import is running.
>>
>> Here is the corresponding extract of the namenode log file:
>>
>> 2013-07-03 15:03:15,951 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
>> transactions: 46009 Total time for transactions(ms): 153Number of
>> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
>> 139555
>> 2013-07-03 15:03:16,427 WARN
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
>> enough replicas, still in need of 3
>> 2013-07-03 15:03:16,427 ERROR
>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> as:root cause:java.io.IOException: File /log/1372863795616 could only be
>> replicated to 0 nodes, instead of 1
>> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
>> null) from 192.168.1.141:41376: error: java.io.IOException: File
>> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
>> java.io.IOException: File /log/1372863795616 could only be replicated to
>> 0 nodes, instead of 1
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>
>>
>> During the process, fsck reports about 300 of open files. The cluster is
>> running hadoop-1.0.3.
>>
>> Any advice about the configuration ? We tried to
>> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
>> maybe raising dfs.datanode.handler.count ?
>>
>>
>> Thanks for your help
>>
>
>

Re: Could not get additional block while writing hundreds of files

Posted by Azuryy Yu <az...@gmail.com>.
Hi Manuel,
2013-07-03 15:03:16,427 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 3
2013-07-03 15:03:16,427 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:root cause:java.io.IOException: File /log/1372863795616 could only be
replicated to 0 nodes, instead of 1


This indicates you haven't enough space on the HDFS. can you check the
cluster capacity used?




On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <manuel.deferran@gmail.com
> wrote:

> Greetings all,
>
> we try to import data to an HDFS cluster, but we face random Exception. We
> try to figure out what is the root cause: misconfiguration, too much load,
> ... and how to solve that.
>
> The client writes hundred of files with a replication factor of 3. It
> crashes sometimes at the beginning, sometimes close to the end, and in rare
> case it succeeds.
>
> On failure, we have on client side:
>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: File /log/1372863795616 could only be replicated to 0
> nodes, instead of 1
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>          ....
>
> which seems to be well known. We have followed the hints from the
> Troubleshooting page, but we're still stuck: lots of disk available on
> datanodes, free inodes, far below the open files limit , all datanodes are
> up and running.
>
> Note that we have other HDFS clients that are still able to write files
> while import is running.
>
> Here is the corresponding extract of the namenode log file:
>
> 2013-07-03 15:03:15,951 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 46009 Total time for transactions(ms): 153Number of
> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
> 139555
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
> null) from 192.168.1.141:41376: error: java.io.IOException: File
> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
> java.io.IOException: File /log/1372863795616 could only be replicated to 0
> nodes, instead of 1
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>
>
> During the process, fsck reports about 300 of open files. The cluster is
> running hadoop-1.0.3.
>
> Any advice about the configuration ? We tried to
> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
> maybe raising dfs.datanode.handler.count ?
>
>
> Thanks for your help
>

Re: Could not get additional block while writing hundreds of files

Posted by Azuryy Yu <az...@gmail.com>.
Hi Manuel,
2013-07-03 15:03:16,427 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 3
2013-07-03 15:03:16,427 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:root cause:java.io.IOException: File /log/1372863795616 could only be
replicated to 0 nodes, instead of 1


This indicates you haven't enough space on the HDFS. can you check the
cluster capacity used?




On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <manuel.deferran@gmail.com
> wrote:

> Greetings all,
>
> we try to import data to an HDFS cluster, but we face random Exception. We
> try to figure out what is the root cause: misconfiguration, too much load,
> ... and how to solve that.
>
> The client writes hundred of files with a replication factor of 3. It
> crashes sometimes at the beginning, sometimes close to the end, and in rare
> case it succeeds.
>
> On failure, we have on client side:
>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: File /log/1372863795616 could only be replicated to 0
> nodes, instead of 1
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>          ....
>
> which seems to be well known. We have followed the hints from the
> Troubleshooting page, but we're still stuck: lots of disk available on
> datanodes, free inodes, far below the open files limit , all datanodes are
> up and running.
>
> Note that we have other HDFS clients that are still able to write files
> while import is running.
>
> Here is the corresponding extract of the namenode log file:
>
> 2013-07-03 15:03:15,951 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 46009 Total time for transactions(ms): 153Number of
> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
> 139555
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
> null) from 192.168.1.141:41376: error: java.io.IOException: File
> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
> java.io.IOException: File /log/1372863795616 could only be replicated to 0
> nodes, instead of 1
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>
>
> During the process, fsck reports about 300 of open files. The cluster is
> running hadoop-1.0.3.
>
> Any advice about the configuration ? We tried to
> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
> maybe raising dfs.datanode.handler.count ?
>
>
> Thanks for your help
>

Re: Could not get additional block while writing hundreds of files

Posted by Azuryy Yu <az...@gmail.com>.
Hi Manuel,
2013-07-03 15:03:16,427 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 3
2013-07-03 15:03:16,427 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:root cause:java.io.IOException: File /log/1372863795616 could only be
replicated to 0 nodes, instead of 1


This indicates you haven't enough space on the HDFS. can you check the
cluster capacity used?




On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <manuel.deferran@gmail.com
> wrote:

> Greetings all,
>
> we try to import data to an HDFS cluster, but we face random Exception. We
> try to figure out what is the root cause: misconfiguration, too much load,
> ... and how to solve that.
>
> The client writes hundred of files with a replication factor of 3. It
> crashes sometimes at the beginning, sometimes close to the end, and in rare
> case it succeeds.
>
> On failure, we have on client side:
>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: File /log/1372863795616 could only be replicated to 0
> nodes, instead of 1
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>          ....
>
> which seems to be well known. We have followed the hints from the
> Troubleshooting page, but we're still stuck: lots of disk available on
> datanodes, free inodes, far below the open files limit , all datanodes are
> up and running.
>
> Note that we have other HDFS clients that are still able to write files
> while import is running.
>
> Here is the corresponding extract of the namenode log file:
>
> 2013-07-03 15:03:15,951 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 46009 Total time for transactions(ms): 153Number of
> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
> 139555
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
> null) from 192.168.1.141:41376: error: java.io.IOException: File
> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
> java.io.IOException: File /log/1372863795616 could only be replicated to 0
> nodes, instead of 1
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>
>
> During the process, fsck reports about 300 of open files. The cluster is
> running hadoop-1.0.3.
>
> Any advice about the configuration ? We tried to
> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
> maybe raising dfs.datanode.handler.count ?
>
>
> Thanks for your help
>

Re: Could not get additional block while writing hundreds of files

Posted by Azuryy Yu <az...@gmail.com>.
Hi Manuel,
2013-07-03 15:03:16,427 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 3
2013-07-03 15:03:16,427 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:root cause:java.io.IOException: File /log/1372863795616 could only be
replicated to 0 nodes, instead of 1


This indicates you haven't enough space on the HDFS. can you check the
cluster capacity used?




On Thu, Jul 4, 2013 at 12:14 AM, Manuel de Ferran <manuel.deferran@gmail.com
> wrote:

> Greetings all,
>
> we try to import data to an HDFS cluster, but we face random Exception. We
> try to figure out what is the root cause: misconfiguration, too much load,
> ... and how to solve that.
>
> The client writes hundred of files with a replication factor of 3. It
> crashes sometimes at the beginning, sometimes close to the end, and in rare
> case it succeeds.
>
> On failure, we have on client side:
>  DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: File /log/1372863795616 could only be replicated to 0
> nodes, instead of 1
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>          ....
>
> which seems to be well known. We have followed the hints from the
> Troubleshooting page, but we're still stuck: lots of disk available on
> datanodes, free inodes, far below the open files limit , all datanodes are
> up and running.
>
> Note that we have other HDFS clients that are still able to write files
> while import is running.
>
> Here is the corresponding extract of the namenode log file:
>
> 2013-07-03 15:03:15,951 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 46009 Total time for transactions(ms): 153Number of
> transactions batched in Syncs: 5428 Number of syncs: 32889 SyncTimes(ms):
> 139555
> 2013-07-03 15:03:16,427 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
> enough replicas, still in need of 3
> 2013-07-03 15:03:16,427 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:root cause:java.io.IOException: File /log/1372863795616 could only be
> replicated to 0 nodes, instead of 1
> 2013-07-03 15:03:16,427 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 9 on 9002, call addBlock(/log/1372863795616, DFSClient_1875494617,
> null) from 192.168.1.141:41376: error: java.io.IOException: File
> /log/1372863795616 could only be replicated to 0 nodes, instead of 1
> java.io.IOException: File /log/1372863795616 could only be replicated to 0
> nodes, instead of 1
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>
>
> During the process, fsck reports about 300 of open files. The cluster is
> running hadoop-1.0.3.
>
> Any advice about the configuration ? We tried to
> lower dfs.heartbeat.interval, we raised dfs.datanode.max.xcievers to 4k
> maybe raising dfs.datanode.handler.count ?
>
>
> Thanks for your help
>