You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by James Kinley <ja...@gmail.com> on 2012/06/14 18:16:40 UTC

NameNode rollEdidLog: removing storage

Hi,

Our production cluster started reporting "too many open files" this
afternoon and subsequently was unable to save any snapshots to disk.
We have been able to recover it ok, but I would have expected the NN
to complain more if it cannot save a snapshot. All I saw in the log
was...

"WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog:
removing storage <local dir>"
"WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog:
removing storage <nfs dir>"

Do you think this should trigger the NN to enter safe mode. The longer
this goes un-noticed, the more data could be lost if the NN cannot be
recovered?

Regards,

James.

Re: NameNode rollEdidLog: removing storage

Posted by James Kinley <ja...@gmail.com>.
Hi Todd,

Yes, sorry I should have said we are running CDH3u2.

Thanks, James.

On 14 Jun 2012, at 17:48, Todd Lipcon <to...@cloudera.com> wrote:

> Hi James,
> 
> Could you please let us know exactly what version of Hadoop you're
> running? This is an area that has had some bug fixes throughout the
> last year, so identifying the particular version is important.
> 
> -Todd
> 
> On Thu, Jun 14, 2012 at 9:16 AM, James Kinley
> <ja...@gmail.com> wrote:
>> Hi,
>> 
>> Our production cluster started reporting "too many open files" this
>> afternoon and subsequently was unable to save any snapshots to disk.
>> We have been able to recover it ok, but I would have expected the NN
>> to complain more if it cannot save a snapshot. All I saw in the log
>> was...
>> 
>> "WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog:
>> removing storage <local dir>"
>> "WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog:
>> removing storage <nfs dir>"
>> 
>> Do you think this should trigger the NN to enter safe mode. The longer
>> this goes un-noticed, the more data could be lost if the NN cannot be
>> recovered?
>> 
>> Regards,
>> 
>> James.
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: NameNode rollEdidLog: removing storage

Posted by Todd Lipcon <to...@cloudera.com>.
Hi James,

Could you please let us know exactly what version of Hadoop you're
running? This is an area that has had some bug fixes throughout the
last year, so identifying the particular version is important.

-Todd

On Thu, Jun 14, 2012 at 9:16 AM, James Kinley
<ja...@gmail.com> wrote:
> Hi,
>
> Our production cluster started reporting "too many open files" this
> afternoon and subsequently was unable to save any snapshots to disk.
> We have been able to recover it ok, but I would have expected the NN
> to complain more if it cannot save a snapshot. All I saw in the log
> was...
>
> "WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog:
> removing storage <local dir>"
> "WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog:
> removing storage <nfs dir>"
>
> Do you think this should trigger the NN to enter safe mode. The longer
> this goes un-noticed, the more data could be lost if the NN cannot be
> recovered?
>
> Regards,
>
> James.



-- 
Todd Lipcon
Software Engineer, Cloudera