You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Marcin Cylke <mc...@touk.pl> on 2013/01/30 11:37:52 UTC

JournalNode desynchronized

Hi

I had a failure of one of the machines my JournalNode is running on.
I've restored that machine's setup and would like to attach her to the
existing JournalNode Quorum.

When I try to run it I get the following error:

 ERROR org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:hdfs (auth:SIMPLE)
cause:org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
formatted
2013-01-28 12:13:45,050 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 8485, call
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.getEditLogManifest
from 10.10.105.5:57604: error:
org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
formatted
org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
formatted
        at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkFormatted(Journal.java:442)
        at
org.apache.hadoop.hdfs.qjournal.server.Journal.getEditLogManifest(Journal.java:625)
        at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getEditLogManifest(JournalNodeRpcServer.java:177)
        at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.getEditLogManifest(QJournalProtocolServerSideTranslatorPB.java:196)
        at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:14028)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)


How to fix this kind of issue? JournalNode directory looks as follows:

bash-4.1$ ls -R /hadoop/dfs/journalnode/
/hadoop/dfs/journalnode/:
hadoop-cluster

/hadoop/dfs/journalnode/hadoop-cluster:
current

/hadoop/dfs/journalnode/hadoop-cluster/current:
committed-txid  last-promised-epoch

So, there are no edits' files in there and the most reasonable way would
be to sync them in some way. The best solution that comes to mind is
stopping the cluster, copying over all the edits to the "new" server and
then starting journals again.

Is there an easier and on-line way to do that?
I'd appreciate some solution that would not require formatting nameNode :)

Regards
Marcin

Re: JournalNode desynchronized

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
You can rsync from one of your other JournalNodes into the correct
directory on the new one. You can safely do this while the old one is
running.  In the future, there will be a way to do this without using rsync.

cheers,
Colin


On Wed, Jan 30, 2013 at 2:37 AM, Marcin Cylke <mc...@touk.pl> wrote:

> Hi
>
> I had a failure of one of the machines my JournalNode is running on.
> I've restored that machine's setup and would like to attach her to the
> existing JournalNode Quorum.
>
> When I try to run it I get the following error:
>
>  ERROR org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException as:hdfs (auth:SIMPLE)
>
> cause:org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
> Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
> formatted
> 2013-01-28 12:13:45,050 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 8485, call
>
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.getEditLogManifest
> from 10.10.105.5:57604: error:
> org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
> Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
> formatted
> org.apache.hadoop.hdfs.qjournal.protocol.JournalNotFormattedException:
> Journal Storage Directory /hadoop/dfs/journalnode/hadoop-cluster not
> formatted
>         at
>
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkFormatted(Journal.java:442)
>         at
>
> org.apache.hadoop.hdfs.qjournal.server.Journal.getEditLogManifest(Journal.java:625)
>         at
>
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getEditLogManifest(JournalNodeRpcServer.java:177)
>         at
>
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.getEditLogManifest(QJournalProtocolServerSideTranslatorPB.java:196)
>         at
>
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:14028)
>         at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
>
>
> How to fix this kind of issue? JournalNode directory looks as follows:
>
> bash-4.1$ ls -R /hadoop/dfs/journalnode/
> /hadoop/dfs/journalnode/:
> hadoop-cluster
>
> /hadoop/dfs/journalnode/hadoop-cluster:
> current
>
> /hadoop/dfs/journalnode/hadoop-cluster/current:
> committed-txid  last-promised-epoch
>
> So, there are no edits' files in there and the most reasonable way would
> be to sync them in some way. The best solution that comes to mind is
> stopping the cluster, copying over all the edits to the "new" server and
> then starting journals again.
>
> Is there an easier and on-line way to do that?
> I'd appreciate some solution that would not require formatting nameNode :)
>
> Regards
> Marcin
>