You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Suresh V <ve...@gmail.com> on 2015/08/10 03:12:46 UTC
Active Namenode keeps crashing
In our HA setup, the active namenode keeps crashing once a week or so. The
cluster is quite idle without many jobs running and not much user activity.
Below is logs from journal nodes. Can someone help us with this please?
2015-08-04 13:00:20,054 INFO server.Journal
(Journal.java:updateLastPromisedEpoch(315)) - Updating lastPromisedEpoch
from 9 to 10 for client /172.26.44.133
2015-08-04 13:00:20,175 INFO server.Journal
(Journal.java:scanStorageForLatestEdits(188)) - Scanning storage
FileJournalManager(root=/hadoop/hdfs/journal/HDPPROD)
2015-08-04 13:00:20,220 INFO server.Journal
(Journal.java:scanStorageForLatestEdits(194)) - Latest log is
EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
2015-08-04 13:00:20,891 INFO server.Journal
(Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
-> startTxId: 523903 endTxId: 523925 isInProgress: true
2015-08-04 13:00:20,891 INFO server.Journal
(Journal.java:prepareRecovery(731)) - Prepared recovery for segment 523903:
segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
lastWriterEpoch: 9 lastCommittedTxId: 523924
2015-08-04 13:00:20,956 INFO server.Journal
(Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
-> startTxId: 523903 endTxId: 523925 isInProgress: true
2015-08-04 13:00:20,956 INFO server.Journal
(Journal.java:acceptRecovery(817)) - Skipping download of log startTxId:
523903 endTxId: 523925 isInProgress: true: already have up-to-date logs
2015-08-04 13:00:20,989 INFO server.Journal
(Journal.java:acceptRecovery(850)) - Accepted recovery for segment 523903:
segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
acceptedInEpoch: 10
2015-08-04 13:00:21,791 INFO server.Journal
(Journal.java:finalizeLogSegment(584)) - Validating log segment
/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
about to be finalized
2015-08-04 13:00:21,805 INFO namenode.FileJournalManager
(FileJournalManager.java:finalizeLogSegment(133)) - Finalizing edits file
/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
->
/hadoop/hdfs/journal/HDPPROD/current/edits_0000000000000523903-0000000000000523925
2015-08-04 13:00:22,257 INFO server.Journal
(Journal.java:startLogSegment(532)) - Updating lastWriterEpoch from 9 to 10
for client /172.26.44.133
2015-08-04 13:00:23,699 INFO ipc.Server (Server.java:run(2060)) - IPC
Server handler 4 on 8485, call
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from
172.26.44.135:43678 Call#304302 Retry#0
java.io.IOException: IPC's epoch 9 is less than the last promised epoch 10
at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:414)
at
org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:442)
at
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:342)
at
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
at
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
at
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
2015-08-06 19:13:14,012 INFO httpclient.HttpMethodDirector
(HttpMethodDirector.java:executeWithRetry(439)) - I/O exception
(org.apache.commons.httpclient.NoHttpResponseException) caught when
processing request: The server az-easthdpmnp02.metclouduseast.comfailed to
respond
Thank you
Suresh.
Re: Active Namenode keeps crashing
Posted by Artem Ervits <ar...@gmail.com>.
Check whether connectivity between servers is stable. Error says it can't
reach one node. Also check that time is synched between nodes.
On Aug 9, 2015 9:31 PM, "Suresh V" <ve...@gmail.com> wrote:
> In our HA setup, the active namenode keeps crashing once a week or so. The
> cluster is quite idle without many jobs running and not much user activity.
>
> Below is logs from journal nodes. Can someone help us with this please?
>
>
> 2015-08-04 13:00:20,054 INFO server.Journal
> (Journal.java:updateLastPromisedEpoch(315)) - Updating lastPromisedEpoch
> from 9 to 10 for client /172.26.44.133
>
> 2015-08-04 13:00:20,175 INFO server.Journal
> (Journal.java:scanStorageForLatestEdits(188)) - Scanning storage
> FileJournalManager(root=/hadoop/hdfs/journal/HDPPROD)
>
> 2015-08-04 13:00:20,220 INFO server.Journal
> (Journal.java:scanStorageForLatestEdits(194)) - Latest log is
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
>
> 2015-08-04 13:00:20,891 INFO server.Journal
> (Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
> -> startTxId: 523903 endTxId: 523925 isInProgress: true
>
> 2015-08-04 13:00:20,891 INFO server.Journal
> (Journal.java:prepareRecovery(731)) - Prepared recovery for segment 523903:
> segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
> lastWriterEpoch: 9 lastCommittedTxId: 523924
>
> 2015-08-04 13:00:20,956 INFO server.Journal
> (Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
> -> startTxId: 523903 endTxId: 523925 isInProgress: true
>
> 2015-08-04 13:00:20,956 INFO server.Journal
> (Journal.java:acceptRecovery(817)) - Skipping download of log startTxId:
> 523903 endTxId: 523925 isInProgress: true: already have up-to-date logs
>
> 2015-08-04 13:00:20,989 INFO server.Journal
> (Journal.java:acceptRecovery(850)) - Accepted recovery for segment 523903:
> segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
> acceptedInEpoch: 10
>
> 2015-08-04 13:00:21,791 INFO server.Journal
> (Journal.java:finalizeLogSegment(584)) - Validating log segment
> /hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
> about to be finalized
>
> 2015-08-04 13:00:21,805 INFO namenode.FileJournalManager
> (FileJournalManager.java:finalizeLogSegment(133)) - Finalizing edits file
> /hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
> ->
> /hadoop/hdfs/journal/HDPPROD/current/edits_0000000000000523903-0000000000000523925
>
> 2015-08-04 13:00:22,257 INFO server.Journal
> (Journal.java:startLogSegment(532)) - Updating lastWriterEpoch from 9 to 10
> for client /172.26.44.133
>
> 2015-08-04 13:00:23,699 INFO ipc.Server (Server.java:run(2060)) - IPC
> Server handler 4 on 8485, call
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from
> 172.26.44.135:43678 Call#304302 Retry#0
>
> java.io.IOException: IPC's epoch 9 is less than the last promised epoch 10
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:414)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:442)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:342)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
>
> at
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
>
> at
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
> 2015-08-06 19:13:14,012 INFO httpclient.HttpMethodDirector
> (HttpMethodDirector.java:executeWithRetry(439)) - I/O exception
> (org.apache.commons.httpclient.NoHttpResponseException) caught when
> processing request: The server az-easthdpmnp02.metclouduseast.comfailed
> to respond
>
>
>
>
> Thank you
>
> Suresh.
>
>
>
Re: Active Namenode keeps crashing
Posted by Artem Ervits <ar...@gmail.com>.
Check whether connectivity between servers is stable. Error says it can't
reach one node. Also check that time is synched between nodes.
On Aug 9, 2015 9:31 PM, "Suresh V" <ve...@gmail.com> wrote:
> In our HA setup, the active namenode keeps crashing once a week or so. The
> cluster is quite idle without many jobs running and not much user activity.
>
> Below is logs from journal nodes. Can someone help us with this please?
>
>
> 2015-08-04 13:00:20,054 INFO server.Journal
> (Journal.java:updateLastPromisedEpoch(315)) - Updating lastPromisedEpoch
> from 9 to 10 for client /172.26.44.133
>
> 2015-08-04 13:00:20,175 INFO server.Journal
> (Journal.java:scanStorageForLatestEdits(188)) - Scanning storage
> FileJournalManager(root=/hadoop/hdfs/journal/HDPPROD)
>
> 2015-08-04 13:00:20,220 INFO server.Journal
> (Journal.java:scanStorageForLatestEdits(194)) - Latest log is
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
>
> 2015-08-04 13:00:20,891 INFO server.Journal
> (Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
> -> startTxId: 523903 endTxId: 523925 isInProgress: true
>
> 2015-08-04 13:00:20,891 INFO server.Journal
> (Journal.java:prepareRecovery(731)) - Prepared recovery for segment 523903:
> segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
> lastWriterEpoch: 9 lastCommittedTxId: 523924
>
> 2015-08-04 13:00:20,956 INFO server.Journal
> (Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
> -> startTxId: 523903 endTxId: 523925 isInProgress: true
>
> 2015-08-04 13:00:20,956 INFO server.Journal
> (Journal.java:acceptRecovery(817)) - Skipping download of log startTxId:
> 523903 endTxId: 523925 isInProgress: true: already have up-to-date logs
>
> 2015-08-04 13:00:20,989 INFO server.Journal
> (Journal.java:acceptRecovery(850)) - Accepted recovery for segment 523903:
> segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
> acceptedInEpoch: 10
>
> 2015-08-04 13:00:21,791 INFO server.Journal
> (Journal.java:finalizeLogSegment(584)) - Validating log segment
> /hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
> about to be finalized
>
> 2015-08-04 13:00:21,805 INFO namenode.FileJournalManager
> (FileJournalManager.java:finalizeLogSegment(133)) - Finalizing edits file
> /hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
> ->
> /hadoop/hdfs/journal/HDPPROD/current/edits_0000000000000523903-0000000000000523925
>
> 2015-08-04 13:00:22,257 INFO server.Journal
> (Journal.java:startLogSegment(532)) - Updating lastWriterEpoch from 9 to 10
> for client /172.26.44.133
>
> 2015-08-04 13:00:23,699 INFO ipc.Server (Server.java:run(2060)) - IPC
> Server handler 4 on 8485, call
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from
> 172.26.44.135:43678 Call#304302 Retry#0
>
> java.io.IOException: IPC's epoch 9 is less than the last promised epoch 10
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:414)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:442)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:342)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
>
> at
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
>
> at
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
> 2015-08-06 19:13:14,012 INFO httpclient.HttpMethodDirector
> (HttpMethodDirector.java:executeWithRetry(439)) - I/O exception
> (org.apache.commons.httpclient.NoHttpResponseException) caught when
> processing request: The server az-easthdpmnp02.metclouduseast.comfailed
> to respond
>
>
>
>
> Thank you
>
> Suresh.
>
>
>
Re: Active Namenode keeps crashing
Posted by Artem Ervits <ar...@gmail.com>.
Check whether connectivity between servers is stable. Error says it can't
reach one node. Also check that time is synched between nodes.
On Aug 9, 2015 9:31 PM, "Suresh V" <ve...@gmail.com> wrote:
> In our HA setup, the active namenode keeps crashing once a week or so. The
> cluster is quite idle without many jobs running and not much user activity.
>
> Below is logs from journal nodes. Can someone help us with this please?
>
>
> 2015-08-04 13:00:20,054 INFO server.Journal
> (Journal.java:updateLastPromisedEpoch(315)) - Updating lastPromisedEpoch
> from 9 to 10 for client /172.26.44.133
>
> 2015-08-04 13:00:20,175 INFO server.Journal
> (Journal.java:scanStorageForLatestEdits(188)) - Scanning storage
> FileJournalManager(root=/hadoop/hdfs/journal/HDPPROD)
>
> 2015-08-04 13:00:20,220 INFO server.Journal
> (Journal.java:scanStorageForLatestEdits(194)) - Latest log is
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
>
> 2015-08-04 13:00:20,891 INFO server.Journal
> (Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
> -> startTxId: 523903 endTxId: 523925 isInProgress: true
>
> 2015-08-04 13:00:20,891 INFO server.Journal
> (Journal.java:prepareRecovery(731)) - Prepared recovery for segment 523903:
> segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
> lastWriterEpoch: 9 lastCommittedTxId: 523924
>
> 2015-08-04 13:00:20,956 INFO server.Journal
> (Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
> -> startTxId: 523903 endTxId: 523925 isInProgress: true
>
> 2015-08-04 13:00:20,956 INFO server.Journal
> (Journal.java:acceptRecovery(817)) - Skipping download of log startTxId:
> 523903 endTxId: 523925 isInProgress: true: already have up-to-date logs
>
> 2015-08-04 13:00:20,989 INFO server.Journal
> (Journal.java:acceptRecovery(850)) - Accepted recovery for segment 523903:
> segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
> acceptedInEpoch: 10
>
> 2015-08-04 13:00:21,791 INFO server.Journal
> (Journal.java:finalizeLogSegment(584)) - Validating log segment
> /hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
> about to be finalized
>
> 2015-08-04 13:00:21,805 INFO namenode.FileJournalManager
> (FileJournalManager.java:finalizeLogSegment(133)) - Finalizing edits file
> /hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
> ->
> /hadoop/hdfs/journal/HDPPROD/current/edits_0000000000000523903-0000000000000523925
>
> 2015-08-04 13:00:22,257 INFO server.Journal
> (Journal.java:startLogSegment(532)) - Updating lastWriterEpoch from 9 to 10
> for client /172.26.44.133
>
> 2015-08-04 13:00:23,699 INFO ipc.Server (Server.java:run(2060)) - IPC
> Server handler 4 on 8485, call
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from
> 172.26.44.135:43678 Call#304302 Retry#0
>
> java.io.IOException: IPC's epoch 9 is less than the last promised epoch 10
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:414)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:442)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:342)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
>
> at
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
>
> at
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
> 2015-08-06 19:13:14,012 INFO httpclient.HttpMethodDirector
> (HttpMethodDirector.java:executeWithRetry(439)) - I/O exception
> (org.apache.commons.httpclient.NoHttpResponseException) caught when
> processing request: The server az-easthdpmnp02.metclouduseast.comfailed
> to respond
>
>
>
>
> Thank you
>
> Suresh.
>
>
>
Re: Active Namenode keeps crashing
Posted by Artem Ervits <ar...@gmail.com>.
Check whether connectivity between servers is stable. Error says it can't
reach one node. Also check that time is synched between nodes.
On Aug 9, 2015 9:31 PM, "Suresh V" <ve...@gmail.com> wrote:
> In our HA setup, the active namenode keeps crashing once a week or so. The
> cluster is quite idle without many jobs running and not much user activity.
>
> Below is logs from journal nodes. Can someone help us with this please?
>
>
> 2015-08-04 13:00:20,054 INFO server.Journal
> (Journal.java:updateLastPromisedEpoch(315)) - Updating lastPromisedEpoch
> from 9 to 10 for client /172.26.44.133
>
> 2015-08-04 13:00:20,175 INFO server.Journal
> (Journal.java:scanStorageForLatestEdits(188)) - Scanning storage
> FileJournalManager(root=/hadoop/hdfs/journal/HDPPROD)
>
> 2015-08-04 13:00:20,220 INFO server.Journal
> (Journal.java:scanStorageForLatestEdits(194)) - Latest log is
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
>
> 2015-08-04 13:00:20,891 INFO server.Journal
> (Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
> -> startTxId: 523903 endTxId: 523925 isInProgress: true
>
> 2015-08-04 13:00:20,891 INFO server.Journal
> (Journal.java:prepareRecovery(731)) - Prepared recovery for segment 523903:
> segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
> lastWriterEpoch: 9 lastCommittedTxId: 523924
>
> 2015-08-04 13:00:20,956 INFO server.Journal
> (Journal.java:getSegmentInfo(687)) - getSegmentInfo(523903):
> EditLogFile(file=/hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903,first=0000000000000523903,last=0000000000000523925,inProgress=true,hasCorruptHeader=false)
> -> startTxId: 523903 endTxId: 523925 isInProgress: true
>
> 2015-08-04 13:00:20,956 INFO server.Journal
> (Journal.java:acceptRecovery(817)) - Skipping download of log startTxId:
> 523903 endTxId: 523925 isInProgress: true: already have up-to-date logs
>
> 2015-08-04 13:00:20,989 INFO server.Journal
> (Journal.java:acceptRecovery(850)) - Accepted recovery for segment 523903:
> segmentState { startTxId: 523903 endTxId: 523925 isInProgress: true }
> acceptedInEpoch: 10
>
> 2015-08-04 13:00:21,791 INFO server.Journal
> (Journal.java:finalizeLogSegment(584)) - Validating log segment
> /hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
> about to be finalized
>
> 2015-08-04 13:00:21,805 INFO namenode.FileJournalManager
> (FileJournalManager.java:finalizeLogSegment(133)) - Finalizing edits file
> /hadoop/hdfs/journal/HDPPROD/current/edits_inprogress_0000000000000523903
> ->
> /hadoop/hdfs/journal/HDPPROD/current/edits_0000000000000523903-0000000000000523925
>
> 2015-08-04 13:00:22,257 INFO server.Journal
> (Journal.java:startLogSegment(532)) - Updating lastWriterEpoch from 9 to 10
> for client /172.26.44.133
>
> 2015-08-04 13:00:23,699 INFO ipc.Server (Server.java:run(2060)) - IPC
> Server handler 4 on 8485, call
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.journal from
> 172.26.44.135:43678 Call#304302 Retry#0
>
> java.io.IOException: IPC's epoch 9 is less than the last promised epoch 10
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:414)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:442)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:342)
>
> at
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
>
> at
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
>
> at
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
> 2015-08-06 19:13:14,012 INFO httpclient.HttpMethodDirector
> (HttpMethodDirector.java:executeWithRetry(439)) - I/O exception
> (org.apache.commons.httpclient.NoHttpResponseException) caught when
> processing request: The server az-easthdpmnp02.metclouduseast.comfailed
> to respond
>
>
>
>
> Thank you
>
> Suresh.
>
>
>