You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "stack (JIRA)" <ji...@apache.org> on 2007/06/25 21:53:25 UTC
[jira] Created: (HADOOP-1527) Region server won't start because
logdir exists
Region server won't start because logdir exists
-----------------------------------------------
Key: HADOOP-1527
URL: https://issues.apache.org/jira/browse/HADOOP-1527
Project: Hadoop
Issue Type: Bug
Components: contrib/hbase
Reporter: stack
Assignee: stack
Starting and then impolitely stopping a cluster I came across the following:
2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HADOOP-1527:
----------------------------------
Attachment: HADOOP-1527-patch.txt
Revise for recent commits
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.15.0
>
> Attachments: HADOOP-1527-patch.txt, HADOOP-1527-patch.txt
>
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521516 ]
Jim Kellerman commented on HADOOP-1527:
---------------------------------------
This really is an abnormal condition, because if a region server dies, the master should split the region server's log (and place the records in the regions' directory(ies)) and then remove the region server log.
If a region server is starting up and discovers a log directory exists which should belong exclusively to that server that means that either:
- the master has not cleaned up the log yet (or perhaps never will if the master crashed before it could)
- another region server started and grabbed that port, so the starting region server should shut down.
In the former case, if the master crashed, we should provide a tool that can split the log so we can recover the regions that the previous region server instance was serving.
Otherwise I think that what is happening is the correct behavior.
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Reporter: stack
> Assignee: Jim Kellerman
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HADOOP-1527:
----------------------------------
Status: Patch Available (was: Open)
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.15.0
>
> Attachments: HADOOP-1527-patch.txt, HADOOP-1527-patch.txt
>
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HADOOP-1527:
----------------------------------
Fix Version/s: 0.15.0
Affects Version/s: 0.15.0
Status: Patch Available (was: Open)
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.15.0
>
> Attachments: HADOOP-1527-patch.txt
>
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521525 ]
Jim Kellerman commented on HADOOP-1527:
---------------------------------------
Even better, if the master discovers a stale entry in the root or meta regions, it should go look to see if the log file exists an split it before assigning the region to a new server.
This would even handle the case where the region server serving the root region died because it is highly unlikely that a region server would have only been serving the root region.
So the plan of attack is to add a check in the master upon discovery of a stale entry in the root and meta regions, and to create a separate utility to recover a region server log in the unlikely event that a region server was only serving the root region.
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Reporter: stack
> Assignee: Jim Kellerman
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HADOOP-1527:
----------------------------------
Attachment: HADOOP-1527-patch.txt
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.15.0
>
> Attachments: HADOOP-1527-patch.txt
>
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman reassigned HADOOP-1527:
-------------------------------------
Assignee: Jim Kellerman (was: stack)
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Reporter: stack
> Assignee: Jim Kellerman
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HADOOP-1527:
----------------------------------
Status: Open (was: Patch Available)
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.15.0
>
> Attachments: HADOOP-1527-patch.txt
>
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521709 ]
Hadoop QA commented on HADOOP-1527:
-----------------------------------
+1
http://issues.apache.org/jira/secure/attachment/12364281/HADOOP-1527-patch.txt applied and successfully tested against trunk revision r568404.
Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/594/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/594/console
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.15.0
>
> Attachments: HADOOP-1527-patch.txt, HADOOP-1527-patch.txt
>
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-1527) Region server won't start because
logdir exists
Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HADOOP-1527:
----------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Passes tests. Committed.
> Region server won't start because logdir exists
> -----------------------------------------------
>
> Key: HADOOP-1527
> URL: https://issues.apache.org/jira/browse/HADOOP-1527
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.15.0
>
> Attachments: HADOOP-1527-patch.txt, HADOOP-1527-patch.txt
>
>
> Starting and then impolitely stopping a cluster I came across the following:
> 2007-06-25 19:43:31,449 ERROR org.apache.hadoop.hbase.HRegionServer: Can not start region server because org.apache.hadoop.hbase.RegionServerRunningException: region server already running at 208.76.44.140:60010 because logdir exists
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:447)
> at org.apache.hadoop.hbase.HRegionServer.<init>(HRegionServer.java:372)
> at org.apache.hadoop.hbase.HRegionServer.main(HRegionServer.java:1233)
> Region server should recover or offer a recovery path when we run into this condition.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.