You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "mikhail (JIRA)" <ji...@apache.org> on 2009/09/03 22:32:57 UTC
[jira] Commented: (HBASE-1603) MR failed
"RetriesExhaustedException: Trying to contact region server Some server for
region TestTable..."
[ https://issues.apache.org/jira/browse/HBASE-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751141#action_12751141 ]
mikhail commented on HBASE-1603:
--------------------------------
I am getting what looks like a reappearance of this issue. I am trying to insert approximately 70 million pieces into an HBase instance (trunk version 808546 checked out in late August, so I assume it contains the fix from 1615). My mappers fail on flushCommit with the following:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=234, listsize=1504, location=address: 172.18.56.122:60020, regioninfo: REGION => {NAME => 'dailies__exact_dup_stats.PER_SRC_ENTITY_TS,\x0DDennis\x20Bonnen,1251999874064', STARTKEY => '\x0DDennis\x20Bonnen', ENDKEY => '\x0ECHARITY\x20BASKET', ENCODED => 535563176, TABLE => {{NAME => 'dailies__exact_dup_stats.PER_SRC_ENTITY_TS', FAMILIES => [{NAME => 'd', VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'false'}, {NAME => 'date_ranges', VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}, region=dailies__exact_dup_stats.PER_SRC_ENTITY_TS,\x0DDennis\x20Bonnen,1251999874064 for region dailies__exact_dup_stats.PER_SRC_ENTITY_TS,\x0DDennis\x20Bonnen,1251999874064, row '\x0DWalt\x20Stanfill', but failed after 10 attempts.
Exceptions:
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1040)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:582)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:448)
at com.textmap.freedonia.db.hbase.ArtifactHBaseExporter$HBaseArtifactExportMapper.map(ArtifactHBaseExporter.java:172)
at com.textmap.freedonia.db.hbase.ArtifactHBaseExporter$HBaseArtifactExportMapper.map(ArtifactHBaseExporter.java:120)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
On the master, I'm getting the following:
2009-09-03 16:23:28,331 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner scanning meta region {server: 172.18.56.115:60020, regionname: -ROOT-,,0, startKey:
<>}
2009-09-03 16:23:38,583 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner scan of 1 row(s) of meta region {server: 172.18.56.115:60020, regionname: -ROOT-,,0,
startKey: <>} complete
2009-09-03 16:24:28,147 INFO org.apache.hadoop.hbase.master.ServerManager: 14 region servers, 0 dead, average load 1.2
2009-09-03 16:24:28,257 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner scanning meta region {server: 172.18.56.107:60020, regionname: .META.,,1, startKey:
<>}
2009-09-03 16:24:28,261 WARN org.apache.hadoop.hbase.master.BaseScanner: Scan one META region: {server: 172.18.56.107:60020, regionname: .META.,,1, startKey: <>}
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2261)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1877)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
at sun.reflect.GeneratedConstructorAccessor15.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
at org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:191)
at org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
at org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
at org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:135)
at org.apache.hadoop.hbase.Chore.run(Chore.java:68)
On some regionservers, the following repeats in the logs:
2009-09-03 16:25:28,255 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60020, call openScanner([B@ef0bc8a, startRow=, stopRow=, maxVersions=1, caching=-1, timeRange=[0,9223372036854775807), families={(family=info, columns={}}) from 172.18.56.105:35045: error: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2261)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1877)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-09-03 16:25:30,371 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=3.2688217MB (3427608), Free=395.06866MB (414259528), Max=398.3375MB (417687136), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted/Run=NaN
On other regionservers, there are no exceptions, only messages like this:
2009-09-03 16:26:36,647 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=3.2688217MB (3427608), Free=395.06866MB (414259528), Max=398.3375MB (417687136)
, Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted/Run=NaN
The web UI also goes down when the job fails -- master.jsp gives me the following:
Trying to contact region server 172.18.56.107:60020 for region .META.,,1, row '', but failed after 3 attempts.
Exceptions:
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2261)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1877)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2261)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1877)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2261)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1877)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
RequestURI=/master.jsp
Caused by:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server 172.18.56.107:60020 for region .META.,,1, row '', but failed after 3 attempts.
Exceptions:
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2261)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1877)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2261)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1877)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2261)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1877)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:936)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:54)
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:28)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.listTables(HConnectionManager.java:369)
at org.apache.hadoop.hbase.client.HBaseAdmin.listTables(HBaseAdmin.java:127)
at org.apache.hadoop.hbase.generated.master.master_jsp._jspService(master_jsp.java:125)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
Please let me know if this looks like the same problem or a different problem / API misuse on my part.
> MR failed "RetriesExhaustedException: Trying to contact region server Some server for region TestTable..."
> ----------------------------------------------------------------------------------------------------------
>
> Key: HBASE-1603
> URL: https://issues.apache.org/jira/browse/HBASE-1603
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Attachments: debugging-v4.patch
>
>
> Here is the master. Region TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358 was split at 16:11:42,865. My MR job failed at 18:12:26,462 with this:
> {code}
> 2009-07-01 18:12:26,462 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server for region TestTable,�,1246464670313, row '�� ', but failed after 10 attempts.
> Exceptions:
> ...
> {code}
> Why after ten attempts did the client not find the region?
> {code}
> 2009-07-01 16:11:42,865 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358: Daughters; TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313, TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3
> 2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 to aa0-000-15.u.powerset.com,60020,1246461673026
> 2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 to aa0-000-15.u.powerset.com,60020,1246461673026
> 2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3
> 2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 2 of 3
> 2009-07-01 16:11:45,906 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 3 of 3
> 2009-07-01 16:11:45,906 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 open on 208.76.44.142:60020
> 2009-07-01 16:11:45,906 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: updating row TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 in region .META.,,1 with startcode 1246461673026 and server 208.76.44.142:60020
> 2009-07-01 16:11:45,908 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 open on 208.76.44.142:60020
> 2009-07-01 16:11:45,908 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: updating row TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 in region .META.,,1 with startcode 1246461673026 and server 208.76.44.142:60020
> 2009-07-01 17:46:42,670 [IPC Server handler 0 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313: Daughters; TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246470379467, TestTable,\x00\x00\x08\x04\x05\x07\x02\x05\x04\x08,1246470379467 from aa0-000-15.u.powerset.com,60020,1246461673026; 5 of 7
> {code}
> Here is over on the regionserver:
> {code}
> 2009-07-01 16:11:42,865 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246462685358: Daughters; TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313, TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3
> 2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 to aa0-000-15.u.powerset.com,60020,1246461673026
> 2009-07-01 16:11:42,866 [IPC Server handler 2 on 60001] INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 to aa0-000-15.u.powerset.com,60020,1246461673026
> 2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 1 of 3
> 2009-07-01 16:11:45,905 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 2 of 3
> 2009-07-01 16:11:45,906 [IPC Server handler 8 on 60001] INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 from aa0-000-15.u.powerset.com,60020,1246461673026; 3 of 3
> 2009-07-01 16:11:45,906 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 open on X.X.X.142:60020
> 2009-07-01 16:11:45,906 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: updating row TestTable,\x00\x00\x06\x05\x01\x05\x07\x09\x08\x00,1246464670313 in region .META.,,1 with startcode 1246461673026 and server X.X.X4.142:60020
> 2009-07-01 16:11:45,908 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 open on X.X.X.142:60020
> 2009-07-01 16:11:45,908 [HMaster] INFO org.apache.hadoop.hbase.master.RegionServerOperation: updating row TestTable,\x00\x01\x01\x04\x04\x07\x02\x08\x08\x03,1246464670313 in region .META.,,1 with startcode 1246461673026 and server X.X.X.142:60020
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.