You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Chenxi Tong <tc...@gmail.com> on 2018/04/28 06:45:47 UTC

A RS、DN of machine crash down,then hbase handler be filled,and hbase cluster request rapidly go down

Dear all,

For help, many many thanks to help。

Some days ago,our production hbase cluster experience a trouble,about Apr
23 21:52, a machine suddenly crash down,the regionserver and datanode also
down in the machine,

then about six regionserver handler be filled, hbase cluster request
rapidly go down. Until we stop the six regionserver the hbase cluster
recover.

we guess the hbase numCallsInGeneralQueue overstocke, but we increase
numCallsInGeneralQueue and decrease handler and cut off network of a
machine, we can not reappear the scenario.

So , I want to ask for help how to solve the scenario gracefully and how we
can reappear the scenario for test.

Error log :

2018-04-23 21:57:11,472 INFO [StoreFileOpenerThread-info-1]
regionserver.StoreFile$Reader: Loaded Delete Family Bloom (CompoundBloom

Filter) metadata for 1bade4c9adb6444baebf9e797a037f78

2018-04-23 21:57:14,151 WARN [StoreFileOpenerThread-info-1]
hdfs.BlockReaderFactory: I/O error constructing remote block reader.

java.net.NoRouteToHostException: No route to host

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)

at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)

at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2884)

at
org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:747)

at
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:662)

at
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:326)

at
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:570)

at
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:793)

at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:840)

at java.io.DataInputStream.readFully(DataInputStream.java:195)

at
org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:390)

at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:482)

at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:525)

at
org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1164)

at
org.apache.hadoop.hbase.regionserver.StoreFileInfo.open(StoreFileInfo.java:259)

at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:427)

at
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:528)

at
org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:518)

at
org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(HStore.java:667)

at org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:119)

at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:534)

at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:531)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

2018-04-23 21:57:14,151 WARN [StoreFileOpenerThread-info-1]
hdfs.BlockReaderFactory: I/O error constructing remote block reader.

java.net.NoRouteToHostException: No route to host

Re: A RS、DN of machine crash down,then hbase handler be filled,and hbase cluster request rapidly go down

Posted by Stack <st...@duboce.net>.
On Fri, Apr 27, 2018 at 11:45 PM, Chenxi Tong <tc...@gmail.com> wrote:

>
>
> Dear all,
>
> For help, many many thanks to help。
>
> Some days ago,our production hbase cluster experience a trouble,about Apr
> 23 21:52, a machine suddenly crash down,the regionserver and datanode also
> down in the machine,
>
> then about six regionserver handler be filled, hbase cluster request
> rapidly go down. Until we stop the six regionserver the hbase cluster
> recover.
>
> we guess the hbase numCallsInGeneralQueue overstocke, but we increase
> numCallsInGeneralQueue and decrease handler and cut off network of a
> machine, we can not reappear the scenario.
>
> So , I want to ask for help how to solve the scenario gracefully and how
> we can reappear the scenario for test.
>
> Error log :
>
> 2018-04-23 21:57:11,472 INFO [StoreFileOpenerThread-info-1]
> regionserver.StoreFile$Reader: Loaded Delete Family Bloom (CompoundBloom
>
> Filter) metadata for 1bade4c9adb6444baebf9e797a037f78
>
> 2018-04-23 21:57:14,151 WARN [StoreFileOpenerThread-info-1]
> hdfs.BlockReaderFactory: I/O error constructing remote block reader.
>
> java.net.NoRouteToHostException: No route to host
>


You need to fix this. Incidence of the above will cause havoc on your
clusters.
Thanks,
S




>
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>
> at org.apache.hadoop.net.SocketIOWithTimeout.connect(
> SocketIOWithTimeout.java:206)
>
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>
> at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2884)
>
> at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(
> BlockReaderFactory.java:747)
>
> at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(
> BlockReaderFactory.java:662)
>
> at org.apache.hadoop.hdfs.BlockReaderFactory.build(
> BlockReaderFactory.java:326)
>
> at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(
> DFSInputStream.java:570)
>
> at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(
> DFSInputStream.java:793)
>
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:840)
>
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
>
> at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(
> FixedFileTrailer.java:390)
>
> at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(
> HFile.java:482)
>
> at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:525)
>
> at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.
> <init>(StoreFile.java:1164)
>
> at org.apache.hadoop.hbase.regionserver.StoreFileInfo.
> open(StoreFileInfo.java:259)
>
> at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:427)
>
> at org.apache.hadoop.hbase.regionserver.StoreFile.
> createReader(StoreFile.java:528)
>
> at org.apache.hadoop.hbase.regionserver.StoreFile.
> createReader(StoreFile.java:518)
>
> at org.apache.hadoop.hbase.regionserver.HStore.createStoreFileAndReader(
> HStore.java:667)
>
> at org.apache.hadoop.hbase.regionserver.HStore.access$000(HStore.java:119)
>
> at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:534)
>
> at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:531)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> 2018-04-23 21:57:14,151 WARN [StoreFileOpenerThread-info-1]
> hdfs.BlockReaderFactory: I/O error constructing remote block reader.
>
> java.net.NoRouteToHostException: No route to host
>
>
>
>
>
>
>