You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Tommy Li (JIRA)" <ji...@apache.org> on 2019/01/26 00:38:00 UTC

[jira] [Comment Edited] (HBASE-21775) The BufferedMutator doesn't ever refresh region location cache

    [ https://issues.apache.org/jira/browse/HBASE-21775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752856#comment-16752856 ] 

Tommy Li edited comment on HBASE-21775 at 1/26/19 12:37 AM:
------------------------------------------------------------

[~stack] It definitely needs to go to branch-2. I haven't tested this on version 1, but i took a brief look at the code and that condition is [the same|https://github.com/apache/hbase/blob/branch-1.4/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java#L1259] so yeah this can also go to branch-1

 


was (Author: tommyzli):
[~stack] It definitely needs to go to branch-2. I haven't tested this on version 1, but i took a brief look at the code and that condition is [the same|[https://github.com/apache/hbase/blob/branch-1.4/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java#L1259]|https://github.com/apache/hbase/blob/branch-1.4/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java#L1259],] so yeah this can also go to branch-1

 

> The BufferedMutator doesn't ever refresh region location cache
> --------------------------------------------------------------
>
>                 Key: HBASE-21775
>                 URL: https://issues.apache.org/jira/browse/HBASE-21775
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>            Reporter: Tommy Li
>            Assignee: Tommy Li
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HBASE-21775.master.001.patch
>
>
> {color:#222222}I noticed in some of my writing jobs that the BufferedMutator would get stuck retrying writes against a dead server.{color}
> {code:java}
> 19/01/18 15:15:47 INFO [Executor task launch worker for task 0] client.AsyncRequestFutureImpl: #2, waiting for 1  actions to finish on table: dummy_table
> 19/01/18 15:15:54 WARN [htable-pool3-t56] client.AsyncRequestFutureImpl: id=2, table=dummy_table, attempt=15/21, failureCount=1ops, last exception=org.apache.hadoop.hbase.DoNotRetryIOException: Operation rpcTimeout on <SERVER>,17020,1547848193782, tracking started Fri Jan 18 14:55:37 PST 2019; NOT retrying, failed=1 -- final attempt!
> 19/01/18 15:15:54 ERROR [Executor task launch worker for task 0] IngestRawData.map(): [B@258bc2c7: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: Operation rpcTimeout: 1 time, servers with issues: <SERVER>,17020,1547848193782
> {code}
>  
> After the single remaining action permanently failed, it would resume progress only to get stuck again retrying against the same dead server:
> {code:java}
> 19/01/18 15:21:18 INFO [Executor task launch worker for task 0] client.AsyncRequestFutureImpl: #2, waiting for 1  actions to finish on table: dummy_table
> 19/01/18 15:21:18 INFO [Executor task launch worker for task 0] client.AsyncRequestFutureImpl: #2, waiting for 1  actions to finish on table: dummy_table
> 19/01/18 15:21:20 INFO [htable-pool3-t55] client.AsyncRequestFutureImpl: id=2, table=dummy_table, attempt=6/21, failureCount=1ops, last exception=java.net.ConnectException: Call to <SERVER> failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: connection timed out: <SERVER> on <SERVER>,17020,1547848193782, tracking started null, retrying after=20089ms, operationsToReplay=1
> {code}
>  
> Only restarting the client process to generate a new BufferedMutator instance would fix the issue, at least until the next regionserver crash
>  The logs I've pasted show the issue happening with a ConnectionTimeoutException, but we've also seen it with NotServingRegionException and some others



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)