You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Mike Percy (JIRA)" <ji...@apache.org> on 2019/03/04 21:44:00 UTC

[jira] [Commented] (KUDU-1868) Java client mishandles socket read timeouts for scans

    [ https://issues.apache.org/jira/browse/KUDU-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783805#comment-16783805 ] 

Mike Percy commented on KUDU-1868:
----------------------------------

Merged as part of these patches from Will:
 * [https://gerrit.cloudera.org/c/12338/]
 * [https://gerrit.cloudera.org/c/12363/]

 

> Java client mishandles socket read timeouts for scans
> -----------------------------------------------------
>
>                 Key: KUDU-1868
>                 URL: https://issues.apache.org/jira/browse/KUDU-1868
>             Project: Kudu
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Will Berkeley
>            Priority: Major
>              Labels: backup
>
> Scan calls from the Java client that take more than the socket read timeout get retried (unless the operation timeout has expired) instead of being killed. Users will see this:
> {code}
> org.apache.kudu.client.NonRecoverableException: Invalid call sequence ID in scan request
> {code}
> Note that the right behavior here would still end up killing the scanner, so this is really a problem the user has to deal with! It's usually caused by slow IO, combined with very selection scans.
> Workaround: set defaultSocketReadTimeoutMs higher, ideally equal to defaultOperationTimeoutMs (the defaults are 10 and 30 seconds respectively). But really the user should investigate why single the scans are so slow.
> One potentially easy fix to this is to handle retries differently for scanners so that the user gets nicer exception. A harder fix is to handle socket read timeouts completely differently, basically it should be per-RPC and not per TabletClient like it is right now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)