You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jim Kellerman (JIRA)" <ji...@apache.org> on 2009/06/02 04:45:07 UTC
[jira] Commented: (HBASE-1177) Delay when client is located on the
same node as the regionserver
[ https://issues.apache.org/jira/browse/HBASE-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715346#action_12715346 ]
Jim Kellerman commented on HBASE-1177:
--------------------------------------
Based on the graph https://issues.apache.org/jira/secure/attachment/12409623/zoom+of+columns+vs+round-trip+blowup.jpg
the blow-up in round-trip time happens between 8 and 23 columns (approximately - different runs will have different end points), but the amount of time spent in getRow remains pretty constant through this range.
For larger numbers of columns (as depicted in https://issues.apache.org/jira/secure/attachment/12409622/getRow+%2B+round-trip+vs+%23+columns.jpg ), getRow and round-trip time seem to scale pretty linearly.
This is pretty strange, but at least it doesn't seem related to server-side I/O.
> Delay when client is located on the same node as the regionserver
> -----------------------------------------------------------------
>
> Key: HBASE-1177
> URL: https://issues.apache.org/jira/browse/HBASE-1177
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.19.0
> Environment: Linux 2.6.25 x86_64
> Reporter: Jonathan Gray
> Assignee: Jim Kellerman
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: Contribution of getClosest to getRow time.jpg, Contribution of next to getRow time.jpg, Contribution of seekTo to getClosest time.jpg, getRow + round-trip vs # columns.jpg, getRow times.jpg, ReadDelayTest.java, screenshot-1.jpg, screenshot-2.jpg, screenshot-3.jpg, screenshot-4.jpg, zoom of columns vs round-trip blowup.jpg
>
>
> During testing of HBASE-80, we uncovered a strange 40ms delay for random reads. We ran a series of tests and found that it only happens when the client is on the same node as the RS and for a certain range of payloads (not specifically related to number of columns or size of them, only total payload). It appears to be precisely 40ms every time.
> Unsure if this is particular to our architecture, but it does happen on all nodes we've tried. Issue completely goes away with very large payloads or moving the client.
> Will post a test program tomorrow if anyone can test on a different architecture.
> Making a blocker for 0.20. Since this happens when you have an MR task running local to the RS, and this is what we try to do, might also consider making this a blocker for 0.19.1.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.