You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Dave Latham (JIRA)" <ji...@apache.org> on 2010/02/23 00:44:27 UTC

[jira] Commented: (HBASE-2248) New MemStoreScanner copies memstore for each scan, makes short scans slow

    [ https://issues.apache.org/jira/browse/HBASE-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836994#action_12836994 ] 

Dave Latham commented on HBASE-2248:
------------------------------------

After doing a flush on the table, the scans are about 100x faster.

> New MemStoreScanner copies memstore for each scan, makes short scans slow
> -------------------------------------------------------------------------
>
>                 Key: HBASE-2248
>                 URL: https://issues.apache.org/jira/browse/HBASE-2248
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3
>            Reporter: Dave Latham
>             Fix For: 0.20.4
>
>         Attachments: threads.txt
>
>
> HBASE-2037 introduced a new MemStoreScanner which triggers a ConcurrentSkipListMap.buildFromSorted clone of the memstore and snapshot when starting a scan.
> After upgrading to 0.20.3, we noticed a big slowdown in our use of short scans.  Some of our data repesent a time series.   The data is stored in time series order, MR jobs often insert/update new data at the end of the series, and queries usually have to pick up some or all of the series.  These are often scans of 0-100 rows at a time.  To load one page, we'll observe about 20 such scans being triggered concurrently, and they take 2 seconds to complete.  Doing a thread dump of a region server shows many threads in ConcurrentSkipListMap.biuldFromSorted which traverses the entire map of key values to copy it.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.