You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2011/03/23 18:42:05 UTC
[jira] [Commented] (HBASE-3693) isMajorCompaction() check triggers
lots of listStatus DFS RPC calls from HBase
[ https://issues.apache.org/jira/browse/HBASE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010250#comment-13010250 ]
Jean-Daniel Cryans commented on HBASE-3693:
-------------------------------------------
Wow good on your for finding this! +1
> isMajorCompaction() check triggers lots of listStatus DFS RPC calls from HBase
> ------------------------------------------------------------------------------
>
> Key: HBASE-3693
> URL: https://issues.apache.org/jira/browse/HBASE-3693
> Project: HBase
> Issue Type: Improvement
> Reporter: Kannan Muthukkaruppan
> Assignee: Liyin Tang
>
> We noticed that are lots of listStatus calls on the ColumnFamily directories within each regions, coming from this codepath:
> {code}
> compactionSelection()
> --> isMajorCompaction
> --> getLowestTimestamp()
> --> FileStatus[] stats = fs.listStatus(p);
> {code}
> So on every compactionSelection() we're taking this hit. While not immediately an issue, just from log inspection, this accounts for quite a large number of RPCs to namenode at the moment and seems like an unnecessary load to be sending to the namenode.
> Seems like it would be easy to cache the timestamp for each opened/created StoreFile, in memory, in the region server, and avoid going to DFS each time for this information.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira