You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "stack (JIRA)" <ji...@apache.org> on 2007/12/22 07:11:43 UTC
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Attachment: index.patch
Add being able to configure mapfile index interval. Leave it at default for now. hdfs in TRUNK is 50% slower doing PerformanceEvaluation. Let me figure why before changing default from 128 to something like 16 or 32.
M src/contrib/hbase/conf/hbase-default.xml
Add hbase.io.index.interval
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
Add an HbaseMapFile. Move the hbase'isms into it. Have bloom filter
etc. subclass it. Read hbase.io.index.interval writing index file.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
Count hstorefile entries. Emit count when logging at DEBUG.
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.