You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "stack (JIRA)" <ji...@apache.org> on 2007/12/22 05:57:44 UTC
[jira] Created: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
[hbase] Make mapfile index interval configurable
------------------------------------------------
Key: HADOOP-2485
URL: https://issues.apache.org/jira/browse/HADOOP-2485
Project: Hadoop
Issue Type: Improvement
Components: contrib/hbase
Reporter: stack
Priority: Minor
Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Status: Patch Available (was: In Progress)
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2485.patch, index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Status: In Progress (was: Patch Available)
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Status: Patch Available (was: In Progress)
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2485.patch, index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Status: In Progress (was: Patch Available)
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2485.patch, index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554689 ]
Hudson commented on HADOOP-2485:
--------------------------------
Integrated in Hadoop-Nightly #348 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/348/])
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2485.patch, index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Attachment: 2485.patch
Hudson won't pick up my patch. Trying again with a differently named patch to see if that makes a difference.
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2485.patch, index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Status: In Progress (was: Patch Available)
Retrying... after hudson restart
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2485.patch, index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Status: Patch Available (was: In Progress)
Hudson is idle. Requeue.
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Attachment: index.patch
Add being able to configure mapfile index interval. Leave it at default for now. hdfs in TRUNK is 50% slower doing PerformanceEvaluation. Let me figure why before changing default from 128 to something like 16 or 32.
M src/contrib/hbase/conf/hbase-default.xml
Add hbase.io.index.interval
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
Add an HbaseMapFile. Move the hbase'isms into it. Have bloom filter
etc. subclass it. Read hbase.io.index.interval writing index file.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
Count hstorefile entries. Emit count when logging at DEBUG.
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Fix Version/s: 0.16.0
Status: Patch Available (was: Open)
Builds locally.
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Status: Patch Available (was: In Progress)
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Status: In Progress (was: Patch Available)
Patch is not being built by hudson. Retrying.
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-2485:
--------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Committed. Resolving.
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2485.patch, index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HADOOP-2485) [hbase] Make mapfile index interval
configurable
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack reassigned HADOOP-2485:
-----------------------------
Assignee: stack
> [hbase] Make mapfile index interval configurable
> ------------------------------------------------
>
> Key: HADOOP-2485
> URL: https://issues.apache.org/jira/browse/HADOOP-2485
> Project: Hadoop
> Issue Type: Improvement
> Components: contrib/hbase
> Reporter: stack
> Assignee: stack
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: index.patch
>
>
> Default mapfile index interval is every 128 entries. Basic tests show PerformanceEvaluation mapfile test random reading 100k records in 60plus seconds. If index interval is set to 1 so we don't have to next around looking for our record, then 100k random reads take 7 seconds. This is using local filesystem. If I set it to 16, then takes 12 seconds.
> Testing doing PerformanceEvaluation random reads against hbase, with interval set to 16, we run 50% faster (hdfs is in the picture).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.