You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/03/17 07:04:50 UTC
[jira] Updated: (HBASE-1234) Change HBase StoreKey format
[ https://issues.apache.org/jira/browse/HBASE-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1234:
-------------------------
Attachment: 1234.patch
First cut at a patch. Still has issues and tests don't pass yet. Ryan Rawson and Jon Gray contributed to this patch.
Here are notes on whats there so far:
{code}
Removing HStoreKey (Still present but will be deprecated). Added in its stead,
KeyValue. KeyValue is a wrapper around a byte array, offset and length.
KeyValue has comparators that do byte array compares cognizant of our whacky
meta and root key formats.
M b/src/java/org/apache/hadoop/hbase/HRegionInfo.java
(getComparator): Added. Returns the comparator to use switching off
the HRegionInfo type.
M b/src/java/org/apache/hadoop/hbase/HTableDescriptor.java
Use new families comparator. Should save us on getting of family
delimiter, then comparing up to the family delimiter only.
A b/src/java/org/apache/hadoop/hbase/KeyValue.java
Effectively the new Key. Key format has changed too to introduce
Key Type. See class comment for detail on new format.
Mostly a bunch of static creates and utility making KeyValues.
Comparators have stuff like compareRow, matchingRow, etc. Use
the comparator to do stuff we used to do on the fly in past.
Has stuf for getting row, column and timestamp offsets and lengths
so we don't have to copy. Comparators also take offset and lengths
so can compare without copying.
M b/src/java/org/apache/hadoop/hbase/filter/RowFilterInterface.java
Added overrides that take offset and length and deprecated old methods.
M b/src/java/org/apache/hadoop/hbase/io/Cell.java
M b/src/java/org/apache/hadoop/hbase/io/RowResult.java
Added utility methods to go from KeyValue Lists, etc., to Maps of
column to Cells. This is the stuff we'd remove when the client/server
API changes. These will be hotspots when we profile. Did it for Cells
and RowResults.
M b/src/java/org/apache/hadoop/hbase/io/HalfHFileReader.java
Bring over HalfHFileReader to use KeyValue.
M b/src/java/org/apache/hadoop/hbase/io/hfile/HFile.java
M b/src/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java
Changes by Ryan to bring hfile over to KeyValue.
M b/src/java/org/apache/hadoop/hbase/regionserver/HAbstractScanner.java
Convert to KeyValue. Make column match take a KeyValue so don't have
to copy column out to compare it.
M b/src/java/org/apache/hadoop/hbase/regionserver/HLog.java
Convert HLog over to KeyValue.
M b/src/java/org/apache/hadoop/hbase/regionserver/HLogEdit.java
Redo an edit. Make the value a KeyValue. Move row out of key.
We should redo log file so its hfile.
M b/src/java/org/apache/hadoop/hbase/regionserver/HLogKey.java
Format has changed. Moved out row.
M b/src/java/org/apache/hadoop/hbase/regionserver/HRegion.java
Use KeyValue. Return lists of them rather than MapWritables and
Cell []. Use comparators. Added new Counter class. Needed in
getFull to keep up running list of versions (fixed bugs in here,
particularly in memcache where we were doing the count all wrong).
Added utility function to help with getFull; okToAddResult, addResult,
hasEnoughVersions. Cleanup. We were doing things like checking
for read only in two methods; parent check was enough.
Removed all of that localput stuff -- it was weird. May have made
sense once (prompted by jgray).
M b/src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Bring over to new regime. In here we do some of the conversions from
new style list of KeyValue to old style RowResult.
M b/src/java/org/apache/hadoop/hbase/regionserver/InternalScanner.java
Comvert to KeyValue.
M b/src/java/org/apache/hadoop/hbase/regionserver/Memcache.java
Redo as a Set of KeyValues. Use ConcurrentSkipListSet so can
undo all synchronization. Means that iterators no longer fail
fast but I think thats fine -- you get view on data at time
Iterator was taken out.
M b/src/java/org/apache/hadoop/hbase/regionserver/Store.java
Move over to KeyValue. Bunch of refactoring to make it work.
M b/src/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
Move over to KeyValue.
M b/src/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
Move over to KeyValue. Removed ViableRow class. Not needed any
more it looks like.
M b/src/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
Move over to KeyValue.
M b/src/java/org/apache/hadoop/hbase/util/Bytes.java
Fix SIZEOF_BYTES. Added facility. Most in here was done by jgray,
in particular the stuff that provides ByteBuffer facility such as
putInt, putLong, etc. Ryan added the binarySearch of byte array,
offset, and length tuples.
M b/src/java/org/apache/hadoop/hbase/util/MetaUtils.java
M b/src/java/org/apache/hadoop/hbase/util/Merge.java
Still to do.
A b/src/test/org/apache/hadoop/hbase/TestKeyValue.java
Test for keyvalue.
M b/src/test/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java
Mods for changes in HFile.
M b/src/test/org/apache/hadoop/hbase/regionserver/TestHMemcache.java
Fixup of tests for Memcache.
M b/src/test/org/apache/hadoop/hbase/util/TestBytes.java
Tests for Bytes.
{code}
> Change HBase StoreKey format
> ----------------------------
>
> Key: HBASE-1234
> URL: https://issues.apache.org/jira/browse/HBASE-1234
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: stack
> Assignee: stack
> Fix For: 0.20.0
>
> Attachments: 1234.patch
>
>
> HBASE-859 cleaned up keys removing the need of HRegionInfo being in the context comparing keys. This issue is about changing the format. Work done in HBASE-859 means changes have been localized to HStoreKey, in particular to comparators and parse routines. We should do this now since 0.20.0 will require rewriting all data.
> Things to consider:
> <row> <columnfamily> <columnqualifier> <timestamp> <keytype>
> Or leave off columnfamily altogether and just write it once into the hfile metadata (All key compares are done in the Store context so columnfamily can be safely left out of the equation; its only when the key rises above Store that the columnfamily needs appending).
> keytype is probably a byte. Types are delete cell, delete row, delete family, delete column? What else? Where should we put it? At the end? How should type sort? Or should it not be part of sort so its just the order at which we encounter the key?
> How are we going to support keys that go in out of chronological order?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.