You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by ramkrishna vasudevan <ra...@gmail.com> on 2013/04/17 19:16:23 UTC

Cell Encoders and usage of Cell

Hi

With the introduction of the new Cell Interface we are providing a way
where both the RPC usage of cell and the usage of Cell in HFile are
unified.(abstracted)

The current block encoder which encodes the kvs into hfile blocks will be
enhanced may be BlockEncode2 which will deal with Cell encoding and the
same will be written to HFile.

Does that mean that there are going to be changes to the HFile format also?
 Just to understand is my understanding here correct or not.

Because as the Cell interface the row, family, qualifier all are treated as
individual byte arrays.  Also it does not provide a way to access the
getKeyOffset() and getKeyLength().

This is in lieu with HBASE-7448 - Adding tags to cell interface and then
the same will  be used in
HBASE-7663 - Visibility labels.

May be further queries/doubts can be posted on those relevant JIRAs to
proceed work on that.

Regards
Ram

Re: Cell Encoders and usage of Cell

Posted by ramkrishna vasudevan <ra...@gmail.com>.
>Until then, you are left w/ some sort of hack on KV.

Yes.
> If all was Cell on the read/write path, I'd imagine
it'd be easy enough getting your tags in the mix.
Am checking on this also.  May take some time till i finalise on the
changes.

Will discuss with Andy offline and get back on some initial version on the
changes that may be needed.

Regards
Ram


On Tue, Apr 23, 2013 at 3:46 AM, Andrew Purtell <ap...@apache.org> wrote:

> On HBASE-7544 I added encryption to HFile starting with the encoder and
> decoder contexts and working outward, so I can attest it is possible to add
> features that wrap *around* KV serialization without changing the KV
> serialization itself. On trunk the block encoder stuff has been at least
> partially factored out so it is not so bad. The HFileV2 code in 0.94 is
> more of a hairball.
>
> To pursue an incremental development approach, one option to consider is
> making it possible to stack block encoders. Then we might be able to
> implement tag support as a block encoder without losing the ability to do
> real block encoding. Something else to undo is the inflexible enum that
> specifies block encoder types. We should use a short int signature or
> something instead. Think of treating block encoders like filters. Then
> encoder/decoder stacking could be assembled at runtime from metadata read
> out of hfile.
>
>
> On Mon, Apr 22, 2013 at 12:28 PM, Stack <st...@duboce.net> wrote:
>
> > On Thu, Apr 18, 2013 at 8:19 PM, ramkrishna vasudevan <
> > ramkrishna.s.vasudevan@gmail.com> wrote:
> >
> > > My questions were mainly because, if i have the current code  and i
> would
> > > want to introduce tags in it, where would i do it?
> > >
> >
> >
> > Good question.  If the read/write path currently does not support
> Cell-only
> > access, you are in a bit of a bind.
> >
> > It would take some effort converting all to pure Cell.  Most would be
> just
> > grunt work but there are a few tricky spots (according to a quick survey
> > done by our LarsH).  If all was Cell on the read/write path, I'd imagine
> > it'd be easy enough getting your tags in the mix.
> >
> > Until then, you are left w/ some sort of hack on KV.
> >
> >
> >
> > > So if i need tags to be introduced should i start changing the HFile
> > > formats also and only then i would be getting the tags to work?
> > > What do you think here?
> > >
> > >
> > hfiles are versioned so we can up the version when we put a Cell API on
> it.
> >  Might be a bit of work though since will have to bring along the
> > compressors and block encoders and currently these are hard-coded to
> expect
> > KV and in particular KVs current serialization.
> >
> > St.Ack
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: Cell Encoders and usage of Cell

Posted by Andrew Purtell <ap...@apache.org>.
On HBASE-7544 I added encryption to HFile starting with the encoder and
decoder contexts and working outward, so I can attest it is possible to add
features that wrap *around* KV serialization without changing the KV
serialization itself. On trunk the block encoder stuff has been at least
partially factored out so it is not so bad. The HFileV2 code in 0.94 is
more of a hairball.

To pursue an incremental development approach, one option to consider is
making it possible to stack block encoders. Then we might be able to
implement tag support as a block encoder without losing the ability to do
real block encoding. Something else to undo is the inflexible enum that
specifies block encoder types. We should use a short int signature or
something instead. Think of treating block encoders like filters. Then
encoder/decoder stacking could be assembled at runtime from metadata read
out of hfile.


On Mon, Apr 22, 2013 at 12:28 PM, Stack <st...@duboce.net> wrote:

> On Thu, Apr 18, 2013 at 8:19 PM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
> > My questions were mainly because, if i have the current code  and i would
> > want to introduce tags in it, where would i do it?
> >
>
>
> Good question.  If the read/write path currently does not support Cell-only
> access, you are in a bit of a bind.
>
> It would take some effort converting all to pure Cell.  Most would be just
> grunt work but there are a few tricky spots (according to a quick survey
> done by our LarsH).  If all was Cell on the read/write path, I'd imagine
> it'd be easy enough getting your tags in the mix.
>
> Until then, you are left w/ some sort of hack on KV.
>
>
>
> > So if i need tags to be introduced should i start changing the HFile
> > formats also and only then i would be getting the tags to work?
> > What do you think here?
> >
> >
> hfiles are versioned so we can up the version when we put a Cell API on it.
>  Might be a bit of work though since will have to bring along the
> compressors and block encoders and currently these are hard-coded to expect
> KV and in particular KVs current serialization.
>
> St.Ack
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: Cell Encoders and usage of Cell

Posted by Stack <st...@duboce.net>.
On Thu, Apr 18, 2013 at 8:19 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> My questions were mainly because, if i have the current code  and i would
> want to introduce tags in it, where would i do it?
>


Good question.  If the read/write path currently does not support Cell-only
access, you are in a bit of a bind.

It would take some effort converting all to pure Cell.  Most would be just
grunt work but there are a few tricky spots (according to a quick survey
done by our LarsH).  If all was Cell on the read/write path, I'd imagine
it'd be easy enough getting your tags in the mix.

Until then, you are left w/ some sort of hack on KV.



> So if i need tags to be introduced should i start changing the HFile
> formats also and only then i would be getting the tags to work?
> What do you think here?
>
>
hfiles are versioned so we can up the version when we put a Cell API on it.
 Might be a bit of work though since will have to bring along the
compressors and block encoders and currently these are hard-coded to expect
KV and in particular KVs current serialization.

St.Ack

Re: Cell Encoders and usage of Cell

Posted by ramkrishna vasudevan <ra...@gmail.com>.
Just adding to what Matt said,
Cell and KeyValue are the same.

Just that in cell you have individual byte arrays carrying the Row, family,
qualifier, Type and timestamp.
So it is basically saving us from the internal size it occupies.  Also it
helps us to use the same interface between the RPC and also the Hfile.


To make the HFile understand these Cells we need to do some work here.

Regards
Ram


On Mon, Apr 22, 2013 at 6:06 AM, Matt Corgan <mc...@hotpads.com> wrote:

> I'm not 100% clear what you're asking Nick.  My understanding is that Cell
> and KeyValue are identical with regards to the timestamp.  Timestamp is
> part of the identity of the Cell/KeyValue, and each has 1 and only 1
> timestamp from a logical perspective.
>
> From a physical/memory perspective, KeyValue is one implementation of Cell
> where all fields are fully expanded into a single continuous byte[].  The
> Cell interface adds the ability for a timestamp to be shared behind the
> scenes to save memory.  In the case where there are 100 KeyValues in an RPC
> result or disk block, the KeyValue implementation will require 800b of
> memory, but the Cell interface will de-duplicate them and store as little
> as ~8b for the whole RPC or disk block.
>
>
> On Sun, Apr 21, 2013 at 5:08 PM, Nick Dimiduk <nd...@gmail.com> wrote:
>
> > A related question. Can you clarify the distinction between a Cell and a
> > KeyValue as pertains to the timestamp? That is, which of these two
> concepts
> > carries the timestamp as a component of its coordinates? Does a Cell
> > contain multiple KeyValue versions or does a KeyValue contain multiple
> Cell
> > versions?
> >
> > In HBASE-7233, patch v9, I see KeyValue is replaced by Cell in the Get
> > result, which implies to me that a Cell contains multiple KeyValue
> > versions. I don't see the imported Cell.proto. Presumably that's the same
> > Cell type defined in hbase.proto currently on trunk.
> >
> > Thanks,
> > Nick
> >
> > On Sun, Apr 21, 2013 at 2:47 PM, Matt Corgan <mc...@hotpads.com>
> wrote:
> >
> > > fyi Ram - i started adding the Cell interface to the read path of the
> > delta
> > > encoders in HBASE-7323 <
> https://issues.apache.org/jira/browse/HBASE-7323
> > >.
> > >  It's one possible place to start working on it.
> > >
> > >
> > > On Thu, Apr 18, 2013 at 8:19 PM, ramkrishna vasudevan <
> > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > >
> > > > Thanks for your reply Stack.
> > > > >I think so.  hfile APIs are about KVs.  Should be about Cell I'd
> > think.
> > > > Yes.  This is what i too think.
> > > >
> > > > >If you need the above, you are no doing Cell right I'd argue.  The
> > very
> > > > idea of Cell is a disconnect between how it is stored and Cell use.
> > > >
> > > > Yes Stack.  I understand this.  I am not introducing the getKeyOffset
> > and
> > > > getKeyLength over there.
> > > > My questions were mainly because, if i have the current code  and i
> > would
> > > > want to introduce tags in it, where would i do it?
> > > > So if i need tags to be introduced should i start changing the HFile
> > > > formats also and only then i would be getting the tags to work?
> > > > What do you think here?
> > > >
> > > > > I think the Cell
> > > > Interface needs methods added to allow access to "labels".
> > > > Yes.  You are right.
> > > >
> > > >
> > > >
> > > > On Fri, Apr 19, 2013 at 6:58 AM, Stack <st...@duboce.net> wrote:
> > > >
> > > > > On Wed, Apr 17, 2013 at 10:16 AM, ramkrishna vasudevan <
> > > > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > With the introduction of the new Cell Interface we are providing
> a
> > > way
> > > > > > where both the RPC usage of cell and the usage of Cell in HFile
> are
> > > > > > unified.(abstracted)
> > > > > >
> > > > > > The current block encoder which encodes the kvs into hfile blocks
> > > will
> > > > be
> > > > > > enhanced may be BlockEncode2 which will deal with Cell encoding
> and
> > > the
> > > > > > same will be written to HFile.
> > > > > >
> > > > > >
> > > > > That is the idea.  Current block encoders are unusable for anything
> > but
> > > > > hfile with their presumption of a particular KeyValue serialization
> > and
> > > > >  with hfile context sprinkled throughout.
> > > > >
> > > > >
> > > > >
> > > > > > Does that mean that there are going to be changes to the HFile
> > format
> > > > > also?
> > > > > >  Just to understand is my understanding here correct or not.
> > > > > >
> > > > > >
> > > > > I think so.  hfile APIs are about KVs.  Should be about Cell I'd
> > think.
> > > > >
> > > > >
> > > > >
> > > > > > Because as the Cell interface the row, family, qualifier all are
> > > > treated
> > > > > as
> > > > > > individual byte arrays.  Also it does not provide a way to access
> > the
> > > > > > getKeyOffset() and getKeyLength().
> > > > > >
> > > > > >
> > > > > If you need the above, you are no doing Cell right I'd argue.  The
> > very
> > > > > idea of Cell is a disconnect between how it is stored and Cell use.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > > This is in lieu with HBASE-7448 - Adding tags to cell interface
> and
> > > > then
> > > > > > the same will  be used in
> > > > > > HBASE-7663 - Visibility labels.
> > > > > >
> > > > > >
> > > > > I am not sure I follow what you are asking above Ram.  I think the
> > Cell
> > > > > Interface needs methods added to allow access to "labels".
> > > > >
> > > > > St.Ack
> > > > >
> > > > >
> > > > > > May be further queries/doubts can be posted on those relevant
> JIRAs
> > > to
> > > > > > proceed work on that.
> > > > > >
> > > > > > Regards
> > > > > > Ram
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Cell Encoders and usage of Cell

Posted by Matt Corgan <mc...@hotpads.com>.
I'm not 100% clear what you're asking Nick.  My understanding is that Cell
and KeyValue are identical with regards to the timestamp.  Timestamp is
part of the identity of the Cell/KeyValue, and each has 1 and only 1
timestamp from a logical perspective.

>From a physical/memory perspective, KeyValue is one implementation of Cell
where all fields are fully expanded into a single continuous byte[].  The
Cell interface adds the ability for a timestamp to be shared behind the
scenes to save memory.  In the case where there are 100 KeyValues in an RPC
result or disk block, the KeyValue implementation will require 800b of
memory, but the Cell interface will de-duplicate them and store as little
as ~8b for the whole RPC or disk block.


On Sun, Apr 21, 2013 at 5:08 PM, Nick Dimiduk <nd...@gmail.com> wrote:

> A related question. Can you clarify the distinction between a Cell and a
> KeyValue as pertains to the timestamp? That is, which of these two concepts
> carries the timestamp as a component of its coordinates? Does a Cell
> contain multiple KeyValue versions or does a KeyValue contain multiple Cell
> versions?
>
> In HBASE-7233, patch v9, I see KeyValue is replaced by Cell in the Get
> result, which implies to me that a Cell contains multiple KeyValue
> versions. I don't see the imported Cell.proto. Presumably that's the same
> Cell type defined in hbase.proto currently on trunk.
>
> Thanks,
> Nick
>
> On Sun, Apr 21, 2013 at 2:47 PM, Matt Corgan <mc...@hotpads.com> wrote:
>
> > fyi Ram - i started adding the Cell interface to the read path of the
> delta
> > encoders in HBASE-7323 <https://issues.apache.org/jira/browse/HBASE-7323
> >.
> >  It's one possible place to start working on it.
> >
> >
> > On Thu, Apr 18, 2013 at 8:19 PM, ramkrishna vasudevan <
> > ramkrishna.s.vasudevan@gmail.com> wrote:
> >
> > > Thanks for your reply Stack.
> > > >I think so.  hfile APIs are about KVs.  Should be about Cell I'd
> think.
> > > Yes.  This is what i too think.
> > >
> > > >If you need the above, you are no doing Cell right I'd argue.  The
> very
> > > idea of Cell is a disconnect between how it is stored and Cell use.
> > >
> > > Yes Stack.  I understand this.  I am not introducing the getKeyOffset
> and
> > > getKeyLength over there.
> > > My questions were mainly because, if i have the current code  and i
> would
> > > want to introduce tags in it, where would i do it?
> > > So if i need tags to be introduced should i start changing the HFile
> > > formats also and only then i would be getting the tags to work?
> > > What do you think here?
> > >
> > > > I think the Cell
> > > Interface needs methods added to allow access to "labels".
> > > Yes.  You are right.
> > >
> > >
> > >
> > > On Fri, Apr 19, 2013 at 6:58 AM, Stack <st...@duboce.net> wrote:
> > >
> > > > On Wed, Apr 17, 2013 at 10:16 AM, ramkrishna vasudevan <
> > > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > With the introduction of the new Cell Interface we are providing a
> > way
> > > > > where both the RPC usage of cell and the usage of Cell in HFile are
> > > > > unified.(abstracted)
> > > > >
> > > > > The current block encoder which encodes the kvs into hfile blocks
> > will
> > > be
> > > > > enhanced may be BlockEncode2 which will deal with Cell encoding and
> > the
> > > > > same will be written to HFile.
> > > > >
> > > > >
> > > > That is the idea.  Current block encoders are unusable for anything
> but
> > > > hfile with their presumption of a particular KeyValue serialization
> and
> > > >  with hfile context sprinkled throughout.
> > > >
> > > >
> > > >
> > > > > Does that mean that there are going to be changes to the HFile
> format
> > > > also?
> > > > >  Just to understand is my understanding here correct or not.
> > > > >
> > > > >
> > > > I think so.  hfile APIs are about KVs.  Should be about Cell I'd
> think.
> > > >
> > > >
> > > >
> > > > > Because as the Cell interface the row, family, qualifier all are
> > > treated
> > > > as
> > > > > individual byte arrays.  Also it does not provide a way to access
> the
> > > > > getKeyOffset() and getKeyLength().
> > > > >
> > > > >
> > > > If you need the above, you are no doing Cell right I'd argue.  The
> very
> > > > idea of Cell is a disconnect between how it is stored and Cell use.
> > > >
> > > >
> > > >
> > > >
> > > > > This is in lieu with HBASE-7448 - Adding tags to cell interface and
> > > then
> > > > > the same will  be used in
> > > > > HBASE-7663 - Visibility labels.
> > > > >
> > > > >
> > > > I am not sure I follow what you are asking above Ram.  I think the
> Cell
> > > > Interface needs methods added to allow access to "labels".
> > > >
> > > > St.Ack
> > > >
> > > >
> > > > > May be further queries/doubts can be posted on those relevant JIRAs
> > to
> > > > > proceed work on that.
> > > > >
> > > > > Regards
> > > > > Ram
> > > > >
> > > >
> > >
> >
>

Re: Cell Encoders and usage of Cell

Posted by Nick Dimiduk <nd...@gmail.com>.
A related question. Can you clarify the distinction between a Cell and a
KeyValue as pertains to the timestamp? That is, which of these two concepts
carries the timestamp as a component of its coordinates? Does a Cell
contain multiple KeyValue versions or does a KeyValue contain multiple Cell
versions?

In HBASE-7233, patch v9, I see KeyValue is replaced by Cell in the Get
result, which implies to me that a Cell contains multiple KeyValue
versions. I don't see the imported Cell.proto. Presumably that's the same
Cell type defined in hbase.proto currently on trunk.

Thanks,
Nick

On Sun, Apr 21, 2013 at 2:47 PM, Matt Corgan <mc...@hotpads.com> wrote:

> fyi Ram - i started adding the Cell interface to the read path of the delta
> encoders in HBASE-7323 <https://issues.apache.org/jira/browse/HBASE-7323>.
>  It's one possible place to start working on it.
>
>
> On Thu, Apr 18, 2013 at 8:19 PM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
> > Thanks for your reply Stack.
> > >I think so.  hfile APIs are about KVs.  Should be about Cell I'd think.
> > Yes.  This is what i too think.
> >
> > >If you need the above, you are no doing Cell right I'd argue.  The very
> > idea of Cell is a disconnect between how it is stored and Cell use.
> >
> > Yes Stack.  I understand this.  I am not introducing the getKeyOffset and
> > getKeyLength over there.
> > My questions were mainly because, if i have the current code  and i would
> > want to introduce tags in it, where would i do it?
> > So if i need tags to be introduced should i start changing the HFile
> > formats also and only then i would be getting the tags to work?
> > What do you think here?
> >
> > > I think the Cell
> > Interface needs methods added to allow access to "labels".
> > Yes.  You are right.
> >
> >
> >
> > On Fri, Apr 19, 2013 at 6:58 AM, Stack <st...@duboce.net> wrote:
> >
> > > On Wed, Apr 17, 2013 at 10:16 AM, ramkrishna vasudevan <
> > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > >
> > > > Hi
> > > >
> > > > With the introduction of the new Cell Interface we are providing a
> way
> > > > where both the RPC usage of cell and the usage of Cell in HFile are
> > > > unified.(abstracted)
> > > >
> > > > The current block encoder which encodes the kvs into hfile blocks
> will
> > be
> > > > enhanced may be BlockEncode2 which will deal with Cell encoding and
> the
> > > > same will be written to HFile.
> > > >
> > > >
> > > That is the idea.  Current block encoders are unusable for anything but
> > > hfile with their presumption of a particular KeyValue serialization and
> > >  with hfile context sprinkled throughout.
> > >
> > >
> > >
> > > > Does that mean that there are going to be changes to the HFile format
> > > also?
> > > >  Just to understand is my understanding here correct or not.
> > > >
> > > >
> > > I think so.  hfile APIs are about KVs.  Should be about Cell I'd think.
> > >
> > >
> > >
> > > > Because as the Cell interface the row, family, qualifier all are
> > treated
> > > as
> > > > individual byte arrays.  Also it does not provide a way to access the
> > > > getKeyOffset() and getKeyLength().
> > > >
> > > >
> > > If you need the above, you are no doing Cell right I'd argue.  The very
> > > idea of Cell is a disconnect between how it is stored and Cell use.
> > >
> > >
> > >
> > >
> > > > This is in lieu with HBASE-7448 - Adding tags to cell interface and
> > then
> > > > the same will  be used in
> > > > HBASE-7663 - Visibility labels.
> > > >
> > > >
> > > I am not sure I follow what you are asking above Ram.  I think the Cell
> > > Interface needs methods added to allow access to "labels".
> > >
> > > St.Ack
> > >
> > >
> > > > May be further queries/doubts can be posted on those relevant JIRAs
> to
> > > > proceed work on that.
> > > >
> > > > Regards
> > > > Ram
> > > >
> > >
> >
>

Re: Cell Encoders and usage of Cell

Posted by Matt Corgan <mc...@hotpads.com>.
fyi Ram - i started adding the Cell interface to the read path of the delta
encoders in HBASE-7323 <https://issues.apache.org/jira/browse/HBASE-7323>.
 It's one possible place to start working on it.


On Thu, Apr 18, 2013 at 8:19 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> Thanks for your reply Stack.
> >I think so.  hfile APIs are about KVs.  Should be about Cell I'd think.
> Yes.  This is what i too think.
>
> >If you need the above, you are no doing Cell right I'd argue.  The very
> idea of Cell is a disconnect between how it is stored and Cell use.
>
> Yes Stack.  I understand this.  I am not introducing the getKeyOffset and
> getKeyLength over there.
> My questions were mainly because, if i have the current code  and i would
> want to introduce tags in it, where would i do it?
> So if i need tags to be introduced should i start changing the HFile
> formats also and only then i would be getting the tags to work?
> What do you think here?
>
> > I think the Cell
> Interface needs methods added to allow access to "labels".
> Yes.  You are right.
>
>
>
> On Fri, Apr 19, 2013 at 6:58 AM, Stack <st...@duboce.net> wrote:
>
> > On Wed, Apr 17, 2013 at 10:16 AM, ramkrishna vasudevan <
> > ramkrishna.s.vasudevan@gmail.com> wrote:
> >
> > > Hi
> > >
> > > With the introduction of the new Cell Interface we are providing a way
> > > where both the RPC usage of cell and the usage of Cell in HFile are
> > > unified.(abstracted)
> > >
> > > The current block encoder which encodes the kvs into hfile blocks will
> be
> > > enhanced may be BlockEncode2 which will deal with Cell encoding and the
> > > same will be written to HFile.
> > >
> > >
> > That is the idea.  Current block encoders are unusable for anything but
> > hfile with their presumption of a particular KeyValue serialization and
> >  with hfile context sprinkled throughout.
> >
> >
> >
> > > Does that mean that there are going to be changes to the HFile format
> > also?
> > >  Just to understand is my understanding here correct or not.
> > >
> > >
> > I think so.  hfile APIs are about KVs.  Should be about Cell I'd think.
> >
> >
> >
> > > Because as the Cell interface the row, family, qualifier all are
> treated
> > as
> > > individual byte arrays.  Also it does not provide a way to access the
> > > getKeyOffset() and getKeyLength().
> > >
> > >
> > If you need the above, you are no doing Cell right I'd argue.  The very
> > idea of Cell is a disconnect between how it is stored and Cell use.
> >
> >
> >
> >
> > > This is in lieu with HBASE-7448 - Adding tags to cell interface and
> then
> > > the same will  be used in
> > > HBASE-7663 - Visibility labels.
> > >
> > >
> > I am not sure I follow what you are asking above Ram.  I think the Cell
> > Interface needs methods added to allow access to "labels".
> >
> > St.Ack
> >
> >
> > > May be further queries/doubts can be posted on those relevant JIRAs to
> > > proceed work on that.
> > >
> > > Regards
> > > Ram
> > >
> >
>

Re: Cell Encoders and usage of Cell

Posted by ramkrishna vasudevan <ra...@gmail.com>.
Thanks for your reply Stack.
>I think so.  hfile APIs are about KVs.  Should be about Cell I'd think.
Yes.  This is what i too think.

>If you need the above, you are no doing Cell right I'd argue.  The very
idea of Cell is a disconnect between how it is stored and Cell use.

Yes Stack.  I understand this.  I am not introducing the getKeyOffset and
getKeyLength over there.
My questions were mainly because, if i have the current code  and i would
want to introduce tags in it, where would i do it?
So if i need tags to be introduced should i start changing the HFile
formats also and only then i would be getting the tags to work?
What do you think here?

> I think the Cell
Interface needs methods added to allow access to "labels".
Yes.  You are right.



On Fri, Apr 19, 2013 at 6:58 AM, Stack <st...@duboce.net> wrote:

> On Wed, Apr 17, 2013 at 10:16 AM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
> > Hi
> >
> > With the introduction of the new Cell Interface we are providing a way
> > where both the RPC usage of cell and the usage of Cell in HFile are
> > unified.(abstracted)
> >
> > The current block encoder which encodes the kvs into hfile blocks will be
> > enhanced may be BlockEncode2 which will deal with Cell encoding and the
> > same will be written to HFile.
> >
> >
> That is the idea.  Current block encoders are unusable for anything but
> hfile with their presumption of a particular KeyValue serialization and
>  with hfile context sprinkled throughout.
>
>
>
> > Does that mean that there are going to be changes to the HFile format
> also?
> >  Just to understand is my understanding here correct or not.
> >
> >
> I think so.  hfile APIs are about KVs.  Should be about Cell I'd think.
>
>
>
> > Because as the Cell interface the row, family, qualifier all are treated
> as
> > individual byte arrays.  Also it does not provide a way to access the
> > getKeyOffset() and getKeyLength().
> >
> >
> If you need the above, you are no doing Cell right I'd argue.  The very
> idea of Cell is a disconnect between how it is stored and Cell use.
>
>
>
>
> > This is in lieu with HBASE-7448 - Adding tags to cell interface and then
> > the same will  be used in
> > HBASE-7663 - Visibility labels.
> >
> >
> I am not sure I follow what you are asking above Ram.  I think the Cell
> Interface needs methods added to allow access to "labels".
>
> St.Ack
>
>
> > May be further queries/doubts can be posted on those relevant JIRAs to
> > proceed work on that.
> >
> > Regards
> > Ram
> >
>

Re: Cell Encoders and usage of Cell

Posted by Stack <st...@duboce.net>.
On Wed, Apr 17, 2013 at 10:16 AM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> Hi
>
> With the introduction of the new Cell Interface we are providing a way
> where both the RPC usage of cell and the usage of Cell in HFile are
> unified.(abstracted)
>
> The current block encoder which encodes the kvs into hfile blocks will be
> enhanced may be BlockEncode2 which will deal with Cell encoding and the
> same will be written to HFile.
>
>
That is the idea.  Current block encoders are unusable for anything but
hfile with their presumption of a particular KeyValue serialization and
 with hfile context sprinkled throughout.



> Does that mean that there are going to be changes to the HFile format also?
>  Just to understand is my understanding here correct or not.
>
>
I think so.  hfile APIs are about KVs.  Should be about Cell I'd think.



> Because as the Cell interface the row, family, qualifier all are treated as
> individual byte arrays.  Also it does not provide a way to access the
> getKeyOffset() and getKeyLength().
>
>
If you need the above, you are no doing Cell right I'd argue.  The very
idea of Cell is a disconnect between how it is stored and Cell use.




> This is in lieu with HBASE-7448 - Adding tags to cell interface and then
> the same will  be used in
> HBASE-7663 - Visibility labels.
>
>
I am not sure I follow what you are asking above Ram.  I think the Cell
Interface needs methods added to allow access to "labels".

St.Ack


> May be further queries/doubts can be posted on those relevant JIRAs to
> proceed work on that.
>
> Regards
> Ram
>

Re: Cell Encoders and usage of Cell

Posted by ramkrishna vasudevan <ra...@gmail.com>.
Ping?
Could someone clarify on this pls?

Regards
Ram


On Wed, Apr 17, 2013 at 10:46 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> Hi
>
> With the introduction of the new Cell Interface we are providing a way
> where both the RPC usage of cell and the usage of Cell in HFile are
> unified.(abstracted)
>
> The current block encoder which encodes the kvs into hfile blocks will be
> enhanced may be BlockEncode2 which will deal with Cell encoding and the
> same will be written to HFile.
>
> Does that mean that there are going to be changes to the HFile format
> also?  Just to understand is my understanding here correct or not.
>
> Because as the Cell interface the row, family, qualifier all are treated
> as individual byte arrays.  Also it does not provide a way to access the
> getKeyOffset() and getKeyLength().
>
> This is in lieu with HBASE-7448 - Adding tags to cell interface and then
> the same will  be used in
> HBASE-7663 - Visibility labels.
>
> May be further queries/doubts can be posted on those relevant JIRAs to
> proceed work on that.
>
> Regards
> Ram
>