You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Josh Elser <el...@apache.org> on 2017/10/26 22:40:09 UTC

Struggles around Cell#getType()

Hiya,

(Background: see HBASE-19002)

In trying to write some example Observers, I found myself in a pickle: 
how do I tell if a Cell is a Put?

* Cell#getType() returns a byte which corresponds to a KeyValue.Type
* KeyValue.Type has API to convert a byte to Type
* KeyValue (and thus KeyValue.Type) is IA.Private
* DataType o.a.h.h.typesDataType _appears to me_ to be the replacement 
for the KeyValue.Type

Best as I can tell, Cell#getType() should be deprecated and we should 
have some kind of API (method on Cell or CellUtil) which returns a 
DataType instead of Type. The details of the byte and the KeyValue.Type 
should be hidden inside the implementation.

My hunch is that this is an accidental omission, but Stack recommended 
that I "ask the class" ;). What have I missed? I think this is trivial 
to fix; obviously, I don't want to make a fix if I just didn't look hard 
enough.

Thanks!

- Josh

Re: Struggles around Cell#getType()

Posted by Josh Elser <el...@apache.org>.
Filed the following and tentatively tagged them for beta-1 (sorry, I 
know that's crappy as these are API things):

https://issues.apache.org/jira/browse/HBASE-19111
https://issues.apache.org/jira/browse/HBASE-19112

On 10/27/17 1:40 PM, Chia-Ping Tsai wrote:
> bq. You agree with Ram's suggestion for helper methods as a way forward?
> Adding the CellUtil#isPut() is ok to me as the PUT is a basic operation in hbase.
> 
> On 2017-10-28 00:58, Josh Elser <el...@apache.org> wrote:
>> Re-reading https://issues.apache.org/jira/browse/HBASE-8693 that Sergey
>> pointed out, I more think that the maybe getType() was misintepreted
>> from what Nick originally meant it to be. Maybe intentional, maybe not.
>>
>> I don't think getTimestamp() should be removed -- when we store multiple
>> versions of a Key, users should be able to reconcile the Cells client
>> side (e.g. consider a CP which performs some custom merging logic).
>>
>> getSequenceId() I'd agree probably doesn't belong. getTag() I'll hold
>> off judgement because I'm constantly biased into thinking the feature is
>> something that it isn't :)
>>
>> You agree with Ram's suggestion for helper methods as a way forward?
>>
>> On 10/27/17 7:29 AM, Chia-Ping Tsai wrote:
>>> The CellBuilder#Data type is introduced to make sure all components used to builder cell are IA.Public.
>>>
>>> bq. Best as I can tell, Cell#getType() should be deprecated
>>> As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and #getTag should be deprecated as these methods is some kind of internal info of storage engine. As a key-value store, the key  consisting of row, family, and qualifier is enough to the general purpose. Other fields belong to the specific storage engine, and they should not be in the Cell which is our "frontline" interface of data.
>>>
>>>
>>> On 2017-10-27 06:40, Josh Elser <el...@apache.org> wrote:
>>>> Hiya,
>>>>
>>>> (Background: see HBASE-19002)
>>>>
>>>> In trying to write some example Observers, I found myself in a pickle:
>>>> how do I tell if a Cell is a Put?
>>>>
>>>> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
>>>> * KeyValue.Type has API to convert a byte to Type
>>>> * KeyValue (and thus KeyValue.Type) is IA.Private
>>>> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
>>>> for the KeyValue.Type
>>>>
>>>> Best as I can tell, Cell#getType() should be deprecated and we should
>>>> have some kind of API (method on Cell or CellUtil) which returns a
>>>> DataType instead of Type. The details of the byte and the KeyValue.Type
>>>> should be hidden inside the implementation.
>>>>
>>>> My hunch is that this is an accidental omission, but Stack recommended
>>>> that I "ask the class" ;). What have I missed? I think this is trivial
>>>> to fix; obviously, I don't want to make a fix if I just didn't look hard
>>>> enough.
>>>>
>>>> Thanks!
>>>>
>>>> - Josh
>>>>
>>

Re: Struggles around Cell#getType()

Posted by Chia-Ping Tsai <ch...@apache.org>.
bq. CellBuilder#DataType that produces Cell that has getTypeByte method which
returns "The byte representation of the KeyValue.TYPE".
That is a workaround that makes all parameters IA.Public to create cell. (-_-) The root  cause is the cell interface expected to be a general data has many methods which belong to current LSM storage.

On 2017-10-28 08:21, Sergey Soldatov <se...@gmail.com> wrote: 
> bq. Here is the DataType I was talking about:
> 
> Ah, thanks, Ted! Forgot to make the pull. According to the HBASE-18927 as
> well as the discussion that had a place before that, It addresses exactly
> the topic we are discussing. But actually, it didn't solve the problem and
> just created more ambiguity. We have CellBuilder that requires
> CellBuilder#DataType that produces Cell that has getTypeByte method which
> returns "The byte representation of the KeyValue.TYPE".
> 
> bq. I more think that the maybe getType() was misintepreted from what Nick
> originally meant it to be. Maybe intentional, maybe not.
> 
> If I remember correctly HBASE-8693 was about the encoding for commonly used
> data types to keep them sorted in native order. Similar to PData types in
> Phoenix.
> 
> bq. You agree with Ram's suggestion for helper methods as a way forward?
> 
> Well, we already have helpers for all types except Put/Minimum. Adding one
> more is not a big deal. But deprecating getters sounds like a bad idea. For
> example, the timestamp is used by many 3rd parties to do their own
> transactional/versioning support and actually, it's a part of the public
> API. If we may specify timestamp for the cell, why we should restrict users
> from reading it? Others fields may be useful for creating a modified copies
> of the KV like we do in our custom StoreFileReader for local indexes.
> 
> Thanks,
> Sergey
> 
> On Fri, Oct 27, 2017 at 10:40 AM, Chia-Ping Tsai <ch...@apache.org>
> wrote:
> 
> > bq. You agree with Ram's suggestion for helper methods as a way forward?
> > Adding the CellUtil#isPut() is ok to me as the PUT is a basic operation in
> > hbase.
> >
> > On 2017-10-28 00:58, Josh Elser <el...@apache.org> wrote:
> > > Re-reading https://issues.apache.org/jira/browse/HBASE-8693 that Sergey
> > > pointed out, I more think that the maybe getType() was misintepreted
> > > from what Nick originally meant it to be. Maybe intentional, maybe not.
> > >
> > > I don't think getTimestamp() should be removed -- when we store multiple
> > > versions of a Key, users should be able to reconcile the Cells client
> > > side (e.g. consider a CP which performs some custom merging logic).
> > >
> > > getSequenceId() I'd agree probably doesn't belong. getTag() I'll hold
> > > off judgement because I'm constantly biased into thinking the feature is
> > > something that it isn't :)
> > >
> > > You agree with Ram's suggestion for helper methods as a way forward?
> > >
> > > On 10/27/17 7:29 AM, Chia-Ping Tsai wrote:
> > > > The CellBuilder#Data type is introduced to make sure all components
> > used to builder cell are IA.Public.
> > > >
> > > > bq. Best as I can tell, Cell#getType() should be deprecated
> > > > As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and
> > #getTag should be deprecated as these methods is some kind of internal info
> > of storage engine. As a key-value store, the key  consisting of row,
> > family, and qualifier is enough to the general purpose. Other fields belong
> > to the specific storage engine, and they should not be in the Cell which is
> > our "frontline" interface of data.
> > > >
> > > >
> > > > On 2017-10-27 06:40, Josh Elser <el...@apache.org> wrote:
> > > >> Hiya,
> > > >>
> > > >> (Background: see HBASE-19002)
> > > >>
> > > >> In trying to write some example Observers, I found myself in a pickle:
> > > >> how do I tell if a Cell is a Put?
> > > >>
> > > >> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> > > >> * KeyValue.Type has API to convert a byte to Type
> > > >> * KeyValue (and thus KeyValue.Type) is IA.Private
> > > >> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
> > > >> for the KeyValue.Type
> > > >>
> > > >> Best as I can tell, Cell#getType() should be deprecated and we should
> > > >> have some kind of API (method on Cell or CellUtil) which returns a
> > > >> DataType instead of Type. The details of the byte and the
> > KeyValue.Type
> > > >> should be hidden inside the implementation.
> > > >>
> > > >> My hunch is that this is an accidental omission, but Stack recommended
> > > >> that I "ask the class" ;). What have I missed? I think this is trivial
> > > >> to fix; obviously, I don't want to make a fix if I just didn't look
> > hard
> > > >> enough.
> > > >>
> > > >> Thanks!
> > > >>
> > > >> - Josh
> > > >>
> > >
> >
> 

Re: Struggles around Cell#getType()

Posted by Sergey Soldatov <se...@gmail.com>.
bq. Here is the DataType I was talking about:

Ah, thanks, Ted! Forgot to make the pull. According to the HBASE-18927 as
well as the discussion that had a place before that, It addresses exactly
the topic we are discussing. But actually, it didn't solve the problem and
just created more ambiguity. We have CellBuilder that requires
CellBuilder#DataType that produces Cell that has getTypeByte method which
returns "The byte representation of the KeyValue.TYPE".

bq. I more think that the maybe getType() was misintepreted from what Nick
originally meant it to be. Maybe intentional, maybe not.

If I remember correctly HBASE-8693 was about the encoding for commonly used
data types to keep them sorted in native order. Similar to PData types in
Phoenix.

bq. You agree with Ram's suggestion for helper methods as a way forward?

Well, we already have helpers for all types except Put/Minimum. Adding one
more is not a big deal. But deprecating getters sounds like a bad idea. For
example, the timestamp is used by many 3rd parties to do their own
transactional/versioning support and actually, it's a part of the public
API. If we may specify timestamp for the cell, why we should restrict users
from reading it? Others fields may be useful for creating a modified copies
of the KV like we do in our custom StoreFileReader for local indexes.

Thanks,
Sergey

On Fri, Oct 27, 2017 at 10:40 AM, Chia-Ping Tsai <ch...@apache.org>
wrote:

> bq. You agree with Ram's suggestion for helper methods as a way forward?
> Adding the CellUtil#isPut() is ok to me as the PUT is a basic operation in
> hbase.
>
> On 2017-10-28 00:58, Josh Elser <el...@apache.org> wrote:
> > Re-reading https://issues.apache.org/jira/browse/HBASE-8693 that Sergey
> > pointed out, I more think that the maybe getType() was misintepreted
> > from what Nick originally meant it to be. Maybe intentional, maybe not.
> >
> > I don't think getTimestamp() should be removed -- when we store multiple
> > versions of a Key, users should be able to reconcile the Cells client
> > side (e.g. consider a CP which performs some custom merging logic).
> >
> > getSequenceId() I'd agree probably doesn't belong. getTag() I'll hold
> > off judgement because I'm constantly biased into thinking the feature is
> > something that it isn't :)
> >
> > You agree with Ram's suggestion for helper methods as a way forward?
> >
> > On 10/27/17 7:29 AM, Chia-Ping Tsai wrote:
> > > The CellBuilder#Data type is introduced to make sure all components
> used to builder cell are IA.Public.
> > >
> > > bq. Best as I can tell, Cell#getType() should be deprecated
> > > As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and
> #getTag should be deprecated as these methods is some kind of internal info
> of storage engine. As a key-value store, the key  consisting of row,
> family, and qualifier is enough to the general purpose. Other fields belong
> to the specific storage engine, and they should not be in the Cell which is
> our "frontline" interface of data.
> > >
> > >
> > > On 2017-10-27 06:40, Josh Elser <el...@apache.org> wrote:
> > >> Hiya,
> > >>
> > >> (Background: see HBASE-19002)
> > >>
> > >> In trying to write some example Observers, I found myself in a pickle:
> > >> how do I tell if a Cell is a Put?
> > >>
> > >> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> > >> * KeyValue.Type has API to convert a byte to Type
> > >> * KeyValue (and thus KeyValue.Type) is IA.Private
> > >> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
> > >> for the KeyValue.Type
> > >>
> > >> Best as I can tell, Cell#getType() should be deprecated and we should
> > >> have some kind of API (method on Cell or CellUtil) which returns a
> > >> DataType instead of Type. The details of the byte and the
> KeyValue.Type
> > >> should be hidden inside the implementation.
> > >>
> > >> My hunch is that this is an accidental omission, but Stack recommended
> > >> that I "ask the class" ;). What have I missed? I think this is trivial
> > >> to fix; obviously, I don't want to make a fix if I just didn't look
> hard
> > >> enough.
> > >>
> > >> Thanks!
> > >>
> > >> - Josh
> > >>
> >
>

Re: Struggles around Cell#getType()

Posted by Chia-Ping Tsai <ch...@apache.org>.
bq. You agree with Ram's suggestion for helper methods as a way forward?
Adding the CellUtil#isPut() is ok to me as the PUT is a basic operation in hbase.

On 2017-10-28 00:58, Josh Elser <el...@apache.org> wrote: 
> Re-reading https://issues.apache.org/jira/browse/HBASE-8693 that Sergey 
> pointed out, I more think that the maybe getType() was misintepreted 
> from what Nick originally meant it to be. Maybe intentional, maybe not.
> 
> I don't think getTimestamp() should be removed -- when we store multiple 
> versions of a Key, users should be able to reconcile the Cells client 
> side (e.g. consider a CP which performs some custom merging logic).
> 
> getSequenceId() I'd agree probably doesn't belong. getTag() I'll hold 
> off judgement because I'm constantly biased into thinking the feature is 
> something that it isn't :)
> 
> You agree with Ram's suggestion for helper methods as a way forward?
> 
> On 10/27/17 7:29 AM, Chia-Ping Tsai wrote:
> > The CellBuilder#Data type is introduced to make sure all components used to builder cell are IA.Public.
> > 
> > bq. Best as I can tell, Cell#getType() should be deprecated
> > As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and #getTag should be deprecated as these methods is some kind of internal info of storage engine. As a key-value store, the key  consisting of row, family, and qualifier is enough to the general purpose. Other fields belong to the specific storage engine, and they should not be in the Cell which is our "frontline" interface of data.
> > 
> > 
> > On 2017-10-27 06:40, Josh Elser <el...@apache.org> wrote:
> >> Hiya,
> >>
> >> (Background: see HBASE-19002)
> >>
> >> In trying to write some example Observers, I found myself in a pickle:
> >> how do I tell if a Cell is a Put?
> >>
> >> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> >> * KeyValue.Type has API to convert a byte to Type
> >> * KeyValue (and thus KeyValue.Type) is IA.Private
> >> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
> >> for the KeyValue.Type
> >>
> >> Best as I can tell, Cell#getType() should be deprecated and we should
> >> have some kind of API (method on Cell or CellUtil) which returns a
> >> DataType instead of Type. The details of the byte and the KeyValue.Type
> >> should be hidden inside the implementation.
> >>
> >> My hunch is that this is an accidental omission, but Stack recommended
> >> that I "ask the class" ;). What have I missed? I think this is trivial
> >> to fix; obviously, I don't want to make a fix if I just didn't look hard
> >> enough.
> >>
> >> Thanks!
> >>
> >> - Josh
> >>
> 

Re: Struggles around Cell#getType()

Posted by Josh Elser <el...@apache.org>.
Thanks, Nick.

I think some of my confusion here was around overlapping nomenclature 
("type" meaning both the type of application data but also the type of 
mutation). I appreciate the clarification!

On 10/29/17 1:28 PM, Nick Dimiduk wrote:
> On Fri, Oct 27, 2017 at 9:58 AM, Josh Elser <el...@apache.org> wrote:
> 
>> Re-reading https://issues.apache.org/jira/browse/HBASE-8693 that Sergey
>> pointed out, I more think that the maybe getType() was misintepreted from
>> what Nick originally meant it to be. Maybe intentional, maybe not.
>>
> 
> The classes under o.a.h.h.types are for helping users building applications
> to represent their application data in HBase byte arrays. See the overview
> in the package description [0]. One uses implementations of
> o.a.h.h.types.DataType to encode values that can then be stored in
> o.a.h.h.Cells -- as the row, the column qualifier, or column value. The
> type of the Cell is its use within the KeyValue storage engine that is
> HBase while the DataType describes a means of getting some Java object
> value into or out of a byte array.
> 
> [0]:
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/types/package-summary.html#package.description
> 
> On 10/27/17 7:29 AM, Chia-Ping Tsai wrote:
>>
>>> The CellBuilder#Data type is introduced to make sure all components used
>>> to builder cell are IA.Public.
>>>
>>> bq. Best as I can tell, Cell#getType() should be deprecated
>>> As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and #getTag
>>> should be deprecated as these methods is some kind of internal info of
>>> storage engine. As a key-value store, the key  consisting of row, family,
>>> and qualifier is enough to the general purpose. Other fields belong to the
>>> specific storage engine, and they should not be in the Cell which is our
>>> "frontline" interface of data.
>>>
>>>
>>> On 2017-10-27 06:40, Josh Elser <el...@apache.org> wrote:
>>>
>>>> Hiya,
>>>>
>>>> (Background: see HBASE-19002)
>>>>
>>>> In trying to write some example Observers, I found myself in a pickle:
>>>> how do I tell if a Cell is a Put?
>>>>
>>>> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
>>>> * KeyValue.Type has API to convert a byte to Type
>>>> * KeyValue (and thus KeyValue.Type) is IA.Private
>>>> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
>>>> for the KeyValue.Type
>>>>
>>>> Best as I can tell, Cell#getType() should be deprecated and we should
>>>> have some kind of API (method on Cell or CellUtil) which returns a
>>>> DataType instead of Type. The details of the byte and the KeyValue.Type
>>>> should be hidden inside the implementation.
>>>>
>>>> My hunch is that this is an accidental omission, but Stack recommended
>>>> that I "ask the class" ;). What have I missed? I think this is trivial
>>>> to fix; obviously, I don't want to make a fix if I just didn't look hard
>>>> enough.
>>>>
>>>> Thanks!
>>>>
>>>> - Josh
>>>>
>>>>
> 

Re: Struggles around Cell#getType()

Posted by Nick Dimiduk <nd...@gmail.com>.
On Fri, Oct 27, 2017 at 9:58 AM, Josh Elser <el...@apache.org> wrote:

> Re-reading https://issues.apache.org/jira/browse/HBASE-8693 that Sergey
> pointed out, I more think that the maybe getType() was misintepreted from
> what Nick originally meant it to be. Maybe intentional, maybe not.
>

The classes under o.a.h.h.types are for helping users building applications
to represent their application data in HBase byte arrays. See the overview
in the package description [0]. One uses implementations of
o.a.h.h.types.DataType to encode values that can then be stored in
o.a.h.h.Cells -- as the row, the column qualifier, or column value. The
type of the Cell is its use within the KeyValue storage engine that is
HBase while the DataType describes a means of getting some Java object
value into or out of a byte array.

[0]:
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/types/package-summary.html#package.description

On 10/27/17 7:29 AM, Chia-Ping Tsai wrote:
>
>> The CellBuilder#Data type is introduced to make sure all components used
>> to builder cell are IA.Public.
>>
>> bq. Best as I can tell, Cell#getType() should be deprecated
>> As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and #getTag
>> should be deprecated as these methods is some kind of internal info of
>> storage engine. As a key-value store, the key  consisting of row, family,
>> and qualifier is enough to the general purpose. Other fields belong to the
>> specific storage engine, and they should not be in the Cell which is our
>> "frontline" interface of data.
>>
>>
>> On 2017-10-27 06:40, Josh Elser <el...@apache.org> wrote:
>>
>>> Hiya,
>>>
>>> (Background: see HBASE-19002)
>>>
>>> In trying to write some example Observers, I found myself in a pickle:
>>> how do I tell if a Cell is a Put?
>>>
>>> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
>>> * KeyValue.Type has API to convert a byte to Type
>>> * KeyValue (and thus KeyValue.Type) is IA.Private
>>> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
>>> for the KeyValue.Type
>>>
>>> Best as I can tell, Cell#getType() should be deprecated and we should
>>> have some kind of API (method on Cell or CellUtil) which returns a
>>> DataType instead of Type. The details of the byte and the KeyValue.Type
>>> should be hidden inside the implementation.
>>>
>>> My hunch is that this is an accidental omission, but Stack recommended
>>> that I "ask the class" ;). What have I missed? I think this is trivial
>>> to fix; obviously, I don't want to make a fix if I just didn't look hard
>>> enough.
>>>
>>> Thanks!
>>>
>>> - Josh
>>>
>>>

Re: Struggles around Cell#getType()

Posted by Josh Elser <el...@apache.org>.
Re-reading https://issues.apache.org/jira/browse/HBASE-8693 that Sergey 
pointed out, I more think that the maybe getType() was misintepreted 
from what Nick originally meant it to be. Maybe intentional, maybe not.

I don't think getTimestamp() should be removed -- when we store multiple 
versions of a Key, users should be able to reconcile the Cells client 
side (e.g. consider a CP which performs some custom merging logic).

getSequenceId() I'd agree probably doesn't belong. getTag() I'll hold 
off judgement because I'm constantly biased into thinking the feature is 
something that it isn't :)

You agree with Ram's suggestion for helper methods as a way forward?

On 10/27/17 7:29 AM, Chia-Ping Tsai wrote:
> The CellBuilder#Data type is introduced to make sure all components used to builder cell are IA.Public.
> 
> bq. Best as I can tell, Cell#getType() should be deprecated
> As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and #getTag should be deprecated as these methods is some kind of internal info of storage engine. As a key-value store, the key  consisting of row, family, and qualifier is enough to the general purpose. Other fields belong to the specific storage engine, and they should not be in the Cell which is our "frontline" interface of data.
> 
> 
> On 2017-10-27 06:40, Josh Elser <el...@apache.org> wrote:
>> Hiya,
>>
>> (Background: see HBASE-19002)
>>
>> In trying to write some example Observers, I found myself in a pickle:
>> how do I tell if a Cell is a Put?
>>
>> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
>> * KeyValue.Type has API to convert a byte to Type
>> * KeyValue (and thus KeyValue.Type) is IA.Private
>> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
>> for the KeyValue.Type
>>
>> Best as I can tell, Cell#getType() should be deprecated and we should
>> have some kind of API (method on Cell or CellUtil) which returns a
>> DataType instead of Type. The details of the byte and the KeyValue.Type
>> should be hidden inside the implementation.
>>
>> My hunch is that this is an accidental omission, but Stack recommended
>> that I "ask the class" ;). What have I missed? I think this is trivial
>> to fix; obviously, I don't want to make a fix if I just didn't look hard
>> enough.
>>
>> Thanks!
>>
>> - Josh
>>

Re: Struggles around Cell#getType()

Posted by Chia-Ping Tsai <ch...@apache.org>.
The CellBuilder#Data type is introduced to make sure all components used to builder cell are IA.Public.

bq. Best as I can tell, Cell#getType() should be deprecated
As i see it, the Cell#getType, #getTimestamp, #getSequenceId, and #getTag should be deprecated as these methods is some kind of internal info of storage engine. As a key-value store, the key  consisting of row, family, and qualifier is enough to the general purpose. Other fields belong to the specific storage engine, and they should not be in the Cell which is our "frontline" interface of data.


On 2017-10-27 06:40, Josh Elser <el...@apache.org> wrote: 
> Hiya,
> 
> (Background: see HBASE-19002)
> 
> In trying to write some example Observers, I found myself in a pickle: 
> how do I tell if a Cell is a Put?
> 
> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> * KeyValue.Type has API to convert a byte to Type
> * KeyValue (and thus KeyValue.Type) is IA.Private
> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement 
> for the KeyValue.Type
> 
> Best as I can tell, Cell#getType() should be deprecated and we should 
> have some kind of API (method on Cell or CellUtil) which returns a 
> DataType instead of Type. The details of the byte and the KeyValue.Type 
> should be hidden inside the implementation.
> 
> My hunch is that this is an accidental omission, but Stack recommended 
> that I "ask the class" ;). What have I missed? I think this is trivial 
> to fix; obviously, I don't want to make a fix if I just didn't look hard 
> enough.
> 
> Thanks!
> 
> - Josh
> 

Re: Struggles around Cell#getType()

Posted by Ted Yu <yu...@gmail.com>.
Sergey:

Here is the DataType I was talking about:
https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/
apache/hadoop/hbase/CellBuilder.java#L33


bq. missed in doing DataType is not having a type state and make it align
with those in KeyValue#Type

My earlier comment was in line with Anoop's.
The alignment can start with aligning the ordinals of CellBuilder#DataType
to those of KeyValue#Type

From Josh's example:

+            cellBuilder.setType(DataType.Put);

This is what we have in the CellBuilder :

  CellBuilder setType(final DataType type);
If a variant of setType() is added which accepts byte (say, the return
from Cell#getType()),
the example would be closer to real life scenario.

bq. move Type out of KeyValue and keep it as a part of public API

The reason we may not want to do the above is that KeyValue#Type has
several internal values such as Minimum.
Not exposing more than CellBuilder#DataType gives hbase freedom in future
enhancements.

Cheers

On Fri, Oct 27, 2017 at 1:35 AM, Sergey Soldatov <se...@gmail.com>
wrote:

> .bq Also we can have DataType#toType(Cell) or so for the conversion
> purpose.
>
> Let me repeat. DataType is the serialization interface for the values and
> has no relations to the type of KV.
>
> .bq This would imply having as many isXX() methods as the number of
> elements
> in CellBuilder#DataType
>
> Have I missed something, but I don't see any DataType nor Type in
> CellBuilder (checked branch-2 and master). The only place where we define
> the mutations type is KeyValue#Type.
> One more note. The public API for Cell#getTypeByte explicitly says "The
> byte representation of the KeyValue.TYPE of this cell".  For me, it sounds
> like we need to move Type out of KeyValue and keep it as a part of public
> API instead of adding checkers for all types to CellUtil. Any other ideas?
>
> Thanks,
> Sergey
>
> On Fri, Oct 27, 2017 at 12:30 AM, Anoop John <an...@gmail.com>
> wrote:
>
> > I think what we missed in doing DataType is not having a type state
> > and make it align with those in KeyValue#Type.  Relying on ordinal
> > might be problematic.   Also we can have DataType#toType(Cell) or so
> > for the conversion purpose.  This is needed for CPs as noted by Josh's
> > CP eg:s.  Thanks for the nice find Josh.  Doing more and more CP eg:s
> > reveal this kind of misses.
> >
> > -Anoop-
> >
> > On Fri, Oct 27, 2017 at 10:17 AM, Ted Yu <yu...@gmail.com> wrote:
> > > bq. you may need CellUtil#isPut(Cell) sort of API
> > >
> > > This would imply having as many isXX() methods as the number of
> > > elements in CellBuilder#DataType
> > > , right ?
> > >
> > > On Thu, Oct 26, 2017 at 9:29 PM, ramkrishna vasudevan <
> > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > >
> > >> Sorry just to clarify I mean deprecating the getType in Cell can we
> try
> > >> doing it in 2.0-alpha 4.
> > >>
> > >> On Fri, Oct 27, 2017 at 9:45 AM, ramkrishna vasudevan <
> > >> ramkrishna.s.vasudevan@gmail.com> wrote:
> > >>
> > >> > bq.Cell#getType()
> > >> > We had this discussion. So getType should only be used for user
> > exposed
> > >> > types like Put and Deletes. All others are internal. So having it in
> > >> public
> > >> > interface may not be needed. Shall we do this in 2.0 alpha-4? Am +1
> > to do
> > >> > this.
> > >> >
> > >> > How ever to solve your problem I think you may need
> > CellUtil#isPut(Cell)
> > >> > sort of API in CellUtl like you already have isDelete(Cell).
> > >> >
> > >> > Regards
> > >> > Ram
> > >> >
> > >> > On Fri, Oct 27, 2017 at 9:08 AM, Ted Yu <yu...@gmail.com>
> wrote:
> > >> >
> > >> >> There is also CellBuilder#DataType which is public. However, the
> > >> ordinals
> > >> >> of CellBuilder#DataType are different from KeyValue.Type .
> > >> >>
> > >> >> What if we align the ordinals of CellBuilder#DataType to be the
> same
> > as
> > >> >> those from KeyValue.Type ?
> > >> >>
> > >> >> On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <
> > >> >> sergeysoldatov@gmail.com>
> > >> >> wrote:
> > >> >>
> > >> >> > DataType class was introduced as part of HBASE-8693 which is more
> > >> about
> > >> >> the
> > >> >> > type of data in the cell rather than the type of mutation.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Sergey
> > >> >> >
> > >> >> > On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org>
> > >> wrote:
> > >> >> >
> > >> >> > > Hiya,
> > >> >> > >
> > >> >> > > (Background: see HBASE-19002)
> > >> >> > >
> > >> >> > > In trying to write some example Observers, I found myself in a
> > >> pickle:
> > >> >> > how
> > >> >> > > do I tell if a Cell is a Put?
> > >> >> > >
> > >> >> > > * Cell#getType() returns a byte which corresponds to a
> > KeyValue.Type
> > >> >> > > * KeyValue.Type has API to convert a byte to Type
> > >> >> > > * KeyValue (and thus KeyValue.Type) is IA.Private
> > >> >> > > * DataType o.a.h.h.typesDataType _appears to me_ to be the
> > >> replacement
> > >> >> > for
> > >> >> > > the KeyValue.Type
> > >> >> > >
> > >> >> > > Best as I can tell, Cell#getType() should be deprecated and we
> > >> should
> > >> >> > have
> > >> >> > > some kind of API (method on Cell or CellUtil) which returns a
> > >> DataType
> > >> >> > > instead of Type. The details of the byte and the KeyValue.Type
> > >> should
> > >> >> be
> > >> >> > > hidden inside the implementation.
> > >> >> > >
> > >> >> > > My hunch is that this is an accidental omission, but Stack
> > >> recommended
> > >> >> > > that I "ask the class" ;). What have I missed? I think this is
> > >> >> trivial to
> > >> >> > > fix; obviously, I don't want to make a fix if I just didn't
> look
> > >> hard
> > >> >> > > enough.
> > >> >> > >
> > >> >> > > Thanks!
> > >> >> > >
> > >> >> > > - Josh
> > >> >> > >
> > >> >> >
> > >> >>
> > >> >
> > >> >
> > >>
> >
>

Re: Struggles around Cell#getType()

Posted by Sergey Soldatov <se...@gmail.com>.
.bq Also we can have DataType#toType(Cell) or so for the conversion
purpose.

Let me repeat. DataType is the serialization interface for the values and
has no relations to the type of KV.

.bq This would imply having as many isXX() methods as the number of elements
in CellBuilder#DataType

Have I missed something, but I don't see any DataType nor Type in
CellBuilder (checked branch-2 and master). The only place where we define
the mutations type is KeyValue#Type.
One more note. The public API for Cell#getTypeByte explicitly says "The
byte representation of the KeyValue.TYPE of this cell".  For me, it sounds
like we need to move Type out of KeyValue and keep it as a part of public
API instead of adding checkers for all types to CellUtil. Any other ideas?

Thanks,
Sergey

On Fri, Oct 27, 2017 at 12:30 AM, Anoop John <an...@gmail.com> wrote:

> I think what we missed in doing DataType is not having a type state
> and make it align with those in KeyValue#Type.  Relying on ordinal
> might be problematic.   Also we can have DataType#toType(Cell) or so
> for the conversion purpose.  This is needed for CPs as noted by Josh's
> CP eg:s.  Thanks for the nice find Josh.  Doing more and more CP eg:s
> reveal this kind of misses.
>
> -Anoop-
>
> On Fri, Oct 27, 2017 at 10:17 AM, Ted Yu <yu...@gmail.com> wrote:
> > bq. you may need CellUtil#isPut(Cell) sort of API
> >
> > This would imply having as many isXX() methods as the number of
> > elements in CellBuilder#DataType
> > , right ?
> >
> > On Thu, Oct 26, 2017 at 9:29 PM, ramkrishna vasudevan <
> > ramkrishna.s.vasudevan@gmail.com> wrote:
> >
> >> Sorry just to clarify I mean deprecating the getType in Cell can we try
> >> doing it in 2.0-alpha 4.
> >>
> >> On Fri, Oct 27, 2017 at 9:45 AM, ramkrishna vasudevan <
> >> ramkrishna.s.vasudevan@gmail.com> wrote:
> >>
> >> > bq.Cell#getType()
> >> > We had this discussion. So getType should only be used for user
> exposed
> >> > types like Put and Deletes. All others are internal. So having it in
> >> public
> >> > interface may not be needed. Shall we do this in 2.0 alpha-4? Am +1
> to do
> >> > this.
> >> >
> >> > How ever to solve your problem I think you may need
> CellUtil#isPut(Cell)
> >> > sort of API in CellUtl like you already have isDelete(Cell).
> >> >
> >> > Regards
> >> > Ram
> >> >
> >> > On Fri, Oct 27, 2017 at 9:08 AM, Ted Yu <yu...@gmail.com> wrote:
> >> >
> >> >> There is also CellBuilder#DataType which is public. However, the
> >> ordinals
> >> >> of CellBuilder#DataType are different from KeyValue.Type .
> >> >>
> >> >> What if we align the ordinals of CellBuilder#DataType to be the same
> as
> >> >> those from KeyValue.Type ?
> >> >>
> >> >> On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <
> >> >> sergeysoldatov@gmail.com>
> >> >> wrote:
> >> >>
> >> >> > DataType class was introduced as part of HBASE-8693 which is more
> >> about
> >> >> the
> >> >> > type of data in the cell rather than the type of mutation.
> >> >> >
> >> >> > Thanks,
> >> >> > Sergey
> >> >> >
> >> >> > On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org>
> >> wrote:
> >> >> >
> >> >> > > Hiya,
> >> >> > >
> >> >> > > (Background: see HBASE-19002)
> >> >> > >
> >> >> > > In trying to write some example Observers, I found myself in a
> >> pickle:
> >> >> > how
> >> >> > > do I tell if a Cell is a Put?
> >> >> > >
> >> >> > > * Cell#getType() returns a byte which corresponds to a
> KeyValue.Type
> >> >> > > * KeyValue.Type has API to convert a byte to Type
> >> >> > > * KeyValue (and thus KeyValue.Type) is IA.Private
> >> >> > > * DataType o.a.h.h.typesDataType _appears to me_ to be the
> >> replacement
> >> >> > for
> >> >> > > the KeyValue.Type
> >> >> > >
> >> >> > > Best as I can tell, Cell#getType() should be deprecated and we
> >> should
> >> >> > have
> >> >> > > some kind of API (method on Cell or CellUtil) which returns a
> >> DataType
> >> >> > > instead of Type. The details of the byte and the KeyValue.Type
> >> should
> >> >> be
> >> >> > > hidden inside the implementation.
> >> >> > >
> >> >> > > My hunch is that this is an accidental omission, but Stack
> >> recommended
> >> >> > > that I "ask the class" ;). What have I missed? I think this is
> >> >> trivial to
> >> >> > > fix; obviously, I don't want to make a fix if I just didn't look
> >> hard
> >> >> > > enough.
> >> >> > >
> >> >> > > Thanks!
> >> >> > >
> >> >> > > - Josh
> >> >> > >
> >> >> >
> >> >>
> >> >
> >> >
> >>
>

Re: Struggles around Cell#getType()

Posted by Anoop John <an...@gmail.com>.
I think what we missed in doing DataType is not having a type state
and make it align with those in KeyValue#Type.  Relying on ordinal
might be problematic.   Also we can have DataType#toType(Cell) or so
for the conversion purpose.  This is needed for CPs as noted by Josh's
CP eg:s.  Thanks for the nice find Josh.  Doing more and more CP eg:s
reveal this kind of misses.

-Anoop-

On Fri, Oct 27, 2017 at 10:17 AM, Ted Yu <yu...@gmail.com> wrote:
> bq. you may need CellUtil#isPut(Cell) sort of API
>
> This would imply having as many isXX() methods as the number of
> elements in CellBuilder#DataType
> , right ?
>
> On Thu, Oct 26, 2017 at 9:29 PM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
>> Sorry just to clarify I mean deprecating the getType in Cell can we try
>> doing it in 2.0-alpha 4.
>>
>> On Fri, Oct 27, 2017 at 9:45 AM, ramkrishna vasudevan <
>> ramkrishna.s.vasudevan@gmail.com> wrote:
>>
>> > bq.Cell#getType()
>> > We had this discussion. So getType should only be used for user exposed
>> > types like Put and Deletes. All others are internal. So having it in
>> public
>> > interface may not be needed. Shall we do this in 2.0 alpha-4? Am +1 to do
>> > this.
>> >
>> > How ever to solve your problem I think you may need CellUtil#isPut(Cell)
>> > sort of API in CellUtl like you already have isDelete(Cell).
>> >
>> > Regards
>> > Ram
>> >
>> > On Fri, Oct 27, 2017 at 9:08 AM, Ted Yu <yu...@gmail.com> wrote:
>> >
>> >> There is also CellBuilder#DataType which is public. However, the
>> ordinals
>> >> of CellBuilder#DataType are different from KeyValue.Type .
>> >>
>> >> What if we align the ordinals of CellBuilder#DataType to be the same as
>> >> those from KeyValue.Type ?
>> >>
>> >> On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <
>> >> sergeysoldatov@gmail.com>
>> >> wrote:
>> >>
>> >> > DataType class was introduced as part of HBASE-8693 which is more
>> about
>> >> the
>> >> > type of data in the cell rather than the type of mutation.
>> >> >
>> >> > Thanks,
>> >> > Sergey
>> >> >
>> >> > On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org>
>> wrote:
>> >> >
>> >> > > Hiya,
>> >> > >
>> >> > > (Background: see HBASE-19002)
>> >> > >
>> >> > > In trying to write some example Observers, I found myself in a
>> pickle:
>> >> > how
>> >> > > do I tell if a Cell is a Put?
>> >> > >
>> >> > > * Cell#getType() returns a byte which corresponds to a KeyValue.Type
>> >> > > * KeyValue.Type has API to convert a byte to Type
>> >> > > * KeyValue (and thus KeyValue.Type) is IA.Private
>> >> > > * DataType o.a.h.h.typesDataType _appears to me_ to be the
>> replacement
>> >> > for
>> >> > > the KeyValue.Type
>> >> > >
>> >> > > Best as I can tell, Cell#getType() should be deprecated and we
>> should
>> >> > have
>> >> > > some kind of API (method on Cell or CellUtil) which returns a
>> DataType
>> >> > > instead of Type. The details of the byte and the KeyValue.Type
>> should
>> >> be
>> >> > > hidden inside the implementation.
>> >> > >
>> >> > > My hunch is that this is an accidental omission, but Stack
>> recommended
>> >> > > that I "ask the class" ;). What have I missed? I think this is
>> >> trivial to
>> >> > > fix; obviously, I don't want to make a fix if I just didn't look
>> hard
>> >> > > enough.
>> >> > >
>> >> > > Thanks!
>> >> > >
>> >> > > - Josh
>> >> > >
>> >> >
>> >>
>> >
>> >
>>

Re: Struggles around Cell#getType()

Posted by Ted Yu <yu...@gmail.com>.
bq. you may need CellUtil#isPut(Cell) sort of API

This would imply having as many isXX() methods as the number of
elements in CellBuilder#DataType
, right ?

On Thu, Oct 26, 2017 at 9:29 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> Sorry just to clarify I mean deprecating the getType in Cell can we try
> doing it in 2.0-alpha 4.
>
> On Fri, Oct 27, 2017 at 9:45 AM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
> > bq.Cell#getType()
> > We had this discussion. So getType should only be used for user exposed
> > types like Put and Deletes. All others are internal. So having it in
> public
> > interface may not be needed. Shall we do this in 2.0 alpha-4? Am +1 to do
> > this.
> >
> > How ever to solve your problem I think you may need CellUtil#isPut(Cell)
> > sort of API in CellUtl like you already have isDelete(Cell).
> >
> > Regards
> > Ram
> >
> > On Fri, Oct 27, 2017 at 9:08 AM, Ted Yu <yu...@gmail.com> wrote:
> >
> >> There is also CellBuilder#DataType which is public. However, the
> ordinals
> >> of CellBuilder#DataType are different from KeyValue.Type .
> >>
> >> What if we align the ordinals of CellBuilder#DataType to be the same as
> >> those from KeyValue.Type ?
> >>
> >> On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <
> >> sergeysoldatov@gmail.com>
> >> wrote:
> >>
> >> > DataType class was introduced as part of HBASE-8693 which is more
> about
> >> the
> >> > type of data in the cell rather than the type of mutation.
> >> >
> >> > Thanks,
> >> > Sergey
> >> >
> >> > On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org>
> wrote:
> >> >
> >> > > Hiya,
> >> > >
> >> > > (Background: see HBASE-19002)
> >> > >
> >> > > In trying to write some example Observers, I found myself in a
> pickle:
> >> > how
> >> > > do I tell if a Cell is a Put?
> >> > >
> >> > > * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> >> > > * KeyValue.Type has API to convert a byte to Type
> >> > > * KeyValue (and thus KeyValue.Type) is IA.Private
> >> > > * DataType o.a.h.h.typesDataType _appears to me_ to be the
> replacement
> >> > for
> >> > > the KeyValue.Type
> >> > >
> >> > > Best as I can tell, Cell#getType() should be deprecated and we
> should
> >> > have
> >> > > some kind of API (method on Cell or CellUtil) which returns a
> DataType
> >> > > instead of Type. The details of the byte and the KeyValue.Type
> should
> >> be
> >> > > hidden inside the implementation.
> >> > >
> >> > > My hunch is that this is an accidental omission, but Stack
> recommended
> >> > > that I "ask the class" ;). What have I missed? I think this is
> >> trivial to
> >> > > fix; obviously, I don't want to make a fix if I just didn't look
> hard
> >> > > enough.
> >> > >
> >> > > Thanks!
> >> > >
> >> > > - Josh
> >> > >
> >> >
> >>
> >
> >
>

Re: Struggles around Cell#getType()

Posted by Josh Elser <el...@apache.org>.
Thanks, Ram and others.

I think this also the same thing that Sergey is pointing out. I am 
trying to find something me understand where a Cell "came from" (Put, 
Delete, etc), but the only thing exposed to me is only somewhat related 
(DataType and KeyValue.Type have a relationship, but they're for two 
different things).

CellUtil#isPut(Cell) (or broadly isXXX(Cell) methods) sounds like the 
right approach to me. What happens to #getDataType() can/should be a 
separate discussion, I think.

On 10/27/17 12:29 AM, ramkrishna vasudevan wrote:
> Sorry just to clarify I mean deprecating the getType in Cell can we try
> doing it in 2.0-alpha 4.
> 
> On Fri, Oct 27, 2017 at 9:45 AM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
> 
>> bq.Cell#getType()
>> We had this discussion. So getType should only be used for user exposed
>> types like Put and Deletes. All others are internal. So having it in public
>> interface may not be needed. Shall we do this in 2.0 alpha-4? Am +1 to do
>> this.
>>
>> How ever to solve your problem I think you may need CellUtil#isPut(Cell)
>> sort of API in CellUtl like you already have isDelete(Cell).
>>
>> Regards
>> Ram
>>
>> On Fri, Oct 27, 2017 at 9:08 AM, Ted Yu <yu...@gmail.com> wrote:
>>
>>> There is also CellBuilder#DataType which is public. However, the ordinals
>>> of CellBuilder#DataType are different from KeyValue.Type .
>>>
>>> What if we align the ordinals of CellBuilder#DataType to be the same as
>>> those from KeyValue.Type ?
>>>
>>> On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <
>>> sergeysoldatov@gmail.com>
>>> wrote:
>>>
>>>> DataType class was introduced as part of HBASE-8693 which is more about
>>> the
>>>> type of data in the cell rather than the type of mutation.
>>>>
>>>> Thanks,
>>>> Sergey
>>>>
>>>> On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org> wrote:
>>>>
>>>>> Hiya,
>>>>>
>>>>> (Background: see HBASE-19002)
>>>>>
>>>>> In trying to write some example Observers, I found myself in a pickle:
>>>> how
>>>>> do I tell if a Cell is a Put?
>>>>>
>>>>> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
>>>>> * KeyValue.Type has API to convert a byte to Type
>>>>> * KeyValue (and thus KeyValue.Type) is IA.Private
>>>>> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
>>>> for
>>>>> the KeyValue.Type
>>>>>
>>>>> Best as I can tell, Cell#getType() should be deprecated and we should
>>>> have
>>>>> some kind of API (method on Cell or CellUtil) which returns a DataType
>>>>> instead of Type. The details of the byte and the KeyValue.Type should
>>> be
>>>>> hidden inside the implementation.
>>>>>
>>>>> My hunch is that this is an accidental omission, but Stack recommended
>>>>> that I "ask the class" ;). What have I missed? I think this is
>>> trivial to
>>>>> fix; obviously, I don't want to make a fix if I just didn't look hard
>>>>> enough.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> - Josh
>>>>>
>>>>
>>>
>>
>>
> 

Re: Struggles around Cell#getType()

Posted by ramkrishna vasudevan <ra...@gmail.com>.
Sorry just to clarify I mean deprecating the getType in Cell can we try
doing it in 2.0-alpha 4.

On Fri, Oct 27, 2017 at 9:45 AM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> bq.Cell#getType()
> We had this discussion. So getType should only be used for user exposed
> types like Put and Deletes. All others are internal. So having it in public
> interface may not be needed. Shall we do this in 2.0 alpha-4? Am +1 to do
> this.
>
> How ever to solve your problem I think you may need CellUtil#isPut(Cell)
> sort of API in CellUtl like you already have isDelete(Cell).
>
> Regards
> Ram
>
> On Fri, Oct 27, 2017 at 9:08 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> There is also CellBuilder#DataType which is public. However, the ordinals
>> of CellBuilder#DataType are different from KeyValue.Type .
>>
>> What if we align the ordinals of CellBuilder#DataType to be the same as
>> those from KeyValue.Type ?
>>
>> On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <
>> sergeysoldatov@gmail.com>
>> wrote:
>>
>> > DataType class was introduced as part of HBASE-8693 which is more about
>> the
>> > type of data in the cell rather than the type of mutation.
>> >
>> > Thanks,
>> > Sergey
>> >
>> > On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org> wrote:
>> >
>> > > Hiya,
>> > >
>> > > (Background: see HBASE-19002)
>> > >
>> > > In trying to write some example Observers, I found myself in a pickle:
>> > how
>> > > do I tell if a Cell is a Put?
>> > >
>> > > * Cell#getType() returns a byte which corresponds to a KeyValue.Type
>> > > * KeyValue.Type has API to convert a byte to Type
>> > > * KeyValue (and thus KeyValue.Type) is IA.Private
>> > > * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
>> > for
>> > > the KeyValue.Type
>> > >
>> > > Best as I can tell, Cell#getType() should be deprecated and we should
>> > have
>> > > some kind of API (method on Cell or CellUtil) which returns a DataType
>> > > instead of Type. The details of the byte and the KeyValue.Type should
>> be
>> > > hidden inside the implementation.
>> > >
>> > > My hunch is that this is an accidental omission, but Stack recommended
>> > > that I "ask the class" ;). What have I missed? I think this is
>> trivial to
>> > > fix; obviously, I don't want to make a fix if I just didn't look hard
>> > > enough.
>> > >
>> > > Thanks!
>> > >
>> > > - Josh
>> > >
>> >
>>
>
>

Re: Struggles around Cell#getType()

Posted by ramkrishna vasudevan <ra...@gmail.com>.
bq.Cell#getType()
We had this discussion. So getType should only be used for user exposed
types like Put and Deletes. All others are internal. So having it in public
interface may not be needed. Shall we do this in 2.0 alpha-4? Am +1 to do
this.

How ever to solve your problem I think you may need CellUtil#isPut(Cell)
sort of API in CellUtl like you already have isDelete(Cell).

Regards
Ram

On Fri, Oct 27, 2017 at 9:08 AM, Ted Yu <yu...@gmail.com> wrote:

> There is also CellBuilder#DataType which is public. However, the ordinals
> of CellBuilder#DataType are different from KeyValue.Type .
>
> What if we align the ordinals of CellBuilder#DataType to be the same as
> those from KeyValue.Type ?
>
> On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <sergeysoldatov@gmail.com
> >
> wrote:
>
> > DataType class was introduced as part of HBASE-8693 which is more about
> the
> > type of data in the cell rather than the type of mutation.
> >
> > Thanks,
> > Sergey
> >
> > On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org> wrote:
> >
> > > Hiya,
> > >
> > > (Background: see HBASE-19002)
> > >
> > > In trying to write some example Observers, I found myself in a pickle:
> > how
> > > do I tell if a Cell is a Put?
> > >
> > > * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> > > * KeyValue.Type has API to convert a byte to Type
> > > * KeyValue (and thus KeyValue.Type) is IA.Private
> > > * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
> > for
> > > the KeyValue.Type
> > >
> > > Best as I can tell, Cell#getType() should be deprecated and we should
> > have
> > > some kind of API (method on Cell or CellUtil) which returns a DataType
> > > instead of Type. The details of the byte and the KeyValue.Type should
> be
> > > hidden inside the implementation.
> > >
> > > My hunch is that this is an accidental omission, but Stack recommended
> > > that I "ask the class" ;). What have I missed? I think this is trivial
> to
> > > fix; obviously, I don't want to make a fix if I just didn't look hard
> > > enough.
> > >
> > > Thanks!
> > >
> > > - Josh
> > >
> >
>

Re: Struggles around Cell#getType()

Posted by Ted Yu <yu...@gmail.com>.
There is also CellBuilder#DataType which is public. However, the ordinals
of CellBuilder#DataType are different from KeyValue.Type .

What if we align the ordinals of CellBuilder#DataType to be the same as
those from KeyValue.Type ?

On Thu, Oct 26, 2017 at 4:34 PM, Sergey Soldatov <se...@gmail.com>
wrote:

> DataType class was introduced as part of HBASE-8693 which is more about the
> type of data in the cell rather than the type of mutation.
>
> Thanks,
> Sergey
>
> On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org> wrote:
>
> > Hiya,
> >
> > (Background: see HBASE-19002)
> >
> > In trying to write some example Observers, I found myself in a pickle:
> how
> > do I tell if a Cell is a Put?
> >
> > * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> > * KeyValue.Type has API to convert a byte to Type
> > * KeyValue (and thus KeyValue.Type) is IA.Private
> > * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement
> for
> > the KeyValue.Type
> >
> > Best as I can tell, Cell#getType() should be deprecated and we should
> have
> > some kind of API (method on Cell or CellUtil) which returns a DataType
> > instead of Type. The details of the byte and the KeyValue.Type should be
> > hidden inside the implementation.
> >
> > My hunch is that this is an accidental omission, but Stack recommended
> > that I "ask the class" ;). What have I missed? I think this is trivial to
> > fix; obviously, I don't want to make a fix if I just didn't look hard
> > enough.
> >
> > Thanks!
> >
> > - Josh
> >
>

Re: Struggles around Cell#getType()

Posted by Sergey Soldatov <se...@gmail.com>.
DataType class was introduced as part of HBASE-8693 which is more about the
type of data in the cell rather than the type of mutation.

Thanks,
Sergey

On Thu, Oct 26, 2017 at 3:40 PM, Josh Elser <el...@apache.org> wrote:

> Hiya,
>
> (Background: see HBASE-19002)
>
> In trying to write some example Observers, I found myself in a pickle: how
> do I tell if a Cell is a Put?
>
> * Cell#getType() returns a byte which corresponds to a KeyValue.Type
> * KeyValue.Type has API to convert a byte to Type
> * KeyValue (and thus KeyValue.Type) is IA.Private
> * DataType o.a.h.h.typesDataType _appears to me_ to be the replacement for
> the KeyValue.Type
>
> Best as I can tell, Cell#getType() should be deprecated and we should have
> some kind of API (method on Cell or CellUtil) which returns a DataType
> instead of Type. The details of the byte and the KeyValue.Type should be
> hidden inside the implementation.
>
> My hunch is that this is an accidental omission, but Stack recommended
> that I "ask the class" ;). What have I missed? I think this is trivial to
> fix; obviously, I don't want to make a fix if I just didn't look hard
> enough.
>
> Thanks!
>
> - Josh
>