You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by Alfonso Nishikawa <al...@gmail.com> on 2013/01/14 00:25:23 UTC

Document about GORA-174

Hello everybody.

I wrote an article [0] regarding GORA-174 where I try to explain a
compatibility issue with old data in HBase.
I really don't know how it affects other backends. Need some info if anyone
knows. (@Renato: maybe you can tell me something about how is it in
Cassandra :)
I will appreciate your thoughts :)

Thank you very much!

Alfonso Nishikawa

[0] - http://people.apache.org/~alfonsonishikawa/gora-174.html

Re: Document about GORA-174

Posted by Alfonso Nishikawa <al...@gmail.com>.
Hi, Renato.

Thank you very much for your feedback. After your comments I think I have
to rewrite that explanation (sorry, my english always needs to be
rewritten) because that "index" belongs to Avro. Actually the value is
"\01This is the text" (\01 hexadecimal). Tomorrow I will have time to
update all the information and put all efforts on that issue (at last,
after three weeks, I finished restoring my servers :P).
I am looking forward your module! =)

Best regards,

Alfonso Nishikawa

2013/2/6 Renato Marroquín Mogrovejo <re...@gmail.com>

> Hi all,
>
> I am really sorry it has taken me so long to get to this thread,
> anyways let's get down to the important parts (:
> While reviewing Alfonso's emails and some Avro documentation, I think
> Alfonso's proposal is the best approach right now.
> I mean we will end up paying a price to have the chance to persist
> optional data types using Avro Union. The price we will be paying this
> time is storing an extra column whenever we use the Union data type in
> order to keep track of what type of data we had stored.
> So similar to Alfonso's example, we first stored:
>
>  col: This is the text
>
> After implementing this, we would be storing:
>
>  col_name index value
>  -------- ----- ----------------------
>  col :     \01   This is the text
>
> I think though we shouldn't use the word "index" because it can be
> misleading. Maybe using "colName_index"? I am not sure about this yet,
> we should reach a consensus on this one my friends.
> I have made several several changes to the Cassandra Module but I
> would like to discuss them in a separate thread, but in general terms
> I also think this is the way we should go.
>
>
> Renato M.
>
>
>
>
> 2013/1/17 Alfonso Nishikawa <al...@gmail.com>:
> > Hi, Lewis.
> >
> > It refers to both Gora and Avro.
> > About Avro, very hidden in documentation [0] talks about default value in
> > unions.
> > About Gora and specifically HBase, it doesn't matter what is the value in
> > "default":"..." (schema) because it is not being used to read/store. For
> > example: HBaseStore#newInstance() doesn't fill values for not present
> > "family:column"s in HBase.
> >
> > I am pretty sure the best option will be implement the "possible
> solution"
> > configuration option. But maybe make it not deprecated, sincre sometimes
> > will be desirable to write raw data on top level columns (not serialized
> > records) like still happens in /trunk revision (with the restriction
> about
> > null shown in [0]), so It could be read directly calling HBase interface.
> > And maybe should be something configurable when creating the DataStore.
> > What do you think about this?
> >
> > Thank you very much for the feedback! :)
> >
> > Best,
> >
> > Alfonso Nishikawa
> >
> > [0] - http://avro.apache.org/docs/current/spec.html#schema_recordsection
> > "Records > fields > default"
> >
> > 2013/1/17 Lewis John Mcgibbney <le...@gmail.com>
> >
> >> Hi Alfonso,
> >> When you say that "...the first element in the union is considered as
> the
> >> default element, at this moment it is not implemented nor planned" does
> >> this refer to Avro?
> >>
> >>
> >>
> >> On Sunday, January 13, 2013, Alfonso Nishikawa <
> >> alfonso.nishikawa@gmail.com>
> >> wrote:
> >> > Hello everybody.
> >> >
> >> > I wrote an article [0] regarding GORA-174 where I try to explain a
> >> > compatibility issue with old data in HBase.
> >> > I really don't know how it affects other backends. Need some info if
> >> anyone
> >> > knows. (@Renato: maybe you can tell me something about how is it in
> >> > Cassandra :)
> >> > I will appreciate your thoughts :)
> >> >
> >> > Thank you very much!
> >> >
> >> > Alfonso Nishikawa
> >> >
> >> > [0] - http://people.apache.org/~alfonsonishikawa/gora-174.html
> >> >
> >>
> >> --
> >> *Lewis*
> >>
> >
> >
> >
> > --
> > "Drinking bloody marys all night will make you feel like a corpse in the
> > morning."
>



-- 
"Drinking bloody marys all night will make you feel like a corpse in the
morning."

Re: Document about GORA-174

Posted by Renato Marroquín Mogrovejo <re...@gmail.com>.
Hi all,

I am really sorry it has taken me so long to get to this thread,
anyways let's get down to the important parts (:
While reviewing Alfonso's emails and some Avro documentation, I think
Alfonso's proposal is the best approach right now.
I mean we will end up paying a price to have the chance to persist
optional data types using Avro Union. The price we will be paying this
time is storing an extra column whenever we use the Union data type in
order to keep track of what type of data we had stored.
So similar to Alfonso's example, we first stored:

 col: This is the text

After implementing this, we would be storing:

 col_name index value
 -------- ----- ----------------------
 col :     \01   This is the text

I think though we shouldn't use the word "index" because it can be
misleading. Maybe using "colName_index"? I am not sure about this yet,
we should reach a consensus on this one my friends.
I have made several several changes to the Cassandra Module but I
would like to discuss them in a separate thread, but in general terms
I also think this is the way we should go.


Renato M.




2013/1/17 Alfonso Nishikawa <al...@gmail.com>:
> Hi, Lewis.
>
> It refers to both Gora and Avro.
> About Avro, very hidden in documentation [0] talks about default value in
> unions.
> About Gora and specifically HBase, it doesn't matter what is the value in
> "default":"..." (schema) because it is not being used to read/store. For
> example: HBaseStore#newInstance() doesn't fill values for not present
> "family:column"s in HBase.
>
> I am pretty sure the best option will be implement the "possible solution"
> configuration option. But maybe make it not deprecated, sincre sometimes
> will be desirable to write raw data on top level columns (not serialized
> records) like still happens in /trunk revision (with the restriction about
> null shown in [0]), so It could be read directly calling HBase interface.
> And maybe should be something configurable when creating the DataStore.
> What do you think about this?
>
> Thank you very much for the feedback! :)
>
> Best,
>
> Alfonso Nishikawa
>
> [0] - http://avro.apache.org/docs/current/spec.html#schema_record section
> "Records > fields > default"
>
> 2013/1/17 Lewis John Mcgibbney <le...@gmail.com>
>
>> Hi Alfonso,
>> When you say that "...the first element in the union is considered as the
>> default element, at this moment it is not implemented nor planned" does
>> this refer to Avro?
>>
>>
>>
>> On Sunday, January 13, 2013, Alfonso Nishikawa <
>> alfonso.nishikawa@gmail.com>
>> wrote:
>> > Hello everybody.
>> >
>> > I wrote an article [0] regarding GORA-174 where I try to explain a
>> > compatibility issue with old data in HBase.
>> > I really don't know how it affects other backends. Need some info if
>> anyone
>> > knows. (@Renato: maybe you can tell me something about how is it in
>> > Cassandra :)
>> > I will appreciate your thoughts :)
>> >
>> > Thank you very much!
>> >
>> > Alfonso Nishikawa
>> >
>> > [0] - http://people.apache.org/~alfonsonishikawa/gora-174.html
>> >
>>
>> --
>> *Lewis*
>>
>
>
>
> --
> "Drinking bloody marys all night will make you feel like a corpse in the
> morning."

Re: Document about GORA-174

Posted by Alfonso Nishikawa <al...@gmail.com>.
Hi, Lewis.

It refers to both Gora and Avro.
About Avro, very hidden in documentation [0] talks about default value in
unions.
About Gora and specifically HBase, it doesn't matter what is the value in
"default":"..." (schema) because it is not being used to read/store. For
example: HBaseStore#newInstance() doesn't fill values for not present
"family:column"s in HBase.

I am pretty sure the best option will be implement the "possible solution"
configuration option. But maybe make it not deprecated, sincre sometimes
will be desirable to write raw data on top level columns (not serialized
records) like still happens in /trunk revision (with the restriction about
null shown in [0]), so It could be read directly calling HBase interface.
And maybe should be something configurable when creating the DataStore.
What do you think about this?

Thank you very much for the feedback! :)

Best,

Alfonso Nishikawa

[0] - http://avro.apache.org/docs/current/spec.html#schema_record section
"Records > fields > default"

2013/1/17 Lewis John Mcgibbney <le...@gmail.com>

> Hi Alfonso,
> When you say that "...the first element in the union is considered as the
> default element, at this moment it is not implemented nor planned" does
> this refer to Avro?
>
>
>
> On Sunday, January 13, 2013, Alfonso Nishikawa <
> alfonso.nishikawa@gmail.com>
> wrote:
> > Hello everybody.
> >
> > I wrote an article [0] regarding GORA-174 where I try to explain a
> > compatibility issue with old data in HBase.
> > I really don't know how it affects other backends. Need some info if
> anyone
> > knows. (@Renato: maybe you can tell me something about how is it in
> > Cassandra :)
> > I will appreciate your thoughts :)
> >
> > Thank you very much!
> >
> > Alfonso Nishikawa
> >
> > [0] - http://people.apache.org/~alfonsonishikawa/gora-174.html
> >
>
> --
> *Lewis*
>



-- 
"Drinking bloody marys all night will make you feel like a corpse in the
morning."

Document about GORA-174

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Alfonso,
When you say that "...the first element in the union is considered as the
default element, at this moment it is not implemented nor planned" does
this refer to Avro?



On Sunday, January 13, 2013, Alfonso Nishikawa <al...@gmail.com>
wrote:
> Hello everybody.
>
> I wrote an article [0] regarding GORA-174 where I try to explain a
> compatibility issue with old data in HBase.
> I really don't know how it affects other backends. Need some info if
anyone
> knows. (@Renato: maybe you can tell me something about how is it in
> Cassandra :)
> I will appreciate your thoughts :)
>
> Thank you very much!
>
> Alfonso Nishikawa
>
> [0] - http://people.apache.org/~alfonsonishikawa/gora-174.html
>

-- 
*Lewis*