You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Arun Allamsetty <ar...@gmail.com> on 2014/07/22 02:43:41 UTC

HBase appends

Hi,

If I have a one-to-many relationship in a SQL database (an author might
have written many books), and I want to denormalize it for writing in
HBase, I'll have a table with the Author as the row key and a *list* of
books as values.

Now my question is how do I create a *list* such that I could just append
to it using the HBase Java API *Append* instead of doing a
read-modify-insert on a Java List object containing all the books.

Thanks,
Arun

Re: HBase appends

Posted by Arun Allamsetty <ar...@gmail.com>.
That's true. I never thought of it that way. Thanks for pointing it out.

Arun
On Jul 22, 2014 4:07 PM, "Ted Yu" <yu...@gmail.com> wrote:

> When storing new lists using new columns, similar issue would arise, right
> ?
> In Ishan's words:
>
> bq. read all the columns and combine when reading
>
> The combining process applies to the multi-version approach as well.
>
> Cheers
>
>
> On Tue, Jul 22, 2014 at 12:32 PM, Arun Allamsetty <
> arun.allamsetty@gmail.com
> > wrote:
>
> > Hi,
> >
> > Isn't versioning used for an entirely different purpose? What if I screw
> up
> > a book name and then have to rewrite it? Then I'll have two versions for
> > the same book. Also, AFAIK the default number of versions is 1 on table
> > creation without additional parameters.
> >
> > Thanks,
> > Arun
> > On Jul 22, 2014 12:11 PM, "yonghu" <yo...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > If a author does not have hundreds of publications, you can directly
> > write
> > > in one column. Hence, your column will contain multiple data versions.
> > The
> > > default data version is 3 but you can send more.
> > >
> > >
> > > On Tue, Jul 22, 2014 at 4:20 AM, Ishan Chhabra <
> ichhabra@rocketfuel.com>
> > > wrote:
> > >
> > > > Arun,
> > > > You need to represent your data in a format such that you can simply
> > add
> > > a
> > > > byte[] to the end of the existing byte[] in a Cell and later read and
> > > > decode it as a list.
> > > >
> > > > One way is to use the encode your data as protobuf. When you append a
> > > list
> > > > of values in byte[] form in protobuf to an existing list byte[] and
> > read
> > > > the combined byte[], it is automatically recognized as one single
> list
> > > due
> > > > to the way protobuf encodes lists.
> > > >
> > > > Another way to solve this problem is write a new column for each
> > appended
> > > > list and read all the columns and combine when reading. (I prefer
> this
> > > > approach since the way Append is implemented internally, it can lead
> to
> > > > high memstore usage).
> > > >
> > > >
> > > > On Mon, Jul 21, 2014 at 5:43 PM, Arun Allamsetty <
> > > > arun.allamsetty@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > If I have a one-to-many relationship in a SQL database (an author
> > might
> > > > > have written many books), and I want to denormalize it for writing
> in
> > > > > HBase, I'll have a table with the Author as the row key and a
> *list*
> > of
> > > > > books as values.
> > > > >
> > > > > Now my question is how do I create a *list* such that I could just
> > > append
> > > > > to it using the HBase Java API *Append* instead of doing a
> > > > > read-modify-insert on a Java List object containing all the books.
> > > > >
> > > > > Thanks,
> > > > > Arun
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > >
> > >
> >
>

Re: HBase appends

Posted by Ted Yu <yu...@gmail.com>.
When storing new lists using new columns, similar issue would arise, right ?
In Ishan's words:

bq. read all the columns and combine when reading

The combining process applies to the multi-version approach as well.

Cheers


On Tue, Jul 22, 2014 at 12:32 PM, Arun Allamsetty <arun.allamsetty@gmail.com
> wrote:

> Hi,
>
> Isn't versioning used for an entirely different purpose? What if I screw up
> a book name and then have to rewrite it? Then I'll have two versions for
> the same book. Also, AFAIK the default number of versions is 1 on table
> creation without additional parameters.
>
> Thanks,
> Arun
> On Jul 22, 2014 12:11 PM, "yonghu" <yo...@gmail.com> wrote:
>
> > Hi,
> >
> > If a author does not have hundreds of publications, you can directly
> write
> > in one column. Hence, your column will contain multiple data versions.
> The
> > default data version is 3 but you can send more.
> >
> >
> > On Tue, Jul 22, 2014 at 4:20 AM, Ishan Chhabra <ic...@rocketfuel.com>
> > wrote:
> >
> > > Arun,
> > > You need to represent your data in a format such that you can simply
> add
> > a
> > > byte[] to the end of the existing byte[] in a Cell and later read and
> > > decode it as a list.
> > >
> > > One way is to use the encode your data as protobuf. When you append a
> > list
> > > of values in byte[] form in protobuf to an existing list byte[] and
> read
> > > the combined byte[], it is automatically recognized as one single list
> > due
> > > to the way protobuf encodes lists.
> > >
> > > Another way to solve this problem is write a new column for each
> appended
> > > list and read all the columns and combine when reading. (I prefer this
> > > approach since the way Append is implemented internally, it can lead to
> > > high memstore usage).
> > >
> > >
> > > On Mon, Jul 21, 2014 at 5:43 PM, Arun Allamsetty <
> > > arun.allamsetty@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > If I have a one-to-many relationship in a SQL database (an author
> might
> > > > have written many books), and I want to denormalize it for writing in
> > > > HBase, I'll have a table with the Author as the row key and a *list*
> of
> > > > books as values.
> > > >
> > > > Now my question is how do I create a *list* such that I could just
> > append
> > > > to it using the HBase Java API *Append* instead of doing a
> > > > read-modify-insert on a Java List object containing all the books.
> > > >
> > > > Thanks,
> > > > Arun
> > > >
> > >
> > >
> > >
> > > --
> > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > >
> >
>

Re: HBase appends

Posted by Arun Allamsetty <ar...@gmail.com>.
Hi,

Isn't versioning used for an entirely different purpose? What if I screw up
a book name and then have to rewrite it? Then I'll have two versions for
the same book. Also, AFAIK the default number of versions is 1 on table
creation without additional parameters.

Thanks,
Arun
On Jul 22, 2014 12:11 PM, "yonghu" <yo...@gmail.com> wrote:

> Hi,
>
> If a author does not have hundreds of publications, you can directly write
> in one column. Hence, your column will contain multiple data versions. The
> default data version is 3 but you can send more.
>
>
> On Tue, Jul 22, 2014 at 4:20 AM, Ishan Chhabra <ic...@rocketfuel.com>
> wrote:
>
> > Arun,
> > You need to represent your data in a format such that you can simply add
> a
> > byte[] to the end of the existing byte[] in a Cell and later read and
> > decode it as a list.
> >
> > One way is to use the encode your data as protobuf. When you append a
> list
> > of values in byte[] form in protobuf to an existing list byte[] and read
> > the combined byte[], it is automatically recognized as one single list
> due
> > to the way protobuf encodes lists.
> >
> > Another way to solve this problem is write a new column for each appended
> > list and read all the columns and combine when reading. (I prefer this
> > approach since the way Append is implemented internally, it can lead to
> > high memstore usage).
> >
> >
> > On Mon, Jul 21, 2014 at 5:43 PM, Arun Allamsetty <
> > arun.allamsetty@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > If I have a one-to-many relationship in a SQL database (an author might
> > > have written many books), and I want to denormalize it for writing in
> > > HBase, I'll have a table with the Author as the row key and a *list* of
> > > books as values.
> > >
> > > Now my question is how do I create a *list* such that I could just
> append
> > > to it using the HBase Java API *Append* instead of doing a
> > > read-modify-insert on a Java List object containing all the books.
> > >
> > > Thanks,
> > > Arun
> > >
> >
> >
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>

Re: HBase appends

Posted by yonghu <yo...@gmail.com>.
Hi,

If a author does not have hundreds of publications, you can directly write
in one column. Hence, your column will contain multiple data versions. The
default data version is 3 but you can send more.


On Tue, Jul 22, 2014 at 4:20 AM, Ishan Chhabra <ic...@rocketfuel.com>
wrote:

> Arun,
> You need to represent your data in a format such that you can simply add a
> byte[] to the end of the existing byte[] in a Cell and later read and
> decode it as a list.
>
> One way is to use the encode your data as protobuf. When you append a list
> of values in byte[] form in protobuf to an existing list byte[] and read
> the combined byte[], it is automatically recognized as one single list due
> to the way protobuf encodes lists.
>
> Another way to solve this problem is write a new column for each appended
> list and read all the columns and combine when reading. (I prefer this
> approach since the way Append is implemented internally, it can lead to
> high memstore usage).
>
>
> On Mon, Jul 21, 2014 at 5:43 PM, Arun Allamsetty <
> arun.allamsetty@gmail.com>
> wrote:
>
> > Hi,
> >
> > If I have a one-to-many relationship in a SQL database (an author might
> > have written many books), and I want to denormalize it for writing in
> > HBase, I'll have a table with the Author as the row key and a *list* of
> > books as values.
> >
> > Now my question is how do I create a *list* such that I could just append
> > to it using the HBase Java API *Append* instead of doing a
> > read-modify-insert on a Java List object containing all the books.
> >
> > Thanks,
> > Arun
> >
>
>
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Re: HBase appends

Posted by Ishan Chhabra <ic...@rocketfuel.com>.
Arun,
You need to represent your data in a format such that you can simply add a
byte[] to the end of the existing byte[] in a Cell and later read and
decode it as a list.

One way is to use the encode your data as protobuf. When you append a list
of values in byte[] form in protobuf to an existing list byte[] and read
the combined byte[], it is automatically recognized as one single list due
to the way protobuf encodes lists.

Another way to solve this problem is write a new column for each appended
list and read all the columns and combine when reading. (I prefer this
approach since the way Append is implemented internally, it can lead to
high memstore usage).


On Mon, Jul 21, 2014 at 5:43 PM, Arun Allamsetty <ar...@gmail.com>
wrote:

> Hi,
>
> If I have a one-to-many relationship in a SQL database (an author might
> have written many books), and I want to denormalize it for writing in
> HBase, I'll have a table with the Author as the row key and a *list* of
> books as values.
>
> Now my question is how do I create a *list* such that I could just append
> to it using the HBase Java API *Append* instead of doing a
> read-modify-insert on a Java List object containing all the books.
>
> Thanks,
> Arun
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.