You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Atri Sharma <at...@gmail.com> on 2014/03/01 21:01:42 UTC

Online merging of Regions

Hi all,

Just as a theoretical interest, do we support online merging of Regions in
any way? Are there ways to merge Regions while still supporting reads and
writes to them?

If not, can we do them in the following manner:

At each major delete/explicit order from user to merge Regions, we can
create a new empty memstore which will take any reads for the Region in
discussion. Then, we can build a leftist tree of the Region and store it in
the memory itself. We so not need to write it into disk yet. This shall
also allow us to serve reads using the Region itself without any extra
additional copies.

We do it for all the Regions being merged and then merge the leftist trees
made ordered by the range of keys served by the Regions. We then traverse
the final all merged leftist tree, write its data to a HFile and write the
new memstore being used for write's data into the HFile as well.

It's just a thought.Please let me know your feedback and comments on it.

Regards,

Atri



-- 
Regards,

Atri
*l'apprenant*

Re: Online merging of Regions

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
You might want to take a look at that:

https://issues.apache.org/jira/browse/HBASE-7403


2014-03-01 15:01 GMT-05:00 Atri Sharma <at...@gmail.com>:

> Hi all,
>
> Just as a theoretical interest, do we support online merging of Regions in
> any way? Are there ways to merge Regions while still supporting reads and
> writes to them?
>
> If not, can we do them in the following manner:
>
> At each major delete/explicit order from user to merge Regions, we can
> create a new empty memstore which will take any reads for the Region in
> discussion. Then, we can build a leftist tree of the Region and store it in
> the memory itself. We so not need to write it into disk yet. This shall
> also allow us to serve reads using the Region itself without any extra
> additional copies.
>
> We do it for all the Regions being merged and then merge the leftist trees
> made ordered by the range of keys served by the Regions. We then traverse
> the final all merged leftist tree, write its data to a HFile and write the
> new memstore being used for write's data into the HFile as well.
>
> It's just a thought.Please let me know your feedback and comments on it.
>
> Regards,
>
> Atri
>
>
>
> --
> Regards,
>
> Atri
> *l'apprenant*
>

Re: Online merging of Regions

Posted by Atri Sharma <at...@gmail.com>.
Hi Ted,

I am looking at some sort of storage level or query engine project. If
possible, I would like to add and improve aggregate and analytical sort of
queries in HBase.

I am trying to re setup an installation on my laptop right now.

Regards,

Atri
On Sunday, March 2, 2014, Ted Yu <yu...@gmail.com> wrote:

> Atri:
> What level of project do you want to work on ?
> Do you have access to cluster ?
>
> There're many open JIRAs in HBase.
> Recommendation can be made based on your interest.
>
>
>
> On Sat, Mar 1, 2014 at 7:00 PM, Atri Sharma <atri.jiit@gmail.com<javascript:;>>
> wrote:
>
> > Thanks guys.
> >
> > Is there anything I could work on? I was searching for a project to
> > contribute but couldn't find it. Please help
> >
> > Regards,
> >
> > Atri
> >
> > On Sunday, March 2, 2014, Jonathan Hsieh <jon@cloudera.com<javascript:;>>
> wrote:
> >
> > > It is in 0.95/0.96/0.98 releases.
> > >
> > >
> > > On Sat, Mar 1, 2014 at 12:01 PM, Atri Sharma <atri.jiit@gmail.com<javascript:;>
> > <javascript:;>>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Just as a theoretical interest, do we support online merging of
> Regions
> > > in
> > > > any way? Are there ways to merge Regions while still supporting reads
> > and
> > > > writes to them?
> > > >
> > > > If not, can we do them in the following manner:
> > > >
> > > > At each major delete/explicit order from user to merge Regions, we
> can
> > > > create a new empty memstore which will take any reads for the Region
> in
> > > > discussion. Then, we can build a leftist tree of the Region and store
> > it
> > > in
> > > > the memory itself. We so not need to write it into disk yet. This
> shall
> > > > also allow us to serve reads using the Region itself without any
> extra
> > > > additional copies.
> > > >
> > > > We do it for all the Regions being merged and then merge the leftist
> > > trees
> > > > made ordered by the range of keys served by the Regions. We then
> > traverse
> > > > the final all merged leftist tree, write its data to a HFile and
> write
> > > the
> > > > new memstore being used for write's data into the HFile as well.
> > > >
> > > > It's just a thought.Please let me know your feedback and comments on
> > it.
> > > >
> > > > Regards,
> > > >
> > > > Atri
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > >
> > > > Atri
> > > > *l'apprenant*
> > > >
> > >
> > >
> > >
> > > --
> > > // Jonathan Hsieh (shay)
> > > // HBase Tech Lead, Software Engineer, Cloudera
> > > // jon@cloudera.com <javascript:;> <javascript:;> // @jmhsieh
> > >
> >
> >
> > --
> > Regards,
> >
> > Atri
> > *l'apprenant*
> >
>


-- 
Regards,

Atri
*l'apprenant*

Re: Online merging of Regions

Posted by Ted Yu <yu...@gmail.com>.
Atri:
What level of project do you want to work on ?
Do you have access to cluster ?

There're many open JIRAs in HBase.
Recommendation can be made based on your interest.



On Sat, Mar 1, 2014 at 7:00 PM, Atri Sharma <at...@gmail.com> wrote:

> Thanks guys.
>
> Is there anything I could work on? I was searching for a project to
> contribute but couldn't find it. Please help
>
> Regards,
>
> Atri
>
> On Sunday, March 2, 2014, Jonathan Hsieh <jo...@cloudera.com> wrote:
>
> > It is in 0.95/0.96/0.98 releases.
> >
> >
> > On Sat, Mar 1, 2014 at 12:01 PM, Atri Sharma <atri.jiit@gmail.com
> <javascript:;>>
> > wrote:
> >
> > > Hi all,
> > >
> > > Just as a theoretical interest, do we support online merging of Regions
> > in
> > > any way? Are there ways to merge Regions while still supporting reads
> and
> > > writes to them?
> > >
> > > If not, can we do them in the following manner:
> > >
> > > At each major delete/explicit order from user to merge Regions, we can
> > > create a new empty memstore which will take any reads for the Region in
> > > discussion. Then, we can build a leftist tree of the Region and store
> it
> > in
> > > the memory itself. We so not need to write it into disk yet. This shall
> > > also allow us to serve reads using the Region itself without any extra
> > > additional copies.
> > >
> > > We do it for all the Regions being merged and then merge the leftist
> > trees
> > > made ordered by the range of keys served by the Regions. We then
> traverse
> > > the final all merged leftist tree, write its data to a HFile and write
> > the
> > > new memstore being used for write's data into the HFile as well.
> > >
> > > It's just a thought.Please let me know your feedback and comments on
> it.
> > >
> > > Regards,
> > >
> > > Atri
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Atri
> > > *l'apprenant*
> > >
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // HBase Tech Lead, Software Engineer, Cloudera
> > // jon@cloudera.com <javascript:;> // @jmhsieh
> >
>
>
> --
> Regards,
>
> Atri
> *l'apprenant*
>

Re: Online merging of Regions

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Atri,

You are very welcome to contribute. The best place to start is JIRA. Search
for open JIRAs where no-one has yet contributed, and look at it. There is
some easier than others. Just take one which you feel comfortable with.

JM


2014-03-01 22:00 GMT-05:00 Atri Sharma <at...@gmail.com>:

> Thanks guys.
>
> Is there anything I could work on? I was searching for a project to
> contribute but couldn't find it. Please help
>
> Regards,
>
> Atri
>
> On Sunday, March 2, 2014, Jonathan Hsieh <jo...@cloudera.com> wrote:
>
> > It is in 0.95/0.96/0.98 releases.
> >
> >
> > On Sat, Mar 1, 2014 at 12:01 PM, Atri Sharma <atri.jiit@gmail.com
> <javascript:;>>
> > wrote:
> >
> > > Hi all,
> > >
> > > Just as a theoretical interest, do we support online merging of Regions
> > in
> > > any way? Are there ways to merge Regions while still supporting reads
> and
> > > writes to them?
> > >
> > > If not, can we do them in the following manner:
> > >
> > > At each major delete/explicit order from user to merge Regions, we can
> > > create a new empty memstore which will take any reads for the Region in
> > > discussion. Then, we can build a leftist tree of the Region and store
> it
> > in
> > > the memory itself. We so not need to write it into disk yet. This shall
> > > also allow us to serve reads using the Region itself without any extra
> > > additional copies.
> > >
> > > We do it for all the Regions being merged and then merge the leftist
> > trees
> > > made ordered by the range of keys served by the Regions. We then
> traverse
> > > the final all merged leftist tree, write its data to a HFile and write
> > the
> > > new memstore being used for write's data into the HFile as well.
> > >
> > > It's just a thought.Please let me know your feedback and comments on
> it.
> > >
> > > Regards,
> > >
> > > Atri
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Atri
> > > *l'apprenant*
> > >
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // HBase Tech Lead, Software Engineer, Cloudera
> > // jon@cloudera.com <javascript:;> // @jmhsieh
> >
>
>
> --
> Regards,
>
> Atri
> *l'apprenant*
>

Re: Online merging of Regions

Posted by Atri Sharma <at...@gmail.com>.
Thanks guys.

Is there anything I could work on? I was searching for a project to
contribute but couldn't find it. Please help

Regards,

Atri

On Sunday, March 2, 2014, Jonathan Hsieh <jo...@cloudera.com> wrote:

> It is in 0.95/0.96/0.98 releases.
>
>
> On Sat, Mar 1, 2014 at 12:01 PM, Atri Sharma <atri.jiit@gmail.com<javascript:;>>
> wrote:
>
> > Hi all,
> >
> > Just as a theoretical interest, do we support online merging of Regions
> in
> > any way? Are there ways to merge Regions while still supporting reads and
> > writes to them?
> >
> > If not, can we do them in the following manner:
> >
> > At each major delete/explicit order from user to merge Regions, we can
> > create a new empty memstore which will take any reads for the Region in
> > discussion. Then, we can build a leftist tree of the Region and store it
> in
> > the memory itself. We so not need to write it into disk yet. This shall
> > also allow us to serve reads using the Region itself without any extra
> > additional copies.
> >
> > We do it for all the Regions being merged and then merge the leftist
> trees
> > made ordered by the range of keys served by the Regions. We then traverse
> > the final all merged leftist tree, write its data to a HFile and write
> the
> > new memstore being used for write's data into the HFile as well.
> >
> > It's just a thought.Please let me know your feedback and comments on it.
> >
> > Regards,
> >
> > Atri
> >
> >
> >
> > --
> > Regards,
> >
> > Atri
> > *l'apprenant*
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // HBase Tech Lead, Software Engineer, Cloudera
> // jon@cloudera.com <javascript:;> // @jmhsieh
>


-- 
Regards,

Atri
*l'apprenant*

Re: Online merging of Regions

Posted by Atri Sharma <at...@gmail.com>.
Thanks guys.

Is there anything I could work on? I was searching for a project to
contribute but couldn't find it. Please help

Regards,

Atri

On Sunday, March 2, 2014, Jonathan Hsieh <jo...@cloudera.com> wrote:

> It is in 0.95/0.96/0.98 releases.
>
>
> On Sat, Mar 1, 2014 at 12:01 PM, Atri Sharma <atri.jiit@gmail.com<javascript:;>>
> wrote:
>
> > Hi all,
> >
> > Just as a theoretical interest, do we support online merging of Regions
> in
> > any way? Are there ways to merge Regions while still supporting reads and
> > writes to them?
> >
> > If not, can we do them in the following manner:
> >
> > At each major delete/explicit order from user to merge Regions, we can
> > create a new empty memstore which will take any reads for the Region in
> > discussion. Then, we can build a leftist tree of the Region and store it
> in
> > the memory itself. We so not need to write it into disk yet. This shall
> > also allow us to serve reads using the Region itself without any extra
> > additional copies.
> >
> > We do it for all the Regions being merged and then merge the leftist
> trees
> > made ordered by the range of keys served by the Regions. We then traverse
> > the final all merged leftist tree, write its data to a HFile and write
> the
> > new memstore being used for write's data into the HFile as well.
> >
> > It's just a thought.Please let me know your feedback and comments on it.
> >
> > Regards,
> >
> > Atri
> >
> >
> >
> > --
> > Regards,
> >
> > Atri
> > *l'apprenant*
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // HBase Tech Lead, Software Engineer, Cloudera
> // jon@cloudera.com <javascript:;> // @jmhsieh
>


-- 
Regards,

Atri
*l'apprenant*

Re: Online merging of Regions

Posted by Jonathan Hsieh <jo...@cloudera.com>.
It is in 0.95/0.96/0.98 releases.


On Sat, Mar 1, 2014 at 12:01 PM, Atri Sharma <at...@gmail.com> wrote:

> Hi all,
>
> Just as a theoretical interest, do we support online merging of Regions in
> any way? Are there ways to merge Regions while still supporting reads and
> writes to them?
>
> If not, can we do them in the following manner:
>
> At each major delete/explicit order from user to merge Regions, we can
> create a new empty memstore which will take any reads for the Region in
> discussion. Then, we can build a leftist tree of the Region and store it in
> the memory itself. We so not need to write it into disk yet. This shall
> also allow us to serve reads using the Region itself without any extra
> additional copies.
>
> We do it for all the Regions being merged and then merge the leftist trees
> made ordered by the range of keys served by the Regions. We then traverse
> the final all merged leftist tree, write its data to a HFile and write the
> new memstore being used for write's data into the HFile as well.
>
> It's just a thought.Please let me know your feedback and comments on it.
>
> Regards,
>
> Atri
>
>
>
> --
> Regards,
>
> Atri
> *l'apprenant*
>



-- 
// Jonathan Hsieh (shay)
// HBase Tech Lead, Software Engineer, Cloudera
// jon@cloudera.com // @jmhsieh