You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Upendra Yadav <up...@gmail.com> on 2014/02/24 20:35:29 UTC

Some questions to get clear view

1. One region server can have more than one region for same table

2. Which one is correct:

a) Each region has one memstore( and all CF for this region will reside in
this single
memstore) and if memstore size reached its configured limit it will
snapshot and flush... due to single CF all CF have to flush.

b) Each region has n no. of CF and each CF has its own memstore. And when
one CF's memstore get full it will snapshot and flush. And will not force
to flush other CF.

I read the current document some days before and now once again i got that
doubts...

Re: Some questions to get clear view

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
People on HBase for longer than me might correct me if they have the
history of why, but each CF will go in its own directory and will create
its own store file. It might be related.


2014-02-24 15:11 GMT-05:00 Upendra Yadav <up...@gmail.com>:

> Thanks for your Reply...
>
> But what is the benefits of different memstore for different CF, when all
> of them are going to flush on the same time?
>
>
> On Tue, Feb 25, 2014 at 1:23 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > 1. correct.
> > 2. Regions doesn't have memstores. Regions servers have memstores. On per
> > region per CF. all the memstores for a single regions are flush at the
> same
> > time when one is full, even if the others are not.
> >
> > HTH.
> >
> > JM
> >
> >
> > 2014-02-24 14:35 GMT-05:00 Upendra Yadav <up...@gmail.com>:
> >
> > > 1. One region server can have more than one region for same table
> > >
> > > 2. Which one is correct:
> > >
> > > a) Each region has one memstore( and all CF for this region will reside
> > in
> > > this single
> > > memstore) and if memstore size reached its configured limit it will
> > > snapshot and flush... due to single CF all CF have to flush.
> > >
> > > b) Each region has n no. of CF and each CF has its own memstore. And
> when
> > > one CF's memstore get full it will snapshot and flush. And will not
> force
> > > to flush other CF.
> > >
> > > I read the current document some days before and now once again i got
> > that
> > > doubts...
> > >
> >
>

Re: Some questions to get clear view

Posted by Ted Yu <yu...@gmail.com>.
bq. HBase guarantees ACID semantics per-row

ACID guarantees are at region level.

bq. That's why all CF have to flush when any one of them got memstore limit.

See this comment where LSN means log sequence number:

https://issues.apache.org/jira/browse/HBASE-3149?focusedCommentId=13804537&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13804537



On Wed, Feb 26, 2014 at 10:35 AM, Upendra Yadav <up...@gmail.com>wrote:

> I think....
> HBase guarantees ACID semantics per-row
> That's why all CF have to flush when any one of them got memstore limit.
>
>
>
> On Tue, Feb 25, 2014 at 1:28 PM, Bharath Vissapragada <
> bharathv@cloudera.com
> > wrote:
>
> > Hi Upendra,
> >
> > Your argument is correct, especially when there is an uneven data
> > distribution across CFs in a region and this is what is discussed in
> > HBASE-3149.
> > See comments from Stack, Nicholas & Lars.
> >
> > - Bharath
> >
> >
> > On Tue, Feb 25, 2014 at 12:24 PM, Upendra Yadav <upendra1024@gmail.com
> > >wrote:
> >
> > > Thanks...
> > >
> > > but for a region, why hbase need to flush other CF when one of the CF
> got
> > > memstore limit...
> > >
> > >
> > > On Tue, Feb 25, 2014 at 2:50 AM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > Upendra:
> > > > In 0.89-fb branch, the following JIRA has been integrated:
> > > >
> > > > HBASE-3149 Make flush decisions per column family
> > > >
> > > > FYI
> > > >
> > > >
> > > > On Mon, Feb 24, 2014 at 2:11 PM, Upendra Yadav <
> upendra1024@gmail.com
> > > > >wrote:
> > > >
> > > > > Thanks for your Reply...
> > > > >
> > > > > But what is the benefits of different memstore for different CF,
> when
> > > all
> > > > > of them are going to flush on the same time?
> > > > >
> > > > >
> > > > > On Tue, Feb 25, 2014 at 1:23 AM, Jean-Marc Spaggiari <
> > > > > jean-marc@spaggiari.org> wrote:
> > > > >
> > > > > > 1. correct.
> > > > > > 2. Regions doesn't have memstores. Regions servers have
> memstores.
> > On
> > > > per
> > > > > > region per CF. all the memstores for a single regions are flush
> at
> > > the
> > > > > same
> > > > > > time when one is full, even if the others are not.
> > > > > >
> > > > > > HTH.
> > > > > >
> > > > > > JM
> > > > > >
> > > > > >
> > > > > > 2014-02-24 14:35 GMT-05:00 Upendra Yadav <upendra1024@gmail.com
> >:
> > > > > >
> > > > > > > 1. One region server can have more than one region for same
> table
> > > > > > >
> > > > > > > 2. Which one is correct:
> > > > > > >
> > > > > > > a) Each region has one memstore( and all CF for this region
> will
> > > > reside
> > > > > > in
> > > > > > > this single
> > > > > > > memstore) and if memstore size reached its configured limit it
> > will
> > > > > > > snapshot and flush... due to single CF all CF have to flush.
> > > > > > >
> > > > > > > b) Each region has n no. of CF and each CF has its own
> memstore.
> > > And
> > > > > when
> > > > > > > one CF's memstore get full it will snapshot and flush. And will
> > not
> > > > > force
> > > > > > > to flush other CF.
> > > > > > >
> > > > > > > I read the current document some days before and now once
> again i
> > > got
> > > > > > that
> > > > > > > doubts...
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Bharath Vissapragada
> > <http://www.cloudera.com>
> >
>

Re: Some questions to get clear view

Posted by Upendra Yadav <up...@gmail.com>.
I think....
HBase guarantees ACID semantics per-row
That's why all CF have to flush when any one of them got memstore limit.



On Tue, Feb 25, 2014 at 1:28 PM, Bharath Vissapragada <bharathv@cloudera.com
> wrote:

> Hi Upendra,
>
> Your argument is correct, especially when there is an uneven data
> distribution across CFs in a region and this is what is discussed in
> HBASE-3149.
> See comments from Stack, Nicholas & Lars.
>
> - Bharath
>
>
> On Tue, Feb 25, 2014 at 12:24 PM, Upendra Yadav <upendra1024@gmail.com
> >wrote:
>
> > Thanks...
> >
> > but for a region, why hbase need to flush other CF when one of the CF got
> > memstore limit...
> >
> >
> > On Tue, Feb 25, 2014 at 2:50 AM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Upendra:
> > > In 0.89-fb branch, the following JIRA has been integrated:
> > >
> > > HBASE-3149 Make flush decisions per column family
> > >
> > > FYI
> > >
> > >
> > > On Mon, Feb 24, 2014 at 2:11 PM, Upendra Yadav <upendra1024@gmail.com
> > > >wrote:
> > >
> > > > Thanks for your Reply...
> > > >
> > > > But what is the benefits of different memstore for different CF, when
> > all
> > > > of them are going to flush on the same time?
> > > >
> > > >
> > > > On Tue, Feb 25, 2014 at 1:23 AM, Jean-Marc Spaggiari <
> > > > jean-marc@spaggiari.org> wrote:
> > > >
> > > > > 1. correct.
> > > > > 2. Regions doesn't have memstores. Regions servers have memstores.
> On
> > > per
> > > > > region per CF. all the memstores for a single regions are flush at
> > the
> > > > same
> > > > > time when one is full, even if the others are not.
> > > > >
> > > > > HTH.
> > > > >
> > > > > JM
> > > > >
> > > > >
> > > > > 2014-02-24 14:35 GMT-05:00 Upendra Yadav <up...@gmail.com>:
> > > > >
> > > > > > 1. One region server can have more than one region for same table
> > > > > >
> > > > > > 2. Which one is correct:
> > > > > >
> > > > > > a) Each region has one memstore( and all CF for this region will
> > > reside
> > > > > in
> > > > > > this single
> > > > > > memstore) and if memstore size reached its configured limit it
> will
> > > > > > snapshot and flush... due to single CF all CF have to flush.
> > > > > >
> > > > > > b) Each region has n no. of CF and each CF has its own memstore.
> > And
> > > > when
> > > > > > one CF's memstore get full it will snapshot and flush. And will
> not
> > > > force
> > > > > > to flush other CF.
> > > > > >
> > > > > > I read the current document some days before and now once again i
> > got
> > > > > that
> > > > > > doubts...
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>
>

Re: Some questions to get clear view

Posted by Bharath Vissapragada <bh...@cloudera.com>.
Hi Upendra,

Your argument is correct, especially when there is an uneven data
distribution across CFs in a region and this is what is discussed in
HBASE-3149.
See comments from Stack, Nicholas & Lars.

- Bharath


On Tue, Feb 25, 2014 at 12:24 PM, Upendra Yadav <up...@gmail.com>wrote:

> Thanks...
>
> but for a region, why hbase need to flush other CF when one of the CF got
> memstore limit...
>
>
> On Tue, Feb 25, 2014 at 2:50 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > Upendra:
> > In 0.89-fb branch, the following JIRA has been integrated:
> >
> > HBASE-3149 Make flush decisions per column family
> >
> > FYI
> >
> >
> > On Mon, Feb 24, 2014 at 2:11 PM, Upendra Yadav <upendra1024@gmail.com
> > >wrote:
> >
> > > Thanks for your Reply...
> > >
> > > But what is the benefits of different memstore for different CF, when
> all
> > > of them are going to flush on the same time?
> > >
> > >
> > > On Tue, Feb 25, 2014 at 1:23 AM, Jean-Marc Spaggiari <
> > > jean-marc@spaggiari.org> wrote:
> > >
> > > > 1. correct.
> > > > 2. Regions doesn't have memstores. Regions servers have memstores. On
> > per
> > > > region per CF. all the memstores for a single regions are flush at
> the
> > > same
> > > > time when one is full, even if the others are not.
> > > >
> > > > HTH.
> > > >
> > > > JM
> > > >
> > > >
> > > > 2014-02-24 14:35 GMT-05:00 Upendra Yadav <up...@gmail.com>:
> > > >
> > > > > 1. One region server can have more than one region for same table
> > > > >
> > > > > 2. Which one is correct:
> > > > >
> > > > > a) Each region has one memstore( and all CF for this region will
> > reside
> > > > in
> > > > > this single
> > > > > memstore) and if memstore size reached its configured limit it will
> > > > > snapshot and flush... due to single CF all CF have to flush.
> > > > >
> > > > > b) Each region has n no. of CF and each CF has its own memstore.
> And
> > > when
> > > > > one CF's memstore get full it will snapshot and flush. And will not
> > > force
> > > > > to flush other CF.
> > > > >
> > > > > I read the current document some days before and now once again i
> got
> > > > that
> > > > > doubts...
> > > > >
> > > >
> > >
> >
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: Some questions to get clear view

Posted by Upendra Yadav <up...@gmail.com>.
Thanks...

but for a region, why hbase need to flush other CF when one of the CF got
memstore limit...


On Tue, Feb 25, 2014 at 2:50 AM, Ted Yu <yu...@gmail.com> wrote:

> Upendra:
> In 0.89-fb branch, the following JIRA has been integrated:
>
> HBASE-3149 Make flush decisions per column family
>
> FYI
>
>
> On Mon, Feb 24, 2014 at 2:11 PM, Upendra Yadav <upendra1024@gmail.com
> >wrote:
>
> > Thanks for your Reply...
> >
> > But what is the benefits of different memstore for different CF, when all
> > of them are going to flush on the same time?
> >
> >
> > On Tue, Feb 25, 2014 at 1:23 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> > > 1. correct.
> > > 2. Regions doesn't have memstores. Regions servers have memstores. On
> per
> > > region per CF. all the memstores for a single regions are flush at the
> > same
> > > time when one is full, even if the others are not.
> > >
> > > HTH.
> > >
> > > JM
> > >
> > >
> > > 2014-02-24 14:35 GMT-05:00 Upendra Yadav <up...@gmail.com>:
> > >
> > > > 1. One region server can have more than one region for same table
> > > >
> > > > 2. Which one is correct:
> > > >
> > > > a) Each region has one memstore( and all CF for this region will
> reside
> > > in
> > > > this single
> > > > memstore) and if memstore size reached its configured limit it will
> > > > snapshot and flush... due to single CF all CF have to flush.
> > > >
> > > > b) Each region has n no. of CF and each CF has its own memstore. And
> > when
> > > > one CF's memstore get full it will snapshot and flush. And will not
> > force
> > > > to flush other CF.
> > > >
> > > > I read the current document some days before and now once again i got
> > > that
> > > > doubts...
> > > >
> > >
> >
>

Re: Some questions to get clear view

Posted by Ted Yu <yu...@gmail.com>.
Upendra:
In 0.89-fb branch, the following JIRA has been integrated:

HBASE-3149 Make flush decisions per column family

FYI


On Mon, Feb 24, 2014 at 2:11 PM, Upendra Yadav <up...@gmail.com>wrote:

> Thanks for your Reply...
>
> But what is the benefits of different memstore for different CF, when all
> of them are going to flush on the same time?
>
>
> On Tue, Feb 25, 2014 at 1:23 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > 1. correct.
> > 2. Regions doesn't have memstores. Regions servers have memstores. On per
> > region per CF. all the memstores for a single regions are flush at the
> same
> > time when one is full, even if the others are not.
> >
> > HTH.
> >
> > JM
> >
> >
> > 2014-02-24 14:35 GMT-05:00 Upendra Yadav <up...@gmail.com>:
> >
> > > 1. One region server can have more than one region for same table
> > >
> > > 2. Which one is correct:
> > >
> > > a) Each region has one memstore( and all CF for this region will reside
> > in
> > > this single
> > > memstore) and if memstore size reached its configured limit it will
> > > snapshot and flush... due to single CF all CF have to flush.
> > >
> > > b) Each region has n no. of CF and each CF has its own memstore. And
> when
> > > one CF's memstore get full it will snapshot and flush. And will not
> force
> > > to flush other CF.
> > >
> > > I read the current document some days before and now once again i got
> > that
> > > doubts...
> > >
> >
>

Re: Some questions to get clear view

Posted by Upendra Yadav <up...@gmail.com>.
Thanks for your Reply...

But what is the benefits of different memstore for different CF, when all
of them are going to flush on the same time?


On Tue, Feb 25, 2014 at 1:23 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> 1. correct.
> 2. Regions doesn't have memstores. Regions servers have memstores. On per
> region per CF. all the memstores for a single regions are flush at the same
> time when one is full, even if the others are not.
>
> HTH.
>
> JM
>
>
> 2014-02-24 14:35 GMT-05:00 Upendra Yadav <up...@gmail.com>:
>
> > 1. One region server can have more than one region for same table
> >
> > 2. Which one is correct:
> >
> > a) Each region has one memstore( and all CF for this region will reside
> in
> > this single
> > memstore) and if memstore size reached its configured limit it will
> > snapshot and flush... due to single CF all CF have to flush.
> >
> > b) Each region has n no. of CF and each CF has its own memstore. And when
> > one CF's memstore get full it will snapshot and flush. And will not force
> > to flush other CF.
> >
> > I read the current document some days before and now once again i got
> that
> > doubts...
> >
>

Re: Some questions to get clear view

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
1. correct.
2. Regions doesn't have memstores. Regions servers have memstores. On per
region per CF. all the memstores for a single regions are flush at the same
time when one is full, even if the others are not.

HTH.

JM


2014-02-24 14:35 GMT-05:00 Upendra Yadav <up...@gmail.com>:

> 1. One region server can have more than one region for same table
>
> 2. Which one is correct:
>
> a) Each region has one memstore( and all CF for this region will reside in
> this single
> memstore) and if memstore size reached its configured limit it will
> snapshot and flush... due to single CF all CF have to flush.
>
> b) Each region has n no. of CF and each CF has its own memstore. And when
> one CF's memstore get full it will snapshot and flush. And will not force
> to flush other CF.
>
> I read the current document some days before and now once again i got that
> doubts...
>