You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Lin Ma <li...@gmail.com> on 2012/08/08 03:28:36 UTC

consistency, availability and partition pattern of HBase

Hello guys,

According to the notes by Werner*, "*He presented the CAP theorem, which
states that of three properties of shared-data systems—data consistency,
system availability, and tolerance to network partition—only two can be
achieved at any given time." =>
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html

But it seems HBase could achieve all of the 3 features at the same time.
Does it mean HBase breaks the rule by Werner. :-)

If not, which one is sacrificed -- consistency (by using HDFS),
availability (by using Zookeeper) or partition (by using region / column
family) ? And why?

regards,
Lin

Re: consistency, availability and partition pattern of HBase

Posted by Amandeep Khurana <am...@gmail.com>.

Please read the papers. You'll understand the architecture better that way.

On Aug 9, 2012, at 1:48 PM, Lin Ma <li...@gmail.com> wrote:

Thank you Amandeep,

So I can simply understand in this way (logically), there do exist multiple
region servers for the same region, but they are working in active-passive
mode, when at one time only one active server is active? Correct?

regards,
Lin

On Thu, Aug 9, 2012 at 2:04 PM, Amandeep Khurana <am...@gmail.com> wrote:

> Correct. You are limited to the throughput of a single region server while
> interacting with a particular region. This throughput limitation is
> typically handled by designing your keys such that your data is distributed
> well across the cluster.
> Having multiple region servers serve a single region gets you into the land
> of maintaining consistency across copies, which is challenging. It might be
> doable but that's not the design choice Bigtable (and hence HBase) made
> initially.
>
> On Thu, Aug 9, 2012 at 11:04 AM, Lin Ma <li...@gmail.com> wrote:
>
> > Thanks
> >
> > "only a single RegionServer ever hosts a region at once" -- I know HDFS
> > have multiple copies for the same file. Is region server works in
> > active-passive way, i.e. even if there are multiple copies, only one
> region
> > server could serve? If so, will it be bottleneck, supposing the traffic
> to
> > that region is too high?
> >
> > regards,
> > Lin
> >
> > On Thu, Aug 9, 2012 at 11:09 AM, Bryan Beaudreault <
> > bbeaudreault@hubspot.com
> > > wrote:
> >
> > > Actual data backing hbase is replicated, but that is handled by HDFS.
> >  Yes,
> > > if you lose an hdfs datanode, clients (in this case the client is
> hbase)
> > > move to the next node in the pipeline.
> > >
> > > However, only a single RegionServer ever hosts a region at once.  If
> the
> > > RegionServer dies, there is a period where the master must notice the
> > > regions are unhosted and move them to other regionservers.  During that
> > > period, data is inaccessible or modifiable.
> > >
> > > On Wed, Aug 8, 2012 at 10:32 PM, Lin Ma <li...@gmail.com> wrote:
> > >
> > > > Thank you Lars.
> > > >
> > > > Is the same data store duplicated copy across region server? If so,
> if
> > > one
> > > > primary server for the region dies, client just need to read from the
> > > > secondary server for the same region. Why there is data is
> unavailable
> > > > time?
> > > >
> > > > BTW: please feel free to correct me for any wrong knowledge about
> > HBase.
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com>
> > > wrote:
> > > >
> > > > > After a write completes the next read (regardless of the location
> it
> > is
> > > > > issued from) will see the latest value.
> > > > > This is because at any given time exactly RegionServer is
> responsible
> > > for
> > > > > a specific Key
> > > > > (through assignment of key ranges to regions and regions to
> > > > RegionServers).
> > > > >
> > > > >
> > > > > As Mohit said, the trade off is that data is unavailable if a
> > > > RegionServer
> > > > > dies until another RegionServer picks up the regions (and by
> > extension
> > > > the
> > > > > key range)
> > > > >
> > > > > -- Lars
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > From: Lin Ma <li...@gmail.com>
> > > > > To: user@hbase.apache.org
> > > > > Cc:
> > > > > Sent: Wednesday, August 8, 2012 8:47 AM
> > > > > Subject: Re: consistency, availability and partition pattern of
> HBase
> > > > >
> > > > > And consistency is not sacrificed? i.e. all distributed clients'
> > update
> > > > > will results in sequential / real time update? Once update is done
> by
> > > one
> > > > > client, all other client could see results immediately?
> > > > >
> > > > > regards,
> > > > > Lin
> > > > >
> > > > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <
> > mohitanchlia@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > I think availability is sacrificed in the sense that if region
> > server
> > > > > > fails clients will have data inaccessible for the time region
> comes
> > > up
> > > > on
> > > > > > some other server, not to confuse with data loss.
> > > > > >
> > > > > > Sent from my iPad
> > > > > >
> > > > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > > > > >
> > > > > > > Thank you Wei!
> > > > > > >
> > > > > > > Two more comments,
> > > > > > >
> > > > > > > 1. How about Hadoop's CAP characters do you think about?
> > > > > > > 2. For your comments, if HBase implements "per key sequential
> > > > > > consistency",
> > > > > > > what are the missing characters for consistency? Cross-key
> update
> > > > > > > sequences? Could you show me an example about what you think
> are
> > > > > missed?
> > > > > > > thanks.
> > > > > > >
> > > > > > > regards,
> > > > > > > Lin
> > > > > > >
> > > > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com>
> > wrote:
> > > > > > >
> > > > > > >> Hi Lin,
> > > > > > >>
> > > > > > >> In the CAP theorem
> > > > > > >> Consistency stands for atomic consistency, i.e., each CRUD
> > > operation
> > > > > > >> occurs sequentially in a global, real-time clock
> > > > > > >> Availability means each server if not partitioned can accept
> > > > requests
> > > > > > >>
> > > > > > >> Partition means network partition
> > > > > > >>
> > > > > > >> As far as I understand (although I do not see any official
> > > > > > documentation),
> > > > > > >> HBase achieved "per key sequential consistency", i.e., for a
> > > > specific
> > > > > > key,
> > > > > > >> there is an agreed sequence, for all operations on it. This is
> > > > weaker
> > > > > > than
> > > > > > >> strong or sequential consistency, but stronger than "eventual
> > > > > > >> consistency".
> > > > > > >>
> > > > > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > > > > >>
> > > > > > >> Best Regards,
> > > > > > >> Wei
> > > > > > >>
> > > > > > >> Wei Tan
> > > > > > >> Research Staff Member
> > > > > > >> IBM T. J. Watson Research Center
> > > > > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > > > > >> wtan@us.ibm.com; 914-784-6752
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> From:   Lin Ma <li...@gmail.com>
> > > > > > >> To:    user@hbase.apache.org,
> > > > > > >> Date:   08/07/2012 09:30 PM
> > > > > > >> Subject:        consistency, availability and partition
> pattern
> > of
> > > > > HBase
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> Hello guys,
> > > > > > >>
> > > > > > >> According to the notes by Werner*, "*He presented the CAP
> > theorem,
> > > > > which
> > > > > > >> states that of three properties of shared-data systems—data
> > > > > consistency,
> > > > > > >> system availability, and tolerance to network partition—only
> two
> > > can
> > > > > be
> > > > > > >> achieved at any given time." =>
> > > > > > >>
> > > > >
> > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > > > > >>
> > > > > > >> But it seems HBase could achieve all of the 3 features at the
> > same
> > > > > time.
> > > > > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > > > > >>
> > > > > > >> If not, which one is sacrificed -- consistency (by using
> HDFS),
> > > > > > >> availability (by using Zookeeper) or partition (by using
> region
> > /
> > > > > column
> > > > > > >> family) ? And why?
> > > > > > >>
> > > > > > >> regards,
> > > > > > >> Lin
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by Lin Ma <li...@gmail.com>.

Thank you Amandeep,

So I can simply understand in this way (logically), there do exist multiple
region servers for the same region, but they are working in active-passive
mode, when at one time only one active server is active? Correct?

regards,
Lin

On Thu, Aug 9, 2012 at 2:04 PM, Amandeep Khurana <am...@gmail.com> wrote:

> Correct. You are limited to the throughput of a single region server while
> interacting with a particular region. This throughput limitation is
> typically handled by designing your keys such that your data is distributed
> well across the cluster.
> Having multiple region servers serve a single region gets you into the land
> of maintaining consistency across copies, which is challenging. It might be
> doable but that's not the design choice Bigtable (and hence HBase) made
> initially.
>
> On Thu, Aug 9, 2012 at 11:04 AM, Lin Ma <li...@gmail.com> wrote:
>
> > Thanks
> >
> > "only a single RegionServer ever hosts a region at once" -- I know HDFS
> > have multiple copies for the same file. Is region server works in
> > active-passive way, i.e. even if there are multiple copies, only one
> region
> > server could serve? If so, will it be bottleneck, supposing the traffic
> to
> > that region is too high?
> >
> > regards,
> > Lin
> >
> > On Thu, Aug 9, 2012 at 11:09 AM, Bryan Beaudreault <
> > bbeaudreault@hubspot.com
> > > wrote:
> >
> > > Actual data backing hbase is replicated, but that is handled by HDFS.
> >  Yes,
> > > if you lose an hdfs datanode, clients (in this case the client is
> hbase)
> > > move to the next node in the pipeline.
> > >
> > > However, only a single RegionServer ever hosts a region at once.  If
> the
> > > RegionServer dies, there is a period where the master must notice the
> > > regions are unhosted and move them to other regionservers.  During that
> > > period, data is inaccessible or modifiable.
> > >
> > > On Wed, Aug 8, 2012 at 10:32 PM, Lin Ma <li...@gmail.com> wrote:
> > >
> > > > Thank you Lars.
> > > >
> > > > Is the same data store duplicated copy across region server? If so,
> if
> > > one
> > > > primary server for the region dies, client just need to read from the
> > > > secondary server for the same region. Why there is data is
> unavailable
> > > > time?
> > > >
> > > > BTW: please feel free to correct me for any wrong knowledge about
> > HBase.
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com>
> > > wrote:
> > > >
> > > > > After a write completes the next read (regardless of the location
> it
> > is
> > > > > issued from) will see the latest value.
> > > > > This is because at any given time exactly RegionServer is
> responsible
> > > for
> > > > > a specific Key
> > > > > (through assignment of key ranges to regions and regions to
> > > > RegionServers).
> > > > >
> > > > >
> > > > > As Mohit said, the trade off is that data is unavailable if a
> > > > RegionServer
> > > > > dies until another RegionServer picks up the regions (and by
> > extension
> > > > the
> > > > > key range)
> > > > >
> > > > > -- Lars
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > From: Lin Ma <li...@gmail.com>
> > > > > To: user@hbase.apache.org
> > > > > Cc:
> > > > > Sent: Wednesday, August 8, 2012 8:47 AM
> > > > > Subject: Re: consistency, availability and partition pattern of
> HBase
> > > > >
> > > > > And consistency is not sacrificed? i.e. all distributed clients'
> > update
> > > > > will results in sequential / real time update? Once update is done
> by
> > > one
> > > > > client, all other client could see results immediately?
> > > > >
> > > > > regards,
> > > > > Lin
> > > > >
> > > > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <
> > mohitanchlia@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > I think availability is sacrificed in the sense that if region
> > server
> > > > > > fails clients will have data inaccessible for the time region
> comes
> > > up
> > > > on
> > > > > > some other server, not to confuse with data loss.
> > > > > >
> > > > > > Sent from my iPad
> > > > > >
> > > > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > > > > >
> > > > > > > Thank you Wei!
> > > > > > >
> > > > > > > Two more comments,
> > > > > > >
> > > > > > > 1. How about Hadoop's CAP characters do you think about?
> > > > > > > 2. For your comments, if HBase implements "per key sequential
> > > > > > consistency",
> > > > > > > what are the missing characters for consistency? Cross-key
> update
> > > > > > > sequences? Could you show me an example about what you think
> are
> > > > > missed?
> > > > > > > thanks.
> > > > > > >
> > > > > > > regards,
> > > > > > > Lin
> > > > > > >
> > > > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com>
> > wrote:
> > > > > > >
> > > > > > >> Hi Lin,
> > > > > > >>
> > > > > > >> In the CAP theorem
> > > > > > >> Consistency stands for atomic consistency, i.e., each CRUD
> > > operation
> > > > > > >> occurs sequentially in a global, real-time clock
> > > > > > >> Availability means each server if not partitioned can accept
> > > > requests
> > > > > > >>
> > > > > > >> Partition means network partition
> > > > > > >>
> > > > > > >> As far as I understand (although I do not see any official
> > > > > > documentation),
> > > > > > >> HBase achieved "per key sequential consistency", i.e., for a
> > > > specific
> > > > > > key,
> > > > > > >> there is an agreed sequence, for all operations on it. This is
> > > > weaker
> > > > > > than
> > > > > > >> strong or sequential consistency, but stronger than "eventual
> > > > > > >> consistency".
> > > > > > >>
> > > > > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > > > > >>
> > > > > > >> Best Regards,
> > > > > > >> Wei
> > > > > > >>
> > > > > > >> Wei Tan
> > > > > > >> Research Staff Member
> > > > > > >> IBM T. J. Watson Research Center
> > > > > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > > > > >> wtan@us.ibm.com; 914-784-6752
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> From:   Lin Ma <li...@gmail.com>
> > > > > > >> To:    user@hbase.apache.org,
> > > > > > >> Date:   08/07/2012 09:30 PM
> > > > > > >> Subject:        consistency, availability and partition
> pattern
> > of
> > > > > HBase
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> Hello guys,
> > > > > > >>
> > > > > > >> According to the notes by Werner*, "*He presented the CAP
> > theorem,
> > > > > which
> > > > > > >> states that of three properties of shared-data systems—data
> > > > > consistency,
> > > > > > >> system availability, and tolerance to network partition—only
> two
> > > can
> > > > > be
> > > > > > >> achieved at any given time." =>
> > > > > > >>
> > > > >
> > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > > > > >>
> > > > > > >> But it seems HBase could achieve all of the 3 features at the
> > same
> > > > > time.
> > > > > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > > > > >>
> > > > > > >> If not, which one is sacrificed -- consistency (by using
> HDFS),
> > > > > > >> availability (by using Zookeeper) or partition (by using
> region
> > /
> > > > > column
> > > > > > >> family) ? And why?
> > > > > > >>
> > > > > > >> regards,
> > > > > > >> Lin
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by Amandeep Khurana <am...@gmail.com>.

Correct. You are limited to the throughput of a single region server while
interacting with a particular region. This throughput limitation is
typically handled by designing your keys such that your data is distributed
well across the cluster.
Having multiple region servers serve a single region gets you into the land
of maintaining consistency across copies, which is challenging. It might be
doable but that's not the design choice Bigtable (and hence HBase) made
initially.

On Thu, Aug 9, 2012 at 11:04 AM, Lin Ma <li...@gmail.com> wrote:

> Thanks
>
> "only a single RegionServer ever hosts a region at once" -- I know HDFS
> have multiple copies for the same file. Is region server works in
> active-passive way, i.e. even if there are multiple copies, only one region
> server could serve? If so, will it be bottleneck, supposing the traffic to
> that region is too high?
>
> regards,
> Lin
>
> On Thu, Aug 9, 2012 at 11:09 AM, Bryan Beaudreault <
> bbeaudreault@hubspot.com
> > wrote:
>
> > Actual data backing hbase is replicated, but that is handled by HDFS.
>  Yes,
> > if you lose an hdfs datanode, clients (in this case the client is hbase)
> > move to the next node in the pipeline.
> >
> > However, only a single RegionServer ever hosts a region at once.  If the
> > RegionServer dies, there is a period where the master must notice the
> > regions are unhosted and move them to other regionservers.  During that
> > period, data is inaccessible or modifiable.
> >
> > On Wed, Aug 8, 2012 at 10:32 PM, Lin Ma <li...@gmail.com> wrote:
> >
> > > Thank you Lars.
> > >
> > > Is the same data store duplicated copy across region server? If so, if
> > one
> > > primary server for the region dies, client just need to read from the
> > > secondary server for the same region. Why there is data is unavailable
> > > time?
> > >
> > > BTW: please feel free to correct me for any wrong knowledge about
> HBase.
> > >
> > > regards,
> > > Lin
> > >
> > > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com>
> > wrote:
> > >
> > > > After a write completes the next read (regardless of the location it
> is
> > > > issued from) will see the latest value.
> > > > This is because at any given time exactly RegionServer is responsible
> > for
> > > > a specific Key
> > > > (through assignment of key ranges to regions and regions to
> > > RegionServers).
> > > >
> > > >
> > > > As Mohit said, the trade off is that data is unavailable if a
> > > RegionServer
> > > > dies until another RegionServer picks up the regions (and by
> extension
> > > the
> > > > key range)
> > > >
> > > > -- Lars
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: Lin Ma <li...@gmail.com>
> > > > To: user@hbase.apache.org
> > > > Cc:
> > > > Sent: Wednesday, August 8, 2012 8:47 AM
> > > > Subject: Re: consistency, availability and partition pattern of HBase
> > > >
> > > > And consistency is not sacrificed? i.e. all distributed clients'
> update
> > > > will results in sequential / real time update? Once update is done by
> > one
> > > > client, all other client could see results immediately?
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <
> mohitanchlia@gmail.com
> > > > >wrote:
> > > >
> > > > > I think availability is sacrificed in the sense that if region
> server
> > > > > fails clients will have data inaccessible for the time region comes
> > up
> > > on
> > > > > some other server, not to confuse with data loss.
> > > > >
> > > > > Sent from my iPad
> > > > >
> > > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > > > >
> > > > > > Thank you Wei!
> > > > > >
> > > > > > Two more comments,
> > > > > >
> > > > > > 1. How about Hadoop's CAP characters do you think about?
> > > > > > 2. For your comments, if HBase implements "per key sequential
> > > > > consistency",
> > > > > > what are the missing characters for consistency? Cross-key update
> > > > > > sequences? Could you show me an example about what you think are
> > > > missed?
> > > > > > thanks.
> > > > > >
> > > > > > regards,
> > > > > > Lin
> > > > > >
> > > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com>
> wrote:
> > > > > >
> > > > > >> Hi Lin,
> > > > > >>
> > > > > >> In the CAP theorem
> > > > > >> Consistency stands for atomic consistency, i.e., each CRUD
> > operation
> > > > > >> occurs sequentially in a global, real-time clock
> > > > > >> Availability means each server if not partitioned can accept
> > > requests
> > > > > >>
> > > > > >> Partition means network partition
> > > > > >>
> > > > > >> As far as I understand (although I do not see any official
> > > > > documentation),
> > > > > >> HBase achieved "per key sequential consistency", i.e., for a
> > > specific
> > > > > key,
> > > > > >> there is an agreed sequence, for all operations on it. This is
> > > weaker
> > > > > than
> > > > > >> strong or sequential consistency, but stronger than "eventual
> > > > > >> consistency".
> > > > > >>
> > > > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > > > >>
> > > > > >> Best Regards,
> > > > > >> Wei
> > > > > >>
> > > > > >> Wei Tan
> > > > > >> Research Staff Member
> > > > > >> IBM T. J. Watson Research Center
> > > > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > > > >> wtan@us.ibm.com; 914-784-6752
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> From:   Lin Ma <li...@gmail.com>
> > > > > >> To:    user@hbase.apache.org,
> > > > > >> Date:   08/07/2012 09:30 PM
> > > > > >> Subject:        consistency, availability and partition pattern
> of
> > > > HBase
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> Hello guys,
> > > > > >>
> > > > > >> According to the notes by Werner*, "*He presented the CAP
> theorem,
> > > > which
> > > > > >> states that of three properties of shared-data systems—data
> > > > consistency,
> > > > > >> system availability, and tolerance to network partition—only two
> > can
> > > > be
> > > > > >> achieved at any given time." =>
> > > > > >>
> > > >
> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > > > >>
> > > > > >> But it seems HBase could achieve all of the 3 features at the
> same
> > > > time.
> > > > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > > > >>
> > > > > >> If not, which one is sacrificed -- consistency (by using HDFS),
> > > > > >> availability (by using Zookeeper) or partition (by using region
> /
> > > > column
> > > > > >> family) ? And why?
> > > > > >>
> > > > > >> regards,
> > > > > >> Lin
> > > > > >>
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > > >
> > >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by Lin Ma <li...@gmail.com>.

Thanks

"only a single RegionServer ever hosts a region at once" -- I know HDFS
have multiple copies for the same file. Is region server works in
active-passive way, i.e. even if there are multiple copies, only one region
server could serve? If so, will it be bottleneck, supposing the traffic to
that region is too high?

regards,
Lin

On Thu, Aug 9, 2012 at 11:09 AM, Bryan Beaudreault <bbeaudreault@hubspot.com
> wrote:

> Actual data backing hbase is replicated, but that is handled by HDFS.  Yes,
> if you lose an hdfs datanode, clients (in this case the client is hbase)
> move to the next node in the pipeline.
>
> However, only a single RegionServer ever hosts a region at once.  If the
> RegionServer dies, there is a period where the master must notice the
> regions are unhosted and move them to other regionservers.  During that
> period, data is inaccessible or modifiable.
>
> On Wed, Aug 8, 2012 at 10:32 PM, Lin Ma <li...@gmail.com> wrote:
>
> > Thank you Lars.
> >
> > Is the same data store duplicated copy across region server? If so, if
> one
> > primary server for the region dies, client just need to read from the
> > secondary server for the same region. Why there is data is unavailable
> > time?
> >
> > BTW: please feel free to correct me for any wrong knowledge about HBase.
> >
> > regards,
> > Lin
> >
> > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com>
> wrote:
> >
> > > After a write completes the next read (regardless of the location it is
> > > issued from) will see the latest value.
> > > This is because at any given time exactly RegionServer is responsible
> for
> > > a specific Key
> > > (through assignment of key ranges to regions and regions to
> > RegionServers).
> > >
> > >
> > > As Mohit said, the trade off is that data is unavailable if a
> > RegionServer
> > > dies until another RegionServer picks up the regions (and by extension
> > the
> > > key range)
> > >
> > > -- Lars
> > >
> > >
> > > ----- Original Message -----
> > > From: Lin Ma <li...@gmail.com>
> > > To: user@hbase.apache.org
> > > Cc:
> > > Sent: Wednesday, August 8, 2012 8:47 AM
> > > Subject: Re: consistency, availability and partition pattern of HBase
> > >
> > > And consistency is not sacrificed? i.e. all distributed clients' update
> > > will results in sequential / real time update? Once update is done by
> one
> > > client, all other client could see results immediately?
> > >
> > > regards,
> > > Lin
> > >
> > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mohitanchlia@gmail.com
> > > >wrote:
> > >
> > > > I think availability is sacrificed in the sense that if region server
> > > > fails clients will have data inaccessible for the time region comes
> up
> > on
> > > > some other server, not to confuse with data loss.
> > > >
> > > > Sent from my iPad
> > > >
> > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > > >
> > > > > Thank you Wei!
> > > > >
> > > > > Two more comments,
> > > > >
> > > > > 1. How about Hadoop's CAP characters do you think about?
> > > > > 2. For your comments, if HBase implements "per key sequential
> > > > consistency",
> > > > > what are the missing characters for consistency? Cross-key update
> > > > > sequences? Could you show me an example about what you think are
> > > missed?
> > > > > thanks.
> > > > >
> > > > > regards,
> > > > > Lin
> > > > >
> > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> > > > >
> > > > >> Hi Lin,
> > > > >>
> > > > >> In the CAP theorem
> > > > >> Consistency stands for atomic consistency, i.e., each CRUD
> operation
> > > > >> occurs sequentially in a global, real-time clock
> > > > >> Availability means each server if not partitioned can accept
> > requests
> > > > >>
> > > > >> Partition means network partition
> > > > >>
> > > > >> As far as I understand (although I do not see any official
> > > > documentation),
> > > > >> HBase achieved "per key sequential consistency", i.e., for a
> > specific
> > > > key,
> > > > >> there is an agreed sequence, for all operations on it. This is
> > weaker
> > > > than
> > > > >> strong or sequential consistency, but stronger than "eventual
> > > > >> consistency".
> > > > >>
> > > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > > >>
> > > > >> Best Regards,
> > > > >> Wei
> > > > >>
> > > > >> Wei Tan
> > > > >> Research Staff Member
> > > > >> IBM T. J. Watson Research Center
> > > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > > >> wtan@us.ibm.com; 914-784-6752
> > > > >>
> > > > >>
> > > > >>
> > > > >> From:   Lin Ma <li...@gmail.com>
> > > > >> To:    user@hbase.apache.org,
> > > > >> Date:   08/07/2012 09:30 PM
> > > > >> Subject:        consistency, availability and partition pattern of
> > > HBase
> > > > >>
> > > > >>
> > > > >>
> > > > >> Hello guys,
> > > > >>
> > > > >> According to the notes by Werner*, "*He presented the CAP theorem,
> > > which
> > > > >> states that of three properties of shared-data systems—data
> > > consistency,
> > > > >> system availability, and tolerance to network partition—only two
> can
> > > be
> > > > >> achieved at any given time." =>
> > > > >>
> > > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > > >>
> > > > >> But it seems HBase could achieve all of the 3 features at the same
> > > time.
> > > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > > >>
> > > > >> If not, which one is sacrificed -- consistency (by using HDFS),
> > > > >> availability (by using Zookeeper) or partition (by using region /
> > > column
> > > > >> family) ? And why?
> > > > >>
> > > > >> regards,
> > > > >> Lin
> > > > >>
> > > > >>
> > > > >>
> > > >
> > >
> > >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by lars hofhansl <lh...@yahoo.com>.

What Bryan said :)



----- Original Message -----
From: Bryan Beaudreault <bb...@hubspot.com>
To: user@hbase.apache.org
Cc: lars hofhansl <lh...@yahoo.com>
Sent: Wednesday, August 8, 2012 8:09 PM
Subject: Re: consistency, availability and partition pattern of HBase

Actual data backing hbase is replicated, but that is handled by HDFS.  Yes,
if you lose an hdfs datanode, clients (in this case the client is hbase)
move to the next node in the pipeline.

However, only a single RegionServer ever hosts a region at once.  If the
RegionServer dies, there is a period where the master must notice the
regions are unhosted and move them to other regionservers.  During that
period, data is inaccessible or modifiable.

On Wed, Aug 8, 2012 at 10:32 PM, Lin Ma <li...@gmail.com> wrote:

> Thank you Lars.
>
> Is the same data store duplicated copy across region server? If so, if one
> primary server for the region dies, client just need to read from the
> secondary server for the same region. Why there is data is unavailable
> time?
>
> BTW: please feel free to correct me for any wrong knowledge about HBase.
>
> regards,
> Lin
>
> On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com> wrote:
>
> > After a write completes the next read (regardless of the location it is
> > issued from) will see the latest value.
> > This is because at any given time exactly RegionServer is responsible for
> > a specific Key
> > (through assignment of key ranges to regions and regions to
> RegionServers).
> >
> >
> > As Mohit said, the trade off is that data is unavailable if a
> RegionServer
> > dies until another RegionServer picks up the regions (and by extension
> the
> > key range)
> >
> > -- Lars
> >
> >
> > ----- Original Message -----
> > From: Lin Ma <li...@gmail.com>
> > To: user@hbase.apache.org
> > Cc:
> > Sent: Wednesday, August 8, 2012 8:47 AM
> > Subject: Re: consistency, availability and partition pattern of HBase
> >
> > And consistency is not sacrificed? i.e. all distributed clients' update
> > will results in sequential / real time update? Once update is done by one
> > client, all other client could see results immediately?
> >
> > regards,
> > Lin
> >
> > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mohitanchlia@gmail.com
> > >wrote:
> >
> > > I think availability is sacrificed in the sense that if region server
> > > fails clients will have data inaccessible for the time region comes up
> on
> > > some other server, not to confuse with data loss.
> > >
> > > Sent from my iPad
> > >
> > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > >
> > > > Thank you Wei!
> > > >
> > > > Two more comments,
> > > >
> > > > 1. How about Hadoop's CAP characters do you think about?
> > > > 2. For your comments, if HBase implements "per key sequential
> > > consistency",
> > > > what are the missing characters for consistency? Cross-key update
> > > > sequences? Could you show me an example about what you think are
> > missed?
> > > > thanks.
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> > > >
> > > >> Hi Lin,
> > > >>
> > > >> In the CAP theorem
> > > >> Consistency stands for atomic consistency, i.e., each CRUD operation
> > > >> occurs sequentially in a global, real-time clock
> > > >> Availability means each server if not partitioned can accept
> requests
> > > >>
> > > >> Partition means network partition
> > > >>
> > > >> As far as I understand (although I do not see any official
> > > documentation),
> > > >> HBase achieved "per key sequential consistency", i.e., for a
> specific
> > > key,
> > > >> there is an agreed sequence, for all operations on it. This is
> weaker
> > > than
> > > >> strong or sequential consistency, but stronger than "eventual
> > > >> consistency".
> > > >>
> > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > >>
> > > >> Best Regards,
> > > >> Wei
> > > >>
> > > >> Wei Tan
> > > >> Research Staff Member
> > > >> IBM T. J. Watson Research Center
> > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > >> wtan@us.ibm.com; 914-784-6752
> > > >>
> > > >>
> > > >>
> > > >> From:   Lin Ma <li...@gmail.com>
> > > >> To:    user@hbase.apache.org,
> > > >> Date:   08/07/2012 09:30 PM
> > > >> Subject:        consistency, availability and partition pattern of
> > HBase
> > > >>
> > > >>
> > > >>
> > > >> Hello guys,
> > > >>
> > > >> According to the notes by Werner*, "*He presented the CAP theorem,
> > which
> > > >> states that of three properties of shared-data systems—data
> > consistency,
> > > >> system availability, and tolerance to network partition—only two can
> > be
> > > >> achieved at any given time." =>
> > > >>
> > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > >>
> > > >> But it seems HBase could achieve all of the 3 features at the same
> > time.
> > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > >>
> > > >> If not, which one is sacrificed -- consistency (by using HDFS),
> > > >> availability (by using Zookeeper) or partition (by using region /
> > column
> > > >> family) ? And why?
> > > >>
> > > >> regards,
> > > >> Lin
> > > >>
> > > >>
> > > >>
> > >
> >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Actual data backing hbase is replicated, but that is handled by HDFS.  Yes,
if you lose an hdfs datanode, clients (in this case the client is hbase)
move to the next node in the pipeline.

However, only a single RegionServer ever hosts a region at once.  If the
RegionServer dies, there is a period where the master must notice the
regions are unhosted and move them to other regionservers.  During that
period, data is inaccessible or modifiable.

On Wed, Aug 8, 2012 at 10:32 PM, Lin Ma <li...@gmail.com> wrote:

> Thank you Lars.
>
> Is the same data store duplicated copy across region server? If so, if one
> primary server for the region dies, client just need to read from the
> secondary server for the same region. Why there is data is unavailable
> time?
>
> BTW: please feel free to correct me for any wrong knowledge about HBase.
>
> regards,
> Lin
>
> On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com> wrote:
>
> > After a write completes the next read (regardless of the location it is
> > issued from) will see the latest value.
> > This is because at any given time exactly RegionServer is responsible for
> > a specific Key
> > (through assignment of key ranges to regions and regions to
> RegionServers).
> >
> >
> > As Mohit said, the trade off is that data is unavailable if a
> RegionServer
> > dies until another RegionServer picks up the regions (and by extension
> the
> > key range)
> >
> > -- Lars
> >
> >
> > ----- Original Message -----
> > From: Lin Ma <li...@gmail.com>
> > To: user@hbase.apache.org
> > Cc:
> > Sent: Wednesday, August 8, 2012 8:47 AM
> > Subject: Re: consistency, availability and partition pattern of HBase
> >
> > And consistency is not sacrificed? i.e. all distributed clients' update
> > will results in sequential / real time update? Once update is done by one
> > client, all other client could see results immediately?
> >
> > regards,
> > Lin
> >
> > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mohitanchlia@gmail.com
> > >wrote:
> >
> > > I think availability is sacrificed in the sense that if region server
> > > fails clients will have data inaccessible for the time region comes up
> on
> > > some other server, not to confuse with data loss.
> > >
> > > Sent from my iPad
> > >
> > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > >
> > > > Thank you Wei!
> > > >
> > > > Two more comments,
> > > >
> > > > 1. How about Hadoop's CAP characters do you think about?
> > > > 2. For your comments, if HBase implements "per key sequential
> > > consistency",
> > > > what are the missing characters for consistency? Cross-key update
> > > > sequences? Could you show me an example about what you think are
> > missed?
> > > > thanks.
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> > > >
> > > >> Hi Lin,
> > > >>
> > > >> In the CAP theorem
> > > >> Consistency stands for atomic consistency, i.e., each CRUD operation
> > > >> occurs sequentially in a global, real-time clock
> > > >> Availability means each server if not partitioned can accept
> requests
> > > >>
> > > >> Partition means network partition
> > > >>
> > > >> As far as I understand (although I do not see any official
> > > documentation),
> > > >> HBase achieved "per key sequential consistency", i.e., for a
> specific
> > > key,
> > > >> there is an agreed sequence, for all operations on it. This is
> weaker
> > > than
> > > >> strong or sequential consistency, but stronger than "eventual
> > > >> consistency".
> > > >>
> > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > >>
> > > >> Best Regards,
> > > >> Wei
> > > >>
> > > >> Wei Tan
> > > >> Research Staff Member
> > > >> IBM T. J. Watson Research Center
> > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > >> wtan@us.ibm.com; 914-784-6752
> > > >>
> > > >>
> > > >>
> > > >> From:   Lin Ma <li...@gmail.com>
> > > >> To:    user@hbase.apache.org,
> > > >> Date:   08/07/2012 09:30 PM
> > > >> Subject:        consistency, availability and partition pattern of
> > HBase
> > > >>
> > > >>
> > > >>
> > > >> Hello guys,
> > > >>
> > > >> According to the notes by Werner*, "*He presented the CAP theorem,
> > which
> > > >> states that of three properties of shared-data systems—data
> > consistency,
> > > >> system availability, and tolerance to network partition—only two can
> > be
> > > >> achieved at any given time." =>
> > > >>
> > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > >>
> > > >> But it seems HBase could achieve all of the 3 features at the same
> > time.
> > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > >>
> > > >> If not, which one is sacrificed -- consistency (by using HDFS),
> > > >> availability (by using Zookeeper) or partition (by using region /
> > column
> > > >> family) ? And why?
> > > >>
> > > >> regards,
> > > >> Lin
> > > >>
> > > >>
> > > >>
> > >
> >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by Lin Ma <li...@gmail.com>.

Thank you Amandeep,

I think Hadoop HDFS has multiple copies (e.g. 3 copies) on different
servers, if one server down, the other 2 can still serving, what are the
availability degrade issue you are referring to? It will be great if you
could show an example.

regards,
Lin

On Thu, Aug 9, 2012 at 1:41 PM, Amandeep Khurana <am...@gmail.com> wrote:

> HDFS also chooses to degrade availability in the face of partitions.
>
>
> On Thu, Aug 9, 2012 at 11:08 AM, Lin Ma <li...@gmail.com> wrote:
>
>> Amandeep, thanks for your comments, and I will definitely read the paper
>> you suggested.
>>
>> For Hadoop itself, what do you think its CAP features? Which one of the
>> CAP is sacrificed?
>>
>> regards,
>> Lin
>>
>> On Thu, Aug 9, 2012 at 1:34 PM, Amandeep Khurana <am...@gmail.com>wrote:
>>
>>> Firstly, I recommend you read the GFS and Bigtable papers. That'll give
>>> you
>>> a good understanding of the architecture. Adhoc question on the mailing
>>> list won't.
>>>
>>> I'll try to answer some of your questions briefly. Think of HBase as a
>>> database layer over an underlying filesystem (the same way MySQL is over
>>> ext2/3/4 etc). The filesystem for HBase in this case is HDFS. HDFS
>>> replicates data for redundancy and fault tolerance. HBase has region
>>> servers that serve the regions. Regions form tables. Region servers
>>> persist
>>> their data on HDFS. Now, every region is served by one and only one
>>> region
>>> server. So, HBase is not replicating anything. Replication is handled at
>>> the storage layer. If a region server goes down, all its regions now need
>>> to be served by some other region server. During this period of region
>>> assignment, the clients experience degraded availability if they try to
>>> interact with any of those regions.
>>>
>>> Coming back to CAP. HBase chooses to degrade availability in the face of
>>> partitions. "Partition" is a very general term here and does not
>>> necessarily mean network partitions. Any node falling off the HBase
>>> cluster
>>> can be considered to be a partition. So, when failures happen, HBase
>>> degrades availability but does not give up consistency. Consistency in
>>> this
>>> context is sort of the equivalent of atomicity in ACID. In the context of
>>> HBase, any copy of data that is written to HBase will be visible to all
>>> clients. There is no concept of multiple different versions that the
>>> clients need to reconcile between. When you read, you always get the same
>>> version of the row you are reading. In other words, HBase is strongly
>>> consistent.
>>>
>>> Hope that clears things up a bit.
>>>
>>> On Thu, Aug 9, 2012 at 8:02 AM, Lin Ma <li...@gmail.com> wrote:
>>>
>>> > Thank you Lars.
>>> >
>>> > Is the same data store duplicated copy across region server? If so, if
>>> one
>>> > primary server for the region dies, client just need to read from the
>>> > secondary server for the same region. Why there is data is unavailable
>>> > time?
>>> >
>>> > BTW: please feel free to correct me for any wrong knowledge about
>>> HBase.
>>> >
>>> > regards,
>>> > Lin
>>> >
>>> > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com>
>>> wrote:
>>> >
>>> > > After a write completes the next read (regardless of the location it
>>> is
>>> > > issued from) will see the latest value.
>>> > > This is because at any given time exactly RegionServer is
>>> responsible for
>>> > > a specific Key
>>> > > (through assignment of key ranges to regions and regions to
>>> > RegionServers).
>>> > >
>>> > >
>>> > > As Mohit said, the trade off is that data is unavailable if a
>>> > RegionServer
>>> > > dies until another RegionServer picks up the regions (and by
>>> extension
>>> > the
>>> > > key range)
>>> > >
>>> > > -- Lars
>>> > >
>>> > >
>>> > > ----- Original Message -----
>>> > > From: Lin Ma <li...@gmail.com>
>>> > > To: user@hbase.apache.org
>>> > > Cc:
>>> > > Sent: Wednesday, August 8, 2012 8:47 AM
>>> > > Subject: Re: consistency, availability and partition pattern of HBase
>>> > >
>>> > > And consistency is not sacrificed? i.e. all distributed clients'
>>> update
>>> > > will results in sequential / real time update? Once update is done
>>> by one
>>> > > client, all other client could see results immediately?
>>> > >
>>> > > regards,
>>> > > Lin
>>> > >
>>> > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <
>>> mohitanchlia@gmail.com
>>> > > >wrote:
>>> > >
>>> > > > I think availability is sacrificed in the sense that if region
>>> server
>>> > > > fails clients will have data inaccessible for the time region
>>> comes up
>>> > on
>>> > > > some other server, not to confuse with data loss.
>>> > > >
>>> > > > Sent from my iPad
>>> > > >
>>> > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
>>> > > >
>>> > > > > Thank you Wei!
>>> > > > >
>>> > > > > Two more comments,
>>> > > > >
>>> > > > > 1. How about Hadoop's CAP characters do you think about?
>>> > > > > 2. For your comments, if HBase implements "per key sequential
>>> > > > consistency",
>>> > > > > what are the missing characters for consistency? Cross-key update
>>> > > > > sequences? Could you show me an example about what you think are
>>> > > missed?
>>> > > > > thanks.
>>> > > > >
>>> > > > > regards,
>>> > > > > Lin
>>> > > > >
>>> > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com>
>>> wrote:
>>> > > > >
>>> > > > >> Hi Lin,
>>> > > > >>
>>> > > > >> In the CAP theorem
>>> > > > >> Consistency stands for atomic consistency, i.e., each CRUD
>>> operation
>>> > > > >> occurs sequentially in a global, real-time clock
>>> > > > >> Availability means each server if not partitioned can accept
>>> > requests
>>> > > > >>
>>> > > > >> Partition means network partition
>>> > > > >>
>>> > > > >> As far as I understand (although I do not see any official
>>> > > > documentation),
>>> > > > >> HBase achieved "per key sequential consistency", i.e., for a
>>> > specific
>>> > > > key,
>>> > > > >> there is an agreed sequence, for all operations on it. This is
>>> > weaker
>>> > > > than
>>> > > > >> strong or sequential consistency, but stronger than "eventual
>>> > > > >> consistency".
>>> > > > >>
>>> > > > >> BTW: CAP was proposed by Prof. Eric Brewer...
>>> > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
>>> > > > >>
>>> > > > >> Best Regards,
>>> > > > >> Wei
>>> > > > >>
>>> > > > >> Wei Tan
>>> > > > >> Research Staff Member
>>> > > > >> IBM T. J. Watson Research Center
>>> > > > >> 19 Skyline Dr, Hawthorne, NY  10532
>>> > > > >> wtan@us.ibm.com; 914-784-6752
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >> From:   Lin Ma <li...@gmail.com>
>>> > > > >> To:    user@hbase.apache.org,
>>> > > > >> Date:   08/07/2012 09:30 PM
>>> > > > >> Subject:        consistency, availability and partition pattern
>>> of
>>> > > HBase
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > > >> Hello guys,
>>> > > > >>
>>> > > > >> According to the notes by Werner*, "*He presented the CAP
>>> theorem,
>>> > > which
>>> > > > >> states that of three properties of shared-data systems—data
>>> > > consistency,
>>> > > > >> system availability, and tolerance to network partition—only
>>> two can
>>> > > be
>>> > > > >> achieved at any given time." =>
>>> > > > >>
>>> > >
>>> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
>>> > > > >>
>>> > > > >> But it seems HBase could achieve all of the 3 features at the
>>> same
>>> > > time.
>>> > > > >> Does it mean HBase breaks the rule by Werner. :-)
>>> > > > >>
>>> > > > >> If not, which one is sacrificed -- consistency (by using HDFS),
>>> > > > >> availability (by using Zookeeper) or partition (by using region
>>> /
>>> > > column
>>> > > > >> family) ? And why?
>>> > > > >>
>>> > > > >> regards,
>>> > > > >> Lin
>>> > > > >>
>>> > > > >>
>>> > > > >>
>>> > > >
>>> > >
>>> > >
>>> >
>>>
>>
>>
>

Re: consistency, availability and partition pattern of HBase

Posted by Amandeep Khurana <am...@gmail.com>.

HDFS also chooses to degrade availability in the face of partitions.

On Thu, Aug 9, 2012 at 11:08 AM, Lin Ma <li...@gmail.com> wrote:

> Amandeep, thanks for your comments, and I will definitely read the paper
> you suggested.
>
> For Hadoop itself, what do you think its CAP features? Which one of the
> CAP is sacrificed?
>
> regards,
> Lin
>
> On Thu, Aug 9, 2012 at 1:34 PM, Amandeep Khurana <am...@gmail.com> wrote:
>
>> Firstly, I recommend you read the GFS and Bigtable papers. That'll give
>> you
>> a good understanding of the architecture. Adhoc question on the mailing
>> list won't.
>>
>> I'll try to answer some of your questions briefly. Think of HBase as a
>> database layer over an underlying filesystem (the same way MySQL is over
>> ext2/3/4 etc). The filesystem for HBase in this case is HDFS. HDFS
>> replicates data for redundancy and fault tolerance. HBase has region
>> servers that serve the regions. Regions form tables. Region servers
>> persist
>> their data on HDFS. Now, every region is served by one and only one region
>> server. So, HBase is not replicating anything. Replication is handled at
>> the storage layer. If a region server goes down, all its regions now need
>> to be served by some other region server. During this period of region
>> assignment, the clients experience degraded availability if they try to
>> interact with any of those regions.
>>
>> Coming back to CAP. HBase chooses to degrade availability in the face of
>> partitions. "Partition" is a very general term here and does not
>> necessarily mean network partitions. Any node falling off the HBase
>> cluster
>> can be considered to be a partition. So, when failures happen, HBase
>> degrades availability but does not give up consistency. Consistency in
>> this
>> context is sort of the equivalent of atomicity in ACID. In the context of
>> HBase, any copy of data that is written to HBase will be visible to all
>> clients. There is no concept of multiple different versions that the
>> clients need to reconcile between. When you read, you always get the same
>> version of the row you are reading. In other words, HBase is strongly
>> consistent.
>>
>> Hope that clears things up a bit.
>>
>> On Thu, Aug 9, 2012 at 8:02 AM, Lin Ma <li...@gmail.com> wrote:
>>
>> > Thank you Lars.
>> >
>> > Is the same data store duplicated copy across region server? If so, if
>> one
>> > primary server for the region dies, client just need to read from the
>> > secondary server for the same region. Why there is data is unavailable
>> > time?
>> >
>> > BTW: please feel free to correct me for any wrong knowledge about HBase.
>> >
>> > regards,
>> > Lin
>> >
>> > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com>
>> wrote:
>> >
>> > > After a write completes the next read (regardless of the location it
>> is
>> > > issued from) will see the latest value.
>> > > This is because at any given time exactly RegionServer is responsible
>> for
>> > > a specific Key
>> > > (through assignment of key ranges to regions and regions to
>> > RegionServers).
>> > >
>> > >
>> > > As Mohit said, the trade off is that data is unavailable if a
>> > RegionServer
>> > > dies until another RegionServer picks up the regions (and by extension
>> > the
>> > > key range)
>> > >
>> > > -- Lars
>> > >
>> > >
>> > > ----- Original Message -----
>> > > From: Lin Ma <li...@gmail.com>
>> > > To: user@hbase.apache.org
>> > > Cc:
>> > > Sent: Wednesday, August 8, 2012 8:47 AM
>> > > Subject: Re: consistency, availability and partition pattern of HBase
>> > >
>> > > And consistency is not sacrificed? i.e. all distributed clients'
>> update
>> > > will results in sequential / real time update? Once update is done by
>> one
>> > > client, all other client could see results immediately?
>> > >
>> > > regards,
>> > > Lin
>> > >
>> > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <
>> mohitanchlia@gmail.com
>> > > >wrote:
>> > >
>> > > > I think availability is sacrificed in the sense that if region
>> server
>> > > > fails clients will have data inaccessible for the time region comes
>> up
>> > on
>> > > > some other server, not to confuse with data loss.
>> > > >
>> > > > Sent from my iPad
>> > > >
>> > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
>> > > >
>> > > > > Thank you Wei!
>> > > > >
>> > > > > Two more comments,
>> > > > >
>> > > > > 1. How about Hadoop's CAP characters do you think about?
>> > > > > 2. For your comments, if HBase implements "per key sequential
>> > > > consistency",
>> > > > > what are the missing characters for consistency? Cross-key update
>> > > > > sequences? Could you show me an example about what you think are
>> > > missed?
>> > > > > thanks.
>> > > > >
>> > > > > regards,
>> > > > > Lin
>> > > > >
>> > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
>> > > > >
>> > > > >> Hi Lin,
>> > > > >>
>> > > > >> In the CAP theorem
>> > > > >> Consistency stands for atomic consistency, i.e., each CRUD
>> operation
>> > > > >> occurs sequentially in a global, real-time clock
>> > > > >> Availability means each server if not partitioned can accept
>> > requests
>> > > > >>
>> > > > >> Partition means network partition
>> > > > >>
>> > > > >> As far as I understand (although I do not see any official
>> > > > documentation),
>> > > > >> HBase achieved "per key sequential consistency", i.e., for a
>> > specific
>> > > > key,
>> > > > >> there is an agreed sequence, for all operations on it. This is
>> > weaker
>> > > > than
>> > > > >> strong or sequential consistency, but stronger than "eventual
>> > > > >> consistency".
>> > > > >>
>> > > > >> BTW: CAP was proposed by Prof. Eric Brewer...
>> > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
>> > > > >>
>> > > > >> Best Regards,
>> > > > >> Wei
>> > > > >>
>> > > > >> Wei Tan
>> > > > >> Research Staff Member
>> > > > >> IBM T. J. Watson Research Center
>> > > > >> 19 Skyline Dr, Hawthorne, NY  10532
>> > > > >> wtan@us.ibm.com; 914-784-6752
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> From:   Lin Ma <li...@gmail.com>
>> > > > >> To:    user@hbase.apache.org,
>> > > > >> Date:   08/07/2012 09:30 PM
>> > > > >> Subject:        consistency, availability and partition pattern
>> of
>> > > HBase
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> Hello guys,
>> > > > >>
>> > > > >> According to the notes by Werner*, "*He presented the CAP
>> theorem,
>> > > which
>> > > > >> states that of three properties of shared-data systems—data
>> > > consistency,
>> > > > >> system availability, and tolerance to network partition—only two
>> can
>> > > be
>> > > > >> achieved at any given time." =>
>> > > > >>
>> > >
>> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
>> > > > >>
>> > > > >> But it seems HBase could achieve all of the 3 features at the
>> same
>> > > time.
>> > > > >> Does it mean HBase breaks the rule by Werner. :-)
>> > > > >>
>> > > > >> If not, which one is sacrificed -- consistency (by using HDFS),
>> > > > >> availability (by using Zookeeper) or partition (by using region /
>> > > column
>> > > > >> family) ? And why?
>> > > > >>
>> > > > >> regards,
>> > > > >> Lin
>> > > > >>
>> > > > >>
>> > > > >>
>> > > >
>> > >
>> > >
>> >
>>
>
>

Re: consistency, availability and partition pattern of HBase

Posted by Lin Ma <li...@gmail.com>.

Amandeep, thanks for your comments, and I will definitely read the paper
you suggested.

For Hadoop itself, what do you think its CAP features? Which one of the CAP
is sacrificed?

regards,
Lin

On Thu, Aug 9, 2012 at 1:34 PM, Amandeep Khurana <am...@gmail.com> wrote:

> Firstly, I recommend you read the GFS and Bigtable papers. That'll give you
> a good understanding of the architecture. Adhoc question on the mailing
> list won't.
>
> I'll try to answer some of your questions briefly. Think of HBase as a
> database layer over an underlying filesystem (the same way MySQL is over
> ext2/3/4 etc). The filesystem for HBase in this case is HDFS. HDFS
> replicates data for redundancy and fault tolerance. HBase has region
> servers that serve the regions. Regions form tables. Region servers persist
> their data on HDFS. Now, every region is served by one and only one region
> server. So, HBase is not replicating anything. Replication is handled at
> the storage layer. If a region server goes down, all its regions now need
> to be served by some other region server. During this period of region
> assignment, the clients experience degraded availability if they try to
> interact with any of those regions.
>
> Coming back to CAP. HBase chooses to degrade availability in the face of
> partitions. "Partition" is a very general term here and does not
> necessarily mean network partitions. Any node falling off the HBase cluster
> can be considered to be a partition. So, when failures happen, HBase
> degrades availability but does not give up consistency. Consistency in this
> context is sort of the equivalent of atomicity in ACID. In the context of
> HBase, any copy of data that is written to HBase will be visible to all
> clients. There is no concept of multiple different versions that the
> clients need to reconcile between. When you read, you always get the same
> version of the row you are reading. In other words, HBase is strongly
> consistent.
>
> Hope that clears things up a bit.
>
> On Thu, Aug 9, 2012 at 8:02 AM, Lin Ma <li...@gmail.com> wrote:
>
> > Thank you Lars.
> >
> > Is the same data store duplicated copy across region server? If so, if
> one
> > primary server for the region dies, client just need to read from the
> > secondary server for the same region. Why there is data is unavailable
> > time?
> >
> > BTW: please feel free to correct me for any wrong knowledge about HBase.
> >
> > regards,
> > Lin
> >
> > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com>
> wrote:
> >
> > > After a write completes the next read (regardless of the location it is
> > > issued from) will see the latest value.
> > > This is because at any given time exactly RegionServer is responsible
> for
> > > a specific Key
> > > (through assignment of key ranges to regions and regions to
> > RegionServers).
> > >
> > >
> > > As Mohit said, the trade off is that data is unavailable if a
> > RegionServer
> > > dies until another RegionServer picks up the regions (and by extension
> > the
> > > key range)
> > >
> > > -- Lars
> > >
> > >
> > > ----- Original Message -----
> > > From: Lin Ma <li...@gmail.com>
> > > To: user@hbase.apache.org
> > > Cc:
> > > Sent: Wednesday, August 8, 2012 8:47 AM
> > > Subject: Re: consistency, availability and partition pattern of HBase
> > >
> > > And consistency is not sacrificed? i.e. all distributed clients' update
> > > will results in sequential / real time update? Once update is done by
> one
> > > client, all other client could see results immediately?
> > >
> > > regards,
> > > Lin
> > >
> > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mohitanchlia@gmail.com
> > > >wrote:
> > >
> > > > I think availability is sacrificed in the sense that if region server
> > > > fails clients will have data inaccessible for the time region comes
> up
> > on
> > > > some other server, not to confuse with data loss.
> > > >
> > > > Sent from my iPad
> > > >
> > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > > >
> > > > > Thank you Wei!
> > > > >
> > > > > Two more comments,
> > > > >
> > > > > 1. How about Hadoop's CAP characters do you think about?
> > > > > 2. For your comments, if HBase implements "per key sequential
> > > > consistency",
> > > > > what are the missing characters for consistency? Cross-key update
> > > > > sequences? Could you show me an example about what you think are
> > > missed?
> > > > > thanks.
> > > > >
> > > > > regards,
> > > > > Lin
> > > > >
> > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> > > > >
> > > > >> Hi Lin,
> > > > >>
> > > > >> In the CAP theorem
> > > > >> Consistency stands for atomic consistency, i.e., each CRUD
> operation
> > > > >> occurs sequentially in a global, real-time clock
> > > > >> Availability means each server if not partitioned can accept
> > requests
> > > > >>
> > > > >> Partition means network partition
> > > > >>
> > > > >> As far as I understand (although I do not see any official
> > > > documentation),
> > > > >> HBase achieved "per key sequential consistency", i.e., for a
> > specific
> > > > key,
> > > > >> there is an agreed sequence, for all operations on it. This is
> > weaker
> > > > than
> > > > >> strong or sequential consistency, but stronger than "eventual
> > > > >> consistency".
> > > > >>
> > > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > > >>
> > > > >> Best Regards,
> > > > >> Wei
> > > > >>
> > > > >> Wei Tan
> > > > >> Research Staff Member
> > > > >> IBM T. J. Watson Research Center
> > > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > > >> wtan@us.ibm.com; 914-784-6752
> > > > >>
> > > > >>
> > > > >>
> > > > >> From:   Lin Ma <li...@gmail.com>
> > > > >> To:    user@hbase.apache.org,
> > > > >> Date:   08/07/2012 09:30 PM
> > > > >> Subject:        consistency, availability and partition pattern of
> > > HBase
> > > > >>
> > > > >>
> > > > >>
> > > > >> Hello guys,
> > > > >>
> > > > >> According to the notes by Werner*, "*He presented the CAP theorem,
> > > which
> > > > >> states that of three properties of shared-data systems—data
> > > consistency,
> > > > >> system availability, and tolerance to network partition—only two
> can
> > > be
> > > > >> achieved at any given time." =>
> > > > >>
> > > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > > >>
> > > > >> But it seems HBase could achieve all of the 3 features at the same
> > > time.
> > > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > > >>
> > > > >> If not, which one is sacrificed -- consistency (by using HDFS),
> > > > >> availability (by using Zookeeper) or partition (by using region /
> > > column
> > > > >> family) ? And why?
> > > > >>
> > > > >> regards,
> > > > >> Lin
> > > > >>
> > > > >>
> > > > >>
> > > >
> > >
> > >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by Amandeep Khurana <am...@gmail.com>.

Firstly, I recommend you read the GFS and Bigtable papers. That'll give you
a good understanding of the architecture. Adhoc question on the mailing
list won't.

I'll try to answer some of your questions briefly. Think of HBase as a
database layer over an underlying filesystem (the same way MySQL is over
ext2/3/4 etc). The filesystem for HBase in this case is HDFS. HDFS
replicates data for redundancy and fault tolerance. HBase has region
servers that serve the regions. Regions form tables. Region servers persist
their data on HDFS. Now, every region is served by one and only one region
server. So, HBase is not replicating anything. Replication is handled at
the storage layer. If a region server goes down, all its regions now need
to be served by some other region server. During this period of region
assignment, the clients experience degraded availability if they try to
interact with any of those regions.

Coming back to CAP. HBase chooses to degrade availability in the face of
partitions. "Partition" is a very general term here and does not
necessarily mean network partitions. Any node falling off the HBase cluster
can be considered to be a partition. So, when failures happen, HBase
degrades availability but does not give up consistency. Consistency in this
context is sort of the equivalent of atomicity in ACID. In the context of
HBase, any copy of data that is written to HBase will be visible to all
clients. There is no concept of multiple different versions that the
clients need to reconcile between. When you read, you always get the same
version of the row you are reading. In other words, HBase is strongly
consistent.

Hope that clears things up a bit.

On Thu, Aug 9, 2012 at 8:02 AM, Lin Ma <li...@gmail.com> wrote:

> Thank you Lars.
>
> Is the same data store duplicated copy across region server? If so, if one
> primary server for the region dies, client just need to read from the
> secondary server for the same region. Why there is data is unavailable
> time?
>
> BTW: please feel free to correct me for any wrong knowledge about HBase.
>
> regards,
> Lin
>
> On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com> wrote:
>
> > After a write completes the next read (regardless of the location it is
> > issued from) will see the latest value.
> > This is because at any given time exactly RegionServer is responsible for
> > a specific Key
> > (through assignment of key ranges to regions and regions to
> RegionServers).
> >
> >
> > As Mohit said, the trade off is that data is unavailable if a
> RegionServer
> > dies until another RegionServer picks up the regions (and by extension
> the
> > key range)
> >
> > -- Lars
> >
> >
> > ----- Original Message -----
> > From: Lin Ma <li...@gmail.com>
> > To: user@hbase.apache.org
> > Cc:
> > Sent: Wednesday, August 8, 2012 8:47 AM
> > Subject: Re: consistency, availability and partition pattern of HBase
> >
> > And consistency is not sacrificed? i.e. all distributed clients' update
> > will results in sequential / real time update? Once update is done by one
> > client, all other client could see results immediately?
> >
> > regards,
> > Lin
> >
> > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mohitanchlia@gmail.com
> > >wrote:
> >
> > > I think availability is sacrificed in the sense that if region server
> > > fails clients will have data inaccessible for the time region comes up
> on
> > > some other server, not to confuse with data loss.
> > >
> > > Sent from my iPad
> > >
> > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > >
> > > > Thank you Wei!
> > > >
> > > > Two more comments,
> > > >
> > > > 1. How about Hadoop's CAP characters do you think about?
> > > > 2. For your comments, if HBase implements "per key sequential
> > > consistency",
> > > > what are the missing characters for consistency? Cross-key update
> > > > sequences? Could you show me an example about what you think are
> > missed?
> > > > thanks.
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> > > >
> > > >> Hi Lin,
> > > >>
> > > >> In the CAP theorem
> > > >> Consistency stands for atomic consistency, i.e., each CRUD operation
> > > >> occurs sequentially in a global, real-time clock
> > > >> Availability means each server if not partitioned can accept
> requests
> > > >>
> > > >> Partition means network partition
> > > >>
> > > >> As far as I understand (although I do not see any official
> > > documentation),
> > > >> HBase achieved "per key sequential consistency", i.e., for a
> specific
> > > key,
> > > >> there is an agreed sequence, for all operations on it. This is
> weaker
> > > than
> > > >> strong or sequential consistency, but stronger than "eventual
> > > >> consistency".
> > > >>
> > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > >>
> > > >> Best Regards,
> > > >> Wei
> > > >>
> > > >> Wei Tan
> > > >> Research Staff Member
> > > >> IBM T. J. Watson Research Center
> > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > >> wtan@us.ibm.com; 914-784-6752
> > > >>
> > > >>
> > > >>
> > > >> From:   Lin Ma <li...@gmail.com>
> > > >> To:    user@hbase.apache.org,
> > > >> Date:   08/07/2012 09:30 PM
> > > >> Subject:        consistency, availability and partition pattern of
> > HBase
> > > >>
> > > >>
> > > >>
> > > >> Hello guys,
> > > >>
> > > >> According to the notes by Werner*, "*He presented the CAP theorem,
> > which
> > > >> states that of three properties of shared-data systems—data
> > consistency,
> > > >> system availability, and tolerance to network partition—only two can
> > be
> > > >> achieved at any given time." =>
> > > >>
> > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > >>
> > > >> But it seems HBase could achieve all of the 3 features at the same
> > time.
> > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > >>
> > > >> If not, which one is sacrificed -- consistency (by using HDFS),
> > > >> availability (by using Zookeeper) or partition (by using region /
> > column
> > > >> family) ? And why?
> > > >>
> > > >> regards,
> > > >> Lin
> > > >>
> > > >>
> > > >>
> > >
> >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by Mohit Anchlia <mo...@gmail.com>.

On Wed, Aug 8, 2012 at 7:32 PM, Lin Ma <li...@gmail.com> wrote:

> Thank you Lars.
>
> Is the same data store duplicated copy across region server? If so, if one
> primary server for the region dies, client just need to read from the
> secondary server for the same region. Why there is data is unavailable
> time?
>
>
To get better understanding of this I suggest looking at how the WAL logs
are stored. WAL stores multiple regions in one log. Before region is alive
on other region server master needs to split the logs so that it can
replayed by the region server. This process causes downtime with respect to
the region which is being replayed using edit logs.


> BTW: please feel free to correct me for any wrong knowledge about HBase.
>
> regards,
> Lin
>
> On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com> wrote:
>
> > After a write completes the next read (regardless of the location it is
> > issued from) will see the latest value.
> > This is because at any given time exactly RegionServer is responsible for
> > a specific Key
> > (through assignment of key ranges to regions and regions to
> RegionServers).
> >
> >
> > As Mohit said, the trade off is that data is unavailable if a
> RegionServer
> > dies until another RegionServer picks up the regions (and by extension
> the
> > key range)
> >
> > -- Lars
> >
> >
> > ----- Original Message -----
> > From: Lin Ma <li...@gmail.com>
> > To: user@hbase.apache.org
> > Cc:
> > Sent: Wednesday, August 8, 2012 8:47 AM
> > Subject: Re: consistency, availability and partition pattern of HBase
> >
> > And consistency is not sacrificed? i.e. all distributed clients' update
> > will results in sequential / real time update? Once update is done by one
> > client, all other client could see results immediately?
> >
> > regards,
> > Lin
> >
> > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mohitanchlia@gmail.com
> > >wrote:
> >
> > > I think availability is sacrificed in the sense that if region server
> > > fails clients will have data inaccessible for the time region comes up
> on
> > > some other server, not to confuse with data loss.
> > >
> > > Sent from my iPad
> > >
> > > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> > >
> > > > Thank you Wei!
> > > >
> > > > Two more comments,
> > > >
> > > > 1. How about Hadoop's CAP characters do you think about?
> > > > 2. For your comments, if HBase implements "per key sequential
> > > consistency",
> > > > what are the missing characters for consistency? Cross-key update
> > > > sequences? Could you show me an example about what you think are
> > missed?
> > > > thanks.
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> > > >
> > > >> Hi Lin,
> > > >>
> > > >> In the CAP theorem
> > > >> Consistency stands for atomic consistency, i.e., each CRUD operation
> > > >> occurs sequentially in a global, real-time clock
> > > >> Availability means each server if not partitioned can accept
> requests
> > > >>
> > > >> Partition means network partition
> > > >>
> > > >> As far as I understand (although I do not see any official
> > > documentation),
> > > >> HBase achieved "per key sequential consistency", i.e., for a
> specific
> > > key,
> > > >> there is an agreed sequence, for all operations on it. This is
> weaker
> > > than
> > > >> strong or sequential consistency, but stronger than "eventual
> > > >> consistency".
> > > >>
> > > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > > >>
> > > >> Best Regards,
> > > >> Wei
> > > >>
> > > >> Wei Tan
> > > >> Research Staff Member
> > > >> IBM T. J. Watson Research Center
> > > >> 19 Skyline Dr, Hawthorne, NY  10532
> > > >> wtan@us.ibm.com; 914-784-6752
> > > >>
> > > >>
> > > >>
> > > >> From:   Lin Ma <li...@gmail.com>
> > > >> To:    user@hbase.apache.org,
> > > >> Date:   08/07/2012 09:30 PM
> > > >> Subject:        consistency, availability and partition pattern of
> > HBase
> > > >>
> > > >>
> > > >>
> > > >> Hello guys,
> > > >>
> > > >> According to the notes by Werner*, "*He presented the CAP theorem,
> > which
> > > >> states that of three properties of shared-data systems—data
> > consistency,
> > > >> system availability, and tolerance to network partition—only two can
> > be
> > > >> achieved at any given time." =>
> > > >>
> > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > > >>
> > > >> But it seems HBase could achieve all of the 3 features at the same
> > time.
> > > >> Does it mean HBase breaks the rule by Werner. :-)
> > > >>
> > > >> If not, which one is sacrificed -- consistency (by using HDFS),
> > > >> availability (by using Zookeeper) or partition (by using region /
> > column
> > > >> family) ? And why?
> > > >>
> > > >> regards,
> > > >> Lin
> > > >>
> > > >>
> > > >>
> > >
> >
> >
>

Re: consistency, availability and partition pattern of HBase

Posted by Lin Ma <li...@gmail.com>.

Thank you Lars.

Is the same data store duplicated copy across region server? If so, if one
primary server for the region dies, client just need to read from the
secondary server for the same region. Why there is data is unavailable time?

BTW: please feel free to correct me for any wrong knowledge about HBase.

regards,
Lin

On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <lh...@yahoo.com> wrote:

> After a write completes the next read (regardless of the location it is
> issued from) will see the latest value.
> This is because at any given time exactly RegionServer is responsible for
> a specific Key
> (through assignment of key ranges to regions and regions to RegionServers).
>
>
> As Mohit said, the trade off is that data is unavailable if a RegionServer
> dies until another RegionServer picks up the regions (and by extension the
> key range)
>
> -- Lars
>
>
> ----- Original Message -----
> From: Lin Ma <li...@gmail.com>
> To: user@hbase.apache.org
> Cc:
> Sent: Wednesday, August 8, 2012 8:47 AM
> Subject: Re: consistency, availability and partition pattern of HBase
>
> And consistency is not sacrificed? i.e. all distributed clients' update
> will results in sequential / real time update? Once update is done by one
> client, all other client could see results immediately?
>
> regards,
> Lin
>
> On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mohitanchlia@gmail.com
> >wrote:
>
> > I think availability is sacrificed in the sense that if region server
> > fails clients will have data inaccessible for the time region comes up on
> > some other server, not to confuse with data loss.
> >
> > Sent from my iPad
> >
> > On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
> >
> > > Thank you Wei!
> > >
> > > Two more comments,
> > >
> > > 1. How about Hadoop's CAP characters do you think about?
> > > 2. For your comments, if HBase implements "per key sequential
> > consistency",
> > > what are the missing characters for consistency? Cross-key update
> > > sequences? Could you show me an example about what you think are
> missed?
> > > thanks.
> > >
> > > regards,
> > > Lin
> > >
> > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> > >
> > >> Hi Lin,
> > >>
> > >> In the CAP theorem
> > >> Consistency stands for atomic consistency, i.e., each CRUD operation
> > >> occurs sequentially in a global, real-time clock
> > >> Availability means each server if not partitioned can accept requests
> > >>
> > >> Partition means network partition
> > >>
> > >> As far as I understand (although I do not see any official
> > documentation),
> > >> HBase achieved "per key sequential consistency", i.e., for a specific
> > key,
> > >> there is an agreed sequence, for all operations on it. This is weaker
> > than
> > >> strong or sequential consistency, but stronger than "eventual
> > >> consistency".
> > >>
> > >> BTW: CAP was proposed by Prof. Eric Brewer...
> > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> > >>
> > >> Best Regards,
> > >> Wei
> > >>
> > >> Wei Tan
> > >> Research Staff Member
> > >> IBM T. J. Watson Research Center
> > >> 19 Skyline Dr, Hawthorne, NY  10532
> > >> wtan@us.ibm.com; 914-784-6752
> > >>
> > >>
> > >>
> > >> From:   Lin Ma <li...@gmail.com>
> > >> To:    user@hbase.apache.org,
> > >> Date:   08/07/2012 09:30 PM
> > >> Subject:        consistency, availability and partition pattern of
> HBase
> > >>
> > >>
> > >>
> > >> Hello guys,
> > >>
> > >> According to the notes by Werner*, "*He presented the CAP theorem,
> which
> > >> states that of three properties of shared-data systems—data
> consistency,
> > >> system availability, and tolerance to network partition—only two can
> be
> > >> achieved at any given time." =>
> > >>
> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> > >>
> > >> But it seems HBase could achieve all of the 3 features at the same
> time.
> > >> Does it mean HBase breaks the rule by Werner. :-)
> > >>
> > >> If not, which one is sacrificed -- consistency (by using HDFS),
> > >> availability (by using Zookeeper) or partition (by using region /
> column
> > >> family) ? And why?
> > >>
> > >> regards,
> > >> Lin
> > >>
> > >>
> > >>
> >
>
>

Re: consistency, availability and partition pattern of HBase

Posted by lars hofhansl <lh...@yahoo.com>.

After a write completes the next read (regardless of the location it is issued from) will see the latest value.
This is because at any given time exactly RegionServer is responsible for a specific Key
(through assignment of key ranges to regions and regions to RegionServers).


As Mohit said, the trade off is that data is unavailable if a RegionServer dies until another RegionServer picks up the regions (and by extension the key range)

-- Lars


----- Original Message -----
From: Lin Ma <li...@gmail.com>
To: user@hbase.apache.org
Cc: 
Sent: Wednesday, August 8, 2012 8:47 AM
Subject: Re: consistency, availability and partition pattern of HBase

And consistency is not sacrificed? i.e. all distributed clients' update
will results in sequential / real time update? Once update is done by one
client, all other client could see results immediately?

regards,
Lin

On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> I think availability is sacrificed in the sense that if region server
> fails clients will have data inaccessible for the time region comes up on
> some other server, not to confuse with data loss.
>
> Sent from my iPad
>
> On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
>
> > Thank you Wei!
> >
> > Two more comments,
> >
> > 1. How about Hadoop's CAP characters do you think about?
> > 2. For your comments, if HBase implements "per key sequential
> consistency",
> > what are the missing characters for consistency? Cross-key update
> > sequences? Could you show me an example about what you think are missed?
> > thanks.
> >
> > regards,
> > Lin
> >
> > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> >
> >> Hi Lin,
> >>
> >> In the CAP theorem
> >> Consistency stands for atomic consistency, i.e., each CRUD operation
> >> occurs sequentially in a global, real-time clock
> >> Availability means each server if not partitioned can accept requests
> >>
> >> Partition means network partition
> >>
> >> As far as I understand (although I do not see any official
> documentation),
> >> HBase achieved "per key sequential consistency", i.e., for a specific
> key,
> >> there is an agreed sequence, for all operations on it. This is weaker
> than
> >> strong or sequential consistency, but stronger than "eventual
> >> consistency".
> >>
> >> BTW: CAP was proposed by Prof. Eric Brewer...
> >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> >>
> >> Best Regards,
> >> Wei
> >>
> >> Wei Tan
> >> Research Staff Member
> >> IBM T. J. Watson Research Center
> >> 19 Skyline Dr, Hawthorne, NY  10532
> >> wtan@us.ibm.com; 914-784-6752
> >>
> >>
> >>
> >> From:   Lin Ma <li...@gmail.com>
> >> To:    user@hbase.apache.org,
> >> Date:   08/07/2012 09:30 PM
> >> Subject:        consistency, availability and partition pattern of HBase
> >>
> >>
> >>
> >> Hello guys,
> >>
> >> According to the notes by Werner*, "*He presented the CAP theorem, which
> >> states that of three properties of shared-data systems—data consistency,
> >> system availability, and tolerance to network partition—only two can be
> >> achieved at any given time." =>
> >> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> >>
> >> But it seems HBase could achieve all of the 3 features at the same time.
> >> Does it mean HBase breaks the rule by Werner. :-)
> >>
> >> If not, which one is sacrificed -- consistency (by using HDFS),
> >> availability (by using Zookeeper) or partition (by using region / column
> >> family) ? And why?
> >>
> >> regards,
> >> Lin
> >>
> >>
> >>
>

Re: consistency, availability and partition pattern of HBase

Posted by Lin Ma <li...@gmail.com>.

And consistency is not sacrificed? i.e. all distributed clients' update
will results in sequential / real time update? Once update is done by one
client, all other client could see results immediately?

regards,
Lin

On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> I think availability is sacrificed in the sense that if region server
> fails clients will have data inaccessible for the time region comes up on
> some other server, not to confuse with data loss.
>
> Sent from my iPad
>
> On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:
>
> > Thank you Wei!
> >
> > Two more comments,
> >
> > 1. How about Hadoop's CAP characters do you think about?
> > 2. For your comments, if HBase implements "per key sequential
> consistency",
> > what are the missing characters for consistency? Cross-key update
> > sequences? Could you show me an example about what you think are missed?
> > thanks.
> >
> > regards,
> > Lin
> >
> > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> >
> >> Hi Lin,
> >>
> >> In the CAP theorem
> >> Consistency stands for atomic consistency, i.e., each CRUD operation
> >> occurs sequentially in a global, real-time clock
> >> Availability means each server if not partitioned can accept requests
> >>
> >> Partition means network partition
> >>
> >> As far as I understand (although I do not see any official
> documentation),
> >> HBase achieved "per key sequential consistency", i.e., for a specific
> key,
> >> there is an agreed sequence, for all operations on it. This is weaker
> than
> >> strong or sequential consistency, but stronger than "eventual
> >> consistency".
> >>
> >> BTW: CAP was proposed by Prof. Eric Brewer...
> >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
> >>
> >> Best Regards,
> >> Wei
> >>
> >> Wei Tan
> >> Research Staff Member
> >> IBM T. J. Watson Research Center
> >> 19 Skyline Dr, Hawthorne, NY  10532
> >> wtan@us.ibm.com; 914-784-6752
> >>
> >>
> >>
> >> From:   Lin Ma <li...@gmail.com>
> >> To:     user@hbase.apache.org,
> >> Date:   08/07/2012 09:30 PM
> >> Subject:        consistency, availability and partition pattern of HBase
> >>
> >>
> >>
> >> Hello guys,
> >>
> >> According to the notes by Werner*, "*He presented the CAP theorem, which
> >> states that of three properties of shared-data systems—data consistency,
> >> system availability, and tolerance to network partition—only two can be
> >> achieved at any given time." =>
> >> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> >>
> >> But it seems HBase could achieve all of the 3 features at the same time.
> >> Does it mean HBase breaks the rule by Werner. :-)
> >>
> >> If not, which one is sacrificed -- consistency (by using HDFS),
> >> availability (by using Zookeeper) or partition (by using region / column
> >> family) ? And why?
> >>
> >> regards,
> >> Lin
> >>
> >>
> >>
>

Re: consistency, availability and partition pattern of HBase

Posted by Mohit Anchlia <mo...@gmail.com>.

I think availability is sacrificed in the sense that if region server fails clients will have data inaccessible for the time region comes up on some other server, not to confuse with data loss.

Sent from my iPad

On Aug 7, 2012, at 11:56 PM, Lin Ma <li...@gmail.com> wrote:

> Thank you Wei!
> 
> Two more comments,
> 
> 1. How about Hadoop's CAP characters do you think about?
> 2. For your comments, if HBase implements "per key sequential consistency",
> what are the missing characters for consistency? Cross-key update
> sequences? Could you show me an example about what you think are missed?
> thanks.
> 
> regards,
> Lin
> 
> On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:
> 
>> Hi Lin,
>> 
>> In the CAP theorem
>> Consistency stands for atomic consistency, i.e., each CRUD operation
>> occurs sequentially in a global, real-time clock
>> Availability means each server if not partitioned can accept requests
>> 
>> Partition means network partition
>> 
>> As far as I understand (although I do not see any official documentation),
>> HBase achieved "per key sequential consistency", i.e., for a specific key,
>> there is an agreed sequence, for all operations on it. This is weaker than
>> strong or sequential consistency, but stronger than "eventual
>> consistency".
>> 
>> BTW: CAP was proposed by Prof. Eric Brewer...
>> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
>> 
>> Best Regards,
>> Wei
>> 
>> Wei Tan
>> Research Staff Member
>> IBM T. J. Watson Research Center
>> 19 Skyline Dr, Hawthorne, NY  10532
>> wtan@us.ibm.com; 914-784-6752
>> 
>> 
>> 
>> From:   Lin Ma <li...@gmail.com>
>> To:     user@hbase.apache.org,
>> Date:   08/07/2012 09:30 PM
>> Subject:        consistency, availability and partition pattern of HBase
>> 
>> 
>> 
>> Hello guys,
>> 
>> According to the notes by Werner*, "*He presented the CAP theorem, which
>> states that of three properties of shared-data systems—data consistency,
>> system availability, and tolerance to network partition—only two can be
>> achieved at any given time." =>
>> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
>> 
>> But it seems HBase could achieve all of the 3 features at the same time.
>> Does it mean HBase breaks the rule by Werner. :-)
>> 
>> If not, which one is sacrificed -- consistency (by using HDFS),
>> availability (by using Zookeeper) or partition (by using region / column
>> family) ? And why?
>> 
>> regards,
>> Lin
>> 
>> 
>>

Re: consistency, availability and partition pattern of HBase

Posted by Lin Ma <li...@gmail.com>.

Thank you Wei!

Two more comments,

1. How about Hadoop's CAP characters do you think about?
2. For your comments, if HBase implements "per key sequential consistency",
what are the missing characters for consistency? Cross-key update
sequences? Could you show me an example about what you think are missed?
thanks.

regards,
Lin

On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <wt...@us.ibm.com> wrote:

> Hi Lin,
>
> In the CAP theorem
> Consistency stands for atomic consistency, i.e., each CRUD operation
> occurs sequentially in a global, real-time clock
> Availability means each server if not partitioned can accept requests
>
> Partition means network partition
>
> As far as I understand (although I do not see any official documentation),
> HBase achieved "per key sequential consistency", i.e., for a specific key,
> there is an agreed sequence, for all operations on it. This is weaker than
> strong or sequential consistency, but stronger than "eventual
> consistency".
>
> BTW: CAP was proposed by Prof. Eric Brewer...
> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
>
> Best Regards,
> Wei
>
> Wei Tan
> Research Staff Member
> IBM T. J. Watson Research Center
> 19 Skyline Dr, Hawthorne, NY  10532
> wtan@us.ibm.com; 914-784-6752
>
>
>
> From:   Lin Ma <li...@gmail.com>
> To:     user@hbase.apache.org,
> Date:   08/07/2012 09:30 PM
> Subject:        consistency, availability and partition pattern of HBase
>
>
>
> Hello guys,
>
> According to the notes by Werner*, "*He presented the CAP theorem, which
> states that of three properties of shared-data systems—data consistency,
> system availability, and tolerance to network partition—only two can be
> achieved at any given time." =>
> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
>
> But it seems HBase could achieve all of the 3 features at the same time.
> Does it mean HBase breaks the rule by Werner. :-)
>
> If not, which one is sacrificed -- consistency (by using HDFS),
> availability (by using Zookeeper) or partition (by using region / column
> family) ? And why?
>
> regards,
> Lin
>
>
>

Re: consistency, availability and partition pattern of HBase

Posted by J Mohamed Zahoor <jm...@gmail.com>.

Hi Lin,

I would suggest reading this for more clarity.

http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/

./zahoor

On Wed, Aug 8, 2012 at 9:48 AM, Wei Tan <wt...@us.ibm.com> wrote:

> Hi Lin,
>
> In the CAP theorem
> Consistency stands for atomic consistency, i.e., each CRUD operation
> occurs sequentially in a global, real-time clock
> Availability means each server if not partitioned can accept requests
>
> Partition means network partition
>
> As far as I understand (although I do not see any official documentation),
> HBase achieved "per key sequential consistency", i.e., for a specific key,
> there is an agreed sequence, for all operations on it. This is weaker than
> strong or sequential consistency, but stronger than "eventual
> consistency".
>
> BTW: CAP was proposed by Prof. Eric Brewer...
> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
>
> Best Regards,
> Wei
>
> Wei Tan
> Research Staff Member
> IBM T. J. Watson Research Center
> 19 Skyline Dr, Hawthorne, NY  10532
> wtan@us.ibm.com; 914-784-6752
>
>
>
> From:   Lin Ma <li...@gmail.com>
> To:     user@hbase.apache.org,
> Date:   08/07/2012 09:30 PM
> Subject:        consistency, availability and partition pattern of HBase
>
>
>
> Hello guys,
>
> According to the notes by Werner*, "*He presented the CAP theorem, which
> states that of three properties of shared-data systems—data consistency,
> system availability, and tolerance to network partition—only two can be
> achieved at any given time." =>
> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
>
> But it seems HBase could achieve all of the 3 features at the same time.
> Does it mean HBase breaks the rule by Werner. :-)
>
> If not, which one is sacrificed -- consistency (by using HDFS),
> availability (by using Zookeeper) or partition (by using region / column
> family) ? And why?
>
> regards,
> Lin
>
>
>

Re: consistency, availability and partition pattern of HBase

Posted by Wei Tan <wt...@us.ibm.com>.

Hi Lin,

In the CAP theorem
Consistency stands for atomic consistency, i.e., each CRUD operation 
occurs sequentially in a global, real-time clock
Availability means each server if not partitioned can accept requests

Partition means network partition

As far as I understand (although I do not see any official documentation), 
HBase achieved "per key sequential consistency", i.e., for a specific key, 
there is an agreed sequence, for all operations on it. This is weaker than 
strong or sequential consistency, but stronger than "eventual 
consistency".

BTW: CAP was proposed by Prof. Eric Brewer...
http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29

Best Regards,
Wei

Wei Tan 
Research Staff Member 
IBM T. J. Watson Research Center
19 Skyline Dr, Hawthorne, NY  10532
wtan@us.ibm.com; 914-784-6752



From:   Lin Ma <li...@gmail.com>
To:     user@hbase.apache.org, 
Date:   08/07/2012 09:30 PM
Subject:        consistency, availability and partition pattern of HBase



Hello guys,

According to the notes by Werner*, "*He presented the CAP theorem, which
states that of three properties of shared-data systems—data consistency,
system availability, and tolerance to network partition—only two can be
achieved at any given time." =>
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html

But it seems HBase could achieve all of the 3 features at the same time.
Does it mean HBase breaks the rule by Werner. :-)

If not, which one is sacrificed -- consistency (by using HDFS),
availability (by using Zookeeper) or partition (by using region / column
family) ? And why?

regards,
Lin