You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by "Jim R. Wilson" <wi...@gmail.com> on 2008/05/07 20:12:57 UTC

hbase on ec2 with s3 anyone?

Hi all,

I'm about to embark on a mystical journey through hosted web-services
with my trusted friend hbase.  Here are some questions for my fellow
travelers:

1) Has anyone done this before? If so, what lifesaving tips can you offer?
2) Should I attempt to build an hdfs out of ec2 persistent storage, or
just use S3?
3) How many images will I need? Just one, or master/slave?
4) What version of hadoop/hbase should I use?  (The hadoop/ec2
instructions[1] seem to favor the unreleased 0.17, but there doesn't
seem to be a public image with 0.17 at the ready)

Thanks in advance for any advice, I'm gearing up for quite a trip :)

[1] http://wiki.apache.org/hadoop/AmazonEC2

-- Jim R. Wilson (jimbojw)

Re: Does HBase support single-row transaction?

Posted by Zhou Wei <zh...@mails.tsinghua.edu.cn>.

SongJing Zhang 写道:
> long lockId = table.startUpdate(new Text("myRow"));
> ...
> ...
> ....
> table.commit(lockId);  ||   table.abort(lockId);
>   
You're right, but I just wrote simplified version of yours.

>
>

Re: Does HBase support single-row transaction?

Posted by SongJing Zhang <zh...@gmail.com>.

long lockId = table.startUpdate(new Text("myRow"));
...
...
....
table.commit(lockId);  ||   table.abort(lockId);




On Thu, May 8, 2008 at 10:48 AM, Zhou Wei
<zh...@mails.tsinghua.edu.cn> wrote:
> Hi
>  Does HBase support single-row transaction as described in Bigtable paper?
>
>  "Bigtable supports single-row transactions, which can be
>  used to perform atomic read-modify-write sequences on
>  data stored under a single row key."  --Bigtable paper
>
>  If so, how can I define a transaction in HBase,
>  is it looks like this:
>
>  lid=startUpdate
>  get(lid)
>  ..
>  put(lid)
>  ...
>  commit(lid)
>
>  Are these transactions isolated with each other?
>  If not, is there a way to achieve that?
>
>  Thanks
>
>  Zhou
>

Re: Does HBase support single-row transaction?

Posted by Clint Morgan <cl...@gmail.com>.

> "When the application creates an entity, it can assign another entity as the
> parent of the new entity. Assigning a parent to a new entity puts the new
> entity in the same entity group as the parent entity."
>
> I think I need to sign up for app engine and use it to see if I can figure
> how the above is done.

Was thinking this may be done with row key prefix. So all members of
an entity group have the same prefix and are collocated. Then the
regions (or tablets or datastore nodes) must know not to split in the
middle of such a prefix.

Also, it would make sense that they have one table per app engine
user, and each table stores all the kinds (types) that the application
uses...

> We'd need to have HBASE-493 in place building any kind of OCC.
I see the value of 493 for OCC with single row transactions, but for
multi-row transactions i think its not useful. Basically we would have
to hold of on all row puts if any relevant row has conflicts.
cheers,
-clint

Re: Does HBase support single-row transaction?

Posted by stack <st...@duboce.net>.

Clint Morgan wrote:
> So if we wrote all operations for a transaction first to ZooKeeper, we
> still need something like a Distributed Transaction Manager to
> orchestrate the commit process: Send BatchUpdates to each
> RegionServer, ask them to commit, then commit or rollback based on
> results from all participating RegionServers. 
Yes.

> Or is there some more
> clever way to use ZooKeeper? Maybe encoding a commit protocol into the
> Zookeeper nodes...
>
>   
This page is interesting discussing how you can build various 
cluster-wide primitives such as locks and two-phase commit using 
zookeeper:  http://zookeeper.wiki.sourceforge.net/ZooKeeperRecipes.  
Still would need a transaction orchestrator of some sort.

> Looks like google's datastore has a mechanism for keeping groups of
> rows (entity groups) together on the same server (datastore node).
>   
>From 
http://code.google.com/appengine/docs/datastore/keysandentitygroups.html:

"When the application creates an entity, it can assign another entity as 
the parent of the new entity. Assigning a parent to a new entity puts 
the new entity in the same entity group as the parent entity."

I think I need to sign up for app engine and use it to see if I can 
figure how the above is done.
> Then they allow transactions only on rows in the same group. This way
> they don't have to worry about distributed transactions. Rather than
> locking, they use optimistic concurrency control. This means they do
> the transaction in a sandbox, then check for conflicts from other
> transactions before committing.
We'd need to have HBASE-493 in place building any kind of OCC.

St.Ack



> -clint
>
> On Tue, May 27, 2008 at 2:13 PM, stack <st...@duboce.net> wrote:
>   
>> Clint Morgan wrote:
>>     
>>> Zookeeper makes good sense for distributed locking to get isolation.
>>> But we still need transaction start, commit, and rollback to get
>>> atomicity. I think this properly belongs in hbase.
>>>
>>>       
>> Since all clients are going via zookeeper anyways ('isolation'), maybe
>> it'd be better to just run the whole transaction management out of
>> zookeeper? Clients would open a transaction on zookeeper and put their
>> edits there so they were available for rollback and/or commit. If client
>> died midway, could ask zookeeper for outstanding transactions and pickup
>> whereever it'd left off. Otherwise, on success (or rollback), clean up
>> the transaction log.
>>
>> Alternatively, all clients would have to go via the hbase master so it
>> could orchestrate row access. Master would need to hold outstanding
>> transactions somewhere either in an in-memory transactions catalog table
>> or itself over in zookeeper.
>>
>> St.Ack
>>
>>

Re: Does HBase support single-row transaction?

Posted by Clint Morgan <cl...@gmail.com>.

So if we wrote all operations for a transaction first to ZooKeeper, we
still need something like a Distributed Transaction Manager to
orchestrate the commit process: Send BatchUpdates to each
RegionServer, ask them to commit, then commit or rollback based on
results from all participating RegionServers. Or is there some more
clever way to use ZooKeeper? Maybe encoding a commit protocol into the
Zookeeper nodes...

Looks like google's datastore has a mechanism for keeping groups of
rows (entity groups) together on the same server (datastore node).
Then they allow transactions only on rows in the same group. This way
they don't have to worry about distributed transactions. Rather than
locking, they use optimistic concurrency control. This means they do
the transaction in a sandbox, then check for conflicts from other
transactions before committing.

-clint

On Tue, May 27, 2008 at 2:13 PM, stack <st...@duboce.net> wrote:
> Clint Morgan wrote:
>> Zookeeper makes good sense for distributed locking to get isolation.
>> But we still need transaction start, commit, and rollback to get
>> atomicity. I think this properly belongs in hbase.
>>
> Since all clients are going via zookeeper anyways ('isolation'), maybe
> it'd be better to just run the whole transaction management out of
> zookeeper? Clients would open a transaction on zookeeper and put their
> edits there so they were available for rollback and/or commit. If client
> died midway, could ask zookeeper for outstanding transactions and pickup
> whereever it'd left off. Otherwise, on success (or rollback), clean up
> the transaction log.
>
> Alternatively, all clients would have to go via the hbase master so it
> could orchestrate row access. Master would need to hold outstanding
> transactions somewhere either in an in-memory transactions catalog table
> or itself over in zookeeper.
>
> St.Ack
>

Re: Does HBase support single-row transaction?

Posted by stack <st...@duboce.net>.

Clint Morgan wrote:
> Zookeeper makes good sense for distributed locking to get isolation.
> But we still need transaction start, commit, and rollback to get
> atomicity. I think this properly belongs in hbase.
>   
Since all clients are going via zookeeper anyways ('isolation'), maybe
it'd be better to just run the whole transaction management out of
zookeeper? Clients would open a transaction on zookeeper and put their
edits there so they were available for rollback and/or commit. If client
died midway, could ask zookeeper for outstanding transactions and pickup
whereever it'd left off. Otherwise, on success (or rollback), clean up
the transaction log.

Alternatively, all clients would have to go via the hbase master so it
could orchestrate row access. Master would need to hold outstanding
transactions somewhere either in an in-memory transactions catalog table
or itself over in zookeeper.

St.Ack

Re: Does HBase support single-row transaction?

Posted by stack <st...@duboce.net>.

Clint Morgan wrote:
> Responses inline:
>
> 2008/5/27 Bryan Duxbury <br...@rapleaf.com>:
>   
>> It seems like if you wanted to do some manner of multi-row transactional
>> put, the only real way to manage it is with deletes. That is, if the first
>> put succeeds but the second fails, you can "invert" the first put into a
>> bunch of deletes.
>>     
>
> Yes, this is what I was thinking by using the timestamp/multiple
> versions. To roll back you delete everything you wrote and then we get
> back to the previous version. Alternatively you could save the
> original values before they are overwritten.
>   

Deletes would be the way to go I'd say (what to do if we can't insert 
the delete for the very reason the transactions failing?).

We'd have to do a bit of work to support this case first though.  IIRC, 
deletes X-out cells of same timestamp when getting but when scanning, if 
we encounter a delete, it blocks being able to see whats behind the delete.

St.Ack

Re: Does HBase support single-row transaction?

Posted by Bryan Duxbury <br...@rapleaf.com>.

I see what you're saying. I need to think on this. Stack, care to  
weigh in?

-Bryan

On May 27, 2008, at 1:56 PM, Clint Morgan wrote:

> Responses inline:
>
> 2008/5/27 Bryan Duxbury <br...@rapleaf.com>:
>> It seems like if you wanted to do some manner of multi-row  
>> transactional
>> put, the only real way to manage it is with deletes. That is, if  
>> the first
>> put succeeds but the second fails, you can "invert" the first put  
>> into a
>> bunch of deletes.
>
> Yes, this is what I was thinking by using the timestamp/multiple
> versions. To roll back you delete everything you wrote and then we get
> back to the previous version. Alternatively you could save the
> original values before they are overwritten.
>
>> Trying to make the regions themselves maintain the transactional  
>> state seems
>> like a terrible idea. You'd have to not allow a region to get  
>> migrated to
>> another server if it's serving a transaction. This would introduce  
>> a lot of
>> potential performance problems, I think.
>
> I'm envisioning transactions being relatively short-lived: 100 ms to a
> few seconds. I don't see this getting in the way of eg region
> migration any more than scanners do. But maybe I'm missing something.
>
> So the transactional state for a region is (roughly) a transaction
> lease, and a collection of the corresponding BatchUpdates.
>
>> Can you help me understand why atomic transactions are needed?  
>> Can't the
>> atomicity problems be sort of resolved by the whole row versioning  
>> thing?
>
> Simply, we need to ensure that all updates happen together. Otherwise,
> the data is in an inconsistent state. Take the standard example of
> debiting one account and crediting another. If only one of these rows
> gets updated, then the resulting table is corrupted and will not make
> sense to the application. (Money has been created or destroyed)
>
> So that is why one needs atomicity: the application-level semantics  
> demand it.
>
> When we encounter an exception midway through the transaction, we can
> recover the old state of the modified row(s) by reverting to the
> previous version. So the question is who recognizes this and does the
> rollback? I'd like hbase to do it because it seems like a logical
> place to put the behavior. So if the client crashed halfway through
> the transaction, then when his transaction lease expires, hbase will
> revert the relevant BatchUpdates. And the integrity of our table is
> preserved!
>
>> Other databases that do transactions and rollbacks use versioning to
>> accomplish that, I think.
>
> I don't know much about this. But however other (R)DBMS implement it,
> it is provided as a primitive rather than implemented on top of
> underlying versioning functionality (by users). This way the database
> will maintain the consistency rather than the user having to recognize
> problems and revert the state itself.
>
> -clint

Re: Does HBase support single-row transaction?

Posted by Clint Morgan <cl...@gmail.com>.

Responses inline:

2008/5/27 Bryan Duxbury <br...@rapleaf.com>:
> It seems like if you wanted to do some manner of multi-row transactional
> put, the only real way to manage it is with deletes. That is, if the first
> put succeeds but the second fails, you can "invert" the first put into a
> bunch of deletes.

Yes, this is what I was thinking by using the timestamp/multiple
versions. To roll back you delete everything you wrote and then we get
back to the previous version. Alternatively you could save the
original values before they are overwritten.

> Trying to make the regions themselves maintain the transactional state seems
> like a terrible idea. You'd have to not allow a region to get migrated to
> another server if it's serving a transaction. This would introduce a lot of
> potential performance problems, I think.

I'm envisioning transactions being relatively short-lived: 100 ms to a
few seconds. I don't see this getting in the way of eg region
migration any more than scanners do. But maybe I'm missing something.

So the transactional state for a region is (roughly) a transaction
lease, and a collection of the corresponding BatchUpdates.

> Can you help me understand why atomic transactions are needed? Can't the
> atomicity problems be sort of resolved by the whole row versioning thing?

Simply, we need to ensure that all updates happen together. Otherwise,
the data is in an inconsistent state. Take the standard example of
debiting one account and crediting another. If only one of these rows
gets updated, then the resulting table is corrupted and will not make
sense to the application. (Money has been created or destroyed)

So that is why one needs atomicity: the application-level semantics demand it.

When we encounter an exception midway through the transaction, we can
recover the old state of the modified row(s) by reverting to the
previous version. So the question is who recognizes this and does the
rollback? I'd like hbase to do it because it seems like a logical
place to put the behavior. So if the client crashed halfway through
the transaction, then when his transaction lease expires, hbase will
revert the relevant BatchUpdates. And the integrity of our table is
preserved!

> Other databases that do transactions and rollbacks use versioning to
> accomplish that, I think.

I don't know much about this. But however other (R)DBMS implement it,
it is provided as a primitive rather than implemented on top of
underlying versioning functionality (by users). This way the database
will maintain the consistency rather than the user having to recognize
problems and revert the state itself.

-clint

Re: Does HBase support single-row transaction?

Posted by Bryan Duxbury <br...@rapleaf.com>.

It seems like if you wanted to do some manner of multi-row  
transactional put, the only real way to manage it is with deletes.  
That is, if the first put succeeds but the second fails, you can  
"invert" the first put into a bunch of deletes.

Trying to make the regions themselves maintain the transactional  
state seems like a terrible idea. You'd have to not allow a region to  
get migrated to another server if it's serving a transaction. This  
would introduce a lot of potential performance problems, I think.

Can you help me understand why atomic transactions are needed? Can't  
the atomicity problems be sort of resolved by the whole row  
versioning thing? Other databases that do transactions and rollbacks  
use versioning to accomplish that, I think.

-Bryan

On May 27, 2008, at 12:29 PM, Clint Morgan wrote:

> Zookeeper makes good sense for distributed locking to get isolation.
> But we still need transaction start, commit, and rollback to get
> atomicity. I think this properly belongs in hbase.
>
> So suppose I want to read two rows, and then update them as an
> isolated, atomic action:
>
> try {
>   getZookeeperLock(table)
>   tranId = table.beginTransaction();
>   row1 = table.get() // Normal get, but isolated due to distributed  
> lock
>   row2 = table.get()
>   BatchUpdate b1 = new BatchUpdate(row1)
>   b1.put(...)
>   table.addUpdate(tranId, b1);
>   BatchUpdate b2 = new BatchUpdate(row2)
>   b2.put(...);
>   table.addUpdate(tranId, b2);
>   table.commit(tranId);
> } catch(Exception e) {
>   table.rollback(tranId);
> } finally {
>   releaseZookeeperLock(table)
> }
>
> So then on the hbase side we hold on to the batchUpdates until the
> table.commit is called. Then we roll through and apply the updates.
>
> I'm sure rollback()/commit() is tricky to implement, as the updates
> could be on different region servers, so we need a failure on one to
> trigger a rollback on others. We could use timestamp/old versions to
> implement rollback on batchUpdates we have already applied.
>
> Alternatively, this may all be implemented above hbase. The client
> keeps track of updates, and trys to roll back using timestamps.
> Problem here is if the client dies midway through we have half the
> transaction committed and loose atomicity/consistency.
>
> We will eventually want/need atomic transactions on hbase, so I'll
> look into this further. Any input would be appreciated. Would be
> interesting to know how/what google provides...
>
> cheers,
> -clint
>
>
> On Sun, May 11, 2008 at 7:48 AM, Bryan Duxbury <br...@rapleaf.com>  
> wrote:
>> Currently, it's not on our list of things to do. There are a  
>> number of
>> reasons why it would be better to use Zookeeper here than to try  
>> and build
>> it into HBase.
>>
>> That said, I think you could get everything you need if you tried  
>> Zookeeper,
>> using that to acquire locks on the row you need a transaction on.  
>> It's
>> supposedly very high performance and supports your use case  
>> precisely.
>>
>> -Bryan
>>
>> On May 10, 2008, at 11:52 PM, Zhou Wei wrote:
>>
>>> Bryan Duxbury 写道:
>>>>
>>>> startUpdate is deprecated in TRUNK. Also, it doesn't do what you  
>>>> are
>>>> thinking it does. Committing a BatchUpdate is atomic across the  
>>>> whole row,
>>>> however. There is currently no way to make a get and a commit  
>>>> transactional,
>>>> though there is an issue open for write-if-not-modified-since  
>>>> support. If
>>>> this is something you need we can talk about how it might be  
>>>> supported.
>>>
>>> Thanks for answering my questions.
>>>
>>> So currently HBase is not suitable for transactional web  
>>> applications.
>>> A simple counting transaction can not work by concurrent accesses:
>>> transaction{
>>> get(x);
>>> x++;
>>> write(x);
>>> }
>>>
>>> In my opinion, "write-if-not-modified-since" support may not be  
>>> the best
>>> idea of implement single-row transaction.
>>> Because if write can not be performed, application has to try  
>>> again and
>>> again, or just return error and leave user to choose again or abort.
>>> Probably locking, waiting and scheduling at region server might be
>>> preferable in this case.
>>> Is the single-row transaction feature currently in the roadmap of  
>>> HBase?
>>>
>>> Zhou
>>>>
>>>> -Bryan
>>>>
>>>> On May 7, 2008, at 7:48 PM, Zhou Wei wrote:
>>>>
>>>>> Hi
>>>>> Does HBase support single-row transaction as described in Bigtable
>>>>> paper?
>>>>>
>>>>> "Bigtable supports single-row transactions, which can be
>>>>> used to perform atomic read-modify-write sequences on
>>>>> data stored under a single row key." --Bigtable paper
>>>>>
>>>>> If so, how can I define a transaction in HBase,
>>>>> is it looks like this:
>>>>>
>>>>> lid=startUpdate
>>>>> get(lid)
>>>>> ..
>>>>> put(lid)
>>>>> ...
>>>>> commit(lid)
>>>>>
>>>>> Are these transactions isolated with each other?
>>>>> If not, is there a way to achieve that?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Zhou
>>>>
>>>>
>>>>
>>>
>>
>>

Re: Does HBase support single-row transaction?

Posted by Clint Morgan <cl...@gmail.com>.

Zookeeper makes good sense for distributed locking to get isolation.
But we still need transaction start, commit, and rollback to get
atomicity. I think this properly belongs in hbase.

So suppose I want to read two rows, and then update them as an
isolated, atomic action:

try {
  getZookeeperLock(table)
  tranId = table.beginTransaction();
  row1 = table.get() // Normal get, but isolated due to distributed lock
  row2 = table.get()
  BatchUpdate b1 = new BatchUpdate(row1)
  b1.put(...)
  table.addUpdate(tranId, b1);
  BatchUpdate b2 = new BatchUpdate(row2)
  b2.put(...);
  table.addUpdate(tranId, b2);
  table.commit(tranId);
} catch(Exception e) {
  table.rollback(tranId);
} finally {
  releaseZookeeperLock(table)
}

So then on the hbase side we hold on to the batchUpdates until the
table.commit is called. Then we roll through and apply the updates.

I'm sure rollback()/commit() is tricky to implement, as the updates
could be on different region servers, so we need a failure on one to
trigger a rollback on others. We could use timestamp/old versions to
implement rollback on batchUpdates we have already applied.

Alternatively, this may all be implemented above hbase. The client
keeps track of updates, and trys to roll back using timestamps.
Problem here is if the client dies midway through we have half the
transaction committed and loose atomicity/consistency.

We will eventually want/need atomic transactions on hbase, so I'll
look into this further. Any input would be appreciated. Would be
interesting to know how/what google provides...

cheers,
-clint


On Sun, May 11, 2008 at 7:48 AM, Bryan Duxbury <br...@rapleaf.com> wrote:
> Currently, it's not on our list of things to do. There are a number of
> reasons why it would be better to use Zookeeper here than to try and build
> it into HBase.
>
> That said, I think you could get everything you need if you tried Zookeeper,
> using that to acquire locks on the row you need a transaction on. It's
> supposedly very high performance and supports your use case precisely.
>
> -Bryan
>
> On May 10, 2008, at 11:52 PM, Zhou Wei wrote:
>
>> Bryan Duxbury 写道:
>>>
>>> startUpdate is deprecated in TRUNK. Also, it doesn't do what you are
>>> thinking it does. Committing a BatchUpdate is atomic across the whole row,
>>> however. There is currently no way to make a get and a commit transactional,
>>> though there is an issue open for write-if-not-modified-since support. If
>>> this is something you need we can talk about how it might be supported.
>>
>> Thanks for answering my questions.
>>
>> So currently HBase is not suitable for transactional web applications.
>> A simple counting transaction can not work by concurrent accesses:
>> transaction{
>> get(x);
>> x++;
>> write(x);
>> }
>>
>> In my opinion, "write-if-not-modified-since" support may not be the best
>> idea of implement single-row transaction.
>> Because if write can not be performed, application has to try again and
>> again, or just return error and leave user to choose again or abort.
>> Probably locking, waiting and scheduling at region server might be
>> preferable in this case.
>> Is the single-row transaction feature currently in the roadmap of HBase?
>>
>> Zhou
>>>
>>> -Bryan
>>>
>>> On May 7, 2008, at 7:48 PM, Zhou Wei wrote:
>>>
>>>> Hi
>>>> Does HBase support single-row transaction as described in Bigtable
>>>> paper?
>>>>
>>>> "Bigtable supports single-row transactions, which can be
>>>> used to perform atomic read-modify-write sequences on
>>>> data stored under a single row key." --Bigtable paper
>>>>
>>>> If so, how can I define a transaction in HBase,
>>>> is it looks like this:
>>>>
>>>> lid=startUpdate
>>>> get(lid)
>>>> ..
>>>> put(lid)
>>>> ...
>>>> commit(lid)
>>>>
>>>> Are these transactions isolated with each other?
>>>> If not, is there a way to achieve that?
>>>>
>>>> Thanks
>>>>
>>>> Zhou
>>>
>>>
>>>
>>
>
>

Re: Does HBase support single-row transaction?

Posted by Zhou Wei <zh...@mails.tsinghua.edu.cn>.

Bryan Duxbury wrote:
> Currently, it's not on our list of things to do. There are a number of 
> reasons why it would be better to use Zookeeper here than to try and 
> build it into HBase.
>
> That said, I think you could get everything you need if you tried 
> Zookeeper, using that to acquire locks on the row you need a 
> transaction on. It's supposedly very high performance and supports 
> your use case precisely.
>
> -Bryan

Thanks.
>
> On May 10, 2008, at 11:52 PM, Zhou Wei wrote:
>
>> Bryan Duxbury 写道:
>>> startUpdate is deprecated in TRUNK. Also, it doesn't do what you are 
>>> thinking it does. Committing a BatchUpdate is atomic across the 
>>> whole row, however. There is currently no way to make a get and a 
>>> commit transactional, though there is an issue open for 
>>> write-if-not-modified-since support. If this is something you need 
>>> we can talk about how it might be supported.
>
>
>
>

Re: Does HBase support single-row transaction?

Posted by Bryan Duxbury <br...@rapleaf.com>.

Currently, it's not on our list of things to do. There are a number  
of reasons why it would be better to use Zookeeper here than to try  
and build it into HBase.

That said, I think you could get everything you need if you tried  
Zookeeper, using that to acquire locks on the row you need a  
transaction on. It's supposedly very high performance and supports  
your use case precisely.

-Bryan

On May 10, 2008, at 11:52 PM, Zhou Wei wrote:

> Bryan Duxbury 写道:
>> startUpdate is deprecated in TRUNK. Also, it doesn't do what you  
>> are thinking it does. Committing a BatchUpdate is atomic across  
>> the whole row, however. There is currently no way to make a get  
>> and a commit transactional, though there is an issue open for  
>> write-if-not-modified-since support. If this is something you need  
>> we can talk about how it might be supported.
> Thanks for answering my questions.
>
> So currently HBase is not suitable for transactional web applications.
> A simple counting transaction can not work by concurrent accesses:
> transaction{
> get(x);
> x++;
> write(x);
> }
>
> In my opinion, "write-if-not-modified-since" support may not be the  
> best idea of implement single-row transaction.
> Because if write can not be performed, application has to try again  
> and again, or just return error and leave user to choose again or  
> abort.
> Probably locking, waiting and scheduling at region server might be  
> preferable in this case.
> Is the single-row transaction feature currently in the roadmap of  
> HBase?
>
> Zhou
>>
>> -Bryan
>>
>> On May 7, 2008, at 7:48 PM, Zhou Wei wrote:
>>
>>> Hi
>>> Does HBase support single-row transaction as described in  
>>> Bigtable paper?
>>>
>>> "Bigtable supports single-row transactions, which can be
>>> used to perform atomic read-modify-write sequences on
>>> data stored under a single row key." --Bigtable paper
>>>
>>> If so, how can I define a transaction in HBase,
>>> is it looks like this:
>>>
>>> lid=startUpdate
>>> get(lid)
>>> ..
>>> put(lid)
>>> ...
>>> commit(lid)
>>>
>>> Are these transactions isolated with each other?
>>> If not, is there a way to achieve that?
>>>
>>> Thanks
>>>
>>> Zhou
>>
>>
>>
>

Re: Does HBase support single-row transaction?

Posted by Zhou Wei <zh...@mails.tsinghua.edu.cn>.

Bryan Duxbury 写道:
> startUpdate is deprecated in TRUNK. Also, it doesn't do what you are 
> thinking it does. Committing a BatchUpdate is atomic across the whole 
> row, however. There is currently no way to make a get and a commit 
> transactional, though there is an issue open for 
> write-if-not-modified-since support. If this is something you need we 
> can talk about how it might be supported.
Thanks for answering my questions.

So currently HBase is not suitable for transactional web applications.
A simple counting transaction can not work by concurrent accesses:
transaction{
get(x);
x++;
write(x);
}

In my opinion, "write-if-not-modified-since" support may not be the best 
idea of implement single-row transaction.
Because if write can not be performed, application has to try again and 
again, or just return error and leave user to choose again or abort.
Probably locking, waiting and scheduling at region server might be 
preferable in this case.
Is the single-row transaction feature currently in the roadmap of HBase?

Zhou
>
> -Bryan
>
> On May 7, 2008, at 7:48 PM, Zhou Wei wrote:
>
>> Hi
>> Does HBase support single-row transaction as described in Bigtable 
>> paper?
>>
>> "Bigtable supports single-row transactions, which can be
>> used to perform atomic read-modify-write sequences on
>> data stored under a single row key." --Bigtable paper
>>
>> If so, how can I define a transaction in HBase,
>> is it looks like this:
>>
>> lid=startUpdate
>> get(lid)
>> ..
>> put(lid)
>> ...
>> commit(lid)
>>
>> Are these transactions isolated with each other?
>> If not, is there a way to achieve that?
>>
>> Thanks
>>
>> Zhou
>
>
>

Re: Does HBase support single-row transaction?

Posted by Bryan Duxbury <br...@rapleaf.com>.

startUpdate is deprecated in TRUNK. Also, it doesn't do what you are  
thinking it does. Committing a BatchUpdate is atomic across the whole  
row, however. There is currently no way to make a get and a commit  
transactional, though there is an issue open for write-if-not- 
modified-since support. If this is something you need we can talk  
about how it might be supported.

-Bryan

On May 7, 2008, at 7:48 PM, Zhou Wei wrote:

> Hi
> Does HBase support single-row transaction as described in Bigtable  
> paper?
>
> "Bigtable supports single-row transactions, which can be
> used to perform atomic read-modify-write sequences on
> data stored under a single row key."  --Bigtable paper
>
> If so, how can I define a transaction in HBase,
> is it looks like this:
>
> lid=startUpdate
> get(lid)
> ..
> put(lid)
> ...
> commit(lid)
>
> Are these transactions isolated with each other?
> If not, is there a way to achieve that?
>
> Thanks
>
> Zhou

Does HBase support single-row transaction?

Posted by Zhou Wei <zh...@mails.tsinghua.edu.cn>.

Hi
Does HBase support single-row transaction as described in Bigtable paper?

"Bigtable supports single-row transactions, which can be
used to perform atomic read-modify-write sequences on
data stored under a single row key."  --Bigtable paper

If so, how can I define a transaction in HBase,
is it looks like this:

lid=startUpdate
get(lid)
..
put(lid)
...
commit(lid)

Are these transactions isolated with each other?
If not, is there a way to achieve that?

Thanks

Zhou