You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Jaeyun Noh <me...@gmail.com> on 2008/10/27 06:33:39 UTC

Loosened transaction isolation level

Hi all, I have a question on transaction model implemented in HBase
(TransactionalRegion, which extends HRegion).


During the commit phase, HBase transaction checks conflicts between
read and write according to the timestamp-based CC.

In that code, I found that update transactions with any scan operation
seems to be rolled back by other update transactions. It will
seriously reduce success ratio of transaction commits when a scan
interleaves updates. Phantom reads can be prevented by key range
locking in theory, but this is impractical. So I understand that a
conservative model is used. Just locking a whole table.

In our case, we don't want a serious transaction isolation level. We
may don't need to prevent phantom read and even a non-repeatable read.
We just need a read-commit isolation level, preventing a dirty read.
In this case, I wonder that this conflict check model is really needed
in our use case.

So what about supporting more loosen transaction model in addition? We
can live with ReadCommitted isolation level. Maybe we can set an enum,
which indicates an isolation level, in the beginTransaction() method.

FYI, Following code is the conflict check method in TransactionState
used in TransactionalRegion class.

  private boolean hasConflict(final TransactionState checkAgainst) {
    if (checkAgainst.getStatus().equals(TransactionState.Status.ABORTED)) {
      return false; // Cannot conflict with aborted transactions
    }

    for (BatchUpdate otherUpdate : checkAgainst.getWriteSet()) {
      if (this.hasScan) {
        LOG.info("Transaction" + this.toString()
            + " has a scan read. Meanwile a write occured. "
            + "Conservitivly reporting conflict");
        return true;
      }

      if (this.getReadSet().contains(otherUpdate.getRow())) {
        LOG.trace("Transaction " + this.toString() + " conflicts with "
            + checkAgainst.toString());
        return true;
      }
    }
    return false;
  }

Regards,
Jaeyun Noh.

Re: Loosened transaction isolation level

Posted by Clint Morgan <cl...@gmail.com>.

Yeah that is correct: there is currently no global transaction log which is
needed for failure recovery. The intent was to wait on Zookeeper, but it
would also be possible to build as a regular hbase table.

-clint

On Mon, Oct 27, 2008 at 10:39 AM, Ski Gh3 <sk...@gmail.com> wrote:

> Is the transactional package fully functional?
> I was under the impression that without zookeeper integrated, there is no
> way to recover from a regionserver crashed during the middle of the commit
> (after everyone voted yes to commit)
> Please correct me if I was wrong...
> Otherwise, what's the plan? Leave it as is and fix it once we have zk?
>
> Thx!
>
> On Mon, Oct 27, 2008 at 9:55 AM, Clint Morgan <cm...@troove.net>
> wrote:
>
> > Yeah, the scanner's are definitely overly conservative. I needed this
> > because for example I am scanning a key range to guarantee that the
> update
> > I'm processing has a unique column value. The current implementation
> could
> > be improved to look more carefully if the scan really did pass any rows
> > which were in the write sets of other transactions (EG. Keep the start
> and
> > end points of the scanner)
> >
> > But it would also make sense to allow less restrictive transaction
> models,
> > and an enum passed to beginTransaction makes sense. Please open a JIRA.
> > (And
> > maybe post a patch...)
> >
> > -clint
> >
> > On Sun, Oct 26, 2008 at 10:33 PM, Jaeyun Noh <me...@gmail.com> wrote:
> >
> > > Hi all, I have a question on transaction model implemented in HBase
> > > (TransactionalRegion, which extends HRegion).
> > >
> > >
> > > During the commit phase, HBase transaction checks conflicts between
> > > read and write according to the timestamp-based CC.
> > >
> > > In that code, I found that update transactions with any scan operation
> > > seems to be rolled back by other update transactions. It will
> > > seriously reduce success ratio of transaction commits when a scan
> > > interleaves updates. Phantom reads can be prevented by key range
> > > locking in theory, but this is impractical. So I understand that a
> > > conservative model is used. Just locking a whole table.
> > >
> > > In our case, we don't want a serious transaction isolation level. We
> > > may don't need to prevent phantom read and even a non-repeatable read.
> > > We just need a read-commit isolation level, preventing a dirty read.
> > > In this case, I wonder that this conflict check model is really needed
> > > in our use case.
> > >
> > > So what about supporting more loosen transaction model in addition? We
> > > can live with ReadCommitted isolation level. Maybe we can set an enum,
> > > which indicates an isolation level, in the beginTransaction() method.
> > >
> > > FYI, Following code is the conflict check method in TransactionState
> > > used in TransactionalRegion class.
> > >
> > >  private boolean hasConflict(final TransactionState checkAgainst) {
> > >    if
> (checkAgainst.getStatus().equals(TransactionState.Status.ABORTED))
> > {
> > >      return false; // Cannot conflict with aborted transactions
> > >    }
> > >
> > >    for (BatchUpdate otherUpdate : checkAgainst.getWriteSet()) {
> > >      if (this.hasScan) {
> > >        LOG.info("Transaction" + this.toString()
> > >            + " has a scan read. Meanwile a write occured. "
> > >            + "Conservitivly reporting conflict");
> > >        return true;
> > >      }
> > >
> > >      if (this.getReadSet().contains(otherUpdate.getRow())) {
> > >        LOG.trace("Transaction " + this.toString() + " conflicts with "
> > >            + checkAgainst.toString());
> > >        return true;
> > >      }
> > >    }
> > >    return false;
> > >  }
> > >
> > > Regards,
> > > Jaeyun Noh.
> > >
> >
>

Re: Loosened transaction isolation level

Posted by Ski Gh3 <sk...@gmail.com>.

Is the transactional package fully functional?
I was under the impression that without zookeeper integrated, there is no
way to recover from a regionserver crashed during the middle of the commit
(after everyone voted yes to commit)
Please correct me if I was wrong...
Otherwise, what's the plan? Leave it as is and fix it once we have zk?

Thx!

On Mon, Oct 27, 2008 at 9:55 AM, Clint Morgan <cm...@troove.net> wrote:

> Yeah, the scanner's are definitely overly conservative. I needed this
> because for example I am scanning a key range to guarantee that the update
> I'm processing has a unique column value. The current implementation could
> be improved to look more carefully if the scan really did pass any rows
> which were in the write sets of other transactions (EG. Keep the start and
> end points of the scanner)
>
> But it would also make sense to allow less restrictive transaction models,
> and an enum passed to beginTransaction makes sense. Please open a JIRA.
> (And
> maybe post a patch...)
>
> -clint
>
> On Sun, Oct 26, 2008 at 10:33 PM, Jaeyun Noh <me...@gmail.com> wrote:
>
> > Hi all, I have a question on transaction model implemented in HBase
> > (TransactionalRegion, which extends HRegion).
> >
> >
> > During the commit phase, HBase transaction checks conflicts between
> > read and write according to the timestamp-based CC.
> >
> > In that code, I found that update transactions with any scan operation
> > seems to be rolled back by other update transactions. It will
> > seriously reduce success ratio of transaction commits when a scan
> > interleaves updates. Phantom reads can be prevented by key range
> > locking in theory, but this is impractical. So I understand that a
> > conservative model is used. Just locking a whole table.
> >
> > In our case, we don't want a serious transaction isolation level. We
> > may don't need to prevent phantom read and even a non-repeatable read.
> > We just need a read-commit isolation level, preventing a dirty read.
> > In this case, I wonder that this conflict check model is really needed
> > in our use case.
> >
> > So what about supporting more loosen transaction model in addition? We
> > can live with ReadCommitted isolation level. Maybe we can set an enum,
> > which indicates an isolation level, in the beginTransaction() method.
> >
> > FYI, Following code is the conflict check method in TransactionState
> > used in TransactionalRegion class.
> >
> >  private boolean hasConflict(final TransactionState checkAgainst) {
> >    if (checkAgainst.getStatus().equals(TransactionState.Status.ABORTED))
> {
> >      return false; // Cannot conflict with aborted transactions
> >    }
> >
> >    for (BatchUpdate otherUpdate : checkAgainst.getWriteSet()) {
> >      if (this.hasScan) {
> >        LOG.info("Transaction" + this.toString()
> >            + " has a scan read. Meanwile a write occured. "
> >            + "Conservitivly reporting conflict");
> >        return true;
> >      }
> >
> >      if (this.getReadSet().contains(otherUpdate.getRow())) {
> >        LOG.trace("Transaction " + this.toString() + " conflicts with "
> >            + checkAgainst.toString());
> >        return true;
> >      }
> >    }
> >    return false;
> >  }
> >
> > Regards,
> > Jaeyun Noh.
> >
>

Re: Loosened transaction isolation level

Posted by Jaeyun Noh <me...@gmail.com>.

Hi Clint,
I opened an JIRA item (HBASE-962).
Thanks.


On Mon, Oct 27, 2008 at 9:55 AM, Clint Morgan <cm...@troove.net> wrote:
> Yeah, the scanner's are definitely overly conservative. I needed this
> because for example I am scanning a key range to guarantee that the update
> I'm processing has a unique column value. The current implementation could
> be improved to look more carefully if the scan really did pass any rows
> which were in the write sets of other transactions (EG. Keep the start and
> end points of the scanner)
>
> But it would also make sense to allow less restrictive transaction models,
> and an enum passed to beginTransaction makes sense. Please open a JIRA. (And
> maybe post a patch...)
>
> -clint
>
> On Sun, Oct 26, 2008 at 10:33 PM, Jaeyun Noh <me...@gmail.com> wrote:
>
>> Hi all, I have a question on transaction model implemented in HBase
>> (TransactionalRegion, which extends HRegion).
>>
>>
>> During the commit phase, HBase transaction checks conflicts between
>> read and write according to the timestamp-based CC.
>>
>> In that code, I found that update transactions with any scan operation
>> seems to be rolled back by other update transactions. It will
>> seriously reduce success ratio of transaction commits when a scan
>> interleaves updates. Phantom reads can be prevented by key range
>> locking in theory, but this is impractical. So I understand that a
>> conservative model is used. Just locking a whole table.
>>
>> In our case, we don't want a serious transaction isolation level. We
>> may don't need to prevent phantom read and even a non-repeatable read.
>> We just need a read-commit isolation level, preventing a dirty read.
>> In this case, I wonder that this conflict check model is really needed
>> in our use case.
>>
>> So what about supporting more loosen transaction model in addition? We
>> can live with ReadCommitted isolation level. Maybe we can set an enum,
>> which indicates an isolation level, in the beginTransaction() method.
>>
>> FYI, Following code is the conflict check method in TransactionState
>> used in TransactionalRegion class.
>>
>>  private boolean hasConflict(final TransactionState checkAgainst) {
>>    if (checkAgainst.getStatus().equals(TransactionState.Status.ABORTED)) {
>>      return false; // Cannot conflict with aborted transactions
>>    }
>>
>>    for (BatchUpdate otherUpdate : checkAgainst.getWriteSet()) {
>>      if (this.hasScan) {
>>        LOG.info("Transaction" + this.toString()
>>            + " has a scan read. Meanwile a write occured. "
>>            + "Conservitivly reporting conflict");
>>        return true;
>>      }
>>
>>      if (this.getReadSet().contains(otherUpdate.getRow())) {
>>        LOG.trace("Transaction " + this.toString() + " conflicts with "
>>            + checkAgainst.toString());
>>        return true;
>>      }
>>    }
>>    return false;
>>  }
>>
>> Regards,
>> Jaeyun Noh.
>>
>

Re: Loosened transaction isolation level

Posted by Clint Morgan <cm...@troove.net>.

Yeah, the scanner's are definitely overly conservative. I needed this
because for example I am scanning a key range to guarantee that the update
I'm processing has a unique column value. The current implementation could
be improved to look more carefully if the scan really did pass any rows
which were in the write sets of other transactions (EG. Keep the start and
end points of the scanner)

But it would also make sense to allow less restrictive transaction models,
and an enum passed to beginTransaction makes sense. Please open a JIRA. (And
maybe post a patch...)

-clint

On Sun, Oct 26, 2008 at 10:33 PM, Jaeyun Noh <me...@gmail.com> wrote:

> Hi all, I have a question on transaction model implemented in HBase
> (TransactionalRegion, which extends HRegion).
>
>
> During the commit phase, HBase transaction checks conflicts between
> read and write according to the timestamp-based CC.
>
> In that code, I found that update transactions with any scan operation
> seems to be rolled back by other update transactions. It will
> seriously reduce success ratio of transaction commits when a scan
> interleaves updates. Phantom reads can be prevented by key range
> locking in theory, but this is impractical. So I understand that a
> conservative model is used. Just locking a whole table.
>
> In our case, we don't want a serious transaction isolation level. We
> may don't need to prevent phantom read and even a non-repeatable read.
> We just need a read-commit isolation level, preventing a dirty read.
> In this case, I wonder that this conflict check model is really needed
> in our use case.
>
> So what about supporting more loosen transaction model in addition? We
> can live with ReadCommitted isolation level. Maybe we can set an enum,
> which indicates an isolation level, in the beginTransaction() method.
>
> FYI, Following code is the conflict check method in TransactionState
> used in TransactionalRegion class.
>
>  private boolean hasConflict(final TransactionState checkAgainst) {
>    if (checkAgainst.getStatus().equals(TransactionState.Status.ABORTED)) {
>      return false; // Cannot conflict with aborted transactions
>    }
>
>    for (BatchUpdate otherUpdate : checkAgainst.getWriteSet()) {
>      if (this.hasScan) {
>        LOG.info("Transaction" + this.toString()
>            + " has a scan read. Meanwile a write occured. "
>            + "Conservitivly reporting conflict");
>        return true;
>      }
>
>      if (this.getReadSet().contains(otherUpdate.getRow())) {
>        LOG.trace("Transaction " + this.toString() + " conflicts with "
>            + checkAgainst.toString());
>        return true;
>      }
>    }
>    return false;
>  }
>
> Regards,
> Jaeyun Noh.
>