You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@asterixdb.apache.org by abdullah alamoudi <ba...@gmail.com> on 2016/06/24 17:14:43 UTC

Delete transactions

Hi everyone,
I think we have a problem related to the deletes transaction behavior:here
is the problem:

Our delete starts by searching the tree to identify delete tuples based on
the delete statement conditional clause. It follows that by inserting
delete tuples in primary index, followed by updating secondary indexes,
followed by a commit on the PK

The problem happens if after searching the tree and identifying the records
to be deleted, one of those records was updated. This will cause the record
to be deleted in the primary index even though it might not meet the
conditional clause. Moreover, the new entries in the secondary indexes will
remain without their record in the primary index.

In order to fix this, we need to do one of the following:
1. lock the records when we do the search to identify the records to be
deleted
OR
2. when performing the delete, we double check that the record we're
deleting is the same as the record we find when we do the actual delete

A better way would be to perform the delete as we do the search since there
is no need to do the whole search, materialize then perform the delete.

There is a change I got something wrong. Did I? Thoughts?

Re: Delete transactions

Posted by Taewoo Kim <wa...@gmail.com>.
PS: we have decided to try instant-trylock on PK for an index-only plan
query since we can't assume the lock manager can handle millions of locks
without any performance degradation. Thus, we currently only support
read-committed for an index-only plan query, too.
On Fri, Jun 24, 2016 at 17:59 Young-Seok Kim <ki...@gmail.com> wrote:

> Non-instant read locks will break the deadlock-free locking protocol that
> we achieved currently since the necessary condition for the deadlock, i.e.,
> hold-and-wait situation will come back alive.
>
> Best,
> Young-Seok
>
>
> On Fri, Jun 24, 2016 at 5:49 PM, Mike Carey <dt...@gmail.com> wrote:
>
> > Got it and agreed.  Would another solution be to retain the (no longer
> > instant) lock?  (What can of worms would that open?)
> >
> >
> >
> > On 6/24/16 5:09 PM, Young-Seok Kim wrote:
> >
> >> Please see inline below.
> >>
> >> Best,
> >> Young-Seok
> >>
> >> On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <dt...@gmail.com> wrote:
> >>
> >> So to clarify, record-level consistency (and primary/secondary index
> >>> consistency) is guaranteed and will work "correctly" in all cases if a
> >>> record R is updated (or deleted) by T2 after being targeted (by primary
> >>> key) for deletion by T1.
> >>>
> >> Yes, agreed on.
> >>
> >>
> >> The only semantic issue is that there is a (hopefully very, very small)
> >>> window of time between when T1 sees R in a secondary index and when it
> >>> acquires for the lock on R's primary key - during which T2 could
> change R
> >>> in a way that makes it no longer query-compliant.
> >>>
> >> Here, let me clarify the above sentence: "when it acquires for the lock
> on
> >> R's primary key" means that the lock is acquired and released by T1
> since
> >> the lock was an instant shared-mode(read) lock. So, T2 can change R
> after
> >> acquiring an exclusive lock and consequently the R is not qualified for
> >> the
> >> query predicate anymore.
> >>
> >>
> >> (However, at its time of being observed - which happened under
> >>> read-committed - it was a correct candidate for deletion.  So this is
> >>> kind
> >>> of "expected" but admittedly kind of weird.  It seems like this could
> >>> maybe
> >>> be fixed in the future via a mechanism similar to the index-only
> branch's
> >>> way of handling locks?)
> >>>
> >> This expected but undesired situation can be avoided by introducing a
> >> version number which will be stored as a field (which is not exposed to
> >> users) in each entry of the primary index and the secondary indexes such
> >> that the version number can be used to verify that the record searched
> >> during the search phase is the same record to be deleted during the
> delete
> >> phase. If the verification is succeeded, the delete will be performed.
> >> Otherwise, it will not.
> >>
> >>
> >>
> >>>
> >>> On 6/24/16 10:59 AM, Young-Seok Kim wrote:
> >>>
> >>> This is somewhat expected issue by having read-committed isolation
> level
> >>>> based on strict 2PL locking protocol.
> >>>> The strict 2PL guarantees that all acquired exclusive locks by a
> >>>> transaction can be released after the transaction is committed.
> >>>> But, read lock doesn't follow this.
> >>>> So, as you described in the email, a record read by a transaction, T1
> >>>> during search can be modified by another transaction T2 before the
> >>>> record
> >>>> is deleted by T1.  This is a possible situation under the
> read-committed
> >>>> isolation level.
> >>>> However, there is no inconsistency between a primary index and
> secondary
> >>>> indexes in the way that the modified record by T2 is deleted by T1
> from
> >>>> the
> >>>> primary index and the corresponding secondary index entry may not be
> >>>> deleted by T1. This is because when T1 starts deleting process through
> >>>> the
> >>>> job pipeline, an exclusive lock for the record is first acquired and
> >>>> then
> >>>> the delete operations in primary and secondary indexes are performed.
> >>>> So,
> >>>> either case1) the record should exist with the identical primary key
> for
> >>>> the record to be deleted by T1 (since the search will deliver the
> >>>> primary
> >>>> key, not the complete record) or case2) the record will not be deleted
> >>>> by
> >>>> T1 if the record with the primary key does not exist.
> >>>>
> >>>> For case1), once a record is deleted from the primary index, all rest
> of
> >>>> secondary indexes in the job pipeline correctly find and delete the
> >>>> corresponding secondary index entries.
> >>>> For case2), I need to check the behavior whether the job pipeline
> throws
> >>>> an
> >>>> exception due to trying to delete the non-existing record and stops
> >>>> proceeding the job by aborting the job, or the exception is just
> >>>> swallowed
> >>>> and the job proceeds for the next record.
> >>>>
> >>>> Best,
> >>>> Young-Seok
> >>>>
> >>>>
> >>>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <
> bamousaa@gmail.com
> >>>> >
> >>>> wrote:
> >>>>
> >>>> Hi everyone,
> >>>>
> >>>>> I think we have a problem related to the deletes transaction
> >>>>> behavior:here
> >>>>> is the problem:
> >>>>>
> >>>>> Our delete starts by searching the tree to identify delete tuples
> based
> >>>>> on
> >>>>> the delete statement conditional clause. It follows that by inserting
> >>>>> delete tuples in primary index, followed by updating secondary
> indexes,
> >>>>> followed by a commit on the PK
> >>>>>
> >>>>> The problem happens if after searching the tree and identifying the
> >>>>> records
> >>>>> to be deleted, one of those records was updated. This will cause the
> >>>>> record
> >>>>> to be deleted in the primary index even though it might not meet the
> >>>>> conditional clause. Moreover, the new entries in the secondary
> indexes
> >>>>> will
> >>>>> remain without their record in the primary index.
> >>>>>
> >>>>> In order to fix this, we need to do one of the following:
> >>>>> 1. lock the records when we do the search to identify the records to
> be
> >>>>> deleted
> >>>>> OR
> >>>>> 2. when performing the delete, we double check that the record we're
> >>>>> deleting is the same as the record we find when we do the actual
> delete
> >>>>>
> >>>>> A better way would be to perform the delete as we do the search since
> >>>>> there
> >>>>> is no need to do the whole search, materialize then perform the
> delete.
> >>>>>
> >>>>> There is a change I got something wrong. Did I? Thoughts?
> >>>>>
> >>>>>
> >>>>>
> >
>

Re: Delete transactions

Posted by Young-Seok Kim <ki...@gmail.com>.
It seems that the current upsert does not work correctly based on the
deadlock-free locking protocol when the query in the upsert statement has
any blocking operator in the job pipeline.
According to the deadlock-free locking protocol, when a trylock() fails for
a record to be upserted during search phase from the query clause, all
records that already acquired lock should be pushed to the commit operator
before a lock()(blocking call) is called for the trylock-failed record  in
order to avoid the hold-and-wait situation.
However, when there is any blocking operator in the pipeline, the pushed
operator may not reach the commit operator. So, upsert with a blocking
operator breaks the deadlock-free locking protocol.
This should be fixed.

For the same reason, the delete should not be modified to hold non-instant
locks for maintaining the deadlock-free locking protocol.

Best,
Young-Seok

On Sat, Jun 25, 2016 at 11:42 AM, abdullah alamoudi <ba...@gmail.com>
wrote:

> Not difficult at all. We do something similar for Upserts.
> Essentially, all we need to do is:
> 1. Exclude the delete case from the introduce materialize rule.
> 2. Use different callbacks for the search and delete operations. This means
> that the lock on the search in this case will be a write lock and in the
> delete will be a no lock. The commit will then release the lock.
>
> Should be a couple of hours work.
> ~Abdullah.
>
> On Sat, Jun 25, 2016 at 6:25 PM, Mike Carey <dt...@gmail.com> wrote:
>
> > I wonder how easy this would be to fit into the current S/W structure,
> > though?
> >
> > (Side note:  My wife will be sad; Halloween is her favorite holiday...
> :-))
> >
> >
> > On 6/24/16 7:22 PM, abdullah alamoudi wrote:
> >
> >> There is a better alternative.
> >>
> >> When the search is performed, we acquire a write lock and perform the
> >> deletion right away without materialization. Currently, the
> >> materialization
> >> is introduced by the IntroduceMaterializeForInsertWithSelfReadRule. The
> >> rule is meant to prevent a Halloween situation. However, with a delete,
> >> Halloween is cancelled.
> >>
> >> Cheers,
> >> Abdullah.
> >>
> >> On Sat, Jun 25, 2016 at 3:59 AM, Young-Seok Kim <ki...@gmail.com>
> >> wrote:
> >>
> >> Non-instant read locks will break the deadlock-free locking protocol
> that
> >>> we achieved currently since the necessary condition for the deadlock,
> >>> i.e.,
> >>> hold-and-wait situation will come back alive.
> >>>
> >>> Best,
> >>> Young-Seok
> >>>
> >>>
> >>> On Fri, Jun 24, 2016 at 5:49 PM, Mike Carey <dt...@gmail.com> wrote:
> >>>
> >>> Got it and agreed.  Would another solution be to retain the (no longer
> >>>> instant) lock?  (What can of worms would that open?)
> >>>>
> >>>>
> >>>>
> >>>> On 6/24/16 5:09 PM, Young-Seok Kim wrote:
> >>>>
> >>>> Please see inline below.
> >>>>>
> >>>>> Best,
> >>>>> Young-Seok
> >>>>>
> >>>>> On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <dt...@gmail.com>
> wrote:
> >>>>>
> >>>>> So to clarify, record-level consistency (and primary/secondary index
> >>>>>
> >>>>>> consistency) is guaranteed and will work "correctly" in all cases
> if a
> >>>>>> record R is updated (or deleted) by T2 after being targeted (by
> >>>>>> primary
> >>>>>> key) for deletion by T1.
> >>>>>>
> >>>>>> Yes, agreed on.
> >>>>>
> >>>>>
> >>>>> The only semantic issue is that there is a (hopefully very, very
> small)
> >>>>>
> >>>>>> window of time between when T1 sees R in a secondary index and when
> it
> >>>>>> acquires for the lock on R's primary key - during which T2 could
> >>>>>>
> >>>>> change R
> >>>
> >>>> in a way that makes it no longer query-compliant.
> >>>>>>
> >>>>>> Here, let me clarify the above sentence: "when it acquires for the
> >>>>> lock
> >>>>>
> >>>> on
> >>>
> >>>> R's primary key" means that the lock is acquired and released by T1
> >>>>>
> >>>> since
> >>>
> >>>> the lock was an instant shared-mode(read) lock. So, T2 can change R
> >>>>>
> >>>> after
> >>>
> >>>> acquiring an exclusive lock and consequently the R is not qualified
> for
> >>>>> the
> >>>>> query predicate anymore.
> >>>>>
> >>>>>
> >>>>> (However, at its time of being observed - which happened under
> >>>>>
> >>>>>> read-committed - it was a correct candidate for deletion.  So this
> is
> >>>>>> kind
> >>>>>> of "expected" but admittedly kind of weird.  It seems like this
> could
> >>>>>> maybe
> >>>>>> be fixed in the future via a mechanism similar to the index-only
> >>>>>>
> >>>>> branch's
> >>>
> >>>> way of handling locks?)
> >>>>>>
> >>>>>> This expected but undesired situation can be avoided by introducing
> a
> >>>>> version number which will be stored as a field (which is not exposed
> to
> >>>>> users) in each entry of the primary index and the secondary indexes
> >>>>> such
> >>>>> that the version number can be used to verify that the record
> searched
> >>>>> during the search phase is the same record to be deleted during the
> >>>>>
> >>>> delete
> >>>
> >>>> phase. If the verification is succeeded, the delete will be performed.
> >>>>> Otherwise, it will not.
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 6/24/16 10:59 AM, Young-Seok Kim wrote:
> >>>>>>
> >>>>>> This is somewhat expected issue by having read-committed isolation
> >>>>>>
> >>>>> level
> >>>
> >>>> based on strict 2PL locking protocol.
> >>>>>>> The strict 2PL guarantees that all acquired exclusive locks by a
> >>>>>>> transaction can be released after the transaction is committed.
> >>>>>>> But, read lock doesn't follow this.
> >>>>>>> So, as you described in the email, a record read by a transaction,
> T1
> >>>>>>> during search can be modified by another transaction T2 before the
> >>>>>>> record
> >>>>>>> is deleted by T1.  This is a possible situation under the
> >>>>>>>
> >>>>>> read-committed
> >>>
> >>>> isolation level.
> >>>>>>> However, there is no inconsistency between a primary index and
> >>>>>>>
> >>>>>> secondary
> >>>
> >>>> indexes in the way that the modified record by T2 is deleted by T1
> >>>>>>>
> >>>>>> from
> >>>
> >>>> the
> >>>>>>> primary index and the corresponding secondary index entry may not
> be
> >>>>>>> deleted by T1. This is because when T1 starts deleting process
> >>>>>>> through
> >>>>>>> the
> >>>>>>> job pipeline, an exclusive lock for the record is first acquired
> and
> >>>>>>> then
> >>>>>>> the delete operations in primary and secondary indexes are
> performed.
> >>>>>>> So,
> >>>>>>> either case1) the record should exist with the identical primary
> key
> >>>>>>>
> >>>>>> for
> >>>
> >>>> the record to be deleted by T1 (since the search will deliver the
> >>>>>>> primary
> >>>>>>> key, not the complete record) or case2) the record will not be
> >>>>>>> deleted
> >>>>>>> by
> >>>>>>> T1 if the record with the primary key does not exist.
> >>>>>>>
> >>>>>>> For case1), once a record is deleted from the primary index, all
> rest
> >>>>>>>
> >>>>>> of
> >>>
> >>>> secondary indexes in the job pipeline correctly find and delete the
> >>>>>>> corresponding secondary index entries.
> >>>>>>> For case2), I need to check the behavior whether the job pipeline
> >>>>>>>
> >>>>>> throws
> >>>
> >>>> an
> >>>>>>> exception due to trying to delete the non-existing record and stops
> >>>>>>> proceeding the job by aborting the job, or the exception is just
> >>>>>>> swallowed
> >>>>>>> and the job proceeds for the next record.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Young-Seok
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <
> >>>>>>>
> >>>>>> bamousaa@gmail.com
> >>>
> >>>> wrote:
> >>>>>>>
> >>>>>>> Hi everyone,
> >>>>>>>
> >>>>>>> I think we have a problem related to the deletes transaction
> >>>>>>>> behavior:here
> >>>>>>>> is the problem:
> >>>>>>>>
> >>>>>>>> Our delete starts by searching the tree to identify delete tuples
> >>>>>>>>
> >>>>>>> based
> >>>
> >>>> on
> >>>>>>>> the delete statement conditional clause. It follows that by
> >>>>>>>> inserting
> >>>>>>>> delete tuples in primary index, followed by updating secondary
> >>>>>>>>
> >>>>>>> indexes,
> >>>
> >>>> followed by a commit on the PK
> >>>>>>>>
> >>>>>>>> The problem happens if after searching the tree and identifying
> the
> >>>>>>>> records
> >>>>>>>> to be deleted, one of those records was updated. This will cause
> the
> >>>>>>>> record
> >>>>>>>> to be deleted in the primary index even though it might not meet
> the
> >>>>>>>> conditional clause. Moreover, the new entries in the secondary
> >>>>>>>>
> >>>>>>> indexes
> >>>
> >>>> will
> >>>>>>>> remain without their record in the primary index.
> >>>>>>>>
> >>>>>>>> In order to fix this, we need to do one of the following:
> >>>>>>>> 1. lock the records when we do the search to identify the records
> to
> >>>>>>>>
> >>>>>>> be
> >>>
> >>>> deleted
> >>>>>>>> OR
> >>>>>>>> 2. when performing the delete, we double check that the record
> we're
> >>>>>>>> deleting is the same as the record we find when we do the actual
> >>>>>>>>
> >>>>>>> delete
> >>>
> >>>> A better way would be to perform the delete as we do the search since
> >>>>>>>> there
> >>>>>>>> is no need to do the whole search, materialize then perform the
> >>>>>>>>
> >>>>>>> delete.
> >>>
> >>>> There is a change I got something wrong. Did I? Thoughts?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >
>

Re: Delete transactions

Posted by abdullah alamoudi <ba...@gmail.com>.
Not difficult at all. We do something similar for Upserts.
Essentially, all we need to do is:
1. Exclude the delete case from the introduce materialize rule.
2. Use different callbacks for the search and delete operations. This means
that the lock on the search in this case will be a write lock and in the
delete will be a no lock. The commit will then release the lock.

Should be a couple of hours work.
~Abdullah.

On Sat, Jun 25, 2016 at 6:25 PM, Mike Carey <dt...@gmail.com> wrote:

> I wonder how easy this would be to fit into the current S/W structure,
> though?
>
> (Side note:  My wife will be sad; Halloween is her favorite holiday... :-))
>
>
> On 6/24/16 7:22 PM, abdullah alamoudi wrote:
>
>> There is a better alternative.
>>
>> When the search is performed, we acquire a write lock and perform the
>> deletion right away without materialization. Currently, the
>> materialization
>> is introduced by the IntroduceMaterializeForInsertWithSelfReadRule. The
>> rule is meant to prevent a Halloween situation. However, with a delete,
>> Halloween is cancelled.
>>
>> Cheers,
>> Abdullah.
>>
>> On Sat, Jun 25, 2016 at 3:59 AM, Young-Seok Kim <ki...@gmail.com>
>> wrote:
>>
>> Non-instant read locks will break the deadlock-free locking protocol that
>>> we achieved currently since the necessary condition for the deadlock,
>>> i.e.,
>>> hold-and-wait situation will come back alive.
>>>
>>> Best,
>>> Young-Seok
>>>
>>>
>>> On Fri, Jun 24, 2016 at 5:49 PM, Mike Carey <dt...@gmail.com> wrote:
>>>
>>> Got it and agreed.  Would another solution be to retain the (no longer
>>>> instant) lock?  (What can of worms would that open?)
>>>>
>>>>
>>>>
>>>> On 6/24/16 5:09 PM, Young-Seok Kim wrote:
>>>>
>>>> Please see inline below.
>>>>>
>>>>> Best,
>>>>> Young-Seok
>>>>>
>>>>> On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <dt...@gmail.com> wrote:
>>>>>
>>>>> So to clarify, record-level consistency (and primary/secondary index
>>>>>
>>>>>> consistency) is guaranteed and will work "correctly" in all cases if a
>>>>>> record R is updated (or deleted) by T2 after being targeted (by
>>>>>> primary
>>>>>> key) for deletion by T1.
>>>>>>
>>>>>> Yes, agreed on.
>>>>>
>>>>>
>>>>> The only semantic issue is that there is a (hopefully very, very small)
>>>>>
>>>>>> window of time between when T1 sees R in a secondary index and when it
>>>>>> acquires for the lock on R's primary key - during which T2 could
>>>>>>
>>>>> change R
>>>
>>>> in a way that makes it no longer query-compliant.
>>>>>>
>>>>>> Here, let me clarify the above sentence: "when it acquires for the
>>>>> lock
>>>>>
>>>> on
>>>
>>>> R's primary key" means that the lock is acquired and released by T1
>>>>>
>>>> since
>>>
>>>> the lock was an instant shared-mode(read) lock. So, T2 can change R
>>>>>
>>>> after
>>>
>>>> acquiring an exclusive lock and consequently the R is not qualified for
>>>>> the
>>>>> query predicate anymore.
>>>>>
>>>>>
>>>>> (However, at its time of being observed - which happened under
>>>>>
>>>>>> read-committed - it was a correct candidate for deletion.  So this is
>>>>>> kind
>>>>>> of "expected" but admittedly kind of weird.  It seems like this could
>>>>>> maybe
>>>>>> be fixed in the future via a mechanism similar to the index-only
>>>>>>
>>>>> branch's
>>>
>>>> way of handling locks?)
>>>>>>
>>>>>> This expected but undesired situation can be avoided by introducing a
>>>>> version number which will be stored as a field (which is not exposed to
>>>>> users) in each entry of the primary index and the secondary indexes
>>>>> such
>>>>> that the version number can be used to verify that the record searched
>>>>> during the search phase is the same record to be deleted during the
>>>>>
>>>> delete
>>>
>>>> phase. If the verification is succeeded, the delete will be performed.
>>>>> Otherwise, it will not.
>>>>>
>>>>>
>>>>>
>>>>> On 6/24/16 10:59 AM, Young-Seok Kim wrote:
>>>>>>
>>>>>> This is somewhat expected issue by having read-committed isolation
>>>>>>
>>>>> level
>>>
>>>> based on strict 2PL locking protocol.
>>>>>>> The strict 2PL guarantees that all acquired exclusive locks by a
>>>>>>> transaction can be released after the transaction is committed.
>>>>>>> But, read lock doesn't follow this.
>>>>>>> So, as you described in the email, a record read by a transaction, T1
>>>>>>> during search can be modified by another transaction T2 before the
>>>>>>> record
>>>>>>> is deleted by T1.  This is a possible situation under the
>>>>>>>
>>>>>> read-committed
>>>
>>>> isolation level.
>>>>>>> However, there is no inconsistency between a primary index and
>>>>>>>
>>>>>> secondary
>>>
>>>> indexes in the way that the modified record by T2 is deleted by T1
>>>>>>>
>>>>>> from
>>>
>>>> the
>>>>>>> primary index and the corresponding secondary index entry may not be
>>>>>>> deleted by T1. This is because when T1 starts deleting process
>>>>>>> through
>>>>>>> the
>>>>>>> job pipeline, an exclusive lock for the record is first acquired and
>>>>>>> then
>>>>>>> the delete operations in primary and secondary indexes are performed.
>>>>>>> So,
>>>>>>> either case1) the record should exist with the identical primary key
>>>>>>>
>>>>>> for
>>>
>>>> the record to be deleted by T1 (since the search will deliver the
>>>>>>> primary
>>>>>>> key, not the complete record) or case2) the record will not be
>>>>>>> deleted
>>>>>>> by
>>>>>>> T1 if the record with the primary key does not exist.
>>>>>>>
>>>>>>> For case1), once a record is deleted from the primary index, all rest
>>>>>>>
>>>>>> of
>>>
>>>> secondary indexes in the job pipeline correctly find and delete the
>>>>>>> corresponding secondary index entries.
>>>>>>> For case2), I need to check the behavior whether the job pipeline
>>>>>>>
>>>>>> throws
>>>
>>>> an
>>>>>>> exception due to trying to delete the non-existing record and stops
>>>>>>> proceeding the job by aborting the job, or the exception is just
>>>>>>> swallowed
>>>>>>> and the job proceeds for the next record.
>>>>>>>
>>>>>>> Best,
>>>>>>> Young-Seok
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <
>>>>>>>
>>>>>> bamousaa@gmail.com
>>>
>>>> wrote:
>>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> I think we have a problem related to the deletes transaction
>>>>>>>> behavior:here
>>>>>>>> is the problem:
>>>>>>>>
>>>>>>>> Our delete starts by searching the tree to identify delete tuples
>>>>>>>>
>>>>>>> based
>>>
>>>> on
>>>>>>>> the delete statement conditional clause. It follows that by
>>>>>>>> inserting
>>>>>>>> delete tuples in primary index, followed by updating secondary
>>>>>>>>
>>>>>>> indexes,
>>>
>>>> followed by a commit on the PK
>>>>>>>>
>>>>>>>> The problem happens if after searching the tree and identifying the
>>>>>>>> records
>>>>>>>> to be deleted, one of those records was updated. This will cause the
>>>>>>>> record
>>>>>>>> to be deleted in the primary index even though it might not meet the
>>>>>>>> conditional clause. Moreover, the new entries in the secondary
>>>>>>>>
>>>>>>> indexes
>>>
>>>> will
>>>>>>>> remain without their record in the primary index.
>>>>>>>>
>>>>>>>> In order to fix this, we need to do one of the following:
>>>>>>>> 1. lock the records when we do the search to identify the records to
>>>>>>>>
>>>>>>> be
>>>
>>>> deleted
>>>>>>>> OR
>>>>>>>> 2. when performing the delete, we double check that the record we're
>>>>>>>> deleting is the same as the record we find when we do the actual
>>>>>>>>
>>>>>>> delete
>>>
>>>> A better way would be to perform the delete as we do the search since
>>>>>>>> there
>>>>>>>> is no need to do the whole search, materialize then perform the
>>>>>>>>
>>>>>>> delete.
>>>
>>>> There is a change I got something wrong. Did I? Thoughts?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>

Re: Delete transactions

Posted by Mike Carey <dt...@gmail.com>.
I wonder how easy this would be to fit into the current S/W structure, 
though?

(Side note:  My wife will be sad; Halloween is her favorite holiday... :-))


On 6/24/16 7:22 PM, abdullah alamoudi wrote:
> There is a better alternative.
>
> When the search is performed, we acquire a write lock and perform the
> deletion right away without materialization. Currently, the materialization
> is introduced by the IntroduceMaterializeForInsertWithSelfReadRule. The
> rule is meant to prevent a Halloween situation. However, with a delete,
> Halloween is cancelled.
>
> Cheers,
> Abdullah.
>
> On Sat, Jun 25, 2016 at 3:59 AM, Young-Seok Kim <ki...@gmail.com> wrote:
>
>> Non-instant read locks will break the deadlock-free locking protocol that
>> we achieved currently since the necessary condition for the deadlock, i.e.,
>> hold-and-wait situation will come back alive.
>>
>> Best,
>> Young-Seok
>>
>>
>> On Fri, Jun 24, 2016 at 5:49 PM, Mike Carey <dt...@gmail.com> wrote:
>>
>>> Got it and agreed.  Would another solution be to retain the (no longer
>>> instant) lock?  (What can of worms would that open?)
>>>
>>>
>>>
>>> On 6/24/16 5:09 PM, Young-Seok Kim wrote:
>>>
>>>> Please see inline below.
>>>>
>>>> Best,
>>>> Young-Seok
>>>>
>>>> On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <dt...@gmail.com> wrote:
>>>>
>>>> So to clarify, record-level consistency (and primary/secondary index
>>>>> consistency) is guaranteed and will work "correctly" in all cases if a
>>>>> record R is updated (or deleted) by T2 after being targeted (by primary
>>>>> key) for deletion by T1.
>>>>>
>>>> Yes, agreed on.
>>>>
>>>>
>>>> The only semantic issue is that there is a (hopefully very, very small)
>>>>> window of time between when T1 sees R in a secondary index and when it
>>>>> acquires for the lock on R's primary key - during which T2 could
>> change R
>>>>> in a way that makes it no longer query-compliant.
>>>>>
>>>> Here, let me clarify the above sentence: "when it acquires for the lock
>> on
>>>> R's primary key" means that the lock is acquired and released by T1
>> since
>>>> the lock was an instant shared-mode(read) lock. So, T2 can change R
>> after
>>>> acquiring an exclusive lock and consequently the R is not qualified for
>>>> the
>>>> query predicate anymore.
>>>>
>>>>
>>>> (However, at its time of being observed - which happened under
>>>>> read-committed - it was a correct candidate for deletion.  So this is
>>>>> kind
>>>>> of "expected" but admittedly kind of weird.  It seems like this could
>>>>> maybe
>>>>> be fixed in the future via a mechanism similar to the index-only
>> branch's
>>>>> way of handling locks?)
>>>>>
>>>> This expected but undesired situation can be avoided by introducing a
>>>> version number which will be stored as a field (which is not exposed to
>>>> users) in each entry of the primary index and the secondary indexes such
>>>> that the version number can be used to verify that the record searched
>>>> during the search phase is the same record to be deleted during the
>> delete
>>>> phase. If the verification is succeeded, the delete will be performed.
>>>> Otherwise, it will not.
>>>>
>>>>
>>>>
>>>>> On 6/24/16 10:59 AM, Young-Seok Kim wrote:
>>>>>
>>>>> This is somewhat expected issue by having read-committed isolation
>> level
>>>>>> based on strict 2PL locking protocol.
>>>>>> The strict 2PL guarantees that all acquired exclusive locks by a
>>>>>> transaction can be released after the transaction is committed.
>>>>>> But, read lock doesn't follow this.
>>>>>> So, as you described in the email, a record read by a transaction, T1
>>>>>> during search can be modified by another transaction T2 before the
>>>>>> record
>>>>>> is deleted by T1.  This is a possible situation under the
>> read-committed
>>>>>> isolation level.
>>>>>> However, there is no inconsistency between a primary index and
>> secondary
>>>>>> indexes in the way that the modified record by T2 is deleted by T1
>> from
>>>>>> the
>>>>>> primary index and the corresponding secondary index entry may not be
>>>>>> deleted by T1. This is because when T1 starts deleting process through
>>>>>> the
>>>>>> job pipeline, an exclusive lock for the record is first acquired and
>>>>>> then
>>>>>> the delete operations in primary and secondary indexes are performed.
>>>>>> So,
>>>>>> either case1) the record should exist with the identical primary key
>> for
>>>>>> the record to be deleted by T1 (since the search will deliver the
>>>>>> primary
>>>>>> key, not the complete record) or case2) the record will not be deleted
>>>>>> by
>>>>>> T1 if the record with the primary key does not exist.
>>>>>>
>>>>>> For case1), once a record is deleted from the primary index, all rest
>> of
>>>>>> secondary indexes in the job pipeline correctly find and delete the
>>>>>> corresponding secondary index entries.
>>>>>> For case2), I need to check the behavior whether the job pipeline
>> throws
>>>>>> an
>>>>>> exception due to trying to delete the non-existing record and stops
>>>>>> proceeding the job by aborting the job, or the exception is just
>>>>>> swallowed
>>>>>> and the job proceeds for the next record.
>>>>>>
>>>>>> Best,
>>>>>> Young-Seok
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <
>> bamousaa@gmail.com
>>>>>> wrote:
>>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>>> I think we have a problem related to the deletes transaction
>>>>>>> behavior:here
>>>>>>> is the problem:
>>>>>>>
>>>>>>> Our delete starts by searching the tree to identify delete tuples
>> based
>>>>>>> on
>>>>>>> the delete statement conditional clause. It follows that by inserting
>>>>>>> delete tuples in primary index, followed by updating secondary
>> indexes,
>>>>>>> followed by a commit on the PK
>>>>>>>
>>>>>>> The problem happens if after searching the tree and identifying the
>>>>>>> records
>>>>>>> to be deleted, one of those records was updated. This will cause the
>>>>>>> record
>>>>>>> to be deleted in the primary index even though it might not meet the
>>>>>>> conditional clause. Moreover, the new entries in the secondary
>> indexes
>>>>>>> will
>>>>>>> remain without their record in the primary index.
>>>>>>>
>>>>>>> In order to fix this, we need to do one of the following:
>>>>>>> 1. lock the records when we do the search to identify the records to
>> be
>>>>>>> deleted
>>>>>>> OR
>>>>>>> 2. when performing the delete, we double check that the record we're
>>>>>>> deleting is the same as the record we find when we do the actual
>> delete
>>>>>>> A better way would be to perform the delete as we do the search since
>>>>>>> there
>>>>>>> is no need to do the whole search, materialize then perform the
>> delete.
>>>>>>> There is a change I got something wrong. Did I? Thoughts?
>>>>>>>
>>>>>>>
>>>>>>>


Re: Delete transactions

Posted by abdullah alamoudi <ba...@gmail.com>.
There is a better alternative.

When the search is performed, we acquire a write lock and perform the
deletion right away without materialization. Currently, the materialization
is introduced by the IntroduceMaterializeForInsertWithSelfReadRule. The
rule is meant to prevent a Halloween situation. However, with a delete,
Halloween is cancelled.

Cheers,
Abdullah.

On Sat, Jun 25, 2016 at 3:59 AM, Young-Seok Kim <ki...@gmail.com> wrote:

> Non-instant read locks will break the deadlock-free locking protocol that
> we achieved currently since the necessary condition for the deadlock, i.e.,
> hold-and-wait situation will come back alive.
>
> Best,
> Young-Seok
>
>
> On Fri, Jun 24, 2016 at 5:49 PM, Mike Carey <dt...@gmail.com> wrote:
>
> > Got it and agreed.  Would another solution be to retain the (no longer
> > instant) lock?  (What can of worms would that open?)
> >
> >
> >
> > On 6/24/16 5:09 PM, Young-Seok Kim wrote:
> >
> >> Please see inline below.
> >>
> >> Best,
> >> Young-Seok
> >>
> >> On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <dt...@gmail.com> wrote:
> >>
> >> So to clarify, record-level consistency (and primary/secondary index
> >>> consistency) is guaranteed and will work "correctly" in all cases if a
> >>> record R is updated (or deleted) by T2 after being targeted (by primary
> >>> key) for deletion by T1.
> >>>
> >> Yes, agreed on.
> >>
> >>
> >> The only semantic issue is that there is a (hopefully very, very small)
> >>> window of time between when T1 sees R in a secondary index and when it
> >>> acquires for the lock on R's primary key - during which T2 could
> change R
> >>> in a way that makes it no longer query-compliant.
> >>>
> >> Here, let me clarify the above sentence: "when it acquires for the lock
> on
> >> R's primary key" means that the lock is acquired and released by T1
> since
> >> the lock was an instant shared-mode(read) lock. So, T2 can change R
> after
> >> acquiring an exclusive lock and consequently the R is not qualified for
> >> the
> >> query predicate anymore.
> >>
> >>
> >> (However, at its time of being observed - which happened under
> >>> read-committed - it was a correct candidate for deletion.  So this is
> >>> kind
> >>> of "expected" but admittedly kind of weird.  It seems like this could
> >>> maybe
> >>> be fixed in the future via a mechanism similar to the index-only
> branch's
> >>> way of handling locks?)
> >>>
> >> This expected but undesired situation can be avoided by introducing a
> >> version number which will be stored as a field (which is not exposed to
> >> users) in each entry of the primary index and the secondary indexes such
> >> that the version number can be used to verify that the record searched
> >> during the search phase is the same record to be deleted during the
> delete
> >> phase. If the verification is succeeded, the delete will be performed.
> >> Otherwise, it will not.
> >>
> >>
> >>
> >>>
> >>> On 6/24/16 10:59 AM, Young-Seok Kim wrote:
> >>>
> >>> This is somewhat expected issue by having read-committed isolation
> level
> >>>> based on strict 2PL locking protocol.
> >>>> The strict 2PL guarantees that all acquired exclusive locks by a
> >>>> transaction can be released after the transaction is committed.
> >>>> But, read lock doesn't follow this.
> >>>> So, as you described in the email, a record read by a transaction, T1
> >>>> during search can be modified by another transaction T2 before the
> >>>> record
> >>>> is deleted by T1.  This is a possible situation under the
> read-committed
> >>>> isolation level.
> >>>> However, there is no inconsistency between a primary index and
> secondary
> >>>> indexes in the way that the modified record by T2 is deleted by T1
> from
> >>>> the
> >>>> primary index and the corresponding secondary index entry may not be
> >>>> deleted by T1. This is because when T1 starts deleting process through
> >>>> the
> >>>> job pipeline, an exclusive lock for the record is first acquired and
> >>>> then
> >>>> the delete operations in primary and secondary indexes are performed.
> >>>> So,
> >>>> either case1) the record should exist with the identical primary key
> for
> >>>> the record to be deleted by T1 (since the search will deliver the
> >>>> primary
> >>>> key, not the complete record) or case2) the record will not be deleted
> >>>> by
> >>>> T1 if the record with the primary key does not exist.
> >>>>
> >>>> For case1), once a record is deleted from the primary index, all rest
> of
> >>>> secondary indexes in the job pipeline correctly find and delete the
> >>>> corresponding secondary index entries.
> >>>> For case2), I need to check the behavior whether the job pipeline
> throws
> >>>> an
> >>>> exception due to trying to delete the non-existing record and stops
> >>>> proceeding the job by aborting the job, or the exception is just
> >>>> swallowed
> >>>> and the job proceeds for the next record.
> >>>>
> >>>> Best,
> >>>> Young-Seok
> >>>>
> >>>>
> >>>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <
> bamousaa@gmail.com
> >>>> >
> >>>> wrote:
> >>>>
> >>>> Hi everyone,
> >>>>
> >>>>> I think we have a problem related to the deletes transaction
> >>>>> behavior:here
> >>>>> is the problem:
> >>>>>
> >>>>> Our delete starts by searching the tree to identify delete tuples
> based
> >>>>> on
> >>>>> the delete statement conditional clause. It follows that by inserting
> >>>>> delete tuples in primary index, followed by updating secondary
> indexes,
> >>>>> followed by a commit on the PK
> >>>>>
> >>>>> The problem happens if after searching the tree and identifying the
> >>>>> records
> >>>>> to be deleted, one of those records was updated. This will cause the
> >>>>> record
> >>>>> to be deleted in the primary index even though it might not meet the
> >>>>> conditional clause. Moreover, the new entries in the secondary
> indexes
> >>>>> will
> >>>>> remain without their record in the primary index.
> >>>>>
> >>>>> In order to fix this, we need to do one of the following:
> >>>>> 1. lock the records when we do the search to identify the records to
> be
> >>>>> deleted
> >>>>> OR
> >>>>> 2. when performing the delete, we double check that the record we're
> >>>>> deleting is the same as the record we find when we do the actual
> delete
> >>>>>
> >>>>> A better way would be to perform the delete as we do the search since
> >>>>> there
> >>>>> is no need to do the whole search, materialize then perform the
> delete.
> >>>>>
> >>>>> There is a change I got something wrong. Did I? Thoughts?
> >>>>>
> >>>>>
> >>>>>
> >
>

Re: Delete transactions

Posted by Young-Seok Kim <ki...@gmail.com>.
Non-instant read locks will break the deadlock-free locking protocol that
we achieved currently since the necessary condition for the deadlock, i.e.,
hold-and-wait situation will come back alive.

Best,
Young-Seok


On Fri, Jun 24, 2016 at 5:49 PM, Mike Carey <dt...@gmail.com> wrote:

> Got it and agreed.  Would another solution be to retain the (no longer
> instant) lock?  (What can of worms would that open?)
>
>
>
> On 6/24/16 5:09 PM, Young-Seok Kim wrote:
>
>> Please see inline below.
>>
>> Best,
>> Young-Seok
>>
>> On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <dt...@gmail.com> wrote:
>>
>> So to clarify, record-level consistency (and primary/secondary index
>>> consistency) is guaranteed and will work "correctly" in all cases if a
>>> record R is updated (or deleted) by T2 after being targeted (by primary
>>> key) for deletion by T1.
>>>
>> Yes, agreed on.
>>
>>
>> The only semantic issue is that there is a (hopefully very, very small)
>>> window of time between when T1 sees R in a secondary index and when it
>>> acquires for the lock on R's primary key - during which T2 could change R
>>> in a way that makes it no longer query-compliant.
>>>
>> Here, let me clarify the above sentence: "when it acquires for the lock on
>> R's primary key" means that the lock is acquired and released by T1 since
>> the lock was an instant shared-mode(read) lock. So, T2 can change R after
>> acquiring an exclusive lock and consequently the R is not qualified for
>> the
>> query predicate anymore.
>>
>>
>> (However, at its time of being observed - which happened under
>>> read-committed - it was a correct candidate for deletion.  So this is
>>> kind
>>> of "expected" but admittedly kind of weird.  It seems like this could
>>> maybe
>>> be fixed in the future via a mechanism similar to the index-only branch's
>>> way of handling locks?)
>>>
>> This expected but undesired situation can be avoided by introducing a
>> version number which will be stored as a field (which is not exposed to
>> users) in each entry of the primary index and the secondary indexes such
>> that the version number can be used to verify that the record searched
>> during the search phase is the same record to be deleted during the delete
>> phase. If the verification is succeeded, the delete will be performed.
>> Otherwise, it will not.
>>
>>
>>
>>>
>>> On 6/24/16 10:59 AM, Young-Seok Kim wrote:
>>>
>>> This is somewhat expected issue by having read-committed isolation level
>>>> based on strict 2PL locking protocol.
>>>> The strict 2PL guarantees that all acquired exclusive locks by a
>>>> transaction can be released after the transaction is committed.
>>>> But, read lock doesn't follow this.
>>>> So, as you described in the email, a record read by a transaction, T1
>>>> during search can be modified by another transaction T2 before the
>>>> record
>>>> is deleted by T1.  This is a possible situation under the read-committed
>>>> isolation level.
>>>> However, there is no inconsistency between a primary index and secondary
>>>> indexes in the way that the modified record by T2 is deleted by T1 from
>>>> the
>>>> primary index and the corresponding secondary index entry may not be
>>>> deleted by T1. This is because when T1 starts deleting process through
>>>> the
>>>> job pipeline, an exclusive lock for the record is first acquired and
>>>> then
>>>> the delete operations in primary and secondary indexes are performed.
>>>> So,
>>>> either case1) the record should exist with the identical primary key for
>>>> the record to be deleted by T1 (since the search will deliver the
>>>> primary
>>>> key, not the complete record) or case2) the record will not be deleted
>>>> by
>>>> T1 if the record with the primary key does not exist.
>>>>
>>>> For case1), once a record is deleted from the primary index, all rest of
>>>> secondary indexes in the job pipeline correctly find and delete the
>>>> corresponding secondary index entries.
>>>> For case2), I need to check the behavior whether the job pipeline throws
>>>> an
>>>> exception due to trying to delete the non-existing record and stops
>>>> proceeding the job by aborting the job, or the exception is just
>>>> swallowed
>>>> and the job proceeds for the next record.
>>>>
>>>> Best,
>>>> Young-Seok
>>>>
>>>>
>>>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <bamousaa@gmail.com
>>>> >
>>>> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>>> I think we have a problem related to the deletes transaction
>>>>> behavior:here
>>>>> is the problem:
>>>>>
>>>>> Our delete starts by searching the tree to identify delete tuples based
>>>>> on
>>>>> the delete statement conditional clause. It follows that by inserting
>>>>> delete tuples in primary index, followed by updating secondary indexes,
>>>>> followed by a commit on the PK
>>>>>
>>>>> The problem happens if after searching the tree and identifying the
>>>>> records
>>>>> to be deleted, one of those records was updated. This will cause the
>>>>> record
>>>>> to be deleted in the primary index even though it might not meet the
>>>>> conditional clause. Moreover, the new entries in the secondary indexes
>>>>> will
>>>>> remain without their record in the primary index.
>>>>>
>>>>> In order to fix this, we need to do one of the following:
>>>>> 1. lock the records when we do the search to identify the records to be
>>>>> deleted
>>>>> OR
>>>>> 2. when performing the delete, we double check that the record we're
>>>>> deleting is the same as the record we find when we do the actual delete
>>>>>
>>>>> A better way would be to perform the delete as we do the search since
>>>>> there
>>>>> is no need to do the whole search, materialize then perform the delete.
>>>>>
>>>>> There is a change I got something wrong. Did I? Thoughts?
>>>>>
>>>>>
>>>>>
>

Re: Delete transactions

Posted by Mike Carey <dt...@gmail.com>.
Got it and agreed.  Would another solution be to retain the (no longer 
instant) lock?  (What can of worms would that open?)


On 6/24/16 5:09 PM, Young-Seok Kim wrote:
> Please see inline below.
>
> Best,
> Young-Seok
>
> On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <dt...@gmail.com> wrote:
>
>> So to clarify, record-level consistency (and primary/secondary index
>> consistency) is guaranteed and will work "correctly" in all cases if a
>> record R is updated (or deleted) by T2 after being targeted (by primary
>> key) for deletion by T1.
> Yes, agreed on.
>
>
>> The only semantic issue is that there is a (hopefully very, very small)
>> window of time between when T1 sees R in a secondary index and when it
>> acquires for the lock on R's primary key - during which T2 could change R
>> in a way that makes it no longer query-compliant.
> Here, let me clarify the above sentence: "when it acquires for the lock on
> R's primary key" means that the lock is acquired and released by T1 since
> the lock was an instant shared-mode(read) lock. So, T2 can change R after
> acquiring an exclusive lock and consequently the R is not qualified for the
> query predicate anymore.
>
>
>> (However, at its time of being observed - which happened under
>> read-committed - it was a correct candidate for deletion.  So this is kind
>> of "expected" but admittedly kind of weird.  It seems like this could maybe
>> be fixed in the future via a mechanism similar to the index-only branch's
>> way of handling locks?)
> This expected but undesired situation can be avoided by introducing a
> version number which will be stored as a field (which is not exposed to
> users) in each entry of the primary index and the secondary indexes such
> that the version number can be used to verify that the record searched
> during the search phase is the same record to be deleted during the delete
> phase. If the verification is succeeded, the delete will be performed.
> Otherwise, it will not.
>
>
>>
>>
>> On 6/24/16 10:59 AM, Young-Seok Kim wrote:
>>
>>> This is somewhat expected issue by having read-committed isolation level
>>> based on strict 2PL locking protocol.
>>> The strict 2PL guarantees that all acquired exclusive locks by a
>>> transaction can be released after the transaction is committed.
>>> But, read lock doesn't follow this.
>>> So, as you described in the email, a record read by a transaction, T1
>>> during search can be modified by another transaction T2 before the record
>>> is deleted by T1.  This is a possible situation under the read-committed
>>> isolation level.
>>> However, there is no inconsistency between a primary index and secondary
>>> indexes in the way that the modified record by T2 is deleted by T1 from
>>> the
>>> primary index and the corresponding secondary index entry may not be
>>> deleted by T1. This is because when T1 starts deleting process through the
>>> job pipeline, an exclusive lock for the record is first acquired and then
>>> the delete operations in primary and secondary indexes are performed. So,
>>> either case1) the record should exist with the identical primary key for
>>> the record to be deleted by T1 (since the search will deliver the primary
>>> key, not the complete record) or case2) the record will not be deleted by
>>> T1 if the record with the primary key does not exist.
>>>
>>> For case1), once a record is deleted from the primary index, all rest of
>>> secondary indexes in the job pipeline correctly find and delete the
>>> corresponding secondary index entries.
>>> For case2), I need to check the behavior whether the job pipeline throws
>>> an
>>> exception due to trying to delete the non-existing record and stops
>>> proceeding the job by aborting the job, or the exception is just swallowed
>>> and the job proceeds for the next record.
>>>
>>> Best,
>>> Young-Seok
>>>
>>>
>>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <ba...@gmail.com>
>>> wrote:
>>>
>>> Hi everyone,
>>>> I think we have a problem related to the deletes transaction
>>>> behavior:here
>>>> is the problem:
>>>>
>>>> Our delete starts by searching the tree to identify delete tuples based
>>>> on
>>>> the delete statement conditional clause. It follows that by inserting
>>>> delete tuples in primary index, followed by updating secondary indexes,
>>>> followed by a commit on the PK
>>>>
>>>> The problem happens if after searching the tree and identifying the
>>>> records
>>>> to be deleted, one of those records was updated. This will cause the
>>>> record
>>>> to be deleted in the primary index even though it might not meet the
>>>> conditional clause. Moreover, the new entries in the secondary indexes
>>>> will
>>>> remain without their record in the primary index.
>>>>
>>>> In order to fix this, we need to do one of the following:
>>>> 1. lock the records when we do the search to identify the records to be
>>>> deleted
>>>> OR
>>>> 2. when performing the delete, we double check that the record we're
>>>> deleting is the same as the record we find when we do the actual delete
>>>>
>>>> A better way would be to perform the delete as we do the search since
>>>> there
>>>> is no need to do the whole search, materialize then perform the delete.
>>>>
>>>> There is a change I got something wrong. Did I? Thoughts?
>>>>
>>>>


Re: Delete transactions

Posted by Young-Seok Kim <ki...@gmail.com>.
Please see inline below.

Best,
Young-Seok

On Fri, Jun 24, 2016 at 4:35 PM, Mike Carey <dt...@gmail.com> wrote:

> So to clarify, record-level consistency (and primary/secondary index
> consistency) is guaranteed and will work "correctly" in all cases if a
> record R is updated (or deleted) by T2 after being targeted (by primary
> key) for deletion by T1.

Yes, agreed on.


> The only semantic issue is that there is a (hopefully very, very small)
> window of time between when T1 sees R in a secondary index and when it
> acquires for the lock on R's primary key - during which T2 could change R
> in a way that makes it no longer query-compliant.

Here, let me clarify the above sentence: "when it acquires for the lock on
R's primary key" means that the lock is acquired and released by T1 since
the lock was an instant shared-mode(read) lock. So, T2 can change R after
acquiring an exclusive lock and consequently the R is not qualified for the
query predicate anymore.


> (However, at its time of being observed - which happened under
> read-committed - it was a correct candidate for deletion.  So this is kind
> of "expected" but admittedly kind of weird.  It seems like this could maybe
> be fixed in the future via a mechanism similar to the index-only branch's
> way of handling locks?)

This expected but undesired situation can be avoided by introducing a
version number which will be stored as a field (which is not exposed to
users) in each entry of the primary index and the secondary indexes such
that the version number can be used to verify that the record searched
during the search phase is the same record to be deleted during the delete
phase. If the verification is succeeded, the delete will be performed.
Otherwise, it will not.


>
>
>
> On 6/24/16 10:59 AM, Young-Seok Kim wrote:
>
>> This is somewhat expected issue by having read-committed isolation level
>> based on strict 2PL locking protocol.
>> The strict 2PL guarantees that all acquired exclusive locks by a
>> transaction can be released after the transaction is committed.
>> But, read lock doesn't follow this.
>> So, as you described in the email, a record read by a transaction, T1
>> during search can be modified by another transaction T2 before the record
>> is deleted by T1.  This is a possible situation under the read-committed
>> isolation level.
>> However, there is no inconsistency between a primary index and secondary
>> indexes in the way that the modified record by T2 is deleted by T1 from
>> the
>> primary index and the corresponding secondary index entry may not be
>> deleted by T1. This is because when T1 starts deleting process through the
>> job pipeline, an exclusive lock for the record is first acquired and then
>> the delete operations in primary and secondary indexes are performed. So,
>> either case1) the record should exist with the identical primary key for
>> the record to be deleted by T1 (since the search will deliver the primary
>> key, not the complete record) or case2) the record will not be deleted by
>> T1 if the record with the primary key does not exist.
>>
>> For case1), once a record is deleted from the primary index, all rest of
>> secondary indexes in the job pipeline correctly find and delete the
>> corresponding secondary index entries.
>> For case2), I need to check the behavior whether the job pipeline throws
>> an
>> exception due to trying to delete the non-existing record and stops
>> proceeding the job by aborting the job, or the exception is just swallowed
>> and the job proceeds for the next record.
>>
>> Best,
>> Young-Seok
>>
>>
>> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <ba...@gmail.com>
>> wrote:
>>
>> Hi everyone,
>>> I think we have a problem related to the deletes transaction
>>> behavior:here
>>> is the problem:
>>>
>>> Our delete starts by searching the tree to identify delete tuples based
>>> on
>>> the delete statement conditional clause. It follows that by inserting
>>> delete tuples in primary index, followed by updating secondary indexes,
>>> followed by a commit on the PK
>>>
>>> The problem happens if after searching the tree and identifying the
>>> records
>>> to be deleted, one of those records was updated. This will cause the
>>> record
>>> to be deleted in the primary index even though it might not meet the
>>> conditional clause. Moreover, the new entries in the secondary indexes
>>> will
>>> remain without their record in the primary index.
>>>
>>> In order to fix this, we need to do one of the following:
>>> 1. lock the records when we do the search to identify the records to be
>>> deleted
>>> OR
>>> 2. when performing the delete, we double check that the record we're
>>> deleting is the same as the record we find when we do the actual delete
>>>
>>> A better way would be to perform the delete as we do the search since
>>> there
>>> is no need to do the whole search, materialize then perform the delete.
>>>
>>> There is a change I got something wrong. Did I? Thoughts?
>>>
>>>
>

Re: Delete transactions

Posted by Mike Carey <dt...@gmail.com>.
So to clarify, record-level consistency (and primary/secondary index 
consistency) is guaranteed and will work "correctly" in all cases if a 
record R is updated (or deleted) by T2 after being targeted (by primary 
key) for deletion by T1. The only semantic issue is that there is a 
(hopefully very, very small) window of time between when T1 sees R in a 
secondary index and when it acquires for the lock on R's primary key - 
during which T2 could change R in a way that makes it no longer 
query-compliant.  (However, at its time of being observed - which 
happened under read-committed - it was a correct candidate for 
deletion.  So this is kind of "expected" but admittedly kind of weird.  
It seems like this could maybe be fixed in the future via a mechanism 
similar to the index-only branch's way of handling locks?)



On 6/24/16 10:59 AM, Young-Seok Kim wrote:
> This is somewhat expected issue by having read-committed isolation level
> based on strict 2PL locking protocol.
> The strict 2PL guarantees that all acquired exclusive locks by a
> transaction can be released after the transaction is committed.
> But, read lock doesn't follow this.
> So, as you described in the email, a record read by a transaction, T1
> during search can be modified by another transaction T2 before the record
> is deleted by T1.  This is a possible situation under the read-committed
> isolation level.
> However, there is no inconsistency between a primary index and secondary
> indexes in the way that the modified record by T2 is deleted by T1 from the
> primary index and the corresponding secondary index entry may not be
> deleted by T1. This is because when T1 starts deleting process through the
> job pipeline, an exclusive lock for the record is first acquired and then
> the delete operations in primary and secondary indexes are performed. So,
> either case1) the record should exist with the identical primary key for
> the record to be deleted by T1 (since the search will deliver the primary
> key, not the complete record) or case2) the record will not be deleted by
> T1 if the record with the primary key does not exist.
>
> For case1), once a record is deleted from the primary index, all rest of
> secondary indexes in the job pipeline correctly find and delete the
> corresponding secondary index entries.
> For case2), I need to check the behavior whether the job pipeline throws an
> exception due to trying to delete the non-existing record and stops
> proceeding the job by aborting the job, or the exception is just swallowed
> and the job proceeds for the next record.
>
> Best,
> Young-Seok
>
>
> On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <ba...@gmail.com>
> wrote:
>
>> Hi everyone,
>> I think we have a problem related to the deletes transaction behavior:here
>> is the problem:
>>
>> Our delete starts by searching the tree to identify delete tuples based on
>> the delete statement conditional clause. It follows that by inserting
>> delete tuples in primary index, followed by updating secondary indexes,
>> followed by a commit on the PK
>>
>> The problem happens if after searching the tree and identifying the records
>> to be deleted, one of those records was updated. This will cause the record
>> to be deleted in the primary index even though it might not meet the
>> conditional clause. Moreover, the new entries in the secondary indexes will
>> remain without their record in the primary index.
>>
>> In order to fix this, we need to do one of the following:
>> 1. lock the records when we do the search to identify the records to be
>> deleted
>> OR
>> 2. when performing the delete, we double check that the record we're
>> deleting is the same as the record we find when we do the actual delete
>>
>> A better way would be to perform the delete as we do the search since there
>> is no need to do the whole search, materialize then perform the delete.
>>
>> There is a change I got something wrong. Did I? Thoughts?
>>


Re: Delete transactions

Posted by Young-Seok Kim <ki...@gmail.com>.
This is somewhat expected issue by having read-committed isolation level
based on strict 2PL locking protocol.
The strict 2PL guarantees that all acquired exclusive locks by a
transaction can be released after the transaction is committed.
But, read lock doesn't follow this.
So, as you described in the email, a record read by a transaction, T1
during search can be modified by another transaction T2 before the record
is deleted by T1.  This is a possible situation under the read-committed
isolation level.
However, there is no inconsistency between a primary index and secondary
indexes in the way that the modified record by T2 is deleted by T1 from the
primary index and the corresponding secondary index entry may not be
deleted by T1. This is because when T1 starts deleting process through the
job pipeline, an exclusive lock for the record is first acquired and then
the delete operations in primary and secondary indexes are performed. So,
either case1) the record should exist with the identical primary key for
the record to be deleted by T1 (since the search will deliver the primary
key, not the complete record) or case2) the record will not be deleted by
T1 if the record with the primary key does not exist.

For case1), once a record is deleted from the primary index, all rest of
secondary indexes in the job pipeline correctly find and delete the
corresponding secondary index entries.
For case2), I need to check the behavior whether the job pipeline throws an
exception due to trying to delete the non-existing record and stops
proceeding the job by aborting the job, or the exception is just swallowed
and the job proceeds for the next record.

Best,
Young-Seok


On Fri, Jun 24, 2016 at 10:14 AM, abdullah alamoudi <ba...@gmail.com>
wrote:

> Hi everyone,
> I think we have a problem related to the deletes transaction behavior:here
> is the problem:
>
> Our delete starts by searching the tree to identify delete tuples based on
> the delete statement conditional clause. It follows that by inserting
> delete tuples in primary index, followed by updating secondary indexes,
> followed by a commit on the PK
>
> The problem happens if after searching the tree and identifying the records
> to be deleted, one of those records was updated. This will cause the record
> to be deleted in the primary index even though it might not meet the
> conditional clause. Moreover, the new entries in the secondary indexes will
> remain without their record in the primary index.
>
> In order to fix this, we need to do one of the following:
> 1. lock the records when we do the search to identify the records to be
> deleted
> OR
> 2. when performing the delete, we double check that the record we're
> deleting is the same as the record we find when we do the actual delete
>
> A better way would be to perform the delete as we do the search since there
> is no need to do the whole search, materialize then perform the delete.
>
> There is a change I got something wrong. Did I? Thoughts?
>