You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by John Wilson <sa...@gmail.com> on 2018/02/13 01:16:07 UTC

What happens if Primary Node fails during the Commit Phase

Hi,

Assume the Prepare phase has completed and that the primary node has
received a commit message from the coordinator.

Two questions:

   1. A primary node commits a transaction before it forwards a commit
   message to the backup nodes. True?
   2. What happens if a Primary Node fails while it is committing but
   before the commit message is sent to backup nodes? Do the backup nodes
   commit after some timeout or will they send a fail message to the
   coordinator?

The doc below provides a nice description but doesn't exactly answer my
question.

https://www.gridgain.com/resources/blog/apache-ignite-transactions-architecture-failover-and-recovery

Thanks,

Re: What happens if Primary Node fails during the Commit Phase

Posted by John Wilson <sa...@gmail.com>.
I got the answer for #3 here
https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Pages&links.
I will post the remaining questions in a separate thread.

On Mon, Feb 12, 2018 at 8:03 PM, John Wilson <sa...@gmail.com>
wrote:

> You're always helpful Val. Thanks!
>
>
> I have a question regarding Optimistic Locking
>
>
>    1. The documentation here, https://cwiki.apache.
>    org/confluence/display/IGNITE/Ignite+Key-Value+Transactions+
>    Architecture
>    <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Key-Value+Transactions+Architecture>,
>    states that locks, for optimistic locking, are acquired during the
>    "prepare" phase. But the graphic depicted there and the tutorial here,
>    https://www.gridgain.com/resources/blog/apache-ignite-transactions-
>    architecture-concurrency-modes-and-isolation-levels
>    <https://www.gridgain.com/resources/blog/apache-ignite-transactions-architecture-concurrency-modes-and-isolation-levels>,
>    clearly indicate that locks are acquired during the commit phase; with a
>    version information passed along with the key by the coordinator to the
>    primary nodes. Can you please explain the discrepancy?
>
> And two questions regarding pages and page locking?
>
>    1. Does Ignite use a lock-free algorithm for its B+ tree structure?
>    Looking at the source code and the use of a tag field to avoid the ABA
>    problem suggests that.
>    2. Ignite documentation talks about entry-level locks and page locks.
>    When exactly is a page locked and released? Also, when an entry is
>    inserted/modified in a page, is the page locked, forbidding other threads
>    from inserting other entries in the page, or only the entry's offset is
>    locked allowing other threads to insert other entries in the page?
>    3. What is the the difference between a directCount and indirectCount
>    for a page?
>
> Thanks
>
> On Mon, Feb 12, 2018 at 7:33 PM, vkulichenko <
> valentin.kulichenko@gmail.com> wrote:
>
>> Hi John,
>>
>> 1. True.
>>
>> 2. The blog actually provides the answer:
>>
>> When the Backup Nodes detect the failure, they will notify the Transaction
>> coordinator that they committed the transaction successfully. In this
>> scenario, there is no data loss because the data are backed up and can
>> still
>> be accessed and used by applications.
>>
>> In other words, if primary node fails, backups will not wait for a
>> message,
>> but instead will commit right away and send an ack to the coordinator.
>> Once
>> coordinator gets all required acs, transaction completes.
>>
>> -Val
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>
>

Re: What happens if Primary Node fails during the Commit Phase

Posted by John Wilson <sa...@gmail.com>.
I got the answer for #3 here
https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Pages&links.
I will post the remaining questions in a separate thread.

On Mon, Feb 12, 2018 at 8:03 PM, John Wilson <sa...@gmail.com>
wrote:

> You're always helpful Val. Thanks!
>
>
> I have a question regarding Optimistic Locking
>
>
>    1. The documentation here, https://cwiki.apache.
>    org/confluence/display/IGNITE/Ignite+Key-Value+Transactions+
>    Architecture
>    <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Key-Value+Transactions+Architecture>,
>    states that locks, for optimistic locking, are acquired during the
>    "prepare" phase. But the graphic depicted there and the tutorial here,
>    https://www.gridgain.com/resources/blog/apache-ignite-transactions-
>    architecture-concurrency-modes-and-isolation-levels
>    <https://www.gridgain.com/resources/blog/apache-ignite-transactions-architecture-concurrency-modes-and-isolation-levels>,
>    clearly indicate that locks are acquired during the commit phase; with a
>    version information passed along with the key by the coordinator to the
>    primary nodes. Can you please explain the discrepancy?
>
> And two questions regarding pages and page locking?
>
>    1. Does Ignite use a lock-free algorithm for its B+ tree structure?
>    Looking at the source code and the use of a tag field to avoid the ABA
>    problem suggests that.
>    2. Ignite documentation talks about entry-level locks and page locks.
>    When exactly is a page locked and released? Also, when an entry is
>    inserted/modified in a page, is the page locked, forbidding other threads
>    from inserting other entries in the page, or only the entry's offset is
>    locked allowing other threads to insert other entries in the page?
>    3. What is the the difference between a directCount and indirectCount
>    for a page?
>
> Thanks
>
> On Mon, Feb 12, 2018 at 7:33 PM, vkulichenko <
> valentin.kulichenko@gmail.com> wrote:
>
>> Hi John,
>>
>> 1. True.
>>
>> 2. The blog actually provides the answer:
>>
>> When the Backup Nodes detect the failure, they will notify the Transaction
>> coordinator that they committed the transaction successfully. In this
>> scenario, there is no data loss because the data are backed up and can
>> still
>> be accessed and used by applications.
>>
>> In other words, if primary node fails, backups will not wait for a
>> message,
>> but instead will commit right away and send an ack to the coordinator.
>> Once
>> coordinator gets all required acs, transaction completes.
>>
>> -Val
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>
>

Re: What happens if Primary Node fails during the Commit Phase

Posted by John Wilson <sa...@gmail.com>.
You're always helpful Val. Thanks!


I have a question regarding Optimistic Locking


   1. The documentation here,
   https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Key-Value+Transactions+Architecture,
   states that locks, for optimistic locking, are acquired during the
   "prepare" phase. But the graphic depicted there and the tutorial here,
   https://www.gridgain.com/resources/blog/apache-ignite-transactions-architecture-concurrency-modes-and-isolation-levels,
   clearly indicate that locks are acquired during the commit phase; with a
   version information passed along with the key by the coordinator to the
   primary nodes. Can you please explain the discrepancy?

And two questions regarding pages and page locking?

   1. Does Ignite use a lock-free algorithm for its B+ tree structure?
   Looking at the source code and the use of a tag field to avoid the ABA
   problem suggests that.
   2. Ignite documentation talks about entry-level locks and page locks.
   When exactly is a page locked and released? Also, when an entry is
   inserted/modified in a page, is the page locked, forbidding other threads
   from inserting other entries in the page, or only the entry's offset is
   locked allowing other threads to insert other entries in the page?
   3. What is the the difference between a directCount and indirectCount
   for a page?

Thanks

On Mon, Feb 12, 2018 at 7:33 PM, vkulichenko <va...@gmail.com>
wrote:

> Hi John,
>
> 1. True.
>
> 2. The blog actually provides the answer:
>
> When the Backup Nodes detect the failure, they will notify the Transaction
> coordinator that they committed the transaction successfully. In this
> scenario, there is no data loss because the data are backed up and can
> still
> be accessed and used by applications.
>
> In other words, if primary node fails, backups will not wait for a message,
> but instead will commit right away and send an ack to the coordinator. Once
> coordinator gets all required acs, transaction completes.
>
> -Val
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: What happens if Primary Node fails during the Commit Phase

Posted by John Wilson <sa...@gmail.com>.
You're always helpful Val. Thanks!


I have a question regarding Optimistic Locking


   1. The documentation here,
   https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Key-Value+Transactions+Architecture,
   states that locks, for optimistic locking, are acquired during the
   "prepare" phase. But the graphic depicted there and the tutorial here,
   https://www.gridgain.com/resources/blog/apache-ignite-transactions-architecture-concurrency-modes-and-isolation-levels,
   clearly indicate that locks are acquired during the commit phase; with a
   version information passed along with the key by the coordinator to the
   primary nodes. Can you please explain the discrepancy?

And two questions regarding pages and page locking?

   1. Does Ignite use a lock-free algorithm for its B+ tree structure?
   Looking at the source code and the use of a tag field to avoid the ABA
   problem suggests that.
   2. Ignite documentation talks about entry-level locks and page locks.
   When exactly is a page locked and released? Also, when an entry is
   inserted/modified in a page, is the page locked, forbidding other threads
   from inserting other entries in the page, or only the entry's offset is
   locked allowing other threads to insert other entries in the page?
   3. What is the the difference between a directCount and indirectCount
   for a page?

Thanks

On Mon, Feb 12, 2018 at 7:33 PM, vkulichenko <va...@gmail.com>
wrote:

> Hi John,
>
> 1. True.
>
> 2. The blog actually provides the answer:
>
> When the Backup Nodes detect the failure, they will notify the Transaction
> coordinator that they committed the transaction successfully. In this
> scenario, there is no data loss because the data are backed up and can
> still
> be accessed and used by applications.
>
> In other words, if primary node fails, backups will not wait for a message,
> but instead will commit right away and send an ack to the coordinator. Once
> coordinator gets all required acs, transaction completes.
>
> -Val
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: What happens if Primary Node fails during the Commit Phase

Posted by vkulichenko <va...@gmail.com>.
Hi John,

1. True.

2. The blog actually provides the answer:

When the Backup Nodes detect the failure, they will notify the Transaction
coordinator that they committed the transaction successfully. In this
scenario, there is no data loss because the data are backed up and can still
be accessed and used by applications.

In other words, if primary node fails, backups will not wait for a message,
but instead will commit right away and send an ack to the coordinator. Once
coordinator gets all required acs, transaction completes.

-Val



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/