You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Denis Magda <dm...@gridgain.com> on 2016/04/18 18:17:14 UTC

Fwd: Data lost when using write-behind

Igniters,

Do we queue changes on backup nodes as well and flush them to the store if a primary node leaves?

This is irrelevant for transactional caches since changes are queue and flushed on a side of a transaction initiator, right? And flushing from backups makes sense only for atomic caches, correct?

—
Denis

> Begin forwarded message:
> 
> From: Shaomin Zhang <Sh...@tudor.com>
> Subject: RE: Data lost when using write-behind
> Date: April 18, 2016 at 6:35:20 PM GMT+3
> To: "user@ignite.apache.org" <us...@ignite.apache.org>
> Reply-To: user@ignite.apache.org
> 
> Hi Alexei
>  
> Will updates that are lost because of the node failure will be retried to be persisted to database later?
>  
> Thanks
>  
> Shaomin
>  
> From: Alexei Scherbakov [mailto:alexey.scherbakoff@gmail.com] 
> Sent: 18 April 2016 15:27
> To: user@ignite.apache.org
> Subject: Re: Data lost when using write-behind
>  
> Hi,
>  
> You should use write-behind mode only if it's acceptable for you to lose some updates to persistent store on node failures.
> Be vary of possible desync between persistent store and cache after node recovery.
> You can tune write-behind behavior as described here:
> https://apacheignite.readme.io/docs/persistent-store#configuration <https://urldefense.proofpoint.com/v2/url?u=https-3A__apacheignite.readme.io_docs_persistent-2Dstore-23configuration&d=CwMFaQ&c=lcVbikor4usg5Rj5OmznbA&r=TO3grc1lvgRzUij7SCtFhBPVz_ocKy44E1ncA3VjNmM&m=vJJJDAQ260MnYql1raooH2qhXcGEeGtNVzdRirH_kzo&s=NxaNDHVKM3hIoqnL1tRaKHJIQcztibaLeIKSoiJfisM&e=>
>  
>  
>  
> 2016-04-18 5:25 GMT+03:00 wang shuai <wangshuaie@yonyou.com <ma...@yonyou.com>>:
>   When testing the write-behind feature, I found the data which would be
> persisted to back-end database was put in a queue of JVM. That means if that
> server crash, the data which has not been persisted to the database will be
> lost. Even though that part of data can be found in other server's memory,
> that data can not be updated to database automatically.
>   So I want to make clear what the recommended scenario is to use
> write-behind and how to handle the server crash when using write-behind.
> 
> 
> 
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-lost-when-using-write-behind-tp4265.html <https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Dignite-2Dusers.70518.x6.nabble.com_Data-2Dlost-2Dwhen-2Dusing-2Dwrite-2Dbehind-2Dtp4265.html&d=CwMFaQ&c=lcVbikor4usg5Rj5OmznbA&r=TO3grc1lvgRzUij7SCtFhBPVz_ocKy44E1ncA3VjNmM&m=vJJJDAQ260MnYql1raooH2qhXcGEeGtNVzdRirH_kzo&s=4hEm13b3-E-vZ1QcqI35pBDphCxAvgb5RKbdsbFViVU&e=>
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
> 
> 
>  
> -- 
> 
> Best regards,
> Alexei Scherbakov
> _________________________________________________________
> 
> This email, its contents, and any attachments transmitted with it are intended only for the addressee(s) and may be confidential and legally privileged. We do not waive any confidentiality by misdelivery. If you have received this email in error, please notify the sender immediately and delete it. You should not copy it, forward it or otherwise use the contents, attachments or information in any way. Any liability for viruses is excluded to the fullest extent permitted by law.
> 
> Tudor Capital Europe LLP (TCE) is authorised and regulated by The Financial Conduct Authority (the FCA). TCE is registered as a limited liability partnership in England and Wales No: OC340673 with its registered office at 10 New Burlington Street, London, W1S 3BE, United Kingdom
>

RE: Data lost when using write-behind

Posted by vkulichenko <va...@gmail.com>.

Hi,

1.6 will be released soon, as far as I know, but I'm not sure this fix will
be included there, unless someone in the community picks it up.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-lost-when-using-write-behind-tp4265p4382.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: Data lost when using write-behind

Posted by wang shuai <wa...@yonyou.com>.

Thank you, vkulichenko.

The ticket plans to be fixed in the 1.6 version. Do you know when the target
date of 1.6 release is?  



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-lost-when-using-write-behind-tp4265p4354.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: Data lost when using write-behind

Posted by vkulichenko <va...@gmail.com>.

Hi Sparkle,

Please properly subscribe to the mailing list so that the community can
receive email notifications for your messages. Here is the instruction:
http://apache-ignite-users.70518.x6.nabble.com/mailing_list/MailingListOptions.jtp?forum=1


sparkle_j wrote
> To address this issue, we are using a cache to store updates from within a
> transaction and remove them when write behind is complete.
> 
> In case of a node failure event, we just read the local entries from the
> cache, and try to force write those updates to database. However this is
> causing blocking and grid comes a halt after sometime (we may be updating
> and removing same key from the cache). Do you please recommend a better
> way to handle this scenario.

This sounds like a workable solution. However, I'm not sure why the grid
hangs in your case. Can you elaborate more details? I would recommend to
start with collecting logs and thread dumps from all nodes. If you attache
them here, I will be able to take a look.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-lost-when-using-write-behind-tp4265p6194.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: Data lost when using write-behind

Posted by vkulichenko <va...@gmail.com>.

Shaomin,

Session is ended when the cache update is done or when transaction is
committed/rolled back. In case of write behind the actual DB update can
happen after the cache store session ends.

There is no synchronization between this process and discovery event.
Actually, discovery event is fired on all nodes and the session is
node-local, so I'm not sure what you meant by your question. Please clarify
if anything is still unclear for you.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-lost-when-using-write-behind-tp4265p4425.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: Data lost when using write-behind

Posted by Shaomin Zhang <Sh...@tudor.com>.

Val

In the case of the primary failure, it send out a EVT_NODE_FAILED event, is the CacheStoreSessionListener.onSessionEnd() method still be called? Which one goes first, the event or the onSessionEnd() method?

Thanks again

Shaomin

-----Original Message-----
From: vkulichenko [mailto:valentin.kulichenko@gmail.com]
Sent: 19 April 2016 21:56
To: user@ignite.apache.org
Subject: RE: Data lost when using write-behind

Shaomin,

The EVT_NODE_FAILED is fired when any node fails and leaves topology, but you still don't know which entries are lost because you lost the write-behind queue that was on that node.

Currently the only way to fully guarantee consistency between cache and DB is using write-through. After [1] is fixed, this will be also possible with write-behind in ATOMIC caches. But in TRANSACTIONAL caches write-behind store makes all DB updates separately, losing the transactional semantics on DB level, so inconsistencies will still be possible.

[1] https://issues.apache.org/jira/browse/IGNITE-1897

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-lost-when-using-write-behind-tp4265p4342.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.
_________________________________________________________

This email, its contents, and any attachments transmitted with it are intended only for the addressee(s) and may be confidential and legally privileged. We do not waive any confidentiality by misdelivery. If you have received this email in error, please notify the sender immediately and delete it. You should not copy it, forward it or otherwise use the contents, attachments or information in any way. Any liability for viruses is excluded to the fullest extent permitted by law.

Tudor Capital Europe LLP (TCE) is authorised and regulated by The Financial Conduct Authority (the FCA). TCE is registered as a limited liability partnership in England and Wales No: OC340673 with its registered office at 10 New Burlington Street, London, W1S 3BE, United Kingdom

RE: Data lost when using write-behind

Posted by vkulichenko <va...@gmail.com>.

Shaomin,

The EVT_NODE_FAILED is fired when any node fails and leaves topology, but
you still don't know which entries are lost because you lost the
write-behind queue that was on that node.

Currently the only way to fully guarantee consistency between cache and DB
is using write-through. After [1] is fixed, this will be also possible with
write-behind in ATOMIC caches. But in TRANSACTIONAL caches write-behind
store makes all DB updates separately, losing the transactional semantics on
DB level, so inconsistencies will still be possible.

[1] https://issues.apache.org/jira/browse/IGNITE-1897

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-lost-when-using-write-behind-tp4265p4342.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: Data lost when using write-behind

Posted by Shaomin Zhang <Sh...@tudor.com>.

When the primary crashes, does Ignite emit any event about that? is the sessionEnd() method still going to be called?

Thanks

Shaomin

From: Alexey Goncharuk [mailto:alexey.goncharuk@gmail.com]
Sent: 18 April 2016 23:55
To: dev@ignite.apache.org; user@ignite.apache.org
Subject: Re: Data lost when using write-behind

Yes, this is correct, if there is no write-behind, then in TRANSACTIONAL cache the database write happens from the originating node, and in ATOMIC cache - from primary nodes.

_________________________________________________________

This email, its contents, and any attachments transmitted with it are intended only for the addressee(s) and may be confidential and legally privileged. We do not waive any confidentiality by misdelivery. If you have received this email in error, please notify the sender immediately and delete it. You should not copy it, forward it or otherwise use the contents, attachments or information in any way. Any liability for viruses is excluded to the fullest extent permitted by law.

Tudor Capital Europe LLP (TCE) is authorised and regulated by The Financial Conduct Authority (the FCA). TCE is registered as a limited liability partnership in England and Wales No: OC340673 with its registered office at 10 New Burlington Street, London, W1S 3BE, United Kingdom

RE: Data lost when using write-behind

Posted by Shaomin Zhang <Sh...@tudor.com>.

When the primary crashes, does Ignite emit any event about that? is the sessionEnd() method still going to be called?

Thanks

Shaomin

From: Alexey Goncharuk [mailto:alexey.goncharuk@gmail.com]
Sent: 18 April 2016 23:55
To: dev@ignite.apache.org; user@ignite.apache.org
Subject: Re: Data lost when using write-behind

Yes, this is correct, if there is no write-behind, then in TRANSACTIONAL cache the database write happens from the originating node, and in ATOMIC cache - from primary nodes.

_________________________________________________________

This email, its contents, and any attachments transmitted with it are intended only for the addressee(s) and may be confidential and legally privileged. We do not waive any confidentiality by misdelivery. If you have received this email in error, please notify the sender immediately and delete it. You should not copy it, forward it or otherwise use the contents, attachments or information in any way. Any liability for viruses is excluded to the fullest extent permitted by law.

Tudor Capital Europe LLP (TCE) is authorised and regulated by The Financial Conduct Authority (the FCA). TCE is registered as a limited liability partnership in England and Wales No: OC340673 with its registered office at 10 New Burlington Street, London, W1S 3BE, United Kingdom

Re: Data lost when using write-behind

Posted by Alexey Goncharuk <al...@gmail.com>.

Yes, this is correct, if there is no write-behind, then in TRANSACTIONAL
cache the database write happens from the originating node, and in ATOMIC
cache - from primary nodes.

Re: Data lost when using write-behind

Posted by Alexey Goncharuk <al...@gmail.com>.

Yes, this is correct, if there is no write-behind, then in TRANSACTIONAL
cache the database write happens from the originating node, and in ATOMIC
cache - from primary nodes.

RE: Data lost when using write-behind

Posted by vkulichenko <va...@gmail.com>.

Here is the ticket for this:
https://issues.apache.org/jira/browse/IGNITE-1897

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-lost-when-using-write-behind-tp4265p4296.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: Data lost when using write-behind

Posted by Denis Magda <dm...@gridgain.com>.

Alex, 

Thanks for the explanation!

However in case of write-through mode there is a difference in transactional and atomic caches. In transactional mode data is committed from a transaction coordinator side while in atomic mode – from primary nodes. Is my understanding correct?

Denis

From: Alexey Goncharuk
Sent: Monday, April 18, 2016 21:23
To: dev@ignite.apache.org; user@ignite.apache.org
Subject: Re: Data lost when using write-behind

Denis,

Updates are always queued on primary nodes when write-behind is enabled, regardless of atomicity mode. This is required because otherwise updates can be written to the database in a wrong order.

We did not queue database updates on backups because we did not have a mechanism that would allow us to track which updates have been written to the database and which are not. Now that we have a partition counter that is already used in continuous queries failover, it can also be reused for write-behind ACKs. 

I thought we had a ticket for this. I will re-check if this is true, and will create if it is not there yet.

RE: Data lost when using write-behind

Posted by Denis Magda <dm...@gridgain.com>.

Alex, 

Thanks for the explanation!

However in case of write-through mode there is a difference in transactional and atomic caches. In transactional mode data is committed from a transaction coordinator side while in atomic mode – from primary nodes. Is my understanding correct?

Denis

From: Alexey Goncharuk
Sent: Monday, April 18, 2016 21:23
To: dev@ignite.apache.org; user@ignite.apache.org
Subject: Re: Data lost when using write-behind

Denis,

Updates are always queued on primary nodes when write-behind is enabled, regardless of atomicity mode. This is required because otherwise updates can be written to the database in a wrong order.

We did not queue database updates on backups because we did not have a mechanism that would allow us to track which updates have been written to the database and which are not. Now that we have a partition counter that is already used in continuous queries failover, it can also be reused for write-behind ACKs. 

I thought we had a ticket for this. I will re-check if this is true, and will create if it is not there yet.

Re: Data lost when using write-behind

Posted by Alexey Goncharuk <al...@gmail.com>.

Denis,

Updates are always queued on primary nodes when write-behind is enabled,
regardless of atomicity mode. This is required because otherwise updates
can be written to the database in a wrong order.

We did not queue database updates on backups because we did not have a
mechanism that would allow us to track which updates have been written to
the database and which are not. Now that we have a partition counter that
is already used in continuous queries failover, it can also be reused for
write-behind ACKs.

I thought we had a ticket for this. I will re-check if this is true, and
will create if it is not there yet.

Re: Data lost when using write-behind

Posted by Alexey Goncharuk <al...@gmail.com>.

Denis,

Updates are always queued on primary nodes when write-behind is enabled,
regardless of atomicity mode. This is required because otherwise updates
can be written to the database in a wrong order.

We did not queue database updates on backups because we did not have a
mechanism that would allow us to track which updates have been written to
the database and which are not. Now that we have a partition counter that
is already used in continuous queries failover, it can also be reused for
write-behind ACKs.

I thought we had a ticket for this. I will re-check if this is true, and
will create if it is not there yet.