You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by rickynauvaldy <ri...@sci.ui.ac.id> on 2017/05/22 09:34:55 UTC

Client Server Persistent Store Fault Tolerance

Hello, I'm trying to create an environment of client server connection where
I run 3 server nodes that load data from db (similar to this [1]) which will
be use as a failover [2], and 3 client nodes which write data to cache using
loops. It will then be written to the database using write-behind with the
reason of performance [3]. In this test, while the clients are writing, I
purposely stop the server node one-by-on to see the failover. In [2] it said
that "/jobs are automatically transferred to other available nodes for
re-execution/". It does, actually, but when I stopped the second server
node, exception appeared:

[15:57:23,218][SEVERE][tcp-client-disco-sock-writer-#2%null%][TcpDiscoverySpi]
Failed to send message: TcpDiscoveryClientMetricsUpdateMessage
[super=TcpDiscoveryAbstractMessage [sndNodeId=null,
id=c11df5f2c51-30c473c1-464c-40fa-92f2-c8cbbeb34658, verifierNodeId=null,
topVer=0, pendingIdx=0, failedNodes=null, isClient=true]]
java.net.SocketException: Socket is closed
	at java.net.Socket.getSendBufferSize(Unknown Source)
	at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.socketStream(TcpDiscoverySpi.java:1352)
	at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.writeToSocket(TcpDiscoverySpi.java:1464)
	at
org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1216)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)

I thought this is expected, and the writing process continues. The problem
is after it continues, the data written by client nodes are inconsistent. Is
this normal?

Thanks,
Ricky

[1]
http://apache-ignite-users.70518.x6.nabble.com/How-to-do-write-behind-caching-td12138.html
[2] https://apacheignite.readme.io/docs/fault-tolerance
[3]
http://apache-ignite-users.70518.x6.nabble.com/Database-transaction-updating-more-than-1-table-from-CacheStore-td4197.html



-----
-- Ricky
--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-Tolerance-tp13054.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Client Server Persistent Store Fault Tolerance

Posted by Evgenii Zhuravlev <e....@gmail.com>.

You have 2998 instead of 3000, because 2 of your transactions were rolled
back. You should handle this exceptions and run transaction again, if it
was rolled back.

It's not inconsistent data.

2017-06-02 6:01 GMT+03:00 rickynauvaldy <ri...@sci.ui.ac.id>:

> Evgenii Zhuravlev wrote
> > What did you mean by "inconsistent data"? Did you lost data within one
> > transaction?
>
> I run 3 server nodes and 3 client nodes, that in this example each of the
> client node run 1000 transaction for one row by adding '1' in each loop.
> When I stop 1 server nodes, all 3 client nodes will pause for a moment, and
> then only one of them will catch the CacheException that happened because
> of
> an exception in transaction try-catch, while the others not. Then, 1 of the
> alive server node will perform 'load data from database', and the process
> continue.
>
> When the process finished, the value of the changed row should be 3000,
> right? But actually it's only 2998. This is what I mean by "inconsistent
> data", I apologize if I get the term wrong.
>
>
>
>
>
> -----
> -- Ricky
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-
> Tolerance-tp13054p13329.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Client Server Persistent Store Fault Tolerance

Posted by rickynauvaldy <ri...@sci.ui.ac.id>.

Evgenii Zhuravlev wrote
> What did you mean by "inconsistent data"? Did you lost data within one
> transaction?

I run 3 server nodes and 3 client nodes, that in this example each of the
client node run 1000 transaction for one row by adding '1' in each loop.
When I stop 1 server nodes, all 3 client nodes will pause for a moment, and
then only one of them will catch the CacheException that happened because of
an exception in transaction try-catch, while the others not. Then, 1 of the
alive server node will perform 'load data from database', and the process
continue. 

When the process finished, the value of the changed row should be 3000,
right? But actually it's only 2998. This is what I mean by "inconsistent
data", I apologize if I get the term wrong.





-----
-- Ricky
--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-Tolerance-tp13054p13329.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Client Server Persistent Store Fault Tolerance

Posted by Evgenii Zhuravlev <e....@gmail.com>.

>so that the other clients are missing the value that they
should write (they didn't show any error/exception)

I don't understand, which values are they missing?

When transaction should be rolled back, nodes will re-read values from db,
because, when you use write-through, it has consistent data.

What did you mean by "inconsistent data"? Did you lost data within one
transaction?

Evgenii

2017-05-31 12:16 GMT+03:00 rickynauvaldy <ri...@sci.ui.ac.id>:

> In addition, it is not a problem if I only run 1 client. The problem arise
> when I run more than 1 client.
>
>
>
> -----
> -- Ricky
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-
> Tolerance-tp13054p13263.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Client Server Persistent Store Fault Tolerance

Posted by rickynauvaldy <ri...@sci.ui.ac.id>.

In addition, it is not a problem if I only run 1 client. The problem arise
when I run more than 1 client.



-----
-- Ricky
--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-Tolerance-tp13054p13263.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Client Server Persistent Store Fault Tolerance

Posted by rickynauvaldy <ri...@sci.ui.ac.id>.

Evgenii Zhuravlev wrote
> You don't have inconsistent data. 

It turns out that the remaining server (the one that I didn't stop)
automatically load the data from the database (because, by this example [1],
/this method is called whenever IgniteCache.get() method is called/), but
only one of the client is being rolled back (by catching the
CacheException), so that the other clients are missing the value that they
should write (they didn't show any error/exception), and it makes the data
inconsistent (missing) even when I used write through. Is there a way I can
read the data that have been stored in the existing cache instead of the one
in the persistent store when one of the server is stopped?

Thanks

-- Ricky

[1]
https://dzone.com/articles/apache-ignite-how-to-read-data-from-persistent-sto




-----
-- Ricky
--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-Tolerance-tp13054p13262.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Client Server Persistent Store Fault Tolerance

Posted by Evgenii Zhuravlev <e....@gmail.com>.

Hi,

So, you stopped node, transaction was rolled back. You don't have inconsistent
data. What did you expect?

In this case you should handle this exception and run transaction again.

Evgenii







2017-05-29 7:53 GMT+03:00 rickynauvaldy <ri...@sci.ui.ac.id>:

> So I've been trying to do the case using write through only, but when I
> stopped one of the server, exception appeared:
>
> > Exception in thread "main" javax.cache.CacheException: class
> > org.apache.ignite.cluster.ClusterTopologyException: Failed to acquire
> lock
> > for keys (primary node left grid, retry transaction if possible)
> > [keys=[UserKeyCacheObjectImpl [part=1, val=1, hasValBytes=true]],
> > node=d46c85bc-bd55-4731-b390-05e86fa68af6]
> >       at
> > org.apache.ignite.internal.processors.cache.GridCacheUtils.
> convertToCacheException(GridCacheUtils.java:1421)
> >       at
> > org.apache.ignite.internal.processors.cache.IgniteCacheProxy.
> cacheException(IgniteCacheProxy.java:2641)
> >       at
> > org.apache.ignite.internal.processors.cache.IgniteCacheProxy.get(
> IgniteCacheProxy.java:1205)
> >       at
> > myexamples.store.TransactionClient1.deposit(TransactionClient1.java:107)
> >       at myexamples.store.TransactionClient1.main(
> TransactionClient1.java:46)
> > Caused by: class org.apache.ignite.cluster.ClusterTopologyException:
> > Failed to acquire lock for keys (primary node left grid, retry
> transaction
> > if possible) [keys=[UserKeyCacheObjectImpl [part=1, val=1,
> > hasValBytes=true]], node=d46c85bc-bd55-4731-b390-05e86fa68af6]
> >       at
> > org.apache.ignite.internal.util.IgniteUtils$7.apply(
> IgniteUtils.java:812)
> >       at
> > org.apache.ignite.internal.util.IgniteUtils$7.apply(
> IgniteUtils.java:810)
> >       ... 5 more
> > Caused by: class
> > org.apache.ignite.internal.cluster.ClusterTopologyCheckedException:
> Failed
> > to acquire lock for keys (primary node left grid, retry transaction if
> > possible) [keys=[UserKeyCacheObjectImpl [part=1, val=1,
> > hasValBytes=true]], node=d46c85bc-bd55-4731-b390-05e86fa68af6]
> >       at
> > org.apache.ignite.internal.processors.cache.distributed.dht.colocated.
> GridDhtColocatedLockFuture.newTopologyException(
> GridDhtColocatedLockFuture.java:1319)
> >       at
> > org.apache.ignite.internal.processors.cache.distributed.dht.colocated.
> GridDhtColocatedLockFuture.access$1900(GridDhtColocatedLockFuture.java:85)
> >       at
> > org.apache.ignite.internal.processors.cache.distributed.dht.colocated.
> GridDhtColocatedLockFuture$MiniFuture.onResult(GridDhtColocatedLockFuture.
> java:1469)
> >       at
> > org.apache.ignite.internal.processors.cache.distributed.dht.colocated.
> GridDhtColocatedLockFuture.onNodeLeft(GridDhtColocatedLockFuture.java:414)
> >       at
> > org.apache.ignite.internal.processors.cache.GridCacheMvccManager$4.
> onEvent(GridCacheMvccManager.java:263)
> >       at
> > org.apache.ignite.internal.managers.eventstorage.
> GridEventStorageManager$LocalListenerWrapper.onEvent(
> GridEventStorageManager.java:1311)
> >       at
> > org.apache.ignite.internal.managers.eventstorage.
> GridEventStorageManager.notifyListeners(GridEventStorageManager.java:892)
> >       at
> > org.apache.ignite.internal.managers.eventstorage.
> GridEventStorageManager.record0(GridEventStorageManager.java:340)
> >       at
> > org.apache.ignite.internal.managers.eventstorage.
> GridEventStorageManager.record(GridEventStorageManager.java:307)
> >       at
> > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$
> DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2277)
> >       at
> > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$
> DiscoveryWorker.body0(GridDiscoveryManager.java:2474)
> >       at
> > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$
> DiscoveryWorker.body(GridDiscoveryManager.java:2306)
> >       at
> > org.apache.ignite.internal.util.worker.GridWorker.run(
> GridWorker.java:110)
> >       at java.lang.Thread.run(Unknown Source)
> > Caused by: class
> > org.apache.ignite.internal.cluster.ClusterTopologyCheckedException:
> Failed
> > to acquire lock for keys (primary node left grid, retry transaction if
> > possible) [keys=[UserKeyCacheObjectImpl [part=1, val=1,
> > hasValBytes=true]], node=d46c85bc-bd55-4731-b390-05e86fa68af6]
> >       at
> > org.apache.ignite.internal.processors.cache.distributed.dht.colocated.
> GridDhtColocatedLockFuture.newTopologyException(
> GridDhtColocatedLockFuture.java:1319)
> >       ... 11 more
>
> I saw this post [1], but I still don't get the /this means it has been
> already acquired/ part. What should I do?
>
> I used Transactions with /pessimistic /concurency with /serializable
> /isolation.
>
> When I try to change the isolation to /repeatable_read/, another exception
> is shown:
>
> > Exception in thread "main" class org.apache.ignite.IgniteException:
> Failed
> > to commit transaction:
> > GridNearTxLocal[id=ec0e9825c51-00000000-0668-82bc-0000-00000000001d,
> > concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, state=ROLLED_BACK,
> > invalidate=false, rollbackOnly=true,
> > nodeId=bc2cac44-08b4-43e3-9316-c54bb8ca4679, duration=3859]
> >       at
> > org.apache.ignite.internal.util.IgniteUtils.
> convertException(IgniteUtils.java:949)
> >       at
> > org.apache.ignite.internal.processors.cache.transactions.
> TransactionProxyImpl.rollback(TransactionProxyImpl.java:314)
> >       at
> > myexamples.store.TransactionClient1.deposit(TransactionClient1.java:122)
> >       at myexamples.store.TransactionClient1.main(
> TransactionClient1.java:41)
> > Caused by: class org.apache.ignite.IgniteCheckedException: Failed to
> > commit transaction:
> > GridNearTxLocal[id=ec0e9825c51-00000000-0668-82bc-0000-00000000001d,
> > concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, state=ROLLED_BACK,
> > invalidate=false, rollbackOnly=true,
> > nodeId=bc2cac44-08b4-43e3-9316-c54bb8ca4679, duration=3859]
> >       at
> > org.apache.ignite.internal.processors.cache.distributed.
> near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:423)
> >       at
> > org.apache.ignite.internal.processors.cache.distributed.
> near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3246)
> >       at
> > org.apache.ignite.internal.processors.cache.GridCacheSharedContext.
> rollbackTxAsync(GridCacheSharedContext.java:855)
> >       at
> > org.apache.ignite.internal.processors.cache.transactions.
> TransactionProxyImpl.rollback(TransactionProxyImpl.java:306)
> >       ... 2 more
>
> Any suggestion? Thanks.
>
> -- Ricky
>
> [1]
> http://apache-ignite-users.70518.x6.nabble.com/Enter-
> Lock-is-not-working-td8040.html
>
>
>
>
> -----
> -- Ricky
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-
> Tolerance-tp13054p13186.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Client Server Persistent Store Fault Tolerance

Posted by rickynauvaldy <ri...@sci.ui.ac.id>.

So I've been trying to do the case using write through only, but when I
stopped one of the server, exception appeared:

> Exception in thread "main" javax.cache.CacheException: class
> org.apache.ignite.cluster.ClusterTopologyException: Failed to acquire lock
> for keys (primary node left grid, retry transaction if possible)
> [keys=[UserKeyCacheObjectImpl [part=1, val=1, hasValBytes=true]],
> node=d46c85bc-bd55-4731-b390-05e86fa68af6]
> 	at
> org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1421)
> 	at
> org.apache.ignite.internal.processors.cache.IgniteCacheProxy.cacheException(IgniteCacheProxy.java:2641)
> 	at
> org.apache.ignite.internal.processors.cache.IgniteCacheProxy.get(IgniteCacheProxy.java:1205)
> 	at
> myexamples.store.TransactionClient1.deposit(TransactionClient1.java:107)
> 	at myexamples.store.TransactionClient1.main(TransactionClient1.java:46)
> Caused by: class org.apache.ignite.cluster.ClusterTopologyException:
> Failed to acquire lock for keys (primary node left grid, retry transaction
> if possible) [keys=[UserKeyCacheObjectImpl [part=1, val=1,
> hasValBytes=true]], node=d46c85bc-bd55-4731-b390-05e86fa68af6]
> 	at
> org.apache.ignite.internal.util.IgniteUtils$7.apply(IgniteUtils.java:812)
> 	at
> org.apache.ignite.internal.util.IgniteUtils$7.apply(IgniteUtils.java:810)
> 	... 5 more
> Caused by: class
> org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed
> to acquire lock for keys (primary node left grid, retry transaction if
> possible) [keys=[UserKeyCacheObjectImpl [part=1, val=1,
> hasValBytes=true]], node=d46c85bc-bd55-4731-b390-05e86fa68af6]
> 	at
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.newTopologyException(GridDhtColocatedLockFuture.java:1319)
> 	at
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.access$1900(GridDhtColocatedLockFuture.java:85)
> 	at
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture$MiniFuture.onResult(GridDhtColocatedLockFuture.java:1469)
> 	at
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.onNodeLeft(GridDhtColocatedLockFuture.java:414)
> 	at
> org.apache.ignite.internal.processors.cache.GridCacheMvccManager$4.onEvent(GridCacheMvccManager.java:263)
> 	at
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager$LocalListenerWrapper.onEvent(GridEventStorageManager.java:1311)
> 	at
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:892)
> 	at
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:340)
> 	at
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:307)
> 	at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2277)
> 	at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:2474)
> 	at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2306)
> 	at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> 	at java.lang.Thread.run(Unknown Source)
> Caused by: class
> org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed
> to acquire lock for keys (primary node left grid, retry transaction if
> possible) [keys=[UserKeyCacheObjectImpl [part=1, val=1,
> hasValBytes=true]], node=d46c85bc-bd55-4731-b390-05e86fa68af6]
> 	at
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.newTopologyException(GridDhtColocatedLockFuture.java:1319)
> 	... 11 more

I saw this post [1], but I still don't get the /this means it has been
already acquired/ part. What should I do?

I used Transactions with /pessimistic /concurency with /serializable
/isolation.

When I try to change the isolation to /repeatable_read/, another exception
is shown:

> Exception in thread "main" class org.apache.ignite.IgniteException: Failed
> to commit transaction:
> GridNearTxLocal[id=ec0e9825c51-00000000-0668-82bc-0000-00000000001d,
> concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, state=ROLLED_BACK,
> invalidate=false, rollbackOnly=true,
> nodeId=bc2cac44-08b4-43e3-9316-c54bb8ca4679, duration=3859]
> 	at
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:949)
> 	at
> org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.rollback(TransactionProxyImpl.java:314)
> 	at
> myexamples.store.TransactionClient1.deposit(TransactionClient1.java:122)
> 	at myexamples.store.TransactionClient1.main(TransactionClient1.java:41)
> Caused by: class org.apache.ignite.IgniteCheckedException: Failed to
> commit transaction:
> GridNearTxLocal[id=ec0e9825c51-00000000-0668-82bc-0000-00000000001d,
> concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, state=ROLLED_BACK,
> invalidate=false, rollbackOnly=true,
> nodeId=bc2cac44-08b4-43e3-9316-c54bb8ca4679, duration=3859]
> 	at
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:423)
> 	at
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3246)
> 	at
> org.apache.ignite.internal.processors.cache.GridCacheSharedContext.rollbackTxAsync(GridCacheSharedContext.java:855)
> 	at
> org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.rollback(TransactionProxyImpl.java:306)
> 	... 2 more

Any suggestion? Thanks.

-- Ricky

[1]
http://apache-ignite-users.70518.x6.nabble.com/Enter-Lock-is-not-working-td8040.html




-----
-- Ricky
--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-Tolerance-tp13054p13186.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Client Server Persistent Store Fault Tolerance

Posted by Evgenii Zhuravlev <e....@gmail.com>.

This still will not help if you lose more nodes than number of backups you
have (in this case data in memory is also lost), but it will be less
possible to lose data.
As i said before, only synchronous write-through can guarantee that there
are no data loss

2017-05-23 13:01 GMT+03:00 vdpyatkov <vl...@gmail.com>:

> Hi,
>
> You should not use write behind feature (use write through only), if you
> want to solve that.
>
> I think so, this issue did not bin solved in nearest time, but you always
> can fix it and make contribution[1].
>
> [1]: https://issues.apache.org/jira/browse/IGNITE-1897
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-
> Tolerance-tp13054p13081.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Client Server Persistent Store Fault Tolerance

Posted by vdpyatkov <vl...@gmail.com>.

Hi,

You should not use write behind feature (use write through only), if you
want to solve that.

I think so, this issue did not bin solved in nearest time, but you always
can fix it and make contribution[1].

[1]: https://issues.apache.org/jira/browse/IGNITE-1897



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-Tolerance-tp13054p13081.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Client Server Persistent Store Fault Tolerance

Posted by rickynauvaldy <ri...@sci.ui.ac.id>.

> This can be improved by adding backup queues

May I know how to do this to implement in my program while waiting for the
issue to be solved?


> It's not implemented yet.

It's been awhile since the issue created, will it be implemented soon in the
next version of ignite?

Thanks,
Ricky



-----
-- Ricky
--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-Tolerance-tp13054p13080.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Client Server Persistent Store Fault Tolerance

Posted by Evgenii Zhuravlev <e....@gmail.com>.

Hi,

In current implementation write-behind store can lose writes even if only
one nodes fails. This can be improved by adding backup queues [1], but it's
not implemented yet.

Also, it's possible to get inconsistent state in this case. Only
synchronous write-through can guarantee that there are no data loss.

[1] https://issues.apache.org/jira/browse/IGNITE-1897

2017-05-22 12:34 GMT+03:00 rickynauvaldy <ri...@sci.ui.ac.id>:

> Hello, I'm trying to create an environment of client server connection
> where
> I run 3 server nodes that load data from db (similar to this [1]) which
> will
> be use as a failover [2], and 3 client nodes which write data to cache
> using
> loops. It will then be written to the database using write-behind with the
> reason of performance [3]. In this test, while the clients are writing, I
> purposely stop the server node one-by-on to see the failover. In [2] it
> said
> that "/jobs are automatically transferred to other available nodes for
> re-execution/". It does, actually, but when I stopped the second server
> node, exception appeared:
>
> [15:57:23,218][SEVERE][tcp-client-disco-sock-writer-#2%
> null%][TcpDiscoverySpi]
> Failed to send message: TcpDiscoveryClientMetricsUpdateMessage
> [super=TcpDiscoveryAbstractMessage [sndNodeId=null,
> id=c11df5f2c51-30c473c1-464c-40fa-92f2-c8cbbeb34658, verifierNodeId=null,
> topVer=0, pendingIdx=0, failedNodes=null, isClient=true]]
> java.net.SocketException: Socket is closed
>         at java.net.Socket.getSendBufferSize(Unknown Source)
>         at
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.
> socketStream(TcpDiscoverySpi.java:1352)
>         at
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.
> writeToSocket(TcpDiscoverySpi.java:1464)
>         at
> org.apache.ignite.spi.discovery.tcp.ClientImpl$
> SocketWriter.body(ClientImpl.java:1216)
>         at org.apache.ignite.spi.IgniteSpiThread.run(
> IgniteSpiThread.java:62)
>
> I thought this is expected, and the writing process continues. The problem
> is after it continues, the data written by client nodes are inconsistent.
> Is
> this normal?
>
> Thanks,
> Ricky
>
> [1]
> http://apache-ignite-users.70518.x6.nabble.com/How-to-do-
> write-behind-caching-td12138.html
> [2] https://apacheignite.readme.io/docs/fault-tolerance
> [3]
> http://apache-ignite-users.70518.x6.nabble.com/Database-
> transaction-updating-more-than-1-table-from-CacheStore-td4197.html
>
>
>
> -----
> -- Ricky
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Client-Server-Persistent-Store-Fault-
> Tolerance-tp13054.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>