You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Mikhail Petrov (Jira)" <ji...@apache.org> on 2020/05/04 08:24:00 UTC

[jira] [Comment Edited] (IGNITE-12894) Cannot use IgniteAtomicSequence in Ignite services

    [ https://issues.apache.org/jira/browse/IGNITE-12894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093100#comment-17093100 ] 

Mikhail Petrov edited comment on IGNITE-12894 at 5/4/20, 8:23 AM:
------------------------------------------------------------------

Hi,  [~daradurvs]. Thank you for clarification.

After some researches, I found that [GridServiceProxy#invokeMethod|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/GridServiceProxy.java#L149] already has mechanism for service obtaining with retries.
 But it doesn't work correctly for the case if topology is missed for requested service.

[GridServiceProxy#invokeMethod|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/GridServiceProxy.java#L169] throws an exception that is not ignored in case no node for the service was found. So no repeated attempts will be made.

The following plan of solving this issue is proposed.

Change the behavior of GridServiceProxy#invokeMethod as follows:

Fails with exception if ServiceProcessorAdapter#serviceDescriptors doesn't contain requested service.

If ServiceProcessorAdapter#serviceProxy without specified timeout argument is used for proxy obtaining, wait until requested service topology will be set  via [IgniteServiceProcessor#serviceTopology|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java#L807].To do it, change [IgniteServiceProcessor#serviceTopology|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java#L807] behavior to wait until requested service topology will be set  if timeout equals to 0. Now if timeout equals to 0 it returns immediately.

If ServiceProcessorAdapter#serviceProxy with specified timeout argument is used for proxy obtaining then awaiting time for service topology obtaining will be limited in the same way as in the current implementation.

 

WDYT?


was (Author: petrovmikhail):
Hi,  [~daradurvs]. Thank you for clarification.

After some researches, I found that [GridServiceProxy#invokeMethod|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/GridServiceProxy.java#L149] already has mechanism for service obtaining with retries.
 But it doesn't work correctly for the case if topology is missed for requested service.

[GridServiceProxy#invokeMethod|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/GridServiceProxy.java#L169] throws an exception that is not ignored in case no node for the service was found. So no repeated attempts will be made.

The following plan of solving this issue is proposed.

Change the behavior of GridServiceProxy#invokeMethod as follows:

Fails with exception if ServiceProcessorAdapter#serviceDescriptors doesn't contain requested service.

If ServiceProcessorAdapter#serviceProxy without specified timeout argument is used for proxy obtaining, wait until requested service topology will be set  via [IgniteServiceProcessor#serviceTopology|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java#L807].To do it, change [IgniteServiceProcessor#serviceTopology|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java#L807] behavior to wait until requested service topology will be set  if timeout equals to 0. Now if timeout equals to 0 it returns immediately. Also we should interrupt service topology awaiting in case service initializing fails and full message with that information was received. For now we have no such mechanism.  

If ServiceProcessorAdapter#serviceProxy with specified timeout argument is used for proxy obtaining then awaiting time for service topology obtaining will be limited in the same way as in the current implementation.

 

WDYT?

> Cannot use IgniteAtomicSequence in Ignite services
> --------------------------------------------------
>
>                 Key: IGNITE-12894
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12894
>             Project: Ignite
>          Issue Type: Bug
>          Components: compute
>    Affects Versions: 2.8
>            Reporter: Alexey Kukushkin
>            Assignee: Mikhail Petrov
>            Priority: Major
>              Labels: sbcf
>
> h2. Repro Steps
> Execute the below steps in default service deployment mode and in discovery-based service deployment mode. 
>  Use {{-DIGNITE_EVENT_DRIVEN_SERVICE_PROCESSOR_ENABLED=true}} JVM option to switch to the discovery-based service deployment mode.
>  * Create a service initializing an {{IgniteAtomicService}} in method {{Service#init()}} and using the {{IgniteAtomicService}} in a business method.
>  * Start an Ignite node with the service specified in the IgniteConfiguration
>  * Invoke the service's business method on the Ignite node
> h3. Actual Result
> h4. In Default Service Deployment Mode
> Deadlock on the business method invocation
> h4. In Discovery-Based Service Deployment Mode
> The method invocation fails with {{IgniteException: Failed to find deployed service: IgniteTestService}}
> h2. Reproducer
> h3. Test.java
> {code:java}
> public interface Test {
>     String sayHello(String name);
> }
> {code}
> h3. IgniteTestService.java
> {code:java}
> public class IgniteTestService implements Test, Service {
>     private @IgniteInstanceResource Ignite ignite;
>     private IgniteAtomicSequence seq;
>     @Override public void cancel(ServiceContext ctx) {
>     }
>     @Override public void init(ServiceContext ctx) throws InterruptedException {
>         seq = ignite.atomicSequence("TestSeq", 0, true);
>     }
>     @Override public void execute(ServiceContext ctx) {
>     }
>     @Override public String sayHello(String name) {
>         return "Hello, " + name + "! #" + seq.getAndIncrement();
>     }
> }
> {code}
> h3. Reproducer.java
> {code:java}
> public class Reproducer {
>     public static void main(String[] args) {
>         IgniteConfiguration igniteCfg = new IgniteConfiguration()
>             .setServiceConfiguration(
>                 new ServiceConfiguration()
>                     .setName(IgniteTestService.class.getSimpleName())
>                     .setMaxPerNodeCount(1)
>                     .setTotalCount(0)
>                     .setService(new IgniteTestService())
>             )
>             .setDiscoverySpi(
>                 new TcpDiscoverySpi()
>                     .setIpFinder(new TcpDiscoveryVmIpFinder().setAddresses(Collections.singleton("127.0.0.1:47500")))
>             );
>         try (Ignite ignite = Ignition.start(igniteCfg)) {
>             ignite.services().serviceProxy(IgniteTestService.class.getSimpleName(), Test.class, false)
>                 .sayHello("World");
>         }
>     }
> }
> {code}
> h2. Workaround
> Specifying a service wait timeout solves the problem in the discovery-based service deployment mode (but not in the default deployment mode):
> {code:java}
>             ignite.services().serviceProxy(IgniteTestService.class.getSimpleName(), Test.class, false, 1_000)
>                 .sayHello("World");
> {code}
> This workaround cannot be used in Ignite.NET clients since .NET {{GetServiceProxy}} API does not support the service wait timeout, which is hard-coded to 0 on the server side.
> h2. Full Exception in Discovery-Based Service Deployment Mode
> {noformat}
> [01:08:54,653][SEVERE][services-deployment-worker-#52][IgniteServiceProcessor] Failed to initialize service (service will not be deployed): IgniteTestService
> class org.apache.ignite.IgniteInterruptedException: Got interrupted while waiting for future to complete.
> 	at org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:888)
> 	at org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:886)
> 	at org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1062)
> 	at org.apache.ignite.internal.IgniteKernal.atomicSequence(IgniteKernal.java:3999)
> 	at org.apache.ignite.internal.IgniteKernal.atomicSequence(IgniteKernal.java:3985)
> 	at Sandbox.Net.IgniteTestService.init(IgniteTestService.java:17)
> 	at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.redeploy(IgniteServiceProcessor.java:1188)
> 	at org.apache.ignite.internal.processors.service.ServiceDeploymentTask.lambda$processDeploymentActions$5(ServiceDeploymentTask.java:318)
> 	at java.base/java.util.HashMap.forEach(HashMap.java:1336)
> 	at org.apache.ignite.internal.processors.service.ServiceDeploymentTask.processDeploymentActions(ServiceDeploymentTask.java:302)
> 	at org.apache.ignite.internal.processors.service.ServiceDeploymentTask.init(ServiceDeploymentTask.java:262)
> 	at org.apache.ignite.internal.processors.service.ServiceDeploymentManager$ServicesDeploymentWorker.body(ServiceDeploymentManager.java:475)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> 	at java.base/java.lang.Thread.run(Thread.java:834)
> [01:08:54,712][SEVERE][exchange-worker-#42][GridDhtPartitionsExchangeFuture] Failed to reinitialize local partitions (rebalancing will be stopped): GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], discoEvt=DiscoveryCustomEvent [customMsg=DynamicCacheChangeBatch [id=17576957171-7ae549c8-423a-40b4-9865-c28a2f4b9dd9, reqs=ArrayList [DynamicCacheChangeRequest [cacheName=ignite-sys-atomic-cache@default-ds-group, hasCfg=true, nodeId=5fe32117-84ee-4f1f-9e19-86b85ef8c987, clientStartOnly=false, stop=false, destroy=false, disabledAfterStartfalse]], exchangeActions=ExchangeActions [startCaches=[ignite-sys-atomic-cache@default-ds-group], stopCaches=null, startGrps=[default-ds-group], stopGrps=[], resetParts=null, stateChangeRequest=null], startCaches=false], affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], super=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1], sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500, /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true, ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1, nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1586815734517]], nodeId=5fe32117, evt=DISCOVERY_CUSTOM_EVT]
> class org.apache.ignite.IgniteException: Failed to validate partitions state
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3886)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3577)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3485)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1610)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:891)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3172)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> 	at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: class org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11189)
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3848)
> 	... 8 more
> Caused by: java.lang.InterruptedException
> 	at java.base/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:418)
> 	at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:190)
> 	at org.apache.ignite.internal.util.IgniteUtils$Batch.result(IgniteUtils.java:11313)
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11179)
> 	... 11 more
> [01:08:54,720][SEVERE][exchange-worker-#42][GridCachePartitionExchangeManager] Failed to wait for completion of partition map exchange (preloading will not start): GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryCustomEvent [customMsg=null, affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], super=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1], sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500, /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true, ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1, nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1586815734517]], crd=TcpDiscoveryNode [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1], sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500, /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true, ver=2.8.0#20200226-sha1:341b01df, isClient=false], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], discoEvt=DiscoveryCustomEvent [customMsg=null, affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], super=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1], sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500, /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true, ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1, nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1586815734517]], nodeId=5fe32117, evt=DISCOVERY_CUSTOM_EVT], added=true, exchangeType=ALL, initFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=true, hash=429760908], init=false, lastVer=null, partReleaseFut=PartitionReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]], AtomicUpdateReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]], DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]], LocalTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]], AllTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[RemoteTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]]]]]], exchActions=ExchangeActions [startCaches=[ignite-sys-atomic-cache@default-ds-group], stopCaches=null, startGrps=[default-ds-group], stopGrps=[], resetParts=null, stateChangeRequest=null], affChangeMsg=null, centralizedAff=false, forceAffReassignment=false, exchangeLocE=null, cacheChangeFailureMsgSent=false, done=true, state=CRD, registerCachesFuture=GridFinishedFuture [resFlag=2], partitionsSent=false, partitionsReceived=false, delayedLatestMsg=null, afterLsnrCompleteFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=null, hash=583816633], timeBag=o.a.i.i.util.TimeBag@5ac0d023, startTime=1087079935840199, initTime=1586815734527, rebalanced=false, evtLatch=0, remaining=HashSet [], mergedJoinExchMsgs=null, awaitMergedMsgs=0, super=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=class o.a.i.IgniteException: Failed to validate partitions state, hash=1371010775]]
> class org.apache.ignite.IgniteCheckedException: Failed to validate partitions state
> 	at org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7509)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:260)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:209)
> 	at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:160)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3200)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
> 	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> Caused by: class org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> 	at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: class org.apache.ignite.IgniteException: Failed to validate partitions state
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3886)
> Caused by: java.lang.InterruptedException
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3577)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3485)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1610)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:891)
> 	at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3172)
> 	... 3 more
> Caused by: class org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11189)
> Caused by: class org.apache.ignite.IgniteException: Failed to validate partitions state
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
> 	at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3848)
> 	... 8 more
> Caused by: java.lang.InterruptedException
> 	at java.base/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:418)
> 	at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:190)
> 	at org.apache.ignite.internal.util.IgniteUtils$Batch.result(IgniteUtils.java:11313)
> 	at org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11179)
> 	... 11 more
> Caused by: class org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> Caused by: java.lang.InterruptedException
> [01:08:54] Ignite node stopped OK [uptime=00:00:00.219]
> Exception in thread "main" class org.apache.ignite.IgniteException: Failed to find deployed service: IgniteTestService
> 	at org.apache.ignite.internal.processors.service.GridServiceProxy.invokeMethod(GridServiceProxy.java:169)
> 	at org.apache.ignite.internal.processors.service.GridServiceProxy$ProxyInvocationHandler.invoke(GridServiceProxy.java:364)
> 	at com.sun.proxy.$Proxy25.sayHello(Unknown Source)
> 	at Sandbox.Net.Reproducer.main(Reproducer.java:29)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)