You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by John Smith <ja...@gmail.com> on 2019/05/30 15:57:21 UTC

Ignite Visor Cache command hangs indefinitely.

Hi, running 2.7.0

- I have a 4 node cluster and it seems to be running ok.
- I have clients connecting and doing what they need to do.
- The clients are set as client = true.
- The clients are also connecting from various parts of the network.

The problem with ignite visor cache command is if visor cannot reach a
specific client node it just seems to hang indefinitely.

Choose node number ('c' to cancel) [0]: c
visor> cache

It just stays like that no errors printed nothing...

Re: Ignite Visor Cache command hangs indefinitely.

Posted by Denis Magda <dm...@apache.org>.

John,

This is usually required for the server nodes.

-
Denis


On Wed, Jul 3, 2019 at 10:28 AM John Smith <ja...@gmail.com> wrote:

> Should I do this on the server nodes or the client nodes?
>
> On Tue, 25 Jun 2019 at 10:18, Maxim.Pudov <pu...@gmail.com> wrote:
>
>> You could increase failureDetectionTimeout [1] from default value of
>> 10000 to
>> 60000 or so.
>>
>> https://apacheignite.readme.io/docs/tcpip-discovery#section-failure-detection-timeout
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Should I do this on the server nodes or the client nodes?

On Tue, 25 Jun 2019 at 10:18, Maxim.Pudov <pu...@gmail.com> wrote:

> You could increase failureDetectionTimeout [1] from default value of 10000
> to
> 60000 or so.
>
> https://apacheignite.readme.io/docs/tcpip-discovery#section-failure-detection-timeout
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by "Maxim.Pudov" <pu...@gmail.com>.

You could increase failureDetectionTimeout [1] from default value of 10000 to
60000 or so.
https://apacheignite.readme.io/docs/tcpip-discovery#section-failure-detection-timeout



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

How to turn it off?

Also i think i know what may have been the visor issue. I was connecting to
cluster not specifying ports 47500..47509. But once I added that it seems
more stable. I can even see the wifi node and everything.


On Fri, 21 Jun 2019 at 06:01, Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> It is recommended to turn off failure detection since its default config
> is not very convenient. Maybe it is also fixed in 2.7.5.
>
> This just means some operation took longer than expected and Ignite
> panicked.
>
> Regards,
>
> чт, 20 июн. 2019 г., 19:28 John Smith <ja...@gmail.com>:
>
>> Actually this hapenned when the WIFI node connected. But it never
>> hapenned before...
>>
>> [14:51:46,660][INFO][exchange-worker-#43%xxxxxx%][GridDhtPartitionsExchangeFuture]
>> Completed partition exchange
>> [localNode=e9e9f4b9-b249-4a4d-87ee-fc97097ad9ee,
>> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
>> [topVer=59, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
>> [id=45516c37-5ee0-4046-a13a-9573607d25aa, addrs=[0:0:0:0:0:0:0:1,
>> 127.0.0.1, MY_WIFI_IP, MY_WIFI_IP], sockAddrs=[/MY_WIFI_IP:0,
>> /0:0:0:0:0:0:0:1:0, /127.0.0.1:0, /MY_WIFI_IP:0], discPort=0, order=59,
>> intOrder=32, lastExchangeTime=1561042306599, loc=false,
>> ver=2.7.0#20181130-sha1:256ae401, isClient=true], done=true],
>> topVer=AffinityTopologyVersion [topVer=59, minorTopVer=0],
>> durationFromInit=0]
>> [14:51:46,660][INFO][exchange-worker-#43%xxxxxx%][time] Finished exchange
>> init [topVer=AffinityTopologyVersion [topVer=59, minorTopVer=0], crd=true]
>> [14:51:46,662][INFO][exchange-worker-#43%xxxxxx%][GridCachePartitionExchangeManager]
>> Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
>> [topVer=59, minorTopVer=0], force=false, evt=NODE_JOINED,
>> node=45516c37-5ee0-4046-a13a-9573607d25aa]
>> [14:51:47,123][INFO][grid-nio-worker-tcp-comm-2-#26%xxxxxx%][TcpCommunicationSpi]
>> Accepted incoming communication connection [locAddr=/xxx.xxx.xxx.69:47100,
>> rmtAddr=/MY_WIFI_IP:62249]
>> [14:51:59,428][INFO][db-checkpoint-thread-#1068%xxxxxx%][GridCacheDatabaseSharedManager]
>> Checkpoint started [checkpointId=56e2ea25-7273-49ab-81ac-0fdbc5945626,
>> startPtr=FileWALPointer [idx=137, fileOff=45790479, len=17995],
>> checkpointLockWait=0ms, checkpointLockHoldTime=12ms,
>> walCpRecordFsyncDuration=3ms, pages=242, reason='timeout']
>> [14:51:59,544][INFO][db-checkpoint-thread-#1068%xxxxxx%][GridCacheDatabaseSharedManager]
>> Checkpoint finished [cpId=56e2ea25-7273-49ab-81ac-0fdbc5945626, pages=242,
>> markPos=FileWALPointer [idx=137, fileOff=45790479, len=17995],
>> walSegmentsCleared=0, walSegmentsCovered=[], markDuration=23ms,
>> pagesWrite=14ms, fsync=101ms, total=138ms]
>> [14:52:45,827][INFO][tcp-disco-msg-worker-#2%xxxxxx%][TcpDiscoverySpi]
>> Local node seems to be disconnected from topology (failure detection
>> timeout is reached) [failureDetectionTimeout=10000, connCheckInterval=500]
>> [14:52:45,847][SEVERE][ttl-cleanup-worker-#1652%xxxxxx%][G] Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [threadName=tcp-disco-msg-worker, blockedFor=39s]
>> [14:52:45,859][INFO][tcp-disco-sock-reader-#36%xxxxxx%][TcpDiscoverySpi]
>> Finished serving remote node connection [rmtAddr=/xxx.xxx.xxx.76:56861,
>> rmtPort=56861
>> [14:52:45,864][WARNING][ttl-cleanup-worker-#1652%xxxxxx%][G] Thread
>> [name="tcp-disco-msg-worker-#2%xxxxxx%", id=83, state=RUNNABLE, blockCnt=6,
>> waitCnt=24621465]
>>
>> [14:52:45,875][SEVERE][ttl-cleanup-worker-#1652%xxxxxx%][] Critical
>> system error detected. Will be handled accordingly to configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler
>> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext
>> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker
>> [name=tcp-disco-msg-worker, igniteInstanceName=xxxxxx, finished=false,
>> heartbeatTs=1561042326687]]]
>> class org.apache.ignite.IgniteException: GridWorker
>> [name=tcp-disco-msg-worker, igniteInstanceName=xxxxxx, finished=false,
>> heartbeatTs=1561042326687]
>>         at
>> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
>>         at
>> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
>>         at
>> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
>>         at
>> org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:151)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at java.lang.Thread.run(Thread.java:748)
>>
>>
>> [14:52:47,974][WARNING][jvm-pause-detector-worker][IgniteKernal%xxxxxx]
>> Possible too long JVM pause: 2047 milliseconds.
>>         [14:52:47,994][INFO][tcp-disco-srvr-#3%xxxxxx%][TcpDiscoverySpi]
>> TCP discovery accepted incoming connection [rmtAddr=/xxx.xxx.xxx.72,
>> rmtPort=37607]
>>         [14:52:47,994][INFO][tcp-disco-srvr-#3%xxxxxx%][TcpDiscoverySpi]
>> TCP discovery spawning a new thread for connection
>> [rmtAddr=/xxx.xxx.xxx.72, rmtPort=37607]
>>
>> [14:52:47,996][INFO][tcp-disco-sock-reader-#37%xxxxxx%][TcpDiscoverySpi]
>> Started serving remote node connection [rmtAddr=/xxx.xxx.xxx.72:37607,
>> rmtPort=37607]
>>
>> [14:52:48,005][WARNING][ttl-cleanup-worker-#1652%xxxxxx%][FailureProcessor]
>> Thread dump at 2019/06/20 14:52:47 UTC
>>         Thread [name="sys-#25624%xxxxxx%", id=33109, state=TIMED_WAITING,
>> blockCnt=0, waitCnt=1]
>>             Lock
>> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3a9414a4,
>> ownerName=null, ownerId=-1]
>>                 at sun.misc.Unsafe.park(Native Method)
>>                 at
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>>                 at
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>>                 at
>> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>                 at java.lang.Thread.run(Thread.java:748)
>>
>>         Thread [name="Thread-6972", id=33108, state=TIMED_WAITING,
>> blockCnt=0, waitCnt=17]
>>             Lock
>> [object=java.util.concurrent.SynchronousQueue$TransferStack@62bdd75c,
>> ownerName=null, ownerId=-1]
>>                 at sun.misc.Unsafe.park(Native Method)
>>                 at
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>>                 at
>> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>>                 at
>> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
>>                 at
>> java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>                 at java.lang.Thread.run(Thread.java:748)
>>
>>
>> On Thu, 20 Jun 2019 at 10:08, John Smith <ja...@gmail.com> wrote:
>>
>>> Ok, where do I look for the visor logs when it hangs? And it's not a no
>>> caches issue the cluster works great. It when visor cannot reach a specific
>>> client node.
>>>
>>> On Thu., Jun. 20, 2019, 8:45 a.m. Vasiliy Sisko, <vs...@gridgain.com>
>>> wrote:
>>>
>>>> Hello @javadevmtl
>>>>
>>>> I failed to reproduce your problem.
>>>> In case of any error in cache command Visor CMD shows message "No caches
>>>> found".
>>>> Please provide logs of visor, server and client nodes after command
>>>> hang.
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>
>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

It is recommended to turn off failure detection since its default config is
not very convenient. Maybe it is also fixed in 2.7.5.

This just means some operation took longer than expected and Ignite
panicked.

Regards,

чт, 20 июн. 2019 г., 19:28 John Smith <ja...@gmail.com>:

> Actually this hapenned when the WIFI node connected. But it never hapenned
> before...
>
> [14:51:46,660][INFO][exchange-worker-#43%xxxxxx%][GridDhtPartitionsExchangeFuture]
> Completed partition exchange
> [localNode=e9e9f4b9-b249-4a4d-87ee-fc97097ad9ee,
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
> [topVer=59, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
> [id=45516c37-5ee0-4046-a13a-9573607d25aa, addrs=[0:0:0:0:0:0:0:1,
> 127.0.0.1, MY_WIFI_IP, MY_WIFI_IP], sockAddrs=[/MY_WIFI_IP:0,
> /0:0:0:0:0:0:0:1:0, /127.0.0.1:0, /MY_WIFI_IP:0], discPort=0, order=59,
> intOrder=32, lastExchangeTime=1561042306599, loc=false,
> ver=2.7.0#20181130-sha1:256ae401, isClient=true], done=true],
> topVer=AffinityTopologyVersion [topVer=59, minorTopVer=0],
> durationFromInit=0]
> [14:51:46,660][INFO][exchange-worker-#43%xxxxxx%][time] Finished exchange
> init [topVer=AffinityTopologyVersion [topVer=59, minorTopVer=0], crd=true]
> [14:51:46,662][INFO][exchange-worker-#43%xxxxxx%][GridCachePartitionExchangeManager]
> Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
> [topVer=59, minorTopVer=0], force=false, evt=NODE_JOINED,
> node=45516c37-5ee0-4046-a13a-9573607d25aa]
> [14:51:47,123][INFO][grid-nio-worker-tcp-comm-2-#26%xxxxxx%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/xxx.xxx.xxx.69:47100,
> rmtAddr=/MY_WIFI_IP:62249]
> [14:51:59,428][INFO][db-checkpoint-thread-#1068%xxxxxx%][GridCacheDatabaseSharedManager]
> Checkpoint started [checkpointId=56e2ea25-7273-49ab-81ac-0fdbc5945626,
> startPtr=FileWALPointer [idx=137, fileOff=45790479, len=17995],
> checkpointLockWait=0ms, checkpointLockHoldTime=12ms,
> walCpRecordFsyncDuration=3ms, pages=242, reason='timeout']
> [14:51:59,544][INFO][db-checkpoint-thread-#1068%xxxxxx%][GridCacheDatabaseSharedManager]
> Checkpoint finished [cpId=56e2ea25-7273-49ab-81ac-0fdbc5945626, pages=242,
> markPos=FileWALPointer [idx=137, fileOff=45790479, len=17995],
> walSegmentsCleared=0, walSegmentsCovered=[], markDuration=23ms,
> pagesWrite=14ms, fsync=101ms, total=138ms]
> [14:52:45,827][INFO][tcp-disco-msg-worker-#2%xxxxxx%][TcpDiscoverySpi]
> Local node seems to be disconnected from topology (failure detection
> timeout is reached) [failureDetectionTimeout=10000, connCheckInterval=500]
> [14:52:45,847][SEVERE][ttl-cleanup-worker-#1652%xxxxxx%][G] Blocked
> system-critical thread has been detected. This can lead to cluster-wide
> undefined behaviour [threadName=tcp-disco-msg-worker, blockedFor=39s]
> [14:52:45,859][INFO][tcp-disco-sock-reader-#36%xxxxxx%][TcpDiscoverySpi]
> Finished serving remote node connection [rmtAddr=/xxx.xxx.xxx.76:56861,
> rmtPort=56861
> [14:52:45,864][WARNING][ttl-cleanup-worker-#1652%xxxxxx%][G] Thread
> [name="tcp-disco-msg-worker-#2%xxxxxx%", id=83, state=RUNNABLE, blockCnt=6,
> waitCnt=24621465]
>
> [14:52:45,875][SEVERE][ttl-cleanup-worker-#1652%xxxxxx%][] Critical system
> error detected. Will be handled accordingly to configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler
> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext
> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker
> [name=tcp-disco-msg-worker, igniteInstanceName=xxxxxx, finished=false,
> heartbeatTs=1561042326687]]]
> class org.apache.ignite.IgniteException: GridWorker
> [name=tcp-disco-msg-worker, igniteInstanceName=xxxxxx, finished=false,
> heartbeatTs=1561042326687]
>         at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
>         at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
>         at
> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:151)
>         at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at java.lang.Thread.run(Thread.java:748)
>
>
> [14:52:47,974][WARNING][jvm-pause-detector-worker][IgniteKernal%xxxxxx]
> Possible too long JVM pause: 2047 milliseconds.
>         [14:52:47,994][INFO][tcp-disco-srvr-#3%xxxxxx%][TcpDiscoverySpi]
> TCP discovery accepted incoming connection [rmtAddr=/xxx.xxx.xxx.72,
> rmtPort=37607]
>         [14:52:47,994][INFO][tcp-disco-srvr-#3%xxxxxx%][TcpDiscoverySpi]
> TCP discovery spawning a new thread for connection
> [rmtAddr=/xxx.xxx.xxx.72, rmtPort=37607]
>
> [14:52:47,996][INFO][tcp-disco-sock-reader-#37%xxxxxx%][TcpDiscoverySpi]
> Started serving remote node connection [rmtAddr=/xxx.xxx.xxx.72:37607,
> rmtPort=37607]
>
> [14:52:48,005][WARNING][ttl-cleanup-worker-#1652%xxxxxx%][FailureProcessor]
> Thread dump at 2019/06/20 14:52:47 UTC
>         Thread [name="sys-#25624%xxxxxx%", id=33109, state=TIMED_WAITING,
> blockCnt=0, waitCnt=1]
>             Lock
> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3a9414a4,
> ownerName=null, ownerId=-1]
>                 at sun.misc.Unsafe.park(Native Method)
>                 at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>                 at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>                 at
> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
>                 at
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
>                 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>                 at java.lang.Thread.run(Thread.java:748)
>
>         Thread [name="Thread-6972", id=33108, state=TIMED_WAITING,
> blockCnt=0, waitCnt=17]
>             Lock
> [object=java.util.concurrent.SynchronousQueue$TransferStack@62bdd75c,
> ownerName=null, ownerId=-1]
>                 at sun.misc.Unsafe.park(Native Method)
>                 at
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>                 at
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>                 at
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
>                 at
> java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
>                 at
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
>                 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>                 at java.lang.Thread.run(Thread.java:748)
>
>
> On Thu, 20 Jun 2019 at 10:08, John Smith <ja...@gmail.com> wrote:
>
>> Ok, where do I look for the visor logs when it hangs? And it's not a no
>> caches issue the cluster works great. It when visor cannot reach a specific
>> client node.
>>
>> On Thu., Jun. 20, 2019, 8:45 a.m. Vasiliy Sisko, <vs...@gridgain.com>
>> wrote:
>>
>>> Hello @javadevmtl
>>>
>>> I failed to reproduce your problem.
>>> In case of any error in cache command Visor CMD shows message "No caches
>>> found".
>>> Please provide logs of visor, server and client nodes after command hang.
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>
>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Actually this hapenned when the WIFI node connected. But it never hapenned
before...

[14:51:46,660][INFO][exchange-worker-#43%xxxxxx%][GridDhtPartitionsExchangeFuture]
Completed partition exchange
[localNode=e9e9f4b9-b249-4a4d-87ee-fc97097ad9ee,
exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
[topVer=59, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=45516c37-5ee0-4046-a13a-9573607d25aa, addrs=[0:0:0:0:0:0:0:1,
127.0.0.1, MY_WIFI_IP, MY_WIFI_IP], sockAddrs=[/MY_WIFI_IP:0,
/0:0:0:0:0:0:0:1:0, /127.0.0.1:0, /MY_WIFI_IP:0], discPort=0, order=59,
intOrder=32, lastExchangeTime=1561042306599, loc=false,
ver=2.7.0#20181130-sha1:256ae401, isClient=true], done=true],
topVer=AffinityTopologyVersion [topVer=59, minorTopVer=0],
durationFromInit=0]
[14:51:46,660][INFO][exchange-worker-#43%xxxxxx%][time] Finished exchange
init [topVer=AffinityTopologyVersion [topVer=59, minorTopVer=0], crd=true]
[14:51:46,662][INFO][exchange-worker-#43%xxxxxx%][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=59, minorTopVer=0], force=false, evt=NODE_JOINED,
node=45516c37-5ee0-4046-a13a-9573607d25aa]
[14:51:47,123][INFO][grid-nio-worker-tcp-comm-2-#26%xxxxxx%][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/xxx.xxx.xxx.69:47100,
rmtAddr=/MY_WIFI_IP:62249]
[14:51:59,428][INFO][db-checkpoint-thread-#1068%xxxxxx%][GridCacheDatabaseSharedManager]
Checkpoint started [checkpointId=56e2ea25-7273-49ab-81ac-0fdbc5945626,
startPtr=FileWALPointer [idx=137, fileOff=45790479, len=17995],
checkpointLockWait=0ms, checkpointLockHoldTime=12ms,
walCpRecordFsyncDuration=3ms, pages=242, reason='timeout']
[14:51:59,544][INFO][db-checkpoint-thread-#1068%xxxxxx%][GridCacheDatabaseSharedManager]
Checkpoint finished [cpId=56e2ea25-7273-49ab-81ac-0fdbc5945626, pages=242,
markPos=FileWALPointer [idx=137, fileOff=45790479, len=17995],
walSegmentsCleared=0, walSegmentsCovered=[], markDuration=23ms,
pagesWrite=14ms, fsync=101ms, total=138ms]
[14:52:45,827][INFO][tcp-disco-msg-worker-#2%xxxxxx%][TcpDiscoverySpi]
Local node seems to be disconnected from topology (failure detection
timeout is reached) [failureDetectionTimeout=10000, connCheckInterval=500]
[14:52:45,847][SEVERE][ttl-cleanup-worker-#1652%xxxxxx%][G] Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [threadName=tcp-disco-msg-worker, blockedFor=39s]
[14:52:45,859][INFO][tcp-disco-sock-reader-#36%xxxxxx%][TcpDiscoverySpi]
Finished serving remote node connection [rmtAddr=/xxx.xxx.xxx.76:56861,
rmtPort=56861
[14:52:45,864][WARNING][ttl-cleanup-worker-#1652%xxxxxx%][G] Thread
[name="tcp-disco-msg-worker-#2%xxxxxx%", id=83, state=RUNNABLE, blockCnt=6,
waitCnt=24621465]

[14:52:45,875][SEVERE][ttl-cleanup-worker-#1652%xxxxxx%][] Critical system
error detected. Will be handled accordingly to configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler
[ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext
[type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker
[name=tcp-disco-msg-worker, igniteInstanceName=xxxxxx, finished=false,
heartbeatTs=1561042326687]]]
class org.apache.ignite.IgniteException: GridWorker
[name=tcp-disco-msg-worker, igniteInstanceName=xxxxxx, finished=false,
heartbeatTs=1561042326687]
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
        at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
        at
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
        at
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
        at
org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:151)
        at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
        at java.lang.Thread.run(Thread.java:748)

[14:52:47,974][WARNING][jvm-pause-detector-worker][IgniteKernal%xxxxxx]
Possible too long JVM pause: 2047 milliseconds.
        [14:52:47,994][INFO][tcp-disco-srvr-#3%xxxxxx%][TcpDiscoverySpi]
TCP discovery accepted incoming connection [rmtAddr=/xxx.xxx.xxx.72,
rmtPort=37607]
        [14:52:47,994][INFO][tcp-disco-srvr-#3%xxxxxx%][TcpDiscoverySpi]
TCP discovery spawning a new thread for connection
[rmtAddr=/xxx.xxx.xxx.72, rmtPort=37607]

[14:52:47,996][INFO][tcp-disco-sock-reader-#37%xxxxxx%][TcpDiscoverySpi]
Started serving remote node connection [rmtAddr=/xxx.xxx.xxx.72:37607,
rmtPort=37607]

[14:52:48,005][WARNING][ttl-cleanup-worker-#1652%xxxxxx%][FailureProcessor]
Thread dump at 2019/06/20 14:52:47 UTC
        Thread [name="sys-#25624%xxxxxx%", id=33109, state=TIMED_WAITING,
blockCnt=0, waitCnt=1]
            Lock
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3a9414a4,
ownerName=null, ownerId=-1]
                at sun.misc.Unsafe.park(Native Method)
                at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
                at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
                at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
                at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
                at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
                at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                at java.lang.Thread.run(Thread.java:748)

        Thread [name="Thread-6972", id=33108, state=TIMED_WAITING,
blockCnt=0, waitCnt=17]
            Lock
[object=java.util.concurrent.SynchronousQueue$TransferStack@62bdd75c,
ownerName=null, ownerId=-1]
                at sun.misc.Unsafe.park(Native Method)
                at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
                at
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
                at
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
                at
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
                at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
                at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
                at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                at java.lang.Thread.run(Thread.java:748)

On Thu, 20 Jun 2019 at 10:08, John Smith <ja...@gmail.com> wrote:

> Ok, where do I look for the visor logs when it hangs? And it's not a no
> caches issue the cluster works great. It when visor cannot reach a specific
> client node.
>
> On Thu., Jun. 20, 2019, 8:45 a.m. Vasiliy Sisko, <vs...@gridgain.com>
> wrote:
>
>> Hello @javadevmtl
>>
>> I failed to reproduce your problem.
>> In case of any error in cache command Visor CMD shows message "No caches
>> found".
>> Please provide logs of visor, server and client nodes after command hang.
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Ok, where do I look for the visor logs when it hangs? And it's not a no
caches issue the cluster works great. It when visor cannot reach a specific
client node.

On Thu., Jun. 20, 2019, 8:45 a.m. Vasiliy Sisko, <vs...@gridgain.com>
wrote:

> Hello @javadevmtl
>
> I failed to reproduce your problem.
> In case of any error in cache command Visor CMD shows message "No caches
> found".
> Please provide logs of visor, server and client nodes after command hang.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by Vasiliy Sisko <vs...@gridgain.com>.

Hello @javadevmtl

I failed to reproduce your problem. 
In case of any error in cache command Visor CMD shows message "No caches
found".
Please provide logs of visor, server and client nodes after command hang.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Correct and this is a pure practical issue. I can even imagine scenario
where you have a cluster and for compliance reasons visor is running in a
demilitarized zone.

And all I'm saying is that the visor CACHE command or any for that matter
should not hang waiting to connect to specific clients.

It should maybe timeout indicate so and get the info it has at least. Or
maybe just give us the server nodes/info if available.

That's where I would like your opinion on it.

On Tue, 18 Jun 2019 at 22:51, Denis Magda <dm...@gridgain.com> wrote:

> John,
>
> Sure, you’re right that Visor is the tool for management and monitoring.
> Not sure that Ilya’s statement makes a practical sense.
>
> Looping in our Visor experts. Alexey, Yury, could you please check out the
> issue?
>
> Denis
>
> On Tuesday, June 18, 2019, John Smith <ja...@gmail.com> wrote:
>
>> Ok but visor is used to get info on cache etc... So it just hangs on
>> client's it cannot reach. Maybe it should have a timeout if it can't reach
>> the specific node? Or does it have one but it's super high?
>> Or if it knows it's a client node then to handle it differently?
>>
>>
>> On Tue, 18 Jun 2019 at 10:57, Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> Visor is not the tool to debug cluster. control.sh probably is.
>>>
>>> Visor is a node in topology (a daemon node, but still) and as such it
>>> follows the same limitations as any other node.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пт, 14 июн. 2019 г. в 22:41, John Smith <ja...@gmail.com>:
>>>
>>>> Hi, It's 100% that.
>>>>
>>>> I'm just stating that my applications run inside a container network
>>>> and the Ignite is installed on it's own VMS. The networks see each other
>>>> and this works. Also Visor can connect. No problems.
>>>> It's only when for example we have a dev machine connect from WIFI and
>>>> while a full mesh cluster is created VISOR cannot reach that node.
>>>> Or what if a badly configured client connects and causes this issue.
>>>>
>>>> All I'm saying if Ignite Visor is THE TOOL to debug and check cluster
>>>> state etc... It's a bit odd that it hangs for ever if it cannot reach a
>>>> specific client. I think that Visor/the protocol should know that it's a
>>>> CLIENT ONLY and not try to get stats from it. What do you think?
>>>>
>>>>
>>>>
>>>> On Thu, 13 Jun 2019 at 09:52, Ilya Kasnacheev <
>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> Please enable verbose logging and share logs from both visor, client
>>>>> and server nodes, so that we could check that.
>>>>>
>>>>> There should be messages related to connection attempts.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> чт, 13 июн. 2019 г. в 00:06, John Smith <ja...@gmail.com>:
>>>>>
>>>>>> The clients are in the same low latency network, but they are running
>>>>>> inside container network. While ignite is running on it's own cluster. So
>>>>>> from that stand point they all see each other...
>>>>>>
>>>>>> On Wed, 12 Jun 2019 at 17:04, John Smith <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Ok thanks
>>>>>>>
>>>>>>> On Mon, 10 Jun 2019 at 04:48, Ilya Kasnacheev <
>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello!
>>>>>>>>
>>>>>>>> As a rule, a faulty thick client can destabilize a cluster.
>>>>>>>> Ignite's architecture assumes that all clients are collocated, i.e. that
>>>>>>>> the network between any two nodes (including clients) is reliable, fast and
>>>>>>>> low-latency.
>>>>>>>>
>>>>>>>> It is not recommended to connect thick clients from different
>>>>>>>> networks. Use thin clients where possible.
>>>>>>>>
>>>>>>>> You can file a ticket against Apache Ignite JIRA regarding visor
>>>>>>>> behavior if you like.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> --
>>>>>>>> Ilya Kasnacheev
>>>>>>>>
>>>>>>>>
>>>>>>>> пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:
>>>>>>>>
>>>>>>>>> Correct. Should it not at least timeout and at least show what it
>>>>>>>>> has available? Basically we have a central cluster and various clients
>>>>>>>>> connect to it from different networks. As an example: Docker containers.
>>>>>>>>>
>>>>>>>>> We make sure that the clients are client nodes only and we avoid
>>>>>>>>> creating any caches on clients.
>>>>>>>>>
>>>>>>>>> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <
>>>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello!
>>>>>>>>>>
>>>>>>>>>> I think that Visor will talk to all nodes when trying to run
>>>>>>>>>> caches command, and if it can't reach client nodes the operation will never
>>>>>>>>>> finish.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> --
>>>>>>>>>> Ilya Kasnacheev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> Hi, any thoughts on this?
>>>>>>>>>>>
>>>>>>>>>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I think it should at least time out and show stats of the nodes
>>>>>>>>>>>> it could reach? I don't see why it's dependant on client nodes.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, 30 May 2019 at 11:58, John Smith <
>>>>>>>>>>>> java.dev.mtl@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry pressed enter to quickly....
>>>>>>>>>>>>>
>>>>>>>>>>>>> So basically I'm 100% sure if visor cache command cannot reach
>>>>>>>>>>>>> the client node then it just stays there not doing anything.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, 30 May 2019 at 11:57, John Smith <
>>>>>>>>>>>>> java.dev.mtl@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi, running 2.7.0
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>>>>>>>>>> - I have clients connecting and doing what they need to do.
>>>>>>>>>>>>>> - The clients are set as client = true.
>>>>>>>>>>>>>> - The clients are also connecting from various parts of the
>>>>>>>>>>>>>> network.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem with ignite visor cache command is if visor
>>>>>>>>>>>>>> cannot reach a specific client node it just seems to hang indefinitely.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>>>>>>>>>> visor> cache
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It just stays like that no errors printed nothing...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>
> --
> --
> Denis Magda
>
>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by Denis Magda <dm...@gridgain.com>.

John,

Sure, you’re right that Visor is the tool for management and monitoring.
Not sure that Ilya’s statement makes a practical sense.

Looping in our Visor experts. Alexey, Yury, could you please check out the
issue?

Denis

On Tuesday, June 18, 2019, John Smith <ja...@gmail.com> wrote:

> Ok but visor is used to get info on cache etc... So it just hangs on
> client's it cannot reach. Maybe it should have a timeout if it can't reach
> the specific node? Or does it have one but it's super high?
> Or if it knows it's a client node then to handle it differently?
>
>
> On Tue, 18 Jun 2019 at 10:57, Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> Visor is not the tool to debug cluster. control.sh probably is.
>>
>> Visor is a node in topology (a daemon node, but still) and as such it
>> follows the same limitations as any other node.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пт, 14 июн. 2019 г. в 22:41, John Smith <ja...@gmail.com>:
>>
>>> Hi, It's 100% that.
>>>
>>> I'm just stating that my applications run inside a container network and
>>> the Ignite is installed on it's own VMS. The networks see each other and
>>> this works. Also Visor can connect. No problems.
>>> It's only when for example we have a dev machine connect from WIFI and
>>> while a full mesh cluster is created VISOR cannot reach that node.
>>> Or what if a badly configured client connects and causes this issue.
>>>
>>> All I'm saying if Ignite Visor is THE TOOL to debug and check cluster
>>> state etc... It's a bit odd that it hangs for ever if it cannot reach a
>>> specific client. I think that Visor/the protocol should know that it's a
>>> CLIENT ONLY and not try to get stats from it. What do you think?
>>>
>>>
>>>
>>> On Thu, 13 Jun 2019 at 09:52, Ilya Kasnacheev <il...@gmail.com>
>>> wrote:
>>>
>>>> Hello!
>>>>
>>>> Please enable verbose logging and share logs from both visor, client
>>>> and server nodes, so that we could check that.
>>>>
>>>> There should be messages related to connection attempts.
>>>>
>>>> Regards,
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> чт, 13 июн. 2019 г. в 00:06, John Smith <ja...@gmail.com>:
>>>>
>>>>> The clients are in the same low latency network, but they are running
>>>>> inside container network. While ignite is running on it's own cluster. So
>>>>> from that stand point they all see each other...
>>>>>
>>>>> On Wed, 12 Jun 2019 at 17:04, John Smith <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Ok thanks
>>>>>>
>>>>>> On Mon, 10 Jun 2019 at 04:48, Ilya Kasnacheev <
>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> As a rule, a faulty thick client can destabilize a cluster. Ignite's
>>>>>>> architecture assumes that all clients are collocated, i.e. that the network
>>>>>>> between any two nodes (including clients) is reliable, fast and low-latency.
>>>>>>>
>>>>>>> It is not recommended to connect thick clients from different
>>>>>>> networks. Use thin clients where possible.
>>>>>>>
>>>>>>> You can file a ticket against Apache Ignite JIRA regarding visor
>>>>>>> behavior if you like.
>>>>>>>
>>>>>>> Regards,
>>>>>>> --
>>>>>>> Ilya Kasnacheev
>>>>>>>
>>>>>>>
>>>>>>> пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:
>>>>>>>
>>>>>>>> Correct. Should it not at least timeout and at least show what it
>>>>>>>> has available? Basically we have a central cluster and various clients
>>>>>>>> connect to it from different networks. As an example: Docker containers.
>>>>>>>>
>>>>>>>> We make sure that the clients are client nodes only and we avoid
>>>>>>>> creating any caches on clients.
>>>>>>>>
>>>>>>>> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <
>>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello!
>>>>>>>>>
>>>>>>>>> I think that Visor will talk to all nodes when trying to run
>>>>>>>>> caches command, and if it can't reach client nodes the operation will never
>>>>>>>>> finish.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> --
>>>>>>>>> Ilya Kasnacheev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>>>>>>>>
>>>>>>>>>> Hi, any thoughts on this?
>>>>>>>>>>
>>>>>>>>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I think it should at least time out and show stats of the nodes
>>>>>>>>>>> it could reach? I don't see why it's dependant on client nodes.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Sorry pressed enter to quickly....
>>>>>>>>>>>>
>>>>>>>>>>>> So basically I'm 100% sure if visor cache command cannot reach
>>>>>>>>>>>> the client node then it just stays there not doing anything.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, 30 May 2019 at 11:57, John Smith <
>>>>>>>>>>>> java.dev.mtl@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi, running 2.7.0
>>>>>>>>>>>>>
>>>>>>>>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>>>>>>>>> - I have clients connecting and doing what they need to do.
>>>>>>>>>>>>> - The clients are set as client = true.
>>>>>>>>>>>>> - The clients are also connecting from various parts of the
>>>>>>>>>>>>> network.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem with ignite visor cache command is if visor cannot
>>>>>>>>>>>>> reach a specific client node it just seems to hang indefinitely.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>>>>>>>>> visor> cache
>>>>>>>>>>>>>
>>>>>>>>>>>>> It just stays like that no errors printed nothing...
>>>>>>>>>>>>>
>>>>>>>>>>>>

-- 
--
Denis Magda

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Ok but visor is used to get info on cache etc... So it just hangs on
client's it cannot reach. Maybe it should have a timeout if it can't reach
the specific node? Or does it have one but it's super high?
Or if it knows it's a client node then to handle it differently?


On Tue, 18 Jun 2019 at 10:57, Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> Visor is not the tool to debug cluster. control.sh probably is.
>
> Visor is a node in topology (a daemon node, but still) and as such it
> follows the same limitations as any other node.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пт, 14 июн. 2019 г. в 22:41, John Smith <ja...@gmail.com>:
>
>> Hi, It's 100% that.
>>
>> I'm just stating that my applications run inside a container network and
>> the Ignite is installed on it's own VMS. The networks see each other and
>> this works. Also Visor can connect. No problems.
>> It's only when for example we have a dev machine connect from WIFI and
>> while a full mesh cluster is created VISOR cannot reach that node.
>> Or what if a badly configured client connects and causes this issue.
>>
>> All I'm saying if Ignite Visor is THE TOOL to debug and check cluster
>> state etc... It's a bit odd that it hangs for ever if it cannot reach a
>> specific client. I think that Visor/the protocol should know that it's a
>> CLIENT ONLY and not try to get stats from it. What do you think?
>>
>>
>>
>> On Thu, 13 Jun 2019 at 09:52, Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> Please enable verbose logging and share logs from both visor, client and
>>> server nodes, so that we could check that.
>>>
>>> There should be messages related to connection attempts.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> чт, 13 июн. 2019 г. в 00:06, John Smith <ja...@gmail.com>:
>>>
>>>> The clients are in the same low latency network, but they are running
>>>> inside container network. While ignite is running on it's own cluster. So
>>>> from that stand point they all see each other...
>>>>
>>>> On Wed, 12 Jun 2019 at 17:04, John Smith <ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> Ok thanks
>>>>>
>>>>> On Mon, 10 Jun 2019 at 04:48, Ilya Kasnacheev <
>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> As a rule, a faulty thick client can destabilize a cluster. Ignite's
>>>>>> architecture assumes that all clients are collocated, i.e. that the network
>>>>>> between any two nodes (including clients) is reliable, fast and low-latency.
>>>>>>
>>>>>> It is not recommended to connect thick clients from different
>>>>>> networks. Use thin clients where possible.
>>>>>>
>>>>>> You can file a ticket against Apache Ignite JIRA regarding visor
>>>>>> behavior if you like.
>>>>>>
>>>>>> Regards,
>>>>>> --
>>>>>> Ilya Kasnacheev
>>>>>>
>>>>>>
>>>>>> пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:
>>>>>>
>>>>>>> Correct. Should it not at least timeout and at least show what it
>>>>>>> has available? Basically we have a central cluster and various clients
>>>>>>> connect to it from different networks. As an example: Docker containers.
>>>>>>>
>>>>>>> We make sure that the clients are client nodes only and we avoid
>>>>>>> creating any caches on clients.
>>>>>>>
>>>>>>> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <
>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello!
>>>>>>>>
>>>>>>>> I think that Visor will talk to all nodes when trying to run caches
>>>>>>>> command, and if it can't reach client nodes the operation will never finish.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> --
>>>>>>>> Ilya Kasnacheev
>>>>>>>>
>>>>>>>>
>>>>>>>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>>>>>>>
>>>>>>>>> Hi, any thoughts on this?
>>>>>>>>>
>>>>>>>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I think it should at least time out and show stats of the nodes
>>>>>>>>>> it could reach? I don't see why it's dependant on client nodes.
>>>>>>>>>>
>>>>>>>>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Sorry pressed enter to quickly....
>>>>>>>>>>>
>>>>>>>>>>> So basically I'm 100% sure if visor cache command cannot reach
>>>>>>>>>>> the client node then it just stays there not doing anything.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi, running 2.7.0
>>>>>>>>>>>>
>>>>>>>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>>>>>>>> - I have clients connecting and doing what they need to do.
>>>>>>>>>>>> - The clients are set as client = true.
>>>>>>>>>>>> - The clients are also connecting from various parts of the
>>>>>>>>>>>> network.
>>>>>>>>>>>>
>>>>>>>>>>>> The problem with ignite visor cache command is if visor cannot
>>>>>>>>>>>> reach a specific client node it just seems to hang indefinitely.
>>>>>>>>>>>>
>>>>>>>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>>>>>>>> visor> cache
>>>>>>>>>>>>
>>>>>>>>>>>> It just stays like that no errors printed nothing...
>>>>>>>>>>>>
>>>>>>>>>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

Visor is not the tool to debug cluster. control.sh probably is.

Visor is a node in topology (a daemon node, but still) and as such it
follows the same limitations as any other node.

Regards,
-- 
Ilya Kasnacheev


пт, 14 июн. 2019 г. в 22:41, John Smith <ja...@gmail.com>:

> Hi, It's 100% that.
>
> I'm just stating that my applications run inside a container network and
> the Ignite is installed on it's own VMS. The networks see each other and
> this works. Also Visor can connect. No problems.
> It's only when for example we have a dev machine connect from WIFI and
> while a full mesh cluster is created VISOR cannot reach that node.
> Or what if a badly configured client connects and causes this issue.
>
> All I'm saying if Ignite Visor is THE TOOL to debug and check cluster
> state etc... It's a bit odd that it hangs for ever if it cannot reach a
> specific client. I think that Visor/the protocol should know that it's a
> CLIENT ONLY and not try to get stats from it. What do you think?
>
>
>
> On Thu, 13 Jun 2019 at 09:52, Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> Please enable verbose logging and share logs from both visor, client and
>> server nodes, so that we could check that.
>>
>> There should be messages related to connection attempts.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> чт, 13 июн. 2019 г. в 00:06, John Smith <ja...@gmail.com>:
>>
>>> The clients are in the same low latency network, but they are running
>>> inside container network. While ignite is running on it's own cluster. So
>>> from that stand point they all see each other...
>>>
>>> On Wed, 12 Jun 2019 at 17:04, John Smith <ja...@gmail.com> wrote:
>>>
>>>> Ok thanks
>>>>
>>>> On Mon, 10 Jun 2019 at 04:48, Ilya Kasnacheev <
>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> As a rule, a faulty thick client can destabilize a cluster. Ignite's
>>>>> architecture assumes that all clients are collocated, i.e. that the network
>>>>> between any two nodes (including clients) is reliable, fast and low-latency.
>>>>>
>>>>> It is not recommended to connect thick clients from different
>>>>> networks. Use thin clients where possible.
>>>>>
>>>>> You can file a ticket against Apache Ignite JIRA regarding visor
>>>>> behavior if you like.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:
>>>>>
>>>>>> Correct. Should it not at least timeout and at least show what it has
>>>>>> available? Basically we have a central cluster and various clients connect
>>>>>> to it from different networks. As an example: Docker containers.
>>>>>>
>>>>>> We make sure that the clients are client nodes only and we avoid
>>>>>> creating any caches on clients.
>>>>>>
>>>>>> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <
>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> I think that Visor will talk to all nodes when trying to run caches
>>>>>>> command, and if it can't reach client nodes the operation will never finish.
>>>>>>>
>>>>>>> Regards,
>>>>>>> --
>>>>>>> Ilya Kasnacheev
>>>>>>>
>>>>>>>
>>>>>>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>>>>>>
>>>>>>>> Hi, any thoughts on this?
>>>>>>>>
>>>>>>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I think it should at least time out and show stats of the nodes it
>>>>>>>>> could reach? I don't see why it's dependant on client nodes.
>>>>>>>>>
>>>>>>>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Sorry pressed enter to quickly....
>>>>>>>>>>
>>>>>>>>>> So basically I'm 100% sure if visor cache command cannot reach
>>>>>>>>>> the client node then it just stays there not doing anything.
>>>>>>>>>>
>>>>>>>>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi, running 2.7.0
>>>>>>>>>>>
>>>>>>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>>>>>>> - I have clients connecting and doing what they need to do.
>>>>>>>>>>> - The clients are set as client = true.
>>>>>>>>>>> - The clients are also connecting from various parts of the
>>>>>>>>>>> network.
>>>>>>>>>>>
>>>>>>>>>>> The problem with ignite visor cache command is if visor cannot
>>>>>>>>>>> reach a specific client node it just seems to hang indefinitely.
>>>>>>>>>>>
>>>>>>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>>>>>>> visor> cache
>>>>>>>>>>>
>>>>>>>>>>> It just stays like that no errors printed nothing...
>>>>>>>>>>>
>>>>>>>>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Hi, It's 100% that.

I'm just stating that my applications run inside a container network and
the Ignite is installed on it's own VMS. The networks see each other and
this works. Also Visor can connect. No problems.
It's only when for example we have a dev machine connect from WIFI and
while a full mesh cluster is created VISOR cannot reach that node.
Or what if a badly configured client connects and causes this issue.

All I'm saying if Ignite Visor is THE TOOL to debug and check cluster state
etc... It's a bit odd that it hangs for ever if it cannot reach a specific
client. I think that Visor/the protocol should know that it's a CLIENT ONLY
and not try to get stats from it. What do you think?



On Thu, 13 Jun 2019 at 09:52, Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> Please enable verbose logging and share logs from both visor, client and
> server nodes, so that we could check that.
>
> There should be messages related to connection attempts.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> чт, 13 июн. 2019 г. в 00:06, John Smith <ja...@gmail.com>:
>
>> The clients are in the same low latency network, but they are running
>> inside container network. While ignite is running on it's own cluster. So
>> from that stand point they all see each other...
>>
>> On Wed, 12 Jun 2019 at 17:04, John Smith <ja...@gmail.com> wrote:
>>
>>> Ok thanks
>>>
>>> On Mon, 10 Jun 2019 at 04:48, Ilya Kasnacheev <il...@gmail.com>
>>> wrote:
>>>
>>>> Hello!
>>>>
>>>> As a rule, a faulty thick client can destabilize a cluster. Ignite's
>>>> architecture assumes that all clients are collocated, i.e. that the network
>>>> between any two nodes (including clients) is reliable, fast and low-latency.
>>>>
>>>> It is not recommended to connect thick clients from different networks.
>>>> Use thin clients where possible.
>>>>
>>>> You can file a ticket against Apache Ignite JIRA regarding visor
>>>> behavior if you like.
>>>>
>>>> Regards,
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:
>>>>
>>>>> Correct. Should it not at least timeout and at least show what it has
>>>>> available? Basically we have a central cluster and various clients connect
>>>>> to it from different networks. As an example: Docker containers.
>>>>>
>>>>> We make sure that the clients are client nodes only and we avoid
>>>>> creating any caches on clients.
>>>>>
>>>>> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <
>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> I think that Visor will talk to all nodes when trying to run caches
>>>>>> command, and if it can't reach client nodes the operation will never finish.
>>>>>>
>>>>>> Regards,
>>>>>> --
>>>>>> Ilya Kasnacheev
>>>>>>
>>>>>>
>>>>>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>>>>>
>>>>>>> Hi, any thoughts on this?
>>>>>>>
>>>>>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I think it should at least time out and show stats of the nodes it
>>>>>>>> could reach? I don't see why it's dependant on client nodes.
>>>>>>>>
>>>>>>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Sorry pressed enter to quickly....
>>>>>>>>>
>>>>>>>>> So basically I'm 100% sure if visor cache command cannot reach the
>>>>>>>>> client node then it just stays there not doing anything.
>>>>>>>>>
>>>>>>>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi, running 2.7.0
>>>>>>>>>>
>>>>>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>>>>>> - I have clients connecting and doing what they need to do.
>>>>>>>>>> - The clients are set as client = true.
>>>>>>>>>> - The clients are also connecting from various parts of the
>>>>>>>>>> network.
>>>>>>>>>>
>>>>>>>>>> The problem with ignite visor cache command is if visor cannot
>>>>>>>>>> reach a specific client node it just seems to hang indefinitely.
>>>>>>>>>>
>>>>>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>>>>>> visor> cache
>>>>>>>>>>
>>>>>>>>>> It just stays like that no errors printed nothing...
>>>>>>>>>>
>>>>>>>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

Please enable verbose logging and share logs from both visor, client and
server nodes, so that we could check that.

There should be messages related to connection attempts.

Regards,
-- 
Ilya Kasnacheev


чт, 13 июн. 2019 г. в 00:06, John Smith <ja...@gmail.com>:

> The clients are in the same low latency network, but they are running
> inside container network. While ignite is running on it's own cluster. So
> from that stand point they all see each other...
>
> On Wed, 12 Jun 2019 at 17:04, John Smith <ja...@gmail.com> wrote:
>
>> Ok thanks
>>
>> On Mon, 10 Jun 2019 at 04:48, Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> As a rule, a faulty thick client can destabilize a cluster. Ignite's
>>> architecture assumes that all clients are collocated, i.e. that the network
>>> between any two nodes (including clients) is reliable, fast and low-latency.
>>>
>>> It is not recommended to connect thick clients from different networks.
>>> Use thin clients where possible.
>>>
>>> You can file a ticket against Apache Ignite JIRA regarding visor
>>> behavior if you like.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:
>>>
>>>> Correct. Should it not at least timeout and at least show what it has
>>>> available? Basically we have a central cluster and various clients connect
>>>> to it from different networks. As an example: Docker containers.
>>>>
>>>> We make sure that the clients are client nodes only and we avoid
>>>> creating any caches on clients.
>>>>
>>>> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <il...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> I think that Visor will talk to all nodes when trying to run caches
>>>>> command, and if it can't reach client nodes the operation will never finish.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>>>>
>>>>>> Hi, any thoughts on this?
>>>>>>
>>>>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I think it should at least time out and show stats of the nodes it
>>>>>>> could reach? I don't see why it's dependant on client nodes.
>>>>>>>
>>>>>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Sorry pressed enter to quickly....
>>>>>>>>
>>>>>>>> So basically I'm 100% sure if visor cache command cannot reach the
>>>>>>>> client node then it just stays there not doing anything.
>>>>>>>>
>>>>>>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi, running 2.7.0
>>>>>>>>>
>>>>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>>>>> - I have clients connecting and doing what they need to do.
>>>>>>>>> - The clients are set as client = true.
>>>>>>>>> - The clients are also connecting from various parts of the
>>>>>>>>> network.
>>>>>>>>>
>>>>>>>>> The problem with ignite visor cache command is if visor cannot
>>>>>>>>> reach a specific client node it just seems to hang indefinitely.
>>>>>>>>>
>>>>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>>>>> visor> cache
>>>>>>>>>
>>>>>>>>> It just stays like that no errors printed nothing...
>>>>>>>>>
>>>>>>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

The clients are in the same low latency network, but they are running
inside container network. While ignite is running on it's own cluster. So
from that stand point they all see each other...

On Wed, 12 Jun 2019 at 17:04, John Smith <ja...@gmail.com> wrote:

> Ok thanks
>
> On Mon, 10 Jun 2019 at 04:48, Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> As a rule, a faulty thick client can destabilize a cluster. Ignite's
>> architecture assumes that all clients are collocated, i.e. that the network
>> between any two nodes (including clients) is reliable, fast and low-latency.
>>
>> It is not recommended to connect thick clients from different networks.
>> Use thin clients where possible.
>>
>> You can file a ticket against Apache Ignite JIRA regarding visor behavior
>> if you like.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:
>>
>>> Correct. Should it not at least timeout and at least show what it has
>>> available? Basically we have a central cluster and various clients connect
>>> to it from different networks. As an example: Docker containers.
>>>
>>> We make sure that the clients are client nodes only and we avoid
>>> creating any caches on clients.
>>>
>>> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <il...@gmail.com>
>>> wrote:
>>>
>>>> Hello!
>>>>
>>>> I think that Visor will talk to all nodes when trying to run caches
>>>> command, and if it can't reach client nodes the operation will never finish.
>>>>
>>>> Regards,
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>>>
>>>>> Hi, any thoughts on this?
>>>>>
>>>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I think it should at least time out and show stats of the nodes it
>>>>>> could reach? I don't see why it's dependant on client nodes.
>>>>>>
>>>>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Sorry pressed enter to quickly....
>>>>>>>
>>>>>>> So basically I'm 100% sure if visor cache command cannot reach the
>>>>>>> client node then it just stays there not doing anything.
>>>>>>>
>>>>>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi, running 2.7.0
>>>>>>>>
>>>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>>>> - I have clients connecting and doing what they need to do.
>>>>>>>> - The clients are set as client = true.
>>>>>>>> - The clients are also connecting from various parts of the network.
>>>>>>>>
>>>>>>>> The problem with ignite visor cache command is if visor cannot
>>>>>>>> reach a specific client node it just seems to hang indefinitely.
>>>>>>>>
>>>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>>>> visor> cache
>>>>>>>>
>>>>>>>> It just stays like that no errors printed nothing...
>>>>>>>>
>>>>>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Ok thanks

On Mon, 10 Jun 2019 at 04:48, Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> As a rule, a faulty thick client can destabilize a cluster. Ignite's
> architecture assumes that all clients are collocated, i.e. that the network
> between any two nodes (including clients) is reliable, fast and low-latency.
>
> It is not recommended to connect thick clients from different networks.
> Use thin clients where possible.
>
> You can file a ticket against Apache Ignite JIRA regarding visor behavior
> if you like.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:
>
>> Correct. Should it not at least timeout and at least show what it has
>> available? Basically we have a central cluster and various clients connect
>> to it from different networks. As an example: Docker containers.
>>
>> We make sure that the clients are client nodes only and we avoid creating
>> any caches on clients.
>>
>> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> I think that Visor will talk to all nodes when trying to run caches
>>> command, and if it can't reach client nodes the operation will never finish.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>>
>>>> Hi, any thoughts on this?
>>>>
>>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> I think it should at least time out and show stats of the nodes it
>>>>> could reach? I don't see why it's dependant on client nodes.
>>>>>
>>>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Sorry pressed enter to quickly....
>>>>>>
>>>>>> So basically I'm 100% sure if visor cache command cannot reach the
>>>>>> client node then it just stays there not doing anything.
>>>>>>
>>>>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi, running 2.7.0
>>>>>>>
>>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>>> - I have clients connecting and doing what they need to do.
>>>>>>> - The clients are set as client = true.
>>>>>>> - The clients are also connecting from various parts of the network.
>>>>>>>
>>>>>>> The problem with ignite visor cache command is if visor cannot reach
>>>>>>> a specific client node it just seems to hang indefinitely.
>>>>>>>
>>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>>> visor> cache
>>>>>>>
>>>>>>> It just stays like that no errors printed nothing...
>>>>>>>
>>>>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

As a rule, a faulty thick client can destabilize a cluster. Ignite's
architecture assumes that all clients are collocated, i.e. that the network
between any two nodes (including clients) is reliable, fast and low-latency.

It is not recommended to connect thick clients from different networks. Use
thin clients where possible.

You can file a ticket against Apache Ignite JIRA regarding visor behavior
if you like.

Regards,
-- 
Ilya Kasnacheev


пт, 7 июн. 2019 г. в 23:15, John Smith <ja...@gmail.com>:

> Correct. Should it not at least timeout and at least show what it has
> available? Basically we have a central cluster and various clients connect
> to it from different networks. As an example: Docker containers.
>
> We make sure that the clients are client nodes only and we avoid creating
> any caches on clients.
>
> On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> I think that Visor will talk to all nodes when trying to run caches
>> command, and if it can't reach client nodes the operation will never finish.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>>
>>> Hi, any thoughts on this?
>>>
>>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com> wrote:
>>>
>>>> I think it should at least time out and show stats of the nodes it
>>>> could reach? I don't see why it's dependant on client nodes.
>>>>
>>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> Sorry pressed enter to quickly....
>>>>>
>>>>> So basically I'm 100% sure if visor cache command cannot reach the
>>>>> client node then it just stays there not doing anything.
>>>>>
>>>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, running 2.7.0
>>>>>>
>>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>>> - I have clients connecting and doing what they need to do.
>>>>>> - The clients are set as client = true.
>>>>>> - The clients are also connecting from various parts of the network.
>>>>>>
>>>>>> The problem with ignite visor cache command is if visor cannot reach
>>>>>> a specific client node it just seems to hang indefinitely.
>>>>>>
>>>>>> Choose node number ('c' to cancel) [0]: c
>>>>>> visor> cache
>>>>>>
>>>>>> It just stays like that no errors printed nothing...
>>>>>>
>>>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Correct. Should it not at least timeout and at least show what it has
available? Basically we have a central cluster and various clients connect
to it from different networks. As an example: Docker containers.

We make sure that the clients are client nodes only and we avoid creating
any caches on clients.

On Fri, 7 Jun 2019 at 10:19, Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> I think that Visor will talk to all nodes when trying to run caches
> command, and if it can't reach client nodes the operation will never finish.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:
>
>> Hi, any thoughts on this?
>>
>> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com> wrote:
>>
>>> I think it should at least time out and show stats of the nodes it could
>>> reach? I don't see why it's dependant on client nodes.
>>>
>>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com> wrote:
>>>
>>>> Sorry pressed enter to quickly....
>>>>
>>>> So basically I'm 100% sure if visor cache command cannot reach the
>>>> client node then it just stays there not doing anything.
>>>>
>>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, running 2.7.0
>>>>>
>>>>> - I have a 4 node cluster and it seems to be running ok.
>>>>> - I have clients connecting and doing what they need to do.
>>>>> - The clients are set as client = true.
>>>>> - The clients are also connecting from various parts of the network.
>>>>>
>>>>> The problem with ignite visor cache command is if visor cannot reach a
>>>>> specific client node it just seems to hang indefinitely.
>>>>>
>>>>> Choose node number ('c' to cancel) [0]: c
>>>>> visor> cache
>>>>>
>>>>> It just stays like that no errors printed nothing...
>>>>>
>>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

I think that Visor will talk to all nodes when trying to run caches
command, and if it can't reach client nodes the operation will never finish.

Regards,
-- 
Ilya Kasnacheev


ср, 5 июн. 2019 г. в 22:34, John Smith <ja...@gmail.com>:

> Hi, any thoughts on this?
>
> On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com> wrote:
>
>> I think it should at least time out and show stats of the nodes it could
>> reach? I don't see why it's dependant on client nodes.
>>
>> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com> wrote:
>>
>>> Sorry pressed enter to quickly....
>>>
>>> So basically I'm 100% sure if visor cache command cannot reach the
>>> client node then it just stays there not doing anything.
>>>
>>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com> wrote:
>>>
>>>> Hi, running 2.7.0
>>>>
>>>> - I have a 4 node cluster and it seems to be running ok.
>>>> - I have clients connecting and doing what they need to do.
>>>> - The clients are set as client = true.
>>>> - The clients are also connecting from various parts of the network.
>>>>
>>>> The problem with ignite visor cache command is if visor cannot reach a
>>>> specific client node it just seems to hang indefinitely.
>>>>
>>>> Choose node number ('c' to cancel) [0]: c
>>>> visor> cache
>>>>
>>>> It just stays like that no errors printed nothing...
>>>>
>>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Hi, any thoughts on this?

On Fri, 31 May 2019 at 10:21, John Smith <ja...@gmail.com> wrote:

> I think it should at least time out and show stats of the nodes it could
> reach? I don't see why it's dependant on client nodes.
>
> On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com> wrote:
>
>> Sorry pressed enter to quickly....
>>
>> So basically I'm 100% sure if visor cache command cannot reach the client
>> node then it just stays there not doing anything.
>>
>> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com> wrote:
>>
>>> Hi, running 2.7.0
>>>
>>> - I have a 4 node cluster and it seems to be running ok.
>>> - I have clients connecting and doing what they need to do.
>>> - The clients are set as client = true.
>>> - The clients are also connecting from various parts of the network.
>>>
>>> The problem with ignite visor cache command is if visor cannot reach a
>>> specific client node it just seems to hang indefinitely.
>>>
>>> Choose node number ('c' to cancel) [0]: c
>>> visor> cache
>>>
>>> It just stays like that no errors printed nothing...
>>>
>>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

I think it should at least time out and show stats of the nodes it could
reach? I don't see why it's dependant on client nodes.

On Thu, 30 May 2019 at 11:58, John Smith <ja...@gmail.com> wrote:

> Sorry pressed enter to quickly....
>
> So basically I'm 100% sure if visor cache command cannot reach the client
> node then it just stays there not doing anything.
>
> On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com> wrote:
>
>> Hi, running 2.7.0
>>
>> - I have a 4 node cluster and it seems to be running ok.
>> - I have clients connecting and doing what they need to do.
>> - The clients are set as client = true.
>> - The clients are also connecting from various parts of the network.
>>
>> The problem with ignite visor cache command is if visor cannot reach a
>> specific client node it just seems to hang indefinitely.
>>
>> Choose node number ('c' to cancel) [0]: c
>> visor> cache
>>
>> It just stays like that no errors printed nothing...
>>
>

Re: Ignite Visor Cache command hangs indefinitely.

Posted by John Smith <ja...@gmail.com>.

Sorry pressed enter to quickly....

So basically I'm 100% sure if visor cache command cannot reach the client
node then it just stays there not doing anything.

On Thu, 30 May 2019 at 11:57, John Smith <ja...@gmail.com> wrote:

> Hi, running 2.7.0
>
> - I have a 4 node cluster and it seems to be running ok.
> - I have clients connecting and doing what they need to do.
> - The clients are set as client = true.
> - The clients are also connecting from various parts of the network.
>
> The problem with ignite visor cache command is if visor cannot reach a
> specific client node it just seems to hang indefinitely.
>
> Choose node number ('c' to cancel) [0]: c
> visor> cache
>
> It just stays like that no errors printed nothing...
>