You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by bbellrose <br...@wabtec.com> on 2020/10/14 13:00:04 UTC

Ignite instance fails. Unsure as to root cause

Oct 12 17:47:39 nalrcsvridbq02 Ignite[2031634]: [17:47:39] Possible failure
suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=nio-acceptor-tcp-comm,
igniteInstanceName=RailConnect Ignite QA Grid, finished=false,
heartbeatTs=1602539219763]]]
Oct 12 17:48:20 nalrcsvridbq02 Ignite[2031634]: [17:48:20] Possible failure
suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-1,
igniteInstanceName=RailConnect Ignite QA Grid, finished=false,
heartbeatTs=1602539260020]]]
Oct 12 17:48:20 nalrcsvridbq02 Ignite[2031634]: [17:48:20] Possible failure
suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=ttl-cleanup-worker,
igniteInstanceName=RailConnect Ignite QA Grid, finished=false,
heartbeatTs=1602539260020]]]
Oct 12 17:48:20 nalrcsvridbq02 Ignite[2031634]: [17:48:20] Possible failure
suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=db-checkpoint-thread,
igniteInstanceName=RailConnect Ignite QA Grid, finished=false,
heartbeatTs=1602539260006]]]
Oct 12 17:49:00 nalrcsvridbq02 chronyd[1216]: Forward time jump detected!
Oct 12 17:49:00 nalrcsvridbq02 chronyd[1216]: Can't synchronise: no
selectable sources
Oct 12 17:49:00 nalrcsvridbq02 process-agent[2039258]: 2020-10-12 17:49:00
EDT | PROCESS | INFO | (collector.go:209 in func1) | Delivery queues:
process[size=0, weight=0], pod[size=0, weight=0]
Oct 12 17:49:00 nalrcsvridbq02 agent[2039257]: 2020-10-12 17:49:00 EDT |
CORE | ERROR | (pkg/forwarder/worker.go:178 in process) | Error while
processing transaction: error while sending transaction, rescheduling it:
Post
https://7-22-1-app.agent.datadoghq.com/api/v1/series?api_key=*************************44602:
net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Oct 12 17:49:00 nalrcsvridbq02 trace-agent[2039259]: 2020-10-12 17:49:00 EDT
| TRACE | INFO | (pkg/trace/info/stats.go:101 in LogStats) | No data
received
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: [17:49:00] Topology snapshot
[ver=21, locNode=2eca41b3, servers=1, clients=0, state=ACTIVE, CPUs=2,
offheap=2.0GB, heap=0.25GB]
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: [17:49:00]   ^-- Baseline
[id=0, size=2, online=1, offline=1]
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: [17:49:00] (err) Failed to
execute compound future reducer: GridNearTxFinishFuture
[futId=7a28b630571-b3eac955-0171-4b45-b048-84653e88427e, tx=GridNearTxLocal
[mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
[entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey [key=KeyCacheObject
[hasValBytes=true], cacheId=-27866919], val=BinaryObject [idHash=1523169004,
hash=1743117496][op=CREATE, val=], prevVal=[op=NOOP, val=null],
oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObject [hasValBytes=true], val=null,
ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=684422756,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=2, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=212970798,
order=1602505621810, nodeOrder=16]]], explicitLock=false, queryUpdate=false,
dhtVer=null, last=false, nearEntries=0, clientFirst=false,
node=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee]], nearLocallyMapped=false,
colocatedLocallyMapped=false, needCheckBackup=false, hasRemoteLocks=false,
trackTimeout=false, systemTime=0, systemStartTime=3310262430790233,
prepareStartTime=3310262430790275, prepareTime=80685139741,
commitOrRollbackStartTime=3310343115930116, commitOrRollbackTime=0,
txDumpsThrottling=o.a.i.i.processors.cache.transactions.IgniteTxManager$TxDumpsThrottling@4b23d4ce,
lb=null, mvccOp=false, qryId=-1, crdVer=0,
thread=client-connector-#129%RailConnect Ignite QA Grid%,
mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
[entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey [key=KeyCacheObject
[hasValBytes=true], cacheId=-27866919], val=BinaryObject [idHash=1523169004,
hash=1743117496][op=CREATE, val=], prevVal=[op=NOOP, val=null],
oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObject [hasValBytes=true], val=null,
ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=684422756,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=2, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=212970798,
order=1602505621810, nodeOrder=16]]], explicitLock=false, queryUpdate=false,
dhtVer=null, last=false, nearEntries=0, clientFirst=false,
node=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee]], super=GridDhtTxLocalAdapter
[nearOnOriginatingNode=false, nearNodes=KeySetView [], dhtNodes=KeySetView
[], explicitLock=false, super=IgniteTxLocalAdapter [completedBase=null,
sndTransformedVals=false, depEnabled=false,
txState=IgniteTxImplicitSingleStateImpl [init=true, recovery=false,
useMvccCaching=false], super=IgniteTxAdapter [xidVer=GridCacheVersion
[topVer=212970798, order=1602505621810, nodeOrder=16], writeVer=null,
implicit=true, loc=true, threadId=195, startTime=1602539259845,
nodeId=2eca41b3-7078-44ce-89d2-8bc6f67334a6, startVer=GridCacheVersion
[topVer=212970798, order=1602505621810, nodeOrder=16], endVer=null,
isolation=READ_COMMITTED, concurrency=OPTIMISTIC, timeout=0,
sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion
[topVer=212970798, order=1602505621810, nodeOrder=16], finalizing=NONE,
invalidParts=null, state=PREPARED, timedOut=false,
topVer=AffinityTopologyVersion [topVer=20, minorTopVer=1],
mvccSnapshot=null, skipCompletedVers=false, parentTx=null, duration=80663ms,
onePhaseCommit=true], size=1]]], commit=true,
mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
[entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey [key=KeyCacheObject
[hasValBytes=true], cacheId=-27866919], val=BinaryObject [idHash=1523169004,
hash=1743117496][op=CREATE, val=], prevVal=[op=NOOP, val=null],
oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true,
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObject [hasValBytes=true], val=null,
ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=684422756,
extras=null, flags=0]]], prepared=0, locked=false,
nodeId=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee, locMapped=false,
expiryPlc=null, transferExpiryPlc=false, flags=2, partUpdateCntr=0,
serReadVer=null, xidVer=GridCacheVersion [topVer=212970798,
order=1602505621810, nodeOrder=16]]], explicitLock=false, queryUpdate=false,
dhtVer=null, last=false, nearEntries=0, clientFirst=false,
node=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee]], trackable=true,
finishOnePhaseCalled=false, innerFuts=TransformCollectionView
[CheckBackupFuture[node=2eca41b3-7078-44ce-89d2-8bc6f67334a6, loc=true,
done=true]], super=GridCompoundIdentityFuture [super=GridCompoundFuture
[rdc=AlwaysTrueReducer [], initFlag=0, lsnrCalls=0, done=false,
cancelled=false, err=null, futs=TransformCollectionView [true]]]]class
org.apache.ignite.internal.transactions.IgniteTxRollbackCheckedException:
Failed to commit transaction (transaction has been rolled back on backup
node): GridCacheVersion [topVer=212970798, order=1602505621810,
nodeOrder=16]
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.checkBackup(GridNearTxFinishFuture.java:618)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:378)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$25.apply(GridNearTxLocal.java:4129)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$25.apply(GridNearTxLocal.java:4118)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onComplete(GridNearOptimisticTxPrepareFuture.java:281)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onError(GridNearOptimisticTxPrepareFuture.java:152)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.access$300(GridNearOptimisticTxPrepareFuture.java:76)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture$MiniFuture.onNodeLeft(GridNearOptimisticTxPrepareFuture.java:938)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onNodeLeft(GridNearOptimisticTxPrepareFuture.java:127)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.GridCacheMvccManager$4.onEvent(GridCacheMvccManager.java:266)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager$LocalListenerWrapper.onEvent(GridEventStorageManager.java:1391)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:886)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:871)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:346)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:309)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2762)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:2990)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2794)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
java.lang.Thread.run(Thread.java:748)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: Caused by: class
org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Primary
node left grid: 3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.checkBackup(GridNearTxFinishFuture.java:612)
Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011... 25 more
Oct 12 17:49:00 nalrcsvridbq02 agent[2039257]: 2020-10-12 17:49:00 EDT |
CORE | ERROR | (pkg/forwarder/worker.go:174 in process) | Too many errors
for endpoint
'https://7-22-1-app.agent.datadoghq.com/api/v1/series?api_key=*************************44602':
retrying later
Oct 12 17:49:40 nalrcsvridbq02 chronyd[1216]: Forward time jump detected!
Oct 12 17:49:41 nalrcsvridbq02 kernel: watchdog: BUG: soft lockup - CPU#0
stuck for 38s! [kworker/u256:0:3703404]
Oct 12 17:49:41 nalrcsvridbq02 kernel: Modules linked in: binfmt_misc
nf_tables nfnetlink vmw_vsock_vmci_transport vsock ext4 mbcache jbd2
intel_rapl_msr intel_rapl_common nfit libnvdimm crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel vmw_balloon intel_rapl_perf joydev pcspkr
vmw_vmci i2c_piix4 auth_rpcgss sunrpc ip_tables xfs libcrc32c sr_mod cdrom
vmwgfx ata_generic drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops sd_mod sg ttm ahci drm libahci crc32c_intel ata_piix serio_raw
vmxnet3 libata vmw_pvscsi dm_mirror dm_region_hash dm_log dm_mod
Oct 12 17:49:41 nalrcsvridbq02 kernel: CPU: 0 PID: 3703404 Comm:
kworker/u256:0 Kdump: loaded Tainted: G             L   --------- -  -
4.18.0-193.14.2.el8_2.x86_64 #1
Oct 12 17:49:41 nalrcsvridbq02 kernel: Hardware name: VMware, Inc. VMware
Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/03/2018
Oct 12 17:49:41 nalrcsvridbq02 kernel: Workqueue: writeback wb_workfn
(flush-253:3)
Oct 12 17:49:41 nalrcsvridbq02 kernel: RIP:
0010:_raw_spin_unlock_irqrestore+0x11/0x20
Oct 12 17:49:41 nalrcsvridbq02 kernel: Code: 85 ff 48 29 e8 4c 39 e0 76 cf
80 0b 08 eb 8c 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 0f 1f
40 00 48 89 f7 57 9d <0f> 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44
00 00 8b 07





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite instance fails. Unsure as to root cause

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Oct 12 17:49:41 nalrcsvridbq02 kernel: watchdog: BUG: soft lockup - CPU#0
stuck for 38s! [kworker/u256:0:3703404]

This is bad. Your system kernel says your CPU#0 was hanging for 38 seconds.

This is enough to trigger failure detection timeout and kill and instance,
or at least for the node to be segmented from cluster by the other nodes.

Regards,
-- 
Ilya Kasnacheev


ср, 14 окт. 2020 г. в 16:00, bbellrose <br...@wabtec.com>:

> Oct 12 17:47:39 nalrcsvridbq02 Ignite[2031634]: [17:47:39] Possible failure
> suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=nio-acceptor-tcp-comm,
> igniteInstanceName=RailConnect Ignite QA Grid, finished=false,
> heartbeatTs=1602539219763]]]
> Oct 12 17:48:20 nalrcsvridbq02 Ignite[2031634]: [17:48:20] Possible failure
> suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=grid-nio-worker-tcp-comm-1,
> igniteInstanceName=RailConnect Ignite QA Grid, finished=false,
> heartbeatTs=1602539260020]]]
> Oct 12 17:48:20 nalrcsvridbq02 Ignite[2031634]: [17:48:20] Possible failure
> suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=ttl-cleanup-worker,
> igniteInstanceName=RailConnect Ignite QA Grid, finished=false,
> heartbeatTs=1602539260020]]]
> Oct 12 17:48:20 nalrcsvridbq02 Ignite[2031634]: [17:48:20] Possible failure
> suppressed accordingly to a configured handler
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
> o.a.i.IgniteException: GridWorker [name=db-checkpoint-thread,
> igniteInstanceName=RailConnect Ignite QA Grid, finished=false,
> heartbeatTs=1602539260006]]]
> Oct 12 17:49:00 nalrcsvridbq02 chronyd[1216]: Forward time jump detected!
> Oct 12 17:49:00 nalrcsvridbq02 chronyd[1216]: Can't synchronise: no
> selectable sources
> Oct 12 17:49:00 nalrcsvridbq02 process-agent[2039258]: 2020-10-12 17:49:00
> EDT | PROCESS | INFO | (collector.go:209 in func1) | Delivery queues:
> process[size=0, weight=0], pod[size=0, weight=0]
> Oct 12 17:49:00 nalrcsvridbq02 agent[2039257]: 2020-10-12 17:49:00 EDT |
> CORE | ERROR | (pkg/forwarder/worker.go:178 in process) | Error while
> processing transaction: error while sending transaction, rescheduling it:
> Post
>
> https://7-22-1-app.agent.datadoghq.com/api/v1/series?api_key=*************************44602
> :
> net/http: request canceled (Client.Timeout exceeded while awaiting headers)
> Oct 12 17:49:00 nalrcsvridbq02 trace-agent[2039259]: 2020-10-12 17:49:00
> EDT
> | TRACE | INFO | (pkg/trace/info/stats.go:101 in LogStats) | No data
> received
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: [17:49:00] Topology
> snapshot
> [ver=21, locNode=2eca41b3, servers=1, clients=0, state=ACTIVE, CPUs=2,
> offheap=2.0GB, heap=0.25GB]
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: [17:49:00]   ^-- Baseline
> [id=0, size=2, online=1, offline=1]
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: [17:49:00] (err) Failed to
> execute compound future reducer: GridNearTxFinishFuture
> [futId=7a28b630571-b3eac955-0171-4b45-b048-84653e88427e, tx=GridNearTxLocal
> [mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
> [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey
> [key=KeyCacheObject
> [hasValBytes=true], cacheId=-27866919], val=BinaryObject
> [idHash=1523169004,
> hash=1743117496][op=CREATE, val=], prevVal=[op=NOOP, val=null],
> oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
> conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
> filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true,
> entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
> [super=GridCacheMapEntry [key=KeyCacheObject [hasValBytes=true], val=null,
> ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=684422756,
> extras=null, flags=0]]], prepared=0, locked=false,
> nodeId=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee, locMapped=false,
> expiryPlc=null, transferExpiryPlc=false, flags=2, partUpdateCntr=0,
> serReadVer=null, xidVer=GridCacheVersion [topVer=212970798,
> order=1602505621810, nodeOrder=16]]], explicitLock=false,
> queryUpdate=false,
> dhtVer=null, last=false, nearEntries=0, clientFirst=false,
> node=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee]], nearLocallyMapped=false,
> colocatedLocallyMapped=false, needCheckBackup=false, hasRemoteLocks=false,
> trackTimeout=false, systemTime=0, systemStartTime=3310262430790233,
> prepareStartTime=3310262430790275, prepareTime=80685139741,
> commitOrRollbackStartTime=3310343115930116, commitOrRollbackTime=0,
>
> txDumpsThrottling=o.a.i.i.processors.cache.transactions.IgniteTxManager$TxDumpsThrottling@4b23d4ce
> ,
> lb=null, mvccOp=false, qryId=-1, crdVer=0,
> thread=client-connector-#129%RailConnect Ignite QA Grid%,
> mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
> [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey
> [key=KeyCacheObject
> [hasValBytes=true], cacheId=-27866919], val=BinaryObject
> [idHash=1523169004,
> hash=1743117496][op=CREATE, val=], prevVal=[op=NOOP, val=null],
> oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
> conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
> filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true,
> entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
> [super=GridCacheMapEntry [key=KeyCacheObject [hasValBytes=true], val=null,
> ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=684422756,
> extras=null, flags=0]]], prepared=0, locked=false,
> nodeId=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee, locMapped=false,
> expiryPlc=null, transferExpiryPlc=false, flags=2, partUpdateCntr=0,
> serReadVer=null, xidVer=GridCacheVersion [topVer=212970798,
> order=1602505621810, nodeOrder=16]]], explicitLock=false,
> queryUpdate=false,
> dhtVer=null, last=false, nearEntries=0, clientFirst=false,
> node=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee]], super=GridDhtTxLocalAdapter
> [nearOnOriginatingNode=false, nearNodes=KeySetView [], dhtNodes=KeySetView
> [], explicitLock=false, super=IgniteTxLocalAdapter [completedBase=null,
> sndTransformedVals=false, depEnabled=false,
> txState=IgniteTxImplicitSingleStateImpl [init=true, recovery=false,
> useMvccCaching=false], super=IgniteTxAdapter [xidVer=GridCacheVersion
> [topVer=212970798, order=1602505621810, nodeOrder=16], writeVer=null,
> implicit=true, loc=true, threadId=195, startTime=1602539259845,
> nodeId=2eca41b3-7078-44ce-89d2-8bc6f67334a6, startVer=GridCacheVersion
> [topVer=212970798, order=1602505621810, nodeOrder=16], endVer=null,
> isolation=READ_COMMITTED, concurrency=OPTIMISTIC, timeout=0,
> sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion
> [topVer=212970798, order=1602505621810, nodeOrder=16], finalizing=NONE,
> invalidParts=null, state=PREPARED, timedOut=false,
> topVer=AffinityTopologyVersion [topVer=20, minorTopVer=1],
> mvccSnapshot=null, skipCompletedVers=false, parentTx=null,
> duration=80663ms,
> onePhaseCommit=true], size=1]]], commit=true,
> mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping
> [entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey
> [key=KeyCacheObject
> [hasValBytes=true], cacheId=-27866919], val=BinaryObject
> [idHash=1523169004,
> hash=1743117496][op=CREATE, val=], prevVal=[op=NOOP, val=null],
> oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1,
> conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null,
> filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true,
> entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry
> [super=GridCacheMapEntry [key=KeyCacheObject [hasValBytes=true], val=null,
> ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=684422756,
> extras=null, flags=0]]], prepared=0, locked=false,
> nodeId=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee, locMapped=false,
> expiryPlc=null, transferExpiryPlc=false, flags=2, partUpdateCntr=0,
> serReadVer=null, xidVer=GridCacheVersion [topVer=212970798,
> order=1602505621810, nodeOrder=16]]], explicitLock=false,
> queryUpdate=false,
> dhtVer=null, last=false, nearEntries=0, clientFirst=false,
> node=3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee]], trackable=true,
> finishOnePhaseCalled=false, innerFuts=TransformCollectionView
> [CheckBackupFuture[node=2eca41b3-7078-44ce-89d2-8bc6f67334a6, loc=true,
> done=true]], super=GridCompoundIdentityFuture [super=GridCompoundFuture
> [rdc=AlwaysTrueReducer [], initFlag=0, lsnrCalls=0, done=false,
> cancelled=false, err=null, futs=TransformCollectionView [true]]]]class
> org.apache.ignite.internal.transactions.IgniteTxRollbackCheckedException:
> Failed to commit transaction (transaction has been rolled back on backup
> node): GridCacheVersion [topVer=212970798, order=1602505621810,
> nodeOrder=16]
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.checkBackup(GridNearTxFinishFuture.java:618)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:378)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$25.apply(GridNearTxLocal.java:4129)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$25.apply(GridNearTxLocal.java:4118)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onComplete(GridNearOptimisticTxPrepareFuture.java:281)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onError(GridNearOptimisticTxPrepareFuture.java:152)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.access$300(GridNearOptimisticTxPrepareFuture.java:76)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture$MiniFuture.onNodeLeft(GridNearOptimisticTxPrepareFuture.java:938)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onNodeLeft(GridNearOptimisticTxPrepareFuture.java:127)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.GridCacheMvccManager$4.onEvent(GridCacheMvccManager.java:266)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager$LocalListenerWrapper.onEvent(GridEventStorageManager.java:1391)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:886)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:871)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:346)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:309)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2762)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:2990)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2794)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
> java.lang.Thread.run(Thread.java:748)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: Caused by: class
> org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Primary
> node left grid: 3f8b7981-ee81-4ca4-9f52-5c6f03cb8cee
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011at
>
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.checkBackup(GridNearTxFinishFuture.java:612)
> Oct 12 17:49:00 nalrcsvridbq02 Ignite[2031634]: #011... 25 more
> Oct 12 17:49:00 nalrcsvridbq02 agent[2039257]: 2020-10-12 17:49:00 EDT |
> CORE | ERROR | (pkg/forwarder/worker.go:174 in process) | Too many errors
> for endpoint
> '
> https://7-22-1-app.agent.datadoghq.com/api/v1/series?api_key=*************************44602
> ':
> retrying later
> Oct 12 17:49:40 nalrcsvridbq02 chronyd[1216]: Forward time jump detected!
> Oct 12 17:49:41 nalrcsvridbq02 kernel: watchdog: BUG: soft lockup - CPU#0
> stuck for 38s! [kworker/u256:0:3703404]
> Oct 12 17:49:41 nalrcsvridbq02 kernel: Modules linked in: binfmt_misc
> nf_tables nfnetlink vmw_vsock_vmci_transport vsock ext4 mbcache jbd2
> intel_rapl_msr intel_rapl_common nfit libnvdimm crct10dif_pclmul
> crc32_pclmul ghash_clmulni_intel vmw_balloon intel_rapl_perf joydev pcspkr
> vmw_vmci i2c_piix4 auth_rpcgss sunrpc ip_tables xfs libcrc32c sr_mod cdrom
> vmwgfx ata_generic drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops sd_mod sg ttm ahci drm libahci crc32c_intel ata_piix serio_raw
> vmxnet3 libata vmw_pvscsi dm_mirror dm_region_hash dm_log dm_mod
> Oct 12 17:49:41 nalrcsvridbq02 kernel: CPU: 0 PID: 3703404 Comm:
> kworker/u256:0 Kdump: loaded Tainted: G             L   --------- -  -
> 4.18.0-193.14.2.el8_2.x86_64 #1
> Oct 12 17:49:41 nalrcsvridbq02 kernel: Hardware name: VMware, Inc. VMware
> Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/03/2018
> Oct 12 17:49:41 nalrcsvridbq02 kernel: Workqueue: writeback wb_workfn
> (flush-253:3)
> Oct 12 17:49:41 nalrcsvridbq02 kernel: RIP:
> 0010:_raw_spin_unlock_irqrestore+0x11/0x20
> Oct 12 17:49:41 nalrcsvridbq02 kernel: Code: 85 ff 48 29 e8 4c 39 e0 76 cf
> 80 0b 08 eb 8c 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 0f 1f
> 40 00 48 89 f7 57 9d <0f> 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00 0f 1f
> 44
> 00 00 8b 07
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>