You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Ilya <iy...@gmail.com> on 2017/05/12 12:57:52 UTC

Understanding the mechanics of peer class loading

Hi all!

The question was originally asked (but not answered) on SO:
http://stackoverflow.com/questions/43803402/how-does-peer-classloading-work-in-apache-ignite

In short, we have "Failed to deploy user message" exceptions under high load
in our project.

Here is an overview of our architecture:
- Distributed cache on three nodes, all nodes run on a single workstation
(in this test);
- Workers on each node;
- Messaging between workers is done using IgniteMessaging (topic has the
type of String and I've tried both byte[] and ByteBuffer as a message
class);
- Client connects to the cluster and triggers some business logic, that
causes cross-node messaging, scan queries and MR jobs (using
IgniteCompute::broadcast). All of these may performed concurrently.

I've tried both SHARED and CONTINUOUS deployment mode, but the result
remains the same.

I've noticed lots of similar messages in the logs:
/2017-05-05 13:31:28 INFO   org.apache.ignite.logger.java.JavaLogger info
Removed undeployed class: GridDeployment [ts=1493980288578,
depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@38815daa,
clsLdrId=36c3828db51-0d65e7d5-77bf-444d-9b8b-d18bde94ad13, userVer=0,
loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
undeployed=true, usage=0]
...
2017-05-05 13:31:29 INFO   org.apache.ignite.logger.java.JavaLogger info
Removed undeployed class: GridDeployment [ts=1493980289125,
depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@355f6680,
clsLdrId=1dd3828db51-1b20df7a-a98d-45a3-8ab6-e5d229945830, userVer=0,
loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
undeployed=true, usage=0]
.../

This happens when I use ByteBuffer as message type. In case of byte[], class
B[ is being constantly re-deployed.

ScanQuery predicate and IgniteCompute caller are also being constantly
re-deployed.
If we disable ScanQueries and IgniteCompute broadcasts - all is fine, there
are no re-deployments.

For the further testing I've disabled MRs and kept ScanQueries. I've also
added some debug output to a fresh snapshot of Ignite 2.1.0. Messages "Class
locally deployed: <my ScanQuery predicate>" usually come from the following
call stack:
/org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.recordDeploy(GridDeploymentLocalStore.java:404)
	at
org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.deploy(GridDeploymentLocalStore.java:333)
	at
org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.getDeployment(GridDeploymentLocalStore.java:201)
	at
org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getLocalDeployment(GridDeploymentManager.java:383)
	at
org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getDeployment(GridDeploymentManager.java:345)
	at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.injectResources(GridCacheQueryManager.java:918)
	at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.scanIterator(GridCacheQueryManager.java:826)
	at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.executeQuery(GridCacheQueryManager.java:611)
	at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.queryResult(GridCacheQueryManager.java:1593)
	at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.runQuery(GridCacheQueryManager.java:1164)
	at
org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.processQueryRequest(GridCacheDistributedQueryManager.java:231)
	at
org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:109)
	at
org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:107)
	at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
	at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
	at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
	at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
	at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
	at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
	at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
	at
org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
	at
org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)/

Messages like "removed undeployed class" usually come from the
IgniteMessaging's call stack.

I've analyzed the Ignite kernel a bit, and got a suspicion that undeploy is
being triggered for all classes in a classloader, when at least one class
that resides in that classloader was re-deployed in some other loader.

It happens inside
org.apache.ignite.spi.deployment.local.LocalDeploymentSpi#register

    At first, we get a "Map of new resources added for registered class
loader" using LocalDeploymentSpi#addResource.
    Then we "Remove resources for all class loaders except {@code
ignoreClsLdr}." using LocalDeploymentSpi#removeResources. Inside this
method, it looks like we add all loaders that contain the old version of the
new resource to a "doomed" collection.
    Finally, we iterate this collection and call onClassLoaderReleased for
each element. The latter action actually causes all the classes to be
undeployed (finally causing the "Removed undeployed class" messages).

I don't understand this concept. Why are there multiple classloaders? Why do
we undeploy the whole classloader in such cases?

I'd be grateful, if someone could explain, how does peer classloading work
in Ignite "under the hood".

P. S. I'm looking at the sources of a fresh snapshot of Ignite 2.1.0, but
the behavior is the same with the standard Ignite 1.9.0.

P. P. S. Unfortunately, I've did not manage to reproduce this issue outside
of our project yet.




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Understanding the mechanics of peer class loading

Posted by Dave Harvey <dh...@jobcase.com>.

I added this ticket, because we hit a similar problem, as was able to find
some quite suspect code: https://issues.apache.org/jira/browse/IGNITE-9026





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Understanding the mechanics of peer class loading

Posted by Dmitry Pavlov <dp...@gmail.com>.

Hi, Ilya,

Thank you for sharing logs. Please give me couple days to research.

Re-deployments of a single class may be caused by different class loaders
used for the class. As usage of different class loaders makes class differ,
it is to be redeployed.

Feel free to share any new results, especially if you will have success
with "Failed to deploy user message" issue reproduce.

Best Regards,
Dmitry Pavlov

чт, 18 мая 2017 г. в 19:37, Ilya <iy...@gmail.com>:

> Hi Dmitry,
>
> Fair enough about the classloaders, but the stack trace looks like it
> comes from a server node. Why does the client classloader affects
> classloading on the server?
>
> Regarding the logs from our project - here they are (with levels INFO and
> FINEST).
> I see two directions of research:
> 1) "Failed to deploy user message" errors under high load (many parallel
> requests);
> 2) Multiple deployments of a single class (repeated messages like "Class
> locally deployed: class [B,"), which happen even if there is no high load
> (e.g. while sequential processing of 10-40 similar requests).
>
> I've tried to reproduce the second issue in my test project, with no
> success to date.
> What may cause multiple re-deployments of a single class? Or is such
> behavior (seen in files sequential_redeployments*.log) normal?
>
> On Wed, May 17, 2017 at 1:04 PM, Dmitry Pavlov [via Apache Ignite Users] <[hidden
> email] <http:///user/SendEmail.jtp?type=node&node=13007&i=0>> wrote:
>
>> Hi Ilya,
>>
>> To my mind in this case it's not a problem of Peer Class Loading (PCL).
>> Please note if you disable PCL in Ignite Configuration, test will fail
>> anyway.
>>
>> In this test LocatorBiPredicate is taken by Jetty from the parent class
>> loader (test launcher class loader) and cast to IgnitePredicate can't
>> complete sucessfull because of different class loaders used.
>>
>> May I ask to to provide log and/or stacktrace of original deployment
>> problem in the project? Additionally you may enable debug log level for
>> deployment.
>>
>> Best Regards,
>> Dmitry Pavlov
>>
>
>> ср, 17 мая 2017 г. в 11:39, Ilya <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=12967&i=0>>:
>>
> Hi Dmitry!
>>>
>>> Do I understand correct that GridDeploymentClassLoader is always on
>>> server used if peer classloading is enabled?
>>>
>>> I've did not reproduce the re-deployment issue, but maybe have found
>>> another, which is related to deployment.
>>> Here is the code:
>>> https://github.com/yakupov/IgniteDeploymentTest
>>>
>>> How to execute:
>>> 1) mvn package in the root;
>>> 2) execute JUnit test IgnitionRunner#testJetty (it's in the ITest
>>> module).
>>>
>>> The result is:
>>> SEVERE: Failed to process message
>>> [senderId=98138e20-8fe4-4750-8281-a92b2067fdcb, messageType=class
>>> o.a.i.i.processors.cache.query.GridCacheQueryRequest]
>>> java.lang.ClassCastException: LocatorBiPredicate cannot be cast to
>>> org.apache.ignite.lang.IgniteBiPredicate
>>>     at
>>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryRequest.finishUnmarshal(GridCacheQueryRequest.java:324)
>>>     at
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.unmarshall(GridCacheIoManager.java:1298)
>>>     at
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:364)
>>>     at
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:293)
>>>     at
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:95)
>>>     at
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:238)
>>>     at
>>> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1222)
>>>     at
>>> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:850)
>>>     at
>>> org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:108)
>>>     at
>>> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:790)
>>>     at
>>> org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:428)
>>>     at java.lang.Thread.run(Thread.java:745)
>>>
>>> where
>>> import org.apache.ignite.lang.IgniteBiPredicate;
>>> public class LocatorBiPredicate implements IgniteBiPredicate<String,
>>> Locator> {...}
>>>
>>>
>>> On Tue, May 16, 2017 at 12:51 PM, Dmitry Pavlov [via Apache Ignite
>>> Users] <[hidden email]
>>> <http:///user/SendEmail.jtp?type=node&node=12961&i=0>> wrote:
>>> Hi Ilya,
>>>
>>> I supposed following: GridDeploymentClassLoader usage for loading
>>> message class on receiver side is mandatory condition for reproduction. But
>>> for now it is only hypothesis.
>>>
>>> In my test if all nodes have message payload class in its classpath
>>> there are no problems with messaging under high load (around 10^6 of
>>> messages in 32 threads). Same is true if application class loader is used
>>> in for loading message class (no WAR files & no Jetty).
>>>
>>> Could you please check class loaders used at sender(s) sides and on
>>> receiver side. Is GridDeploymentClassLoader used for loading message
>>> payload class at receiver?
>>>
>>> Sincerely,
>>> Dmitry Pavlov
>>>
>>>
>>>
>>> вт, 16 мая 2017 г. в 12:17, Ilya <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=12881&i=0>>:
>> Hi Dmitry,
>>
>> Unfortunately, I've did not yet manage to reproduce this issue outside of
>> our project.
>>
>> What do you mean by "GridDeploymentClassLoader is used for loading class
>> on server"? How can server classloaders be configured? How does a remote
>> node choose, in which classloader to deploy the received class?
>>
>> My test configuration is as follows:
>>
>>    - Web application that has a single cache and has two IgniteMessaging
>>    local listeners;
>>    - It's run from JUnit test code under Jetty;
>>    - The three instances form a cluster;
>>    - One client sends messages to one of the topics (to the random node)
>>    using IgniteMessaging.
>>
>> All of these works on a single JVM. I suggest that Jetty servers might
>> use dedicated classloaders for each web-app. I've tried to launch client
>> from both test code and another web-app under Jetty, but that did not
>> change anything.
>>
>> In fact, the failing test on our production application is launched in
>> the same manner: two exploded web-apps (client and server), three instances
>> of server app and all of these is run under Jettys in a single VM...
>>
>> On Sun, May 14, 2017 at 2:22 PM, Dmitry Pavlov [via Apache Ignite Users]
>> <[hidden email] <http:///user/SendEmail.jtp?type=node&node=12879&i=0>>
>> wrote:
>> Hi, Ilya,
>>
>> I've tried to reproduce deployment problem in standalone project
>> involving Ingnite.start() in several WAR files. But this test is still
>> passing.
>>
>> It is still possible deployment problem can be reprdoced only
>> when GridDeploymentClassLoader is used for loading class on server, and
>> several different Web App class loaders are enabled on clients.
>>
>> Do you have standalone reproducer you can share?
>>
>> Best Regards,
>> Dmitry Pavlov
>>
>>
>>
>> пт, 12 мая 2017 г. в 15:58, Ilya <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=12831&i=0>>:
>> Hi all!
>>
>> The question was originally asked (but not answered) on SO:
>>
>> http://stackoverflow.com/questions/43803402/how-does-peer-classloading-work-in-apache-ignite
>>
>> In short, we have "Failed to deploy user message" exceptions under high
>> load
>> in our project.
>>
>> Here is an overview of our architecture:
>> - Distributed cache on three nodes, all nodes run on a single workstation
>> (in this test);
>> - Workers on each node;
>> - Messaging between workers is done using IgniteMessaging (topic has the
>> type of String and I've tried both byte[] and ByteBuffer as a message
>> class);
>> - Client connects to the cluster and triggers some business logic, that
>> causes cross-node messaging, scan queries and MR jobs (using
>> IgniteCompute::broadcast). All of these may performed concurrently.
>>
>> I've tried both SHARED and CONTINUOUS deployment mode, but the result
>> remains the same.
>>
>> I've noticed lots of similar messages in the logs:
>> /2017-05-05 13:31:28 INFO   org.apache.ignite.logger.java.JavaLogger info
>> Removed undeployed class: GridDeployment [ts=1493980288578,
>> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@38815daa,
>> clsLdrId=36c3828db51-0d65e7d5-77bf-444d-9b8b-d18bde94ad13, userVer=0,
>> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
>> undeployed=true, usage=0]
>> ...
>> 2017-05-05 13:31:29 INFO   org.apache.ignite.logger.java.JavaLogger info
>> Removed undeployed class: GridDeployment [ts=1493980289125,
>> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@355f6680,
>> clsLdrId=1dd3828db51-1b20df7a-a98d-45a3-8ab6-e5d229945830, userVer=0,
>> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
>> undeployed=true, usage=0]
>> .../
>>
>> This happens when I use ByteBuffer as message type. In case of byte[],
>> class
>> B[ is being constantly re-deployed.
>>
>> ScanQuery predicate and IgniteCompute caller are also being constantly
>> re-deployed.
>> If we disable ScanQueries and IgniteCompute broadcasts - all is fine,
>> there
>> are no re-deployments.
>>
>> For the further testing I've disabled MRs and kept ScanQueries. I've also
>> added some debug output to a fresh snapshot of Ignite 2.1.0. Messages
>> "Class
>> locally deployed: <my ScanQuery predicate>" usually come from the
>> following
>> call stack:
>>
>> /org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.recordDeploy(GridDeploymentLocalStore.java:404)
>>         at
>>
>> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.deploy(GridDeploymentLocalStore.java:333)
>>         at
>>
>> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.getDeployment(GridDeploymentLocalStore.java:201)
>>         at
>>
>> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getLocalDeployment(GridDeploymentManager.java:383)
>>         at
>>
>> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getDeployment(GridDeploymentManager.java:345)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.injectResources(GridCacheQueryManager.java:918)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.scanIterator(GridCacheQueryManager.java:826)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.executeQuery(GridCacheQueryManager.java:611)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.queryResult(GridCacheQueryManager.java:1593)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.runQuery(GridCacheQueryManager.java:1164)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.processQueryRequest(GridCacheDistributedQueryManager.java:231)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:109)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:107)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
>>         at
>>
>> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
>>         at
>>
>> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
>>         at
>>
>> org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
>>         at
>>
>> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)/
>>
>> Messages like "removed undeployed class" usually come from the
>> IgniteMessaging's call stack.
>>
>> I've analyzed the Ignite kernel a bit, and got a suspicion that undeploy
>> is
>> being triggered for all classes in a classloader, when at least one class
>> that resides in that classloader was re-deployed in some other loader.
>>
>> It happens inside
>> org.apache.ignite.spi.deployment.local.LocalDeploymentSpi#register
>>
>>     At first, we get a "Map of new resources added for registered class
>> loader" using LocalDeploymentSpi#addResource.
>>     Then we "Remove resources for all class loaders except {@code
>> ignoreClsLdr}." using LocalDeploymentSpi#removeResources. Inside this
>> method, it looks like we add all loaders that contain the old version of
>> the
>> new resource to a "doomed" collection.
>>     Finally, we iterate this collection and call onClassLoaderReleased for
>> each element. The latter action actually causes all the classes to be
>> undeployed (finally causing the "Removed undeployed class" messages).
>>
>> I don't understand this concept. Why are there multiple classloaders? Why
>> do
>> we undeploy the whole classloader in such cases?
>>
>> I'd be grateful, if someone could explain, how does peer classloading work
>> in Ignite "under the hood".
>>
>> P. S. I'm looking at the sources of a fresh snapshot of Ignite 2.1.0, but
>> the behavior is the same with the standard Ignite 1.9.0.
>>
>> P. P. S. Unfortunately, I've did not manage to reproduce this issue
>> outside
>> of our project yet.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12831.html
>> To unsubscribe from Understanding the mechanics of peer class loading, click
>> here.
>> NAML
>> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>>
>> ------------------------------
>> View this message in context: Re: Understanding the mechanics of peer
>> class loading
>> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12879.html>
>> Sent from the Apache Ignite Users mailing list archive
>> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12881.html
>> To unsubscribe from Understanding the mechanics of peer class loading, click
>> here.
>> NAML
>> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>> ------------------------------
>> View this message in context: Re: Understanding the mechanics of peer
>> class loading
>> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12961.html>
>> Sent from the Apache Ignite Users mailing list archive
>> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>>
>>
>> ------------------------------
>>
> If you reply to this email, your message will be added to the discussion
>> below:
>>
>
>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12967.html
>>
> To unsubscribe from Understanding the mechanics of peer class loading, click
>> here.
>> NAML
>> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
> *hce-ignition-logs.7z* (1M) Download Attachment
> <http://apache-ignite-users.70518.x6.nabble.com/attachment/13007/0/hce-ignition-logs.7z>
>
> ------------------------------
> View this message in context: Re: Understanding the mechanics of peer
> class loading
> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p13007.html>
> Sent from the Apache Ignite Users mailing list archive
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>

Re: Understanding the mechanics of peer class loading

Posted by Ilya <iy...@gmail.com>.

Hi Dmitry,

Fair enough about the classloaders, but the stack trace looks like it comes
from a server node. Why does the client classloader affects classloading on
the server?

Regarding the logs from our project - here they are (with levels INFO and
FINEST).
I see two directions of research:
1) "Failed to deploy user message" errors under high load (many parallel
requests);
2) Multiple deployments of a single class (repeated messages like "Class
locally deployed: class [B,"), which happen even if there is no high load
(e.g. while sequential processing of 10-40 similar requests).

I've tried to reproduce the second issue in my test project, with no
success to date.
What may cause multiple re-deployments of a single class? Or is such
behavior (seen in files sequential_redeployments*.log) normal?

On Wed, May 17, 2017 at 1:04 PM, Dmitry Pavlov [via Apache Ignite Users] <
ml+s70518n12967h13@n6.nabble.com> wrote:

> Hi Ilya,
>
> To my mind in this case it's not a problem of Peer Class Loading (PCL).
> Please note if you disable PCL in Ignite Configuration, test will fail
> anyway.
>
> In this test LocatorBiPredicate is taken by Jetty from the parent class
> loader (test launcher class loader) and cast to IgnitePredicate can't
> complete sucessfull because of different class loaders used.
>
> May I ask to to provide log and/or stacktrace of original deployment
> problem in the project? Additionally you may enable debug log level for
> deployment.
>
> Best Regards,
> Dmitry Pavlov
>
> ср, 17 мая 2017 г. в 11:39, Ilya <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=12967&i=0>>:
>
>> Hi Dmitry!
>>
>> Do I understand correct that GridDeploymentClassLoader is always on
>> server used if peer classloading is enabled?
>>
>> I've did not reproduce the re-deployment issue, but maybe have found
>> another, which is related to deployment.
>> Here is the code:
>> https://github.com/yakupov/IgniteDeploymentTest
>>
>> How to execute:
>> 1) mvn package in the root;
>> 2) execute JUnit test IgnitionRunner#testJetty (it's in the ITest module).
>>
>> The result is:
>> SEVERE: Failed to process message [senderId=98138e20-8fe4-4750-8281-a92b2067fdcb,
>> messageType=class o.a.i.i.processors.cache.query.GridCacheQueryRequest]
>> java.lang.ClassCastException: LocatorBiPredicate cannot be cast to
>> org.apache.ignite.lang.IgniteBiPredicate
>>     at org.apache.ignite.internal.processors.cache.query.
>> GridCacheQueryRequest.finishUnmarshal(GridCacheQueryRequest.java:324)
>>     at org.apache.ignite.internal.processors.cache.
>> GridCacheIoManager.unmarshall(GridCacheIoManager.java:1298)
>>     at org.apache.ignite.internal.processors.cache.
>> GridCacheIoManager.onMessage0(GridCacheIoManager.java:364)
>>     at org.apache.ignite.internal.processors.cache.GridCacheIoManager.
>> handleMessage(GridCacheIoManager.java:293)
>>     at org.apache.ignite.internal.processors.cache.
>> GridCacheIoManager.access$000(GridCacheIoManager.java:95)
>>     at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.
>> onMessage(GridCacheIoManager.java:238)
>>     at org.apache.ignite.internal.managers.communication.
>> GridIoManager.invokeListener(GridIoManager.java:1222)
>>     at org.apache.ignite.internal.managers.communication.GridIoManager.
>> processRegularMessage0(GridIoManager.java:850)
>>     at org.apache.ignite.internal.managers.communication.
>> GridIoManager.access$2100(GridIoManager.java:108)
>>     at org.apache.ignite.internal.managers.communication.
>> GridIoManager$7.run(GridIoManager.java:790)
>>     at org.apache.ignite.internal.util.StripedExecutor$Stripe.
>> run(StripedExecutor.java:428)
>>     at java.lang.Thread.run(Thread.java:745)
>>
>> where
>> import org.apache.ignite.lang.IgniteBiPredicate;
>> public class LocatorBiPredicate implements IgniteBiPredicate<String,
>> Locator> {...}
>>
>>
>> On Tue, May 16, 2017 at 12:51 PM, Dmitry Pavlov [via Apache Ignite Users]
>> <[hidden email] <http:///user/SendEmail.jtp?type=node&node=12961&i=0>>
>> wrote:
>> Hi Ilya,
>>
>> I supposed following: GridDeploymentClassLoader usage for loading message
>> class on receiver side is mandatory condition for reproduction. But for now
>> it is only hypothesis.
>>
>> In my test if all nodes have message payload class in its classpath there
>> are no problems with messaging under high load (around 10^6 of messages in
>> 32 threads). Same is true if application class loader is used in for
>> loading message class (no WAR files & no Jetty).
>>
>> Could you please check class loaders used at sender(s) sides and on
>> receiver side. Is GridDeploymentClassLoader used for loading message
>> payload class at receiver?
>>
>> Sincerely,
>> Dmitry Pavlov
>>
>>
>>
>> вт, 16 мая 2017 г. в 12:17, Ilya <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=12881&i=0>>:
> Hi Dmitry,
>
> Unfortunately, I've did not yet manage to reproduce this issue outside of
> our project.
>
> What do you mean by "GridDeploymentClassLoader is used for loading class
> on server"? How can server classloaders be configured? How does a remote
> node choose, in which classloader to deploy the received class?
>
> My test configuration is as follows:
>
>    - Web application that has a single cache and has two IgniteMessaging
>    local listeners;
>    - It's run from JUnit test code under Jetty;
>    - The three instances form a cluster;
>    - One client sends messages to one of the topics (to the random node)
>    using IgniteMessaging.
>
> All of these works on a single JVM. I suggest that Jetty servers might use
> dedicated classloaders for each web-app. I've tried to launch client from
> both test code and another web-app under Jetty, but that did not change
> anything.
>
> In fact, the failing test on our production application is launched in the
> same manner: two exploded web-apps (client and server), three instances of
> server app and all of these is run under Jettys in a single VM...
>
> On Sun, May 14, 2017 at 2:22 PM, Dmitry Pavlov [via Apache Ignite Users] <[hidden
> email] <http:///user/SendEmail.jtp?type=node&node=12879&i=0>> wrote:
> Hi, Ilya,
>
> I've tried to reproduce deployment problem in standalone project involving
> Ingnite.start() in several WAR files. But this test is still passing.
>
> It is still possible deployment problem can be reprdoced only
> when GridDeploymentClassLoader is used for loading class on server, and
> several different Web App class loaders are enabled on clients.
>
> Do you have standalone reproducer you can share?
>
> Best Regards,
> Dmitry Pavlov
>
>
>
> пт, 12 мая 2017 г. в 15:58, Ilya <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=12831&i=0>>:
> Hi all!
>
> The question was originally asked (but not answered) on SO:
> http://stackoverflow.com/questions/43803402/how-does-
> peer-classloading-work-in-apache-ignite
>
> In short, we have "Failed to deploy user message" exceptions under high
> load
> in our project.
>
> Here is an overview of our architecture:
> - Distributed cache on three nodes, all nodes run on a single workstation
> (in this test);
> - Workers on each node;
> - Messaging between workers is done using IgniteMessaging (topic has the
> type of String and I've tried both byte[] and ByteBuffer as a message
> class);
> - Client connects to the cluster and triggers some business logic, that
> causes cross-node messaging, scan queries and MR jobs (using
> IgniteCompute::broadcast). All of these may performed concurrently.
>
> I've tried both SHARED and CONTINUOUS deployment mode, but the result
> remains the same.
>
> I've noticed lots of similar messages in the logs:
> /2017-05-05 13:31:28 INFO   org.apache.ignite.logger.java.JavaLogger info
> Removed undeployed class: GridDeployment [ts=1493980288578,
> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@38815daa,
> clsLdrId=36c3828db51-0d65e7d5-77bf-444d-9b8b-d18bde94ad13, userVer=0,
> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
> undeployed=true, usage=0]
> ...
> 2017-05-05 13:31:29 INFO   org.apache.ignite.logger.java.JavaLogger info
> Removed undeployed class: GridDeployment [ts=1493980289125,
> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@355f6680,
> clsLdrId=1dd3828db51-1b20df7a-a98d-45a3-8ab6-e5d229945830, userVer=0,
> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
> undeployed=true, usage=0]
> .../
>
> This happens when I use ByteBuffer as message type. In case of byte[],
> class
> B[ is being constantly re-deployed.
>
> ScanQuery predicate and IgniteCompute caller are also being constantly
> re-deployed.
> If we disable ScanQueries and IgniteCompute broadcasts - all is fine, there
> are no re-deployments.
>
> For the further testing I've disabled MRs and kept ScanQueries. I've also
> added some debug output to a fresh snapshot of Ignite 2.1.0. Messages
> "Class
> locally deployed: <my ScanQuery predicate>" usually come from the following
> call stack:
> /org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
> recordDeploy(GridDeploymentLocalStore.java:404)
>         at
> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
> deploy(GridDeploymentLocalStore.java:333)
>         at
> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
> getDeployment(GridDeploymentLocalStore.java:201)
>         at
> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.
> getLocalDeployment(GridDeploymentManager.java:383)
>         at
> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.
> getDeployment(GridDeploymentManager.java:345)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> injectResources(GridCacheQueryManager.java:918)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> scanIterator(GridCacheQueryManager.java:826)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> executeQuery(GridCacheQueryManager.java:611)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> queryResult(GridCacheQueryManager.java:1593)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> runQuery(GridCacheQueryManager.java:1164)
>         at
> org.apache.ignite.internal.processors.cache.query.
> GridCacheDistributedQueryManager.processQueryRequest(
> GridCacheDistributedQueryManager.java:231)
>         at
> org.apache.ignite.internal.processors.cache.query.
> GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManag
> er.java:109)
>         at
> org.apache.ignite.internal.processors.cache.query.
> GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManag
> er.java:107)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.
> processMessage(GridCacheIoManager.java:863)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(
> GridCacheIoManager.java:386)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.
> handleMessage(GridCacheIoManager.java:308)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(
> GridCacheIoManager.java:100)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.
> onMessage(GridCacheIoManager.java:253)
>         at
> org.apache.ignite.internal.managers.communication.
> GridIoManager.invokeListener(GridIoManager.java:1257)
>         at
> org.apache.ignite.internal.managers.communication.GridIoManager.
> processRegularMessage0(GridIoManager.java:885)
>         at
> org.apache.ignite.internal.managers.communication.
> GridIoManager.access$2100(GridIoManager.java:114)
>         at
> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(
> GridIoManager.java:802)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)/
>
> Messages like "removed undeployed class" usually come from the
> IgniteMessaging's call stack.
>
> I've analyzed the Ignite kernel a bit, and got a suspicion that undeploy is
> being triggered for all classes in a classloader, when at least one class
> that resides in that classloader was re-deployed in some other loader.
>
> It happens inside
> org.apache.ignite.spi.deployment.local.LocalDeploymentSpi#register
>
>     At first, we get a "Map of new resources added for registered class
> loader" using LocalDeploymentSpi#addResource.
>     Then we "Remove resources for all class loaders except {@code
> ignoreClsLdr}." using LocalDeploymentSpi#removeResources. Inside this
> method, it looks like we add all loaders that contain the old version of
> the
> new resource to a "doomed" collection.
>     Finally, we iterate this collection and call onClassLoaderReleased for
> each element. The latter action actually causes all the classes to be
> undeployed (finally causing the "Removed undeployed class" messages).
>
> I don't understand this concept. Why are there multiple classloaders? Why
> do
> we undeploy the whole classloader in such cases?
>
> I'd be grateful, if someone could explain, how does peer classloading work
> in Ignite "under the hood".
>
> P. S. I'm looking at the sources of a fresh snapshot of Ignite 2.1.0, but
> the behavior is the same with the standard Ignite 1.9.0.
>
> P. P. S. Unfortunately, I've did not manage to reproduce this issue outside
> of our project yet.
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Understanding-the-mechanics-
> of-peer-class-loading-tp12661.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-ignite-users.70518.x6.nabble.com/
> Understanding-the-mechanics-of-peer-class-loading-tp12661p12831.html
> To unsubscribe from Understanding the mechanics of peer class loading, click
> here.
> NAML
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
> ------------------------------
> View this message in context: Re: Understanding the mechanics of peer
> class loading
> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12879.html>
> Sent from the Apache Ignite Users mailing list archive
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-ignite-users.70518.x6.nabble.com/
> Understanding-the-mechanics-of-peer-class-loading-tp12661p12881.html
> To unsubscribe from Understanding the mechanics of peer class loading, click
> here.
> NAML
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
> ------------------------------
> View this message in context: Re: Understanding the mechanics of peer
> class loading
> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12961.html>
> Sent from the Apache Ignite Users mailing list archive
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-ignite-users.70518.x6.nabble.com/
> Understanding-the-mechanics-of-peer-class-loading-tp12661p12967.html
> To unsubscribe from Understanding the mechanics of peer class loading, click
> here
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=12661&code=aXlha3Vwb3Y5M0BnbWFpbC5jb218MTI2NjF8LTQ4NzEwMDkzOQ==>
> .
> NAML
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>


hce-ignition-logs.7z (1M) <http://apache-ignite-users.70518.x6.nabble.com/attachment/13007/0/hce-ignition-logs.7z>




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p13007.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Understanding the mechanics of peer class loading

Posted by Dmitry Pavlov <dp...@gmail.com>.

Hi Ilya,

To my mind in this case it's not a problem of Peer Class Loading (PCL).
Please note if you disable PCL in Ignite Configuration, test will fail
anyway.

In this test LocatorBiPredicate is taken by Jetty from the parent class
loader (test launcher class loader) and cast to IgnitePredicate can't
complete sucessfull because of different class loaders used.

May I ask to to provide log and/or stacktrace of original deployment
problem in the project? Additionally you may enable debug log level for
deployment.

Best Regards,
Dmitry Pavlov

ср, 17 мая 2017 г. в 11:39, Ilya <iy...@gmail.com>:

> Hi Dmitry!
>
> Do I understand correct that GridDeploymentClassLoader is always on server
> used if peer classloading is enabled?
>
> I've did not reproduce the re-deployment issue, but maybe have found
> another, which is related to deployment.
> Here is the code:
> https://github.com/yakupov/IgniteDeploymentTest
>
> How to execute:
> 1) mvn package in the root;
> 2) execute JUnit test IgnitionRunner#testJetty (it's in the ITest module).
>
> The result is:
> SEVERE: Failed to process message
> [senderId=98138e20-8fe4-4750-8281-a92b2067fdcb, messageType=class
> o.a.i.i.processors.cache.query.GridCacheQueryRequest]
> java.lang.ClassCastException: LocatorBiPredicate cannot be cast to
> org.apache.ignite.lang.IgniteBiPredicate
>     at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryRequest.finishUnmarshal(GridCacheQueryRequest.java:324)
>     at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.unmarshall(GridCacheIoManager.java:1298)
>     at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:364)
>     at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:293)
>     at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:95)
>     at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:238)
>     at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1222)
>     at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:850)
>     at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:108)
>     at
> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:790)
>     at
> org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:428)
>     at java.lang.Thread.run(Thread.java:745)
>
> where
> import org.apache.ignite.lang.IgniteBiPredicate;
> public class LocatorBiPredicate implements IgniteBiPredicate<String,
> Locator> {...}
>
>
> On Tue, May 16, 2017 at 12:51 PM, Dmitry Pavlov [via Apache Ignite Users]
> <[hidden email] <http:///user/SendEmail.jtp?type=node&node=12961&i=0>>
> wrote:
>
>> Hi Ilya,
>>
>> I supposed following: GridDeploymentClassLoader usage for loading message
>> class on receiver side is mandatory condition for reproduction. But for now
>> it is only hypothesis.
>>
>> In my test if all nodes have message payload class in its classpath there
>> are no problems with messaging under high load (around 10^6 of messages in
>> 32 threads). Same is true if application class loader is used in for
>> loading message class (no WAR files & no Jetty).
>>
>> Could you please check class loaders used at sender(s) sides and on
>> receiver side. Is GridDeploymentClassLoader used for loading message
>> payload class at receiver?
>>
>> Sincerely,
>> Dmitry Pavlov
>>
>>
>>
>> вт, 16 мая 2017 г. в 12:17, Ilya <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=12881&i=0>>:
>>
> Hi Dmitry,
>>>
>>> Unfortunately, I've did not yet manage to reproduce this issue outside
>>> of our project.
>>>
>>> What do you mean by "GridDeploymentClassLoader is used for loading
>>> class on server"? How can server classloaders be configured? How does a
>>> remote node choose, in which classloader to deploy the received class?
>>>
>>> My test configuration is as follows:
>>>
>>>    - Web application that has a single cache and has two
>>>    IgniteMessaging local listeners;
>>>    - It's run from JUnit test code under Jetty;
>>>    - The three instances form a cluster;
>>>    - One client sends messages to one of the topics (to the random
>>>    node) using IgniteMessaging.
>>>
>>> All of these works on a single JVM. I suggest that Jetty servers might
>>> use dedicated classloaders for each web-app. I've tried to launch client
>>> from both test code and another web-app under Jetty, but that did not
>>> change anything.
>>>
>>> In fact, the failing test on our production application is launched in
>>> the same manner: two exploded web-apps (client and server), three instances
>>> of server app and all of these is run under Jettys in a single VM...
>>>
>>> On Sun, May 14, 2017 at 2:22 PM, Dmitry Pavlov [via Apache Ignite Users]
>>> <[hidden email] <http:///user/SendEmail.jtp?type=node&node=12879&i=0>>
>>> wrote:
>>> Hi, Ilya,
>>>
>>> I've tried to reproduce deployment problem in standalone project
>>> involving Ingnite.start() in several WAR files. But this test is still
>>> passing.
>>>
>>> It is still possible deployment problem can be reprdoced only
>>> when GridDeploymentClassLoader is used for loading class on server, and
>>> several different Web App class loaders are enabled on clients.
>>>
>>> Do you have standalone reproducer you can share?
>>>
>>> Best Regards,
>>> Dmitry Pavlov
>>>
>>>
>>>
>>> пт, 12 мая 2017 г. в 15:58, Ilya <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=12831&i=0>>:
>> Hi all!
>>
>> The question was originally asked (but not answered) on SO:
>>
>> http://stackoverflow.com/questions/43803402/how-does-peer-classloading-work-in-apache-ignite
>>
>> In short, we have "Failed to deploy user message" exceptions under high
>> load
>> in our project.
>>
>> Here is an overview of our architecture:
>> - Distributed cache on three nodes, all nodes run on a single workstation
>> (in this test);
>> - Workers on each node;
>> - Messaging between workers is done using IgniteMessaging (topic has the
>> type of String and I've tried both byte[] and ByteBuffer as a message
>> class);
>> - Client connects to the cluster and triggers some business logic, that
>> causes cross-node messaging, scan queries and MR jobs (using
>> IgniteCompute::broadcast). All of these may performed concurrently.
>>
>> I've tried both SHARED and CONTINUOUS deployment mode, but the result
>> remains the same.
>>
>> I've noticed lots of similar messages in the logs:
>> /2017-05-05 13:31:28 INFO   org.apache.ignite.logger.java.JavaLogger info
>> Removed undeployed class: GridDeployment [ts=1493980288578,
>> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@38815daa,
>> clsLdrId=36c3828db51-0d65e7d5-77bf-444d-9b8b-d18bde94ad13, userVer=0,
>> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
>> undeployed=true, usage=0]
>> ...
>> 2017-05-05 13:31:29 INFO   org.apache.ignite.logger.java.JavaLogger info
>> Removed undeployed class: GridDeployment [ts=1493980289125,
>> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@355f6680,
>> clsLdrId=1dd3828db51-1b20df7a-a98d-45a3-8ab6-e5d229945830, userVer=0,
>> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
>> undeployed=true, usage=0]
>> .../
>>
>> This happens when I use ByteBuffer as message type. In case of byte[],
>> class
>> B[ is being constantly re-deployed.
>>
>> ScanQuery predicate and IgniteCompute caller are also being constantly
>> re-deployed.
>> If we disable ScanQueries and IgniteCompute broadcasts - all is fine,
>> there
>> are no re-deployments.
>>
>> For the further testing I've disabled MRs and kept ScanQueries. I've also
>> added some debug output to a fresh snapshot of Ignite 2.1.0. Messages
>> "Class
>> locally deployed: <my ScanQuery predicate>" usually come from the
>> following
>> call stack:
>>
>> /org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.recordDeploy(GridDeploymentLocalStore.java:404)
>>         at
>>
>> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.deploy(GridDeploymentLocalStore.java:333)
>>         at
>>
>> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.getDeployment(GridDeploymentLocalStore.java:201)
>>         at
>>
>> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getLocalDeployment(GridDeploymentManager.java:383)
>>         at
>>
>> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getDeployment(GridDeploymentManager.java:345)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.injectResources(GridCacheQueryManager.java:918)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.scanIterator(GridCacheQueryManager.java:826)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.executeQuery(GridCacheQueryManager.java:611)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.queryResult(GridCacheQueryManager.java:1593)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.runQuery(GridCacheQueryManager.java:1164)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.processQueryRequest(GridCacheDistributedQueryManager.java:231)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:109)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:107)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
>>         at
>>
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
>>         at
>>
>> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
>>         at
>>
>> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
>>         at
>>
>> org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
>>         at
>>
>> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)/
>>
>> Messages like "removed undeployed class" usually come from the
>> IgniteMessaging's call stack.
>>
>> I've analyzed the Ignite kernel a bit, and got a suspicion that undeploy
>> is
>> being triggered for all classes in a classloader, when at least one class
>> that resides in that classloader was re-deployed in some other loader.
>>
>> It happens inside
>> org.apache.ignite.spi.deployment.local.LocalDeploymentSpi#register
>>
>>     At first, we get a "Map of new resources added for registered class
>> loader" using LocalDeploymentSpi#addResource.
>>     Then we "Remove resources for all class loaders except {@code
>> ignoreClsLdr}." using LocalDeploymentSpi#removeResources. Inside this
>> method, it looks like we add all loaders that contain the old version of
>> the
>> new resource to a "doomed" collection.
>>     Finally, we iterate this collection and call onClassLoaderReleased for
>> each element. The latter action actually causes all the classes to be
>> undeployed (finally causing the "Removed undeployed class" messages).
>>
>> I don't understand this concept. Why are there multiple classloaders? Why
>> do
>> we undeploy the whole classloader in such cases?
>>
>> I'd be grateful, if someone could explain, how does peer classloading work
>> in Ignite "under the hood".
>>
>> P. S. I'm looking at the sources of a fresh snapshot of Ignite 2.1.0, but
>> the behavior is the same with the standard Ignite 1.9.0.
>>
>> P. P. S. Unfortunately, I've did not manage to reproduce this issue
>> outside
>> of our project yet.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12831.html
>> To unsubscribe from Understanding the mechanics of peer class loading, click
>> here.
>> NAML
>> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>>
>> ------------------------------
>> View this message in context: Re: Understanding the mechanics of peer
>> class loading
>> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12879.html>
>> Sent from the Apache Ignite Users mailing list archive
>> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>>
>>
>> ------------------------------
>>
> If you reply to this email, your message will be added to the discussion
>> below:
>>
>
>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12881.html
>>
> To unsubscribe from Understanding the mechanics of peer class loading, click
>> here.
>> NAML
>> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
> ------------------------------
> View this message in context: Re: Understanding the mechanics of peer
> class loading
> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12961.html>
> Sent from the Apache Ignite Users mailing list archive
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>

Re: Understanding the mechanics of peer class loading

Posted by Ilya <iy...@gmail.com>.

Hi Dmitry!

Do I understand correct that GridDeploymentClassLoader is always on server
used if peer classloading is enabled?

I've did not reproduce the re-deployment issue, but maybe have found
another, which is related to deployment.
Here is the code:
https://github.com/yakupov/IgniteDeploymentTest

How to execute:
1) mvn package in the root;
2) execute JUnit test IgnitionRunner#testJetty (it's in the ITest module).

The result is:
SEVERE: Failed to process message
[senderId=98138e20-8fe4-4750-8281-a92b2067fdcb, messageType=class
o.a.i.i.processors.cache.query.GridCacheQueryRequest]
java.lang.ClassCastException: LocatorBiPredicate cannot be cast to
org.apache.ignite.lang.IgniteBiPredicate
    at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryRequest.finishUnmarshal(GridCacheQueryRequest.java:324)
    at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.unmarshall(GridCacheIoManager.java:1298)
    at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:364)
    at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:293)
    at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:95)
    at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:238)
    at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1222)
    at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:850)
    at
org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:108)
    at
org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:790)
    at
org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:428)
    at java.lang.Thread.run(Thread.java:745)

where
import org.apache.ignite.lang.IgniteBiPredicate;
public class LocatorBiPredicate implements IgniteBiPredicate<String,
Locator> {...}


On Tue, May 16, 2017 at 12:51 PM, Dmitry Pavlov [via Apache Ignite Users] <
ml+s70518n12881h47@n6.nabble.com> wrote:

> Hi Ilya,
>
> I supposed following: GridDeploymentClassLoader usage for loading message
> class on receiver side is mandatory condition for reproduction. But for now
> it is only hypothesis.
>
> In my test if all nodes have message payload class in its classpath there
> are no problems with messaging under high load (around 10^6 of messages in
> 32 threads). Same is true if application class loader is used in for
> loading message class (no WAR files & no Jetty).
>
> Could you please check class loaders used at sender(s) sides and on
> receiver side. Is GridDeploymentClassLoader used for loading message
> payload class at receiver?
>
> Sincerely,
> Dmitry Pavlov
>
>
>
> вт, 16 мая 2017 г. в 12:17, Ilya <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=12881&i=0>>:
>
>> Hi Dmitry,
>>
>> Unfortunately, I've did not yet manage to reproduce this issue outside of
>> our project.
>>
>> What do you mean by "GridDeploymentClassLoader is used for loading class
>> on server"? How can server classloaders be configured? How does a remote
>> node choose, in which classloader to deploy the received class?
>>
>> My test configuration is as follows:
>>
>>    - Web application that has a single cache and has two IgniteMessaging
>>    local listeners;
>>    - It's run from JUnit test code under Jetty;
>>    - The three instances form a cluster;
>>    - One client sends messages to one of the topics (to the random node)
>>    using IgniteMessaging.
>>
>> All of these works on a single JVM. I suggest that Jetty servers might
>> use dedicated classloaders for each web-app. I've tried to launch client
>> from both test code and another web-app under Jetty, but that did not
>> change anything.
>>
>> In fact, the failing test on our production application is launched in
>> the same manner: two exploded web-apps (client and server), three instances
>> of server app and all of these is run under Jettys in a single VM...
>>
>> On Sun, May 14, 2017 at 2:22 PM, Dmitry Pavlov [via Apache Ignite Users]
>> <[hidden email] <http:///user/SendEmail.jtp?type=node&node=12879&i=0>>
>> wrote:
>> Hi, Ilya,
>>
>> I've tried to reproduce deployment problem in standalone project
>> involving Ingnite.start() in several WAR files. But this test is still
>> passing.
>>
>> It is still possible deployment problem can be reprdoced only
>> when GridDeploymentClassLoader is used for loading class on server, and
>> several different Web App class loaders are enabled on clients.
>>
>> Do you have standalone reproducer you can share?
>>
>> Best Regards,
>> Dmitry Pavlov
>>
>>
>>
>> пт, 12 мая 2017 г. в 15:58, Ilya <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=12831&i=0>>:
> Hi all!
>
> The question was originally asked (but not answered) on SO:
> http://stackoverflow.com/questions/43803402/how-does-
> peer-classloading-work-in-apache-ignite
>
> In short, we have "Failed to deploy user message" exceptions under high
> load
> in our project.
>
> Here is an overview of our architecture:
> - Distributed cache on three nodes, all nodes run on a single workstation
> (in this test);
> - Workers on each node;
> - Messaging between workers is done using IgniteMessaging (topic has the
> type of String and I've tried both byte[] and ByteBuffer as a message
> class);
> - Client connects to the cluster and triggers some business logic, that
> causes cross-node messaging, scan queries and MR jobs (using
> IgniteCompute::broadcast). All of these may performed concurrently.
>
> I've tried both SHARED and CONTINUOUS deployment mode, but the result
> remains the same.
>
> I've noticed lots of similar messages in the logs:
> /2017-05-05 13:31:28 INFO   org.apache.ignite.logger.java.JavaLogger info
> Removed undeployed class: GridDeployment [ts=1493980288578,
> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@38815daa,
> clsLdrId=36c3828db51-0d65e7d5-77bf-444d-9b8b-d18bde94ad13, userVer=0,
> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
> undeployed=true, usage=0]
> ...
> 2017-05-05 13:31:29 INFO   org.apache.ignite.logger.java.JavaLogger info
> Removed undeployed class: GridDeployment [ts=1493980289125,
> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@355f6680,
> clsLdrId=1dd3828db51-1b20df7a-a98d-45a3-8ab6-e5d229945830, userVer=0,
> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
> undeployed=true, usage=0]
> .../
>
> This happens when I use ByteBuffer as message type. In case of byte[],
> class
> B[ is being constantly re-deployed.
>
> ScanQuery predicate and IgniteCompute caller are also being constantly
> re-deployed.
> If we disable ScanQueries and IgniteCompute broadcasts - all is fine, there
> are no re-deployments.
>
> For the further testing I've disabled MRs and kept ScanQueries. I've also
> added some debug output to a fresh snapshot of Ignite 2.1.0. Messages
> "Class
> locally deployed: <my ScanQuery predicate>" usually come from the following
> call stack:
> /org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
> recordDeploy(GridDeploymentLocalStore.java:404)
>         at
> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
> deploy(GridDeploymentLocalStore.java:333)
>         at
> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
> getDeployment(GridDeploymentLocalStore.java:201)
>         at
> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.
> getLocalDeployment(GridDeploymentManager.java:383)
>         at
> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.
> getDeployment(GridDeploymentManager.java:345)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> injectResources(GridCacheQueryManager.java:918)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> scanIterator(GridCacheQueryManager.java:826)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> executeQuery(GridCacheQueryManager.java:611)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> queryResult(GridCacheQueryManager.java:1593)
>         at
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
> runQuery(GridCacheQueryManager.java:1164)
>         at
> org.apache.ignite.internal.processors.cache.query.
> GridCacheDistributedQueryManager.processQueryRequest(
> GridCacheDistributedQueryManager.java:231)
>         at
> org.apache.ignite.internal.processors.cache.query.
> GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManag
> er.java:109)
>         at
> org.apache.ignite.internal.processors.cache.query.
> GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManag
> er.java:107)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.
> processMessage(GridCacheIoManager.java:863)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(
> GridCacheIoManager.java:386)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.
> handleMessage(GridCacheIoManager.java:308)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(
> GridCacheIoManager.java:100)
>         at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.
> onMessage(GridCacheIoManager.java:253)
>         at
> org.apache.ignite.internal.managers.communication.
> GridIoManager.invokeListener(GridIoManager.java:1257)
>         at
> org.apache.ignite.internal.managers.communication.GridIoManager.
> processRegularMessage0(GridIoManager.java:885)
>         at
> org.apache.ignite.internal.managers.communication.
> GridIoManager.access$2100(GridIoManager.java:114)
>         at
> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(
> GridIoManager.java:802)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)/
>
> Messages like "removed undeployed class" usually come from the
> IgniteMessaging's call stack.
>
> I've analyzed the Ignite kernel a bit, and got a suspicion that undeploy is
> being triggered for all classes in a classloader, when at least one class
> that resides in that classloader was re-deployed in some other loader.
>
> It happens inside
> org.apache.ignite.spi.deployment.local.LocalDeploymentSpi#register
>
>     At first, we get a "Map of new resources added for registered class
> loader" using LocalDeploymentSpi#addResource.
>     Then we "Remove resources for all class loaders except {@code
> ignoreClsLdr}." using LocalDeploymentSpi#removeResources. Inside this
> method, it looks like we add all loaders that contain the old version of
> the
> new resource to a "doomed" collection.
>     Finally, we iterate this collection and call onClassLoaderReleased for
> each element. The latter action actually causes all the classes to be
> undeployed (finally causing the "Removed undeployed class" messages).
>
> I don't understand this concept. Why are there multiple classloaders? Why
> do
> we undeploy the whole classloader in such cases?
>
> I'd be grateful, if someone could explain, how does peer classloading work
> in Ignite "under the hood".
>
> P. S. I'm looking at the sources of a fresh snapshot of Ignite 2.1.0, but
> the behavior is the same with the standard Ignite 1.9.0.
>
> P. P. S. Unfortunately, I've did not manage to reproduce this issue outside
> of our project yet.
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Understanding-the-mechanics-
> of-peer-class-loading-tp12661.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-ignite-users.70518.x6.nabble.com/
> Understanding-the-mechanics-of-peer-class-loading-tp12661p12831.html
> To unsubscribe from Understanding the mechanics of peer class loading, click
> here.
> NAML
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
> ------------------------------
> View this message in context: Re: Understanding the mechanics of peer
> class loading
> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12879.html>
> Sent from the Apache Ignite Users mailing list archive
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-ignite-users.70518.x6.nabble.com/
> Understanding-the-mechanics-of-peer-class-loading-tp12661p12881.html
> To unsubscribe from Understanding the mechanics of peer class loading, click
> here
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=12661&code=aXlha3Vwb3Y5M0BnbWFpbC5jb218MTI2NjF8LTQ4NzEwMDkzOQ==>
> .
> NAML
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12961.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Understanding the mechanics of peer class loading

Posted by Dmitry Pavlov <dp...@gmail.com>.

Hi Ilya,

I supposed following: GridDeploymentClassLoader usage for loading message
class on receiver side is mandatory condition for reproduction. But for now
it is only hypothesis.

In my test if all nodes have message payload class in its classpath there
are no problems with messaging under high load (around 10^6 of messages in
32 threads). Same is true if application class loader is used in for
loading message class (no WAR files & no Jetty).

Could you please check class loaders used at sender(s) sides and on
receiver side. Is GridDeploymentClassLoader used for loading message
payload class at receiver?

Sincerely,
Dmitry Pavlov



вт, 16 мая 2017 г. в 12:17, Ilya <iy...@gmail.com>:

> Hi Dmitry,
>
> Unfortunately, I've did not yet manage to reproduce this issue outside of
> our project.
>
> What do you mean by "GridDeploymentClassLoader is used for loading class
> on server"? How can server classloaders be configured? How does a remote
> node choose, in which classloader to deploy the received class?
>
> My test configuration is as follows:
>
>    - Web application that has a single cache and has two IgniteMessaging
>    local listeners;
>    - It's run from JUnit test code under Jetty;
>    - The three instances form a cluster;
>    - One client sends messages to one of the topics (to the random node)
>    using IgniteMessaging.
>
> All of these works on a single JVM. I suggest that Jetty servers might use
> dedicated classloaders for each web-app. I've tried to launch client from
> both test code and another web-app under Jetty, but that did not change
> anything.
>
> In fact, the failing test on our production application is launched in the
> same manner: two exploded web-apps (client and server), three instances of
> server app and all of these is run under Jettys in a single VM...
>
> On Sun, May 14, 2017 at 2:22 PM, Dmitry Pavlov [via Apache Ignite Users] <[hidden
> email] <http:///user/SendEmail.jtp?type=node&node=12879&i=0>> wrote:
>
>> Hi, Ilya,
>>
>> I've tried to reproduce deployment problem in standalone project
>> involving Ingnite.start() in several WAR files. But this test is still
>> passing.
>>
>> It is still possible deployment problem can be reprdoced only
>> when GridDeploymentClassLoader is used for loading class on server, and
>> several different Web App class loaders are enabled on clients.
>>
>> Do you have standalone reproducer you can share?
>>
>> Best Regards,
>> Dmitry Pavlov
>>
>>
>>
>> пт, 12 мая 2017 г. в 15:58, Ilya <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=12831&i=0>>:
>>
> Hi all!
>>>
>>> The question was originally asked (but not answered) on SO:
>>>
>>> http://stackoverflow.com/questions/43803402/how-does-peer-classloading-work-in-apache-ignite
>>>
>>> In short, we have "Failed to deploy user message" exceptions under high
>>> load
>>> in our project.
>>>
>>> Here is an overview of our architecture:
>>> - Distributed cache on three nodes, all nodes run on a single workstation
>>> (in this test);
>>> - Workers on each node;
>>> - Messaging between workers is done using IgniteMessaging (topic has the
>>> type of String and I've tried both byte[] and ByteBuffer as a message
>>> class);
>>> - Client connects to the cluster and triggers some business logic, that
>>> causes cross-node messaging, scan queries and MR jobs (using
>>> IgniteCompute::broadcast). All of these may performed concurrently.
>>>
>>> I've tried both SHARED and CONTINUOUS deployment mode, but the result
>>> remains the same.
>>>
>>> I've noticed lots of similar messages in the logs:
>>> /2017-05-05 13:31:28 INFO   org.apache.ignite.logger.java.JavaLogger info
>>> Removed undeployed class: GridDeployment [ts=1493980288578,
>>> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@38815daa,
>>> clsLdrId=36c3828db51-0d65e7d5-77bf-444d-9b8b-d18bde94ad13, userVer=0,
>>> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
>>> undeployed=true, usage=0]
>>> ...
>>> 2017-05-05 13:31:29 INFO   org.apache.ignite.logger.java.JavaLogger info
>>> Removed undeployed class: GridDeployment [ts=1493980289125,
>>> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@355f6680,
>>> clsLdrId=1dd3828db51-1b20df7a-a98d-45a3-8ab6-e5d229945830, userVer=0,
>>> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
>>> undeployed=true, usage=0]
>>> .../
>>>
>>> This happens when I use ByteBuffer as message type. In case of byte[],
>>> class
>>> B[ is being constantly re-deployed.
>>>
>>> ScanQuery predicate and IgniteCompute caller are also being constantly
>>> re-deployed.
>>> If we disable ScanQueries and IgniteCompute broadcasts - all is fine,
>>> there
>>> are no re-deployments.
>>>
>>> For the further testing I've disabled MRs and kept ScanQueries. I've also
>>> added some debug output to a fresh snapshot of Ignite 2.1.0. Messages
>>> "Class
>>> locally deployed: <my ScanQuery predicate>" usually come from the
>>> following
>>> call stack:
>>>
>>> /org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.recordDeploy(GridDeploymentLocalStore.java:404)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.deploy(GridDeploymentLocalStore.java:333)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.getDeployment(GridDeploymentLocalStore.java:201)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getLocalDeployment(GridDeploymentManager.java:383)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getDeployment(GridDeploymentManager.java:345)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.injectResources(GridCacheQueryManager.java:918)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.scanIterator(GridCacheQueryManager.java:826)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.executeQuery(GridCacheQueryManager.java:611)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.queryResult(GridCacheQueryManager.java:1593)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.runQuery(GridCacheQueryManager.java:1164)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.processQueryRequest(GridCacheDistributedQueryManager.java:231)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:109)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:107)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
>>>         at
>>>
>>> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)/
>>>
>>> Messages like "removed undeployed class" usually come from the
>>> IgniteMessaging's call stack.
>>>
>>> I've analyzed the Ignite kernel a bit, and got a suspicion that undeploy
>>> is
>>> being triggered for all classes in a classloader, when at least one class
>>> that resides in that classloader was re-deployed in some other loader.
>>>
>>> It happens inside
>>> org.apache.ignite.spi.deployment.local.LocalDeploymentSpi#register
>>>
>>>     At first, we get a "Map of new resources added for registered class
>>> loader" using LocalDeploymentSpi#addResource.
>>>     Then we "Remove resources for all class loaders except {@code
>>> ignoreClsLdr}." using LocalDeploymentSpi#removeResources. Inside this
>>> method, it looks like we add all loaders that contain the old version of
>>> the
>>> new resource to a "doomed" collection.
>>>     Finally, we iterate this collection and call onClassLoaderReleased
>>> for
>>> each element. The latter action actually causes all the classes to be
>>> undeployed (finally causing the "Removed undeployed class" messages).
>>>
>>> I don't understand this concept. Why are there multiple classloaders?
>>> Why do
>>> we undeploy the whole classloader in such cases?
>>>
>>> I'd be grateful, if someone could explain, how does peer classloading
>>> work
>>> in Ignite "under the hood".
>>>
>>> P. S. I'm looking at the sources of a fresh snapshot of Ignite 2.1.0, but
>>> the behavior is the same with the standard Ignite 1.9.0.
>>>
>>> P. P. S. Unfortunately, I've did not manage to reproduce this issue
>>> outside
>>> of our project yet.
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661.html
>>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>>
>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12831.html
>> To unsubscribe from Understanding the mechanics of peer class loading, click
>> here.
>> NAML
>> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
> ------------------------------
> View this message in context: Re: Understanding the mechanics of peer
> class loading
> <http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12879.html>
> Sent from the Apache Ignite Users mailing list archive
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
>

Re: Understanding the mechanics of peer class loading

Posted by Ilya <iy...@gmail.com>.

Hi Dmitry,

Unfortunately, I've did not yet manage to reproduce this issue outside of
our project.

What do you mean by "GridDeploymentClassLoader is used for loading class on
server"? How can server classloaders be configured? How does a remote node
choose, in which classloader to deploy the received class?

My test configuration is as follows:

   - Web application that has a single cache and has two IgniteMessaging
   local listeners;
   - It's run from JUnit test code under Jetty;
   - The three instances form a cluster;
   - One client sends messages to one of the topics (to the random node)
   using IgniteMessaging.

All of these works on a single JVM. I suggest that Jetty servers might use
dedicated classloaders for each web-app. I've tried to launch client from
both test code and another web-app under Jetty, but that did not change
anything.

In fact, the failing test on our production application is launched in the
same manner: two exploded web-apps (client and server), three instances of
server app and all of these is run under Jettys in a single VM...

On Sun, May 14, 2017 at 2:22 PM, Dmitry Pavlov [via Apache Ignite Users] <
ml+s70518n12831h97@n6.nabble.com> wrote:

> Hi, Ilya,
>
> I've tried to reproduce deployment problem in standalone project involving
> Ingnite.start() in several WAR files. But this test is still passing.
>
> It is still possible deployment problem can be reprdoced only
> when GridDeploymentClassLoader is used for loading class on server, and
> several different Web App class loaders are enabled on clients.
>
> Do you have standalone reproducer you can share?
>
> Best Regards,
> Dmitry Pavlov
>
>
>
> пт, 12 мая 2017 г. в 15:58, Ilya <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=12831&i=0>>:
>
>> Hi all!
>>
>> The question was originally asked (but not answered) on SO:
>> http://stackoverflow.com/questions/43803402/how-does-
>> peer-classloading-work-in-apache-ignite
>>
>> In short, we have "Failed to deploy user message" exceptions under high
>> load
>> in our project.
>>
>> Here is an overview of our architecture:
>> - Distributed cache on three nodes, all nodes run on a single workstation
>> (in this test);
>> - Workers on each node;
>> - Messaging between workers is done using IgniteMessaging (topic has the
>> type of String and I've tried both byte[] and ByteBuffer as a message
>> class);
>> - Client connects to the cluster and triggers some business logic, that
>> causes cross-node messaging, scan queries and MR jobs (using
>> IgniteCompute::broadcast). All of these may performed concurrently.
>>
>> I've tried both SHARED and CONTINUOUS deployment mode, but the result
>> remains the same.
>>
>> I've noticed lots of similar messages in the logs:
>> /2017-05-05 13:31:28 INFO   org.apache.ignite.logger.java.JavaLogger info
>> Removed undeployed class: GridDeployment [ts=1493980288578,
>> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@38815daa,
>> clsLdrId=36c3828db51-0d65e7d5-77bf-444d-9b8b-d18bde94ad13, userVer=0,
>> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
>> undeployed=true, usage=0]
>> ...
>> 2017-05-05 13:31:29 INFO   org.apache.ignite.logger.java.JavaLogger info
>> Removed undeployed class: GridDeployment [ts=1493980289125,
>> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@355f6680,
>> clsLdrId=1dd3828db51-1b20df7a-a98d-45a3-8ab6-e5d229945830, userVer=0,
>> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
>> undeployed=true, usage=0]
>> .../
>>
>> This happens when I use ByteBuffer as message type. In case of byte[],
>> class
>> B[ is being constantly re-deployed.
>>
>> ScanQuery predicate and IgniteCompute caller are also being constantly
>> re-deployed.
>> If we disable ScanQueries and IgniteCompute broadcasts - all is fine,
>> there
>> are no re-deployments.
>>
>> For the further testing I've disabled MRs and kept ScanQueries. I've also
>> added some debug output to a fresh snapshot of Ignite 2.1.0. Messages
>> "Class
>> locally deployed: <my ScanQuery predicate>" usually come from the
>> following
>> call stack:
>> /org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
>> recordDeploy(GridDeploymentLocalStore.java:404)
>>         at
>> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
>> deploy(GridDeploymentLocalStore.java:333)
>>         at
>> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.
>> getDeployment(GridDeploymentLocalStore.java:201)
>>         at
>> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.
>> getLocalDeployment(GridDeploymentManager.java:383)
>>         at
>> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.
>> getDeployment(GridDeploymentManager.java:345)
>>         at
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
>> injectResources(GridCacheQueryManager.java:918)
>>         at
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
>> scanIterator(GridCacheQueryManager.java:826)
>>         at
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
>> executeQuery(GridCacheQueryManager.java:611)
>>         at
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
>> queryResult(GridCacheQueryManager.java:1593)
>>         at
>> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.
>> runQuery(GridCacheQueryManager.java:1164)
>>         at
>> org.apache.ignite.internal.processors.cache.query.
>> GridCacheDistributedQueryManager.processQueryRequest(
>> GridCacheDistributedQueryManager.java:231)
>>         at
>> org.apache.ignite.internal.processors.cache.query.
>> GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManag
>> er.java:109)
>>         at
>> org.apache.ignite.internal.processors.cache.query.
>> GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManag
>> er.java:107)
>>         at
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.
>> processMessage(GridCacheIoManager.java:863)
>>         at
>> org.apache.ignite.internal.processors.cache.
>> GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
>>         at
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager.
>> handleMessage(GridCacheIoManager.java:308)
>>         at
>> org.apache.ignite.internal.processors.cache.
>> GridCacheIoManager.access$000(GridCacheIoManager.java:100)
>>         at
>> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.
>> onMessage(GridCacheIoManager.java:253)
>>         at
>> org.apache.ignite.internal.managers.communication.
>> GridIoManager.invokeListener(GridIoManager.java:1257)
>>         at
>> org.apache.ignite.internal.managers.communication.GridIoManager.
>> processRegularMessage0(GridIoManager.java:885)
>>         at
>> org.apache.ignite.internal.managers.communication.
>> GridIoManager.access$2100(GridIoManager.java:114)
>>         at
>> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(
>> GridIoManager.java:802)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)/
>>
>> Messages like "removed undeployed class" usually come from the
>> IgniteMessaging's call stack.
>>
>> I've analyzed the Ignite kernel a bit, and got a suspicion that undeploy
>> is
>> being triggered for all classes in a classloader, when at least one class
>> that resides in that classloader was re-deployed in some other loader.
>>
>> It happens inside
>> org.apache.ignite.spi.deployment.local.LocalDeploymentSpi#register
>>
>>     At first, we get a "Map of new resources added for registered class
>> loader" using LocalDeploymentSpi#addResource.
>>     Then we "Remove resources for all class loaders except {@code
>> ignoreClsLdr}." using LocalDeploymentSpi#removeResources. Inside this
>> method, it looks like we add all loaders that contain the old version of
>> the
>> new resource to a "doomed" collection.
>>     Finally, we iterate this collection and call onClassLoaderReleased for
>> each element. The latter action actually causes all the classes to be
>> undeployed (finally causing the "Removed undeployed class" messages).
>>
>> I don't understand this concept. Why are there multiple classloaders? Why
>> do
>> we undeploy the whole classloader in such cases?
>>
>> I'd be grateful, if someone could explain, how does peer classloading work
>> in Ignite "under the hood".
>>
>> P. S. I'm looking at the sources of a fresh snapshot of Ignite 2.1.0, but
>> the behavior is the same with the standard Ignite 1.9.0.
>>
>> P. P. S. Unfortunately, I've did not manage to reproduce this issue
>> outside
>> of our project yet.
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-ignite-users.
>> 70518.x6.nabble.com/Understanding-the-mechanics-
>> of-peer-class-loading-tp12661.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-ignite-users.70518.x6.nabble.com/
> Understanding-the-mechanics-of-peer-class-loading-tp12661p12831.html
> To unsubscribe from Understanding the mechanics of peer class loading, click
> here
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=12661&code=aXlha3Vwb3Y5M0BnbWFpbC5jb218MTI2NjF8LTQ4NzEwMDkzOQ==>
> .
> NAML
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661p12879.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Understanding the mechanics of peer class loading

Posted by Dmitry Pavlov <dp...@gmail.com>.

Hi, Ilya,

I've tried to reproduce deployment problem in standalone project involving
Ingnite.start() in several WAR files. But this test is still passing.

It is still possible deployment problem can be reprdoced only
when GridDeploymentClassLoader is used for loading class on server, and
several different Web App class loaders are enabled on clients.

Do you have standalone reproducer you can share?

Best Regards,
Dmitry Pavlov



пт, 12 мая 2017 г. в 15:58, Ilya <iy...@gmail.com>:

> Hi all!
>
> The question was originally asked (but not answered) on SO:
>
> http://stackoverflow.com/questions/43803402/how-does-peer-classloading-work-in-apache-ignite
>
> In short, we have "Failed to deploy user message" exceptions under high
> load
> in our project.
>
> Here is an overview of our architecture:
> - Distributed cache on three nodes, all nodes run on a single workstation
> (in this test);
> - Workers on each node;
> - Messaging between workers is done using IgniteMessaging (topic has the
> type of String and I've tried both byte[] and ByteBuffer as a message
> class);
> - Client connects to the cluster and triggers some business logic, that
> causes cross-node messaging, scan queries and MR jobs (using
> IgniteCompute::broadcast). All of these may performed concurrently.
>
> I've tried both SHARED and CONTINUOUS deployment mode, but the result
> remains the same.
>
> I've noticed lots of similar messages in the logs:
> /2017-05-05 13:31:28 INFO   org.apache.ignite.logger.java.JavaLogger info
> Removed undeployed class: GridDeployment [ts=1493980288578,
> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@38815daa,
> clsLdrId=36c3828db51-0d65e7d5-77bf-444d-9b8b-d18bde94ad13, userVer=0,
> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
> undeployed=true, usage=0]
> ...
> 2017-05-05 13:31:29 INFO   org.apache.ignite.logger.java.JavaLogger info
> Removed undeployed class: GridDeployment [ts=1493980289125,
> depMode=CONTINUOUS, clsLdr=WebAppClassLoader=MyApp@355f6680,
> clsLdrId=1dd3828db51-1b20df7a-a98d-45a3-8ab6-e5d229945830, userVer=0,
> loc=true, sampleClsName=java.lang.String, pendingUndeploy=false,
> undeployed=true, usage=0]
> .../
>
> This happens when I use ByteBuffer as message type. In case of byte[],
> class
> B[ is being constantly re-deployed.
>
> ScanQuery predicate and IgniteCompute caller are also being constantly
> re-deployed.
> If we disable ScanQueries and IgniteCompute broadcasts - all is fine, there
> are no re-deployments.
>
> For the further testing I've disabled MRs and kept ScanQueries. I've also
> added some debug output to a fresh snapshot of Ignite 2.1.0. Messages
> "Class
> locally deployed: <my ScanQuery predicate>" usually come from the following
> call stack:
>
> /org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.recordDeploy(GridDeploymentLocalStore.java:404)
>         at
>
> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.deploy(GridDeploymentLocalStore.java:333)
>         at
>
> org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore.getDeployment(GridDeploymentLocalStore.java:201)
>         at
>
> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getLocalDeployment(GridDeploymentManager.java:383)
>         at
>
> org.apache.ignite.internal.managers.deployment.GridDeploymentManager.getDeployment(GridDeploymentManager.java:345)
>         at
>
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.injectResources(GridCacheQueryManager.java:918)
>         at
>
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.scanIterator(GridCacheQueryManager.java:826)
>         at
>
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.executeQuery(GridCacheQueryManager.java:611)
>         at
>
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.queryResult(GridCacheQueryManager.java:1593)
>         at
>
> org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.runQuery(GridCacheQueryManager.java:1164)
>         at
>
> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager.processQueryRequest(GridCacheDistributedQueryManager.java:231)
>         at
>
> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:109)
>         at
>
> org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$2.apply(GridCacheDistributedQueryManager.java:107)
>         at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
>         at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
>         at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
>         at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
>         at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
>         at
>
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
>         at
>
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
>         at
>
> org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
>         at
>
> org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)/
>
> Messages like "removed undeployed class" usually come from the
> IgniteMessaging's call stack.
>
> I've analyzed the Ignite kernel a bit, and got a suspicion that undeploy is
> being triggered for all classes in a classloader, when at least one class
> that resides in that classloader was re-deployed in some other loader.
>
> It happens inside
> org.apache.ignite.spi.deployment.local.LocalDeploymentSpi#register
>
>     At first, we get a "Map of new resources added for registered class
> loader" using LocalDeploymentSpi#addResource.
>     Then we "Remove resources for all class loaders except {@code
> ignoreClsLdr}." using LocalDeploymentSpi#removeResources. Inside this
> method, it looks like we add all loaders that contain the old version of
> the
> new resource to a "doomed" collection.
>     Finally, we iterate this collection and call onClassLoaderReleased for
> each element. The latter action actually causes all the classes to be
> undeployed (finally causing the "Removed undeployed class" messages).
>
> I don't understand this concept. Why are there multiple classloaders? Why
> do
> we undeploy the whole classloader in such cases?
>
> I'd be grateful, if someone could explain, how does peer classloading work
> in Ignite "under the hood".
>
> P. S. I'm looking at the sources of a fresh snapshot of Ignite 2.1.0, but
> the behavior is the same with the standard Ignite 1.9.0.
>
> P. P. S. Unfortunately, I've did not manage to reproduce this issue outside
> of our project yet.
>
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Understanding-the-mechanics-of-peer-class-loading-tp12661.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>