You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@geode.apache.org by Mateusz Rys <ra...@gmail.com> on 2020/07/01 13:29:24 UTC

Re: CQ Event Notification in Multisite (WAN)

Hello!

Anil,
Yes, it was a new server setup without any data (we deleted the directory).
Region.get() works without problems. With no Listeners active, one client
adds an object to his local database and 10 seconds later, the second
client reads it (also from his local database).

We rebuilt the library a couple of times, adding our own debug printouts
and we found that right before the exception there is an attempt to
deserialize fixedID = -135.
In JAVA it is defined as:
short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;
Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you
think this might be the problem?

[debug 2020/07/01 12:43:28.527312 CEST geodeA:24402 139972805822208] OWN
TcrMessage:LOCAL_CREATE: REGNAME /multisiteData
[debug 2020/07/01 12:43:28.527358 CEST geodeA:24402 139972805822208] OWN
TcrMessage:LOCAL_CREATE: isDELTA=false
[debug 2020/07/01 12:43:28.527380 CEST geodeA:24402 139972805822208]
SerializationRegistry::deserialize typeId = -1 dsCode =  93
[debug 2020/07/01 12:43:28.527388 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: PDX type will be deserialized
[debug 2020/07/01 12:43:28.527398 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: PdxTypeHandler::deserialize called
[debug 2020/07/01 12:43:28.527415 CEST geodeA:24402 139972805822208] OWN :
PdxTypeRegistry:getPdxType called typeId=1009228956
[debug 2020/07/01 12:43:28.527428 CEST geodeA:24402 139972805822208] OWN:
PdxTypeRegistry:getLocalPdxType called localType=com.example.Order
[debug 2020/07/01 12:43:28.527451 CEST geodeA:24402 139972805822208]
deserializePdx ClassName = com.example.Order, isLocal = 1
[debug 2020/07/01 12:43:28.527462 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: getPdxSerializableType called: com.example.Order
[debug 2020/07/01 12:43:28.527473 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: TheTypeMap::findPdxSerializable called
com.example.Order
[debug 2020/07/01 12:43:28.527493 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: getPdxSerializableType called: type found
[debug 2020/07/01 12:43:28.527509 CEST geodeA:24402 139972805822208] OWN
pdxHelper: pdxObjectptr OK
[debug 2020/07/01 12:43:28.527525 CEST geodeA:24402 139972805822208] OWN:
PdxType:getLocalToRemoteMap called
[debug 2020/07/01 12:43:28.527535 CEST geodeA:24402 139972805822208] OWN:
PdxType:getRemoteToLocalMap called
[debug 2020/07/01 12:43:28.527546 CEST geodeA:24402 139972805822208] OWN:
PdxLocalReader:initialize called
OWN order from data called
OWN order from data called - after reading name_:product x
[debug 2020/07/01 12:43:28.527590 CEST geodeA:24402 139972805822208] OWN
pdxHelper: pdxObjectptr->fromData(plr) OK
[debug 2020/07/01 12:43:28.527600 CEST geodeA:24402 139972805822208] OWN:
PdxLocalReader:moveStream called
[debug 2020/07/01 12:43:28.527611 CEST geodeA:24402 139972805822208] OWN
pdxHelper: plr.moveStream() OK
[debug 2020/07/01 12:43:28.527628 CEST geodeA:24402 139972805822208]
SerializationRegistry::deserialize typeId = -1 dsCode =  2
[debug 2020/07/01 12:43:28.527638 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: FixedID type will be deserialized
[debug 2020/07/01 12:43:28.527648 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: deserializeDataSerializableFixedId called
[debug 2020/07/01 12:43:28.527665 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: FixedIDShort fixedId: -135
[debug 2020/07/01 12:43:28.527674 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: TheTypeMap::findDataSerializableFixedId called -135
[debug 2020/07/01 12:43:28.527735 CEST geodeA:24402 139972805822208] OWN
SerializationRegistry: exception imminent!!!
[error 2020/07/01 12:43:28.528009 CEST geodeA:24402 139972805822208]
Exception while receiving subscription event for endpoint geodeB:40404::
apache::geode::client::IllegalStateException: Unregistered type in
deserialization
[debug 2020/07/01 12:43:28.528073 CEST geodeA:24402 139972805822208]
TcrConnection::readMessage: receiving reply from endpoint geodeB:40404

Jake,
Yesterday, we also implemented a simple JAVA client and it works without
any problems.
We are doing almost exactly the same things we did in CPP client.
The only differences are:
- using ClientXXX instead of XXX in API calls (for example:
ClientCacheFactory instead of CacheFactory),
  but I assume it's just a difference in API naming convention.
- lack of registerPdxType in JAVA Client as there is no such thing

With two JAVA Clients writing and having listeners, local event is
recognized as DomainClass from the get-go and remote event is recognized as
PdxInstance that we have to convert.
Nevertheless, it works perfectly fine as we would expect.


With CPP Clients, we can see in the server logs that adding PdxType from
remote WAN is recognized.
Does the fact that we have it twice mean that the server treats it as two
separate entries?
[info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread 4>
tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[

order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0

name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1

quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

[info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread 4>
tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[

order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0

name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1

quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

I'm not sure if producing an integration test isn't a bit beyond my
knowledge at the moment.


Regards
Mateusz


wt., 30 cze 2020 o 19:19 Anilkumar Gingade <ag...@vmware.com> napisał(a):

> Mateusz,
>
>
>
> The type registry (pdx) is maintained at server side only. Clients doesn’t
> maintain any registry.
>
> When you say, you changed ids; was it in a new server setup, without any
> persistent data (persistent region)?
>
> Can you see/try, with a fresh setup, the region.get() works and then add
> cache listener to it…This is to isolate/debug the issue.
>
>
>
> -Anil.
>
>
>
>
>
>
>
> *From: *Mateusz Rys <ra...@gmail.com>
> *Reply-To: *"user@geode.apache.org" <us...@geode.apache.org>
> *Date: *Tuesday, June 30, 2020 at 2:45 AM
> *To: *"user@geode.apache.org" <us...@geode.apache.org>
> *Subject: *Re: CQ Event Notification in Multisite (WAN)
>
>
>
> Hi Anil,
>
>
>
> Thanks for the response.
> You are of course right, this isn't connected to CQ. Yesterday we
> implemented CacheListener and the behavior is the same.
>
> Unfortunately, we do have unique distributed ids set in gemfire.properties.
>
>
> #on geodeA
> mcast-port=0
> locators=GeodeA[10334]
> distributed-system-id=1
> remote-locators=GeodeB[10334]
>
> #on geodeB
> mcast-port=0
> locators=GeodeB[10334]
> distributed-system-id=2
> remote-locators=GeodeA[10334]
>
>
>
> Same IDs are used when creating gateway senders.
>
>
> We also noticed that on both clients we have "Deserialized distributed
> member Id 1" output, even after we changed IDs in gemfire.properties to 50
> and 60 for better visualization.
> [debug 2020/06/28 16:44:35.904586 CEST geodeA:3628 140700250048512]
> Deserializing distributed member Id
> [debug 2020/06/28 16:44:35.904675 CEST geodeA:3628 140700250048512]
> SerializationRegistry::deserialize typeId = -1 dsCode =  1
> [debug 2020/06/28 16:44:35.904815 CEST geodeA:3628 140700250048512]
> SerializationRegistry::deserialize typeId = -1 dsCode =  87
> [debug 2020/06/28 16:44:35.905000 CEST geodeA:3628 140700250048512]
> SerializationRegistry::deserialize typeId = -1 dsCode =  87
> [debug 2020/06/28 16:44:35.905108 CEST geodeA:3628 140700250048512]
> SerializationRegistry::deserialize typeId = -1 dsCode =  87
> [debug 2020/06/28 16:44:35.905209 CEST geodeA:3628 140700250048512]
> SerializationRegistry::deserialize typeId = -1 dsCode =  87
> [debug 2020/06/28 16:44:35.905308 CEST geodeA:3628 140700250048512]
> ClientProxyMembershipID::readVersion ordinal = 110
> [debug 2020/06/28 16:44:35.905466 CEST geodeA:3628 140700250048512]
> ClientProxyMembershipID::writeVersion ordinal = 45
> [debug 2020/06/28 16:44:35.905573 CEST geodeA:3628 140700250048512]
> GethashKey :192:168:56:201:41001:server::1 client id:
> 192.168.56.201(3371:loner):2::server
> [debug 2020/06/28 16:44:35.905719 CEST geodeA:3628 140700250048512] Adding
> a new member to the member list maintained for version stamps member Ids.
> HashKey: :192:168:56:201:41001:server::1 MemberCounter: 1
> [debug 2020/06/28 16:44:35.905867 CEST geodeA:3628 140700250048512]
> Deserialized distributed member Id 1
>
>
>
> This would suggest that there is another place to provide IDs on the
> client, but I fail to see anything relevant in the API.
>
> Do you think this can be the problem?
>
>
>
>
>
> Regards
>
> Mateusz
>
>
>
> pon., 29 cze 2020 o 21:04 Anilkumar Gingade <ag...@vmware.com>
> napisał(a):
>
> Hi Mateusz,
>
>
>
> The issue is not related to CQ functionality, its related to serialization
> and de-serialization of the PDX type.
>
>
>
> As you can see from documentation, you need to configure distributed Ids
> in each cluster to get the PDX working across the WAN sites.
>
>
>
>
> https://geode.apache.org/docs/guide/110/developing/data_serialization/use_pdx_high_level_steps.html
>
>
>
> Have you configured the cluster sites with unique distributed ids?
>
>
>
> -Anil.
>
>
>
>
>
>
>
> *From: *Mateusz Rys <ra...@gmail.com>
> *Reply-To: *"user@geode.apache.org" <us...@geode.apache.org>
> *Date: *Monday, June 29, 2020 at 2:26 AM
> *To: *"user@geode.apache.org" <us...@geode.apache.org>
> *Subject: *CQ Event Notification in Multisite (WAN)
>
>
>
> Hi Geode Users,
>
> Together with my friends I'm trying to setup a simple Native Client (C++)
> and Server configuration.
> We decided to use Continuous Query (CQ) and overwrite onEvent() method to
> have an easy way to be informed about the updates in the database.
> So far so good, everything works as expected.
>
> But the main reason we are doing all of this is to have two or more such
> setups - all connected using Geode Multisite (WAN).
> We successfully created Multisite configuration, we can see (with gfsh
> commands) that if something is written to SiteA, it is being replicated to
> SiteB.
> After reading the documentation, it also became obvious that we must
> provide a way to serialize objects before they are sent over the network.
> We decided that the simplest way should be to inherit from PdxSerializable
> class.
>
> Unfortunately, we ran into a problem that we can't get rid of:
> [error 2020/06/25 16:44:36.344952 CEST geodeA:3604 139864301352704]
> Exception while receiving subscription event for endpoint geodeB:40404::
> apache::geode::client::IllegalStateException: Unregistered type in
> deserialization
>
> The exception is visible if the event comes from a remote (SiteB) source.
> If we update the database locally on SiteA we have normal onEvent()
> invocation in ClientA,
> but the exception is present on ClientB connected to SiteB.
>
> We register our type using
> cache.getTypeRegistry().registerPdxType(Order::createDeserializable);
> We tried to have only the first client to register the type, only the
> second client, or both of them, but that didn't seem to change anything.
> We also tried to toy with setPdxReadSerialized(true) but to no avail.
>
> Have you met this exception before?
> Do you have any CQ+Multisite implementation examples?
>
>
> Thank you
> Regards
> Mateusz
>
>

RE: CQ Event Notification in Multisite (WAN)

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi,

I sent a PR to the geode native client adding the GatewaySenderEventCallbakArgument class, so the client will be able to deserialize it and the exception will not be thrown.

https://github.com/apache/geode-native/pull/628

I have checked with Mateusz that this change solves the problem he was reporting.

BR/

Alberto B.
________________________________
De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: lunes, 13 de julio de 2020 17:15
Para: user@geode.apache.org <us...@geode.apache.org>
Asunto: RE: CQ Event Notification in Multisite (WAN)

Hi,

In this repo you can find an easy way to reproduce the error: https://github.com/alb3rtobr/geode-wan-cq-deserialization

I tested adding the GatewaySenderclass to the C++ client, and with this change the exception is not thrown (draft of the code: https://github.com/apache/geode-native/pull/627 )

Anil, have you had time to check if GatewaySenderEventCallback is needed in the clients?

BR/

Alberto B.


________________________________
De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: miércoles, 8 de julio de 2020 16:43
Para: user@geode.apache.org <us...@geode.apache.org>
Asunto: RE: CQ Event Notification in Multisite (WAN)

Hi,

I will try to help with this issue. I have created a ticket to keep track of it: https://issues.apache.org/jira/browse/GEODE-8344

BR/

Alberto B.
________________________________
De: Anilkumar Gingade <ag...@vmware.com>
Enviado: viernes, 3 de julio de 2020 0:46
Para: user@geode.apache.org <us...@geode.apache.org>
Asunto: Re: CQ Event Notification in Multisite (WAN)


Mateusz,



It is mostly serial (circular) and star pattern (where all the clusters are connected to each other) where the receiver needs to check who the originator and who all are the recipients, before forwarding it to other cluster. As I mentioned, not sure why it is getting forwarded to clients.

-Anil.



From: Mateusz Rys <ra...@gmail.com>
Reply-To: "user@geode.apache.org" <us...@geode.apache.org>
Date: Thursday, July 2, 2020 at 7:44 AM
To: "user@geode.apache.org" <us...@geode.apache.org>
Subject: Re: CQ Event Notification in Multisite (WAN)



Thanks Guys!

When I was younger, my teacher told me that if I think the problem is in the compiler or the library and not in my code then I have to think again.

Finally, I can prove him wrong ;) (once for 10 years)



Anyway, we implemented a stub sort of class that registered itself with FixedID = -135.

It did nothing besides adding itself to TheTypeMap so that it can be found later when deserialization happens.

Strangely enough - it worked. That is, the event was processed successfully and finally our afterCreate/afterUpdate method triggered.

Nobody knows what we broke doing that, but at least we can safely say that we pinpointed the issue.



Anil,
you wrote that the GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients "in case of certain wan connection configuration".

Did you have anything particular in mind?

I scoured the documentation again but I haven't found anything that could make the difference in connection configuration.

As for now, we planned on using the standard "Fully Connected Mesh Topology", although it seems to me that it's the server that decides

whether the event should be propagated further or not - we haven't seen any sort of "endless event loop" after adding the stub-class.





Regards

Mateusz





śr., 1 lip 2020 o 23:23 Anilkumar Gingade <ag...@vmware.com>> napisał(a):

>> Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?

Looking at the purpose of this class; this is to avoid resending of an event to a recipient; in case of certain wan connection configuration.



After processing the WAN event, my expectation is fetching data from WAN event and creating a new region entry and then a new client event from it and sending it to client. My suspicion is we are preserving this callback event and passing all the way to the client; not sure if its needed. Need additional investigation.



-Anil.





From: Jacob Barrett <ja...@vmware.com>>
Reply-To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Date: Wednesday, July 1, 2020 at 11:53 AM
To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Subject: Re: CQ Event Notification in Multisite (WAN)







On Jul 1, 2020, at 6:29 AM, Mateusz Rys <ra...@gmail.com>> wrote:



We rebuilt the library a couple of times, adding our own debug printouts and we found that right before the exception there is an attempt to deserialize fixedID = -135.

In JAVA it is defined as:

short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;

Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you think this might be the problem?



This doesn’t come as a complete surprise to me. Its likely this was never implemented in the C++ client. I am not even suer what this is and why it would be sent to the client anyway.



Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?





Yesterday, we also implemented a simple JAVA client and it works without any problems.

We are doing almost exactly the same things we did in CPP client.

The only differences are:
- using ClientXXX instead of XXX in API calls (for example: ClientCacheFactory instead of CacheFactory),

  but I assume it's just a difference in API naming convention.



Yes, here are minor differences in the APIs. Java ClientCacheFactory is fairly new and was derived to split concerns between server and client versions of a Cache. On the C++ there is only a client cache, but it retains the original name of CacheFactory and Cache. Confusing… yes.



- lack of registerPdxType in JAVA Client as there is no such thing



Java side doesn’t need this because of reflection as describe earlier. C++ needs this to map types to a factory function that will allocate the C++ class.



With two JAVA Clients writing and having listeners, local event is recognized as DomainClass from the get-go and remote event is recognized as PdxInstance that we have to convert.

Nevertheless, it works perfectly fine as we would expect.





With CPP Clients, we can see in the server logs that adding PdxType from remote WAN is recognized.

Does the fact that we have it twice mean that the server treats it as two separate entries?

[info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

[info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]



The server will in fact treat them differently, but the problem is the clients don’t know what do to with them. As described earlier I don’t think the C++ client is tracking both the system id and the type name in the registration of the allocation function. It should create a mapping from the class name to the function so that when the client receives a fully qualified type id (type id and system id) for which there is no mapping then it should try the class name mapping. If a class name mapping is found it should record that same function in the fully qualified type id mapping.



I'm not sure if producing an integration test isn't a bit beyond my knowledge at the moment.



I have faith!



At this point I would just open a JIRA for this issue with all the information you have gathered.



-Jake

RE: CQ Event Notification in Multisite (WAN)

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi,

In this repo you can find an easy way to reproduce the error: https://github.com/alb3rtobr/geode-wan-cq-deserialization

I tested adding the GatewaySenderclass to the C++ client, and with this change the exception is not thrown (draft of the code: https://github.com/apache/geode-native/pull/627 )

Anil, have you had time to check if GatewaySenderEventCallback is needed in the clients?

BR/

Alberto B.


________________________________
De: Alberto Bustamante Reyes <al...@est.tech>
Enviado: miércoles, 8 de julio de 2020 16:43
Para: user@geode.apache.org <us...@geode.apache.org>
Asunto: RE: CQ Event Notification in Multisite (WAN)

Hi,

I will try to help with this issue. I have created a ticket to keep track of it: https://issues.apache.org/jira/browse/GEODE-8344

BR/

Alberto B.
________________________________
De: Anilkumar Gingade <ag...@vmware.com>
Enviado: viernes, 3 de julio de 2020 0:46
Para: user@geode.apache.org <us...@geode.apache.org>
Asunto: Re: CQ Event Notification in Multisite (WAN)


Mateusz,



It is mostly serial (circular) and star pattern (where all the clusters are connected to each other) where the receiver needs to check who the originator and who all are the recipients, before forwarding it to other cluster. As I mentioned, not sure why it is getting forwarded to clients.

-Anil.



From: Mateusz Rys <ra...@gmail.com>
Reply-To: "user@geode.apache.org" <us...@geode.apache.org>
Date: Thursday, July 2, 2020 at 7:44 AM
To: "user@geode.apache.org" <us...@geode.apache.org>
Subject: Re: CQ Event Notification in Multisite (WAN)



Thanks Guys!

When I was younger, my teacher told me that if I think the problem is in the compiler or the library and not in my code then I have to think again.

Finally, I can prove him wrong ;) (once for 10 years)



Anyway, we implemented a stub sort of class that registered itself with FixedID = -135.

It did nothing besides adding itself to TheTypeMap so that it can be found later when deserialization happens.

Strangely enough - it worked. That is, the event was processed successfully and finally our afterCreate/afterUpdate method triggered.

Nobody knows what we broke doing that, but at least we can safely say that we pinpointed the issue.



Anil,
you wrote that the GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients "in case of certain wan connection configuration".

Did you have anything particular in mind?

I scoured the documentation again but I haven't found anything that could make the difference in connection configuration.

As for now, we planned on using the standard "Fully Connected Mesh Topology", although it seems to me that it's the server that decides

whether the event should be propagated further or not - we haven't seen any sort of "endless event loop" after adding the stub-class.





Regards

Mateusz





śr., 1 lip 2020 o 23:23 Anilkumar Gingade <ag...@vmware.com>> napisał(a):

>> Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?

Looking at the purpose of this class; this is to avoid resending of an event to a recipient; in case of certain wan connection configuration.



After processing the WAN event, my expectation is fetching data from WAN event and creating a new region entry and then a new client event from it and sending it to client. My suspicion is we are preserving this callback event and passing all the way to the client; not sure if its needed. Need additional investigation.



-Anil.





From: Jacob Barrett <ja...@vmware.com>>
Reply-To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Date: Wednesday, July 1, 2020 at 11:53 AM
To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Subject: Re: CQ Event Notification in Multisite (WAN)







On Jul 1, 2020, at 6:29 AM, Mateusz Rys <ra...@gmail.com>> wrote:



We rebuilt the library a couple of times, adding our own debug printouts and we found that right before the exception there is an attempt to deserialize fixedID = -135.

In JAVA it is defined as:

short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;

Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you think this might be the problem?



This doesn’t come as a complete surprise to me. Its likely this was never implemented in the C++ client. I am not even suer what this is and why it would be sent to the client anyway.



Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?





Yesterday, we also implemented a simple JAVA client and it works without any problems.

We are doing almost exactly the same things we did in CPP client.

The only differences are:
- using ClientXXX instead of XXX in API calls (for example: ClientCacheFactory instead of CacheFactory),

  but I assume it's just a difference in API naming convention.



Yes, here are minor differences in the APIs. Java ClientCacheFactory is fairly new and was derived to split concerns between server and client versions of a Cache. On the C++ there is only a client cache, but it retains the original name of CacheFactory and Cache. Confusing… yes.



- lack of registerPdxType in JAVA Client as there is no such thing



Java side doesn’t need this because of reflection as describe earlier. C++ needs this to map types to a factory function that will allocate the C++ class.



With two JAVA Clients writing and having listeners, local event is recognized as DomainClass from the get-go and remote event is recognized as PdxInstance that we have to convert.

Nevertheless, it works perfectly fine as we would expect.





With CPP Clients, we can see in the server logs that adding PdxType from remote WAN is recognized.

Does the fact that we have it twice mean that the server treats it as two separate entries?

[info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

[info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]



The server will in fact treat them differently, but the problem is the clients don’t know what do to with them. As described earlier I don’t think the C++ client is tracking both the system id and the type name in the registration of the allocation function. It should create a mapping from the class name to the function so that when the client receives a fully qualified type id (type id and system id) for which there is no mapping then it should try the class name mapping. If a class name mapping is found it should record that same function in the fully qualified type id mapping.



I'm not sure if producing an integration test isn't a bit beyond my knowledge at the moment.



I have faith!



At this point I would just open a JIRA for this issue with all the information you have gathered.



-Jake

RE: CQ Event Notification in Multisite (WAN)

Posted by Alberto Bustamante Reyes <al...@est.tech>.
Hi,

I will try to help with this issue. I have created a ticket to keep track of it: https://issues.apache.org/jira/browse/GEODE-8344

BR/

Alberto B.
________________________________
De: Anilkumar Gingade <ag...@vmware.com>
Enviado: viernes, 3 de julio de 2020 0:46
Para: user@geode.apache.org <us...@geode.apache.org>
Asunto: Re: CQ Event Notification in Multisite (WAN)


Mateusz,



It is mostly serial (circular) and star pattern (where all the clusters are connected to each other) where the receiver needs to check who the originator and who all are the recipients, before forwarding it to other cluster. As I mentioned, not sure why it is getting forwarded to clients.

-Anil.



From: Mateusz Rys <ra...@gmail.com>
Reply-To: "user@geode.apache.org" <us...@geode.apache.org>
Date: Thursday, July 2, 2020 at 7:44 AM
To: "user@geode.apache.org" <us...@geode.apache.org>
Subject: Re: CQ Event Notification in Multisite (WAN)



Thanks Guys!

When I was younger, my teacher told me that if I think the problem is in the compiler or the library and not in my code then I have to think again.

Finally, I can prove him wrong ;) (once for 10 years)



Anyway, we implemented a stub sort of class that registered itself with FixedID = -135.

It did nothing besides adding itself to TheTypeMap so that it can be found later when deserialization happens.

Strangely enough - it worked. That is, the event was processed successfully and finally our afterCreate/afterUpdate method triggered.

Nobody knows what we broke doing that, but at least we can safely say that we pinpointed the issue.



Anil,
you wrote that the GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients "in case of certain wan connection configuration".

Did you have anything particular in mind?

I scoured the documentation again but I haven't found anything that could make the difference in connection configuration.

As for now, we planned on using the standard "Fully Connected Mesh Topology", although it seems to me that it's the server that decides

whether the event should be propagated further or not - we haven't seen any sort of "endless event loop" after adding the stub-class.





Regards

Mateusz





śr., 1 lip 2020 o 23:23 Anilkumar Gingade <ag...@vmware.com>> napisał(a):

>> Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?

Looking at the purpose of this class; this is to avoid resending of an event to a recipient; in case of certain wan connection configuration.



After processing the WAN event, my expectation is fetching data from WAN event and creating a new region entry and then a new client event from it and sending it to client. My suspicion is we are preserving this callback event and passing all the way to the client; not sure if its needed. Need additional investigation.



-Anil.





From: Jacob Barrett <ja...@vmware.com>>
Reply-To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Date: Wednesday, July 1, 2020 at 11:53 AM
To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Subject: Re: CQ Event Notification in Multisite (WAN)







On Jul 1, 2020, at 6:29 AM, Mateusz Rys <ra...@gmail.com>> wrote:



We rebuilt the library a couple of times, adding our own debug printouts and we found that right before the exception there is an attempt to deserialize fixedID = -135.

In JAVA it is defined as:

short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;

Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you think this might be the problem?



This doesn’t come as a complete surprise to me. Its likely this was never implemented in the C++ client. I am not even suer what this is and why it would be sent to the client anyway.



Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?





Yesterday, we also implemented a simple JAVA client and it works without any problems.

We are doing almost exactly the same things we did in CPP client.

The only differences are:
- using ClientXXX instead of XXX in API calls (for example: ClientCacheFactory instead of CacheFactory),

  but I assume it's just a difference in API naming convention.



Yes, here are minor differences in the APIs. Java ClientCacheFactory is fairly new and was derived to split concerns between server and client versions of a Cache. On the C++ there is only a client cache, but it retains the original name of CacheFactory and Cache. Confusing… yes.



- lack of registerPdxType in JAVA Client as there is no such thing



Java side doesn’t need this because of reflection as describe earlier. C++ needs this to map types to a factory function that will allocate the C++ class.



With two JAVA Clients writing and having listeners, local event is recognized as DomainClass from the get-go and remote event is recognized as PdxInstance that we have to convert.

Nevertheless, it works perfectly fine as we would expect.





With CPP Clients, we can see in the server logs that adding PdxType from remote WAN is recognized.

Does the fact that we have it twice mean that the server treats it as two separate entries?

[info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

[info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]



The server will in fact treat them differently, but the problem is the clients don’t know what do to with them. As described earlier I don’t think the C++ client is tracking both the system id and the type name in the registration of the allocation function. It should create a mapping from the class name to the function so that when the client receives a fully qualified type id (type id and system id) for which there is no mapping then it should try the class name mapping. If a class name mapping is found it should record that same function in the fully qualified type id mapping.



I'm not sure if producing an integration test isn't a bit beyond my knowledge at the moment.



I have faith!



At this point I would just open a JIRA for this issue with all the information you have gathered.



-Jake

Re: CQ Event Notification in Multisite (WAN)

Posted by Anilkumar Gingade <ag...@vmware.com>.
Mateusz,

It is mostly serial (circular) and star pattern (where all the clusters are connected to each other) where the receiver needs to check who the originator and who all are the recipients, before forwarding it to other cluster. As I mentioned, not sure why it is getting forwarded to clients.
-Anil.

From: Mateusz Rys <ra...@gmail.com>
Reply-To: "user@geode.apache.org" <us...@geode.apache.org>
Date: Thursday, July 2, 2020 at 7:44 AM
To: "user@geode.apache.org" <us...@geode.apache.org>
Subject: Re: CQ Event Notification in Multisite (WAN)

Thanks Guys!
When I was younger, my teacher told me that if I think the problem is in the compiler or the library and not in my code then I have to think again.
Finally, I can prove him wrong ;) (once for 10 years)

Anyway, we implemented a stub sort of class that registered itself with FixedID = -135.
It did nothing besides adding itself to TheTypeMap so that it can be found later when deserialization happens.
Strangely enough - it worked. That is, the event was processed successfully and finally our afterCreate/afterUpdate method triggered.
Nobody knows what we broke doing that, but at least we can safely say that we pinpointed the issue.

Anil,
you wrote that the GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients "in case of certain wan connection configuration".
Did you have anything particular in mind?
I scoured the documentation again but I haven't found anything that could make the difference in connection configuration.
As for now, we planned on using the standard "Fully Connected Mesh Topology", although it seems to me that it's the server that decides
whether the event should be propagated further or not - we haven't seen any sort of "endless event loop" after adding the stub-class.


Regards
Mateusz


śr., 1 lip 2020 o 23:23 Anilkumar Gingade <ag...@vmware.com>> napisał(a):
>> Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?
Looking at the purpose of this class; this is to avoid resending of an event to a recipient; in case of certain wan connection configuration.

After processing the WAN event, my expectation is fetching data from WAN event and creating a new region entry and then a new client event from it and sending it to client. My suspicion is we are preserving this callback event and passing all the way to the client; not sure if its needed. Need additional investigation.

-Anil.


From: Jacob Barrett <ja...@vmware.com>>
Reply-To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Date: Wednesday, July 1, 2020 at 11:53 AM
To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Subject: Re: CQ Event Notification in Multisite (WAN)



On Jul 1, 2020, at 6:29 AM, Mateusz Rys <ra...@gmail.com>> wrote:

We rebuilt the library a couple of times, adding our own debug printouts and we found that right before the exception there is an attempt to deserialize fixedID = -135.
In JAVA it is defined as:
short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;
Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you think this might be the problem?

This doesn’t come as a complete surprise to me. Its likely this was never implemented in the C++ client. I am not even suer what this is and why it would be sent to the client anyway.

Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?


Yesterday, we also implemented a simple JAVA client and it works without any problems.
We are doing almost exactly the same things we did in CPP client.
The only differences are:
- using ClientXXX instead of XXX in API calls (for example: ClientCacheFactory instead of CacheFactory),
  but I assume it's just a difference in API naming convention.

Yes, here are minor differences in the APIs. Java ClientCacheFactory is fairly new and was derived to split concerns between server and client versions of a Cache. On the C++ there is only a client cache, but it retains the original name of CacheFactory and Cache. Confusing… yes.

- lack of registerPdxType in JAVA Client as there is no such thing

Java side doesn’t need this because of reflection as describe earlier. C++ needs this to map types to a factory function that will allocate the C++ class.

With two JAVA Clients writing and having listeners, local event is recognized as DomainClass from the get-go and remote event is recognized as PdxInstance that we have to convert.
Nevertheless, it works perfectly fine as we would expect.


With CPP Clients, we can see in the server logs that adding PdxType from remote WAN is recognized.
Does the fact that we have it twice mean that the server treats it as two separate entries?
[info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

[info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

The server will in fact treat them differently, but the problem is the clients don’t know what do to with them. As described earlier I don’t think the C++ client is tracking both the system id and the type name in the registration of the allocation function. It should create a mapping from the class name to the function so that when the client receives a fully qualified type id (type id and system id) for which there is no mapping then it should try the class name mapping. If a class name mapping is found it should record that same function in the fully qualified type id mapping.

I'm not sure if producing an integration test isn't a bit beyond my knowledge at the moment.

I have faith!

At this point I would just open a JIRA for this issue with all the information you have gathered.

-Jake

Re: CQ Event Notification in Multisite (WAN)

Posted by Mateusz Rys <ra...@gmail.com>.
Thanks Guys!

When I was younger, my teacher told me that if I think the problem is in
the compiler or the library and not in my code then I have to think again.
Finally, I can prove him wrong ;) (once for 10 years)

Anyway, we implemented a stub sort of class that registered itself with
FixedID = -135.
It did nothing besides adding itself to TheTypeMap so that it can be found
later when deserialization happens.
Strangely enough - it worked. That is, the event was processed successfully
and finally our afterCreate/afterUpdate method triggered.
Nobody knows what we broke doing that, but at least we can safely say that
we pinpointed the issue.

Anil,
you wrote that the GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to
clients "in case of certain wan connection configuration".
Did you have anything particular in mind?
I scoured the documentation again but I haven't found anything that could
make the difference in connection configuration.
As for now, we planned on using the standard "Fully Connected Mesh
Topology", although it seems to me that it's the server that decides
whether the event should be propagated further or not - we haven't seen any
sort of "endless event loop" after adding the stub-class.


Regards
Mateusz


śr., 1 lip 2020 o 23:23 Anilkumar Gingade <ag...@vmware.com> napisał(a):

> >> Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is
> sent to clients?
>
> Looking at the purpose of this class; this is to avoid resending of an
> event to a recipient; in case of certain wan connection configuration.
>
>
>
> After processing the WAN event, my expectation is fetching data from WAN
> event and creating a new region entry and then a new client event from it
> and sending it to client. My suspicion is we are preserving this callback
> event and passing all the way to the client; not sure if its needed. Need
> additional investigation.
>
>
>
> -Anil.
>
>
>
>
>
> *From: *Jacob Barrett <ja...@vmware.com>
> *Reply-To: *"user@geode.apache.org" <us...@geode.apache.org>
> *Date: *Wednesday, July 1, 2020 at 11:53 AM
> *To: *"user@geode.apache.org" <us...@geode.apache.org>
> *Subject: *Re: CQ Event Notification in Multisite (WAN)
>
>
>
>
>
>
>
> On Jul 1, 2020, at 6:29 AM, Mateusz Rys <ra...@gmail.com> wrote:
>
>
>
> We rebuilt the library a couple of times, adding our own debug printouts
> and we found that right before the exception there is an attempt to
> deserialize fixedID = -135.
>
> In JAVA it is defined as:
>
> short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;
>
> Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you
> think this might be the problem?
>
>
>
> This doesn’t come as a complete surprise to me. Its likely this was never
> implemented in the C++ client. I am not even suer what this is and why it
> would be sent to the client anyway.
>
>
>
> Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is
> sent to clients?
>
>
>
>
>
> Yesterday, we also implemented a simple JAVA client and it works without
> any problems.
>
> We are doing almost exactly the same things we did in CPP client.
>
> The only differences are:
> - using ClientXXX instead of XXX in API calls (for example:
> ClientCacheFactory instead of CacheFactory),
>
>   but I assume it's just a difference in API naming convention.
>
>
>
> Yes, here are minor differences in the APIs. Java ClientCacheFactory is
> fairly new and was derived to split concerns between server and client
> versions of a Cache. On the C++ there is only a client cache, but it
> retains the original name of CacheFactory and Cache. Confusing… yes.
>
>
>
> - lack of registerPdxType in JAVA Client as there is no such thing
>
>
>
> Java side doesn’t need this because of reflection as describe earlier. C++
> needs this to map types to a factory function that will allocate the C++
> class.
>
>
>
> With two JAVA Clients writing and having listeners, local event is
> recognized as DomainClass from the get-go and remote event is recognized as
> PdxInstance that we have to convert.
>
> Nevertheless, it works perfectly fine as we would expect.
>
>
>
>
>
> With CPP Clients, we can see in the server logs that adding PdxType from
> remote WAN is recognized.
>
> Does the fact that we have it twice mean that the server treats it as two
> separate entries?
>
> [info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread
> 4> tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
>         name=com.example.Order
>         fields=[
>
> order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
>
> name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
>
> quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]
>
> [info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread
> 4> tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
>         name=com.example.Order
>         fields=[
>
> order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
>
> name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
>
> quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]
>
>
>
> The server will in fact treat them differently, but the problem is the
> clients don’t know what do to with them. As described earlier I don’t think
> the C++ client is tracking both the system id and the type name in the
> registration of the allocation function. It should create a mapping from
> the class name to the function so that when the client receives a fully
> qualified type id (type id and system id) for which there is no mapping
> then it should try the class name mapping. If a class name mapping is found
> it should record that same function in the fully qualified type id mapping.
>
>
>
> I'm not sure if producing an integration test isn't a bit beyond my
> knowledge at the moment.
>
>
>
> I have faith!
>
>
>
> At this point I would just open a JIRA for this issue with all the
> information you have gathered.
>
>
>
> -Jake
>

Re: CQ Event Notification in Multisite (WAN)

Posted by Anilkumar Gingade <ag...@vmware.com>.
>> Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?
Looking at the purpose of this class; this is to avoid resending of an event to a recipient; in case of certain wan connection configuration.

After processing the WAN event, my expectation is fetching data from WAN event and creating a new region entry and then a new client event from it and sending it to client. My suspicion is we are preserving this callback event and passing all the way to the client; not sure if its needed. Need additional investigation.

-Anil.


From: Jacob Barrett <ja...@vmware.com>
Reply-To: "user@geode.apache.org" <us...@geode.apache.org>
Date: Wednesday, July 1, 2020 at 11:53 AM
To: "user@geode.apache.org" <us...@geode.apache.org>
Subject: Re: CQ Event Notification in Multisite (WAN)




On Jul 1, 2020, at 6:29 AM, Mateusz Rys <ra...@gmail.com>> wrote:

We rebuilt the library a couple of times, adding our own debug printouts and we found that right before the exception there is an attempt to deserialize fixedID = -135.
In JAVA it is defined as:
short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;
Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you think this might be the problem?

This doesn’t come as a complete surprise to me. Its likely this was never implemented in the C++ client. I am not even suer what this is and why it would be sent to the client anyway.

Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?



Yesterday, we also implemented a simple JAVA client and it works without any problems.
We are doing almost exactly the same things we did in CPP client.
The only differences are:
- using ClientXXX instead of XXX in API calls (for example: ClientCacheFactory instead of CacheFactory),
  but I assume it's just a difference in API naming convention.

Yes, here are minor differences in the APIs. Java ClientCacheFactory is fairly new and was derived to split concerns between server and client versions of a Cache. On the C++ there is only a client cache, but it retains the original name of CacheFactory and Cache. Confusing… yes.


- lack of registerPdxType in JAVA Client as there is no such thing

Java side doesn’t need this because of reflection as describe earlier. C++ needs this to map types to a factory function that will allocate the C++ class.

With two JAVA Clients writing and having listeners, local event is recognized as DomainClass from the get-go and remote event is recognized as PdxInstance that we have to convert.
Nevertheless, it works perfectly fine as we would expect.


With CPP Clients, we can see in the server logs that adding PdxType from remote WAN is recognized.
Does the fact that we have it twice mean that the server treats it as two separate entries?
[info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

[info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

The server will in fact treat them differently, but the problem is the clients don’t know what do to with them. As described earlier I don’t think the C++ client is tracking both the system id and the type name in the registration of the allocation function. It should create a mapping from the class name to the function so that when the client receives a fully qualified type id (type id and system id) for which there is no mapping then it should try the class name mapping. If a class name mapping is found it should record that same function in the fully qualified type id mapping.


I'm not sure if producing an integration test isn't a bit beyond my knowledge at the moment.

I have faith!

At this point I would just open a JIRA for this issue with all the information you have gathered.

-Jake

Re: CQ Event Notification in Multisite (WAN)

Posted by Jacob Barrett <ja...@vmware.com>.

On Jul 1, 2020, at 6:29 AM, Mateusz Rys <ra...@gmail.com>> wrote:

We rebuilt the library a couple of times, adding our own debug printouts and we found that right before the exception there is an attempt to deserialize fixedID = -135.
In JAVA it is defined as:
short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;
Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you think this might be the problem?

This doesn’t come as a complete surprise to me. Its likely this was never implemented in the C++ client. I am not even suer what this is and why it would be sent to the client anyway.

Anil, do can you explain why GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT is sent to clients?


Yesterday, we also implemented a simple JAVA client and it works without any problems.
We are doing almost exactly the same things we did in CPP client.
The only differences are:
- using ClientXXX instead of XXX in API calls (for example: ClientCacheFactory instead of CacheFactory),
  but I assume it's just a difference in API naming convention.

Yes, here are minor differences in the APIs. Java ClientCacheFactory is fairly new and was derived to split concerns between server and client versions of a Cache. On the C++ there is only a client cache, but it retains the original name of CacheFactory and Cache. Confusing… yes.

- lack of registerPdxType in JAVA Client as there is no such thing

Java side doesn’t need this because of reflection as describe earlier. C++ needs this to map types to a factory function that will allocate the C++ class.

With two JAVA Clients writing and having listeners, local event is recognized as DomainClass from the get-go and remote event is recognized as PdxInstance that we have to convert.
Nevertheless, it works perfectly fine as we would expect.


With CPP Clients, we can see in the server logs that adding PdxType from remote WAN is recognized.
Does the fact that we have it twice mean that the server treats it as two separate entries?
[info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

[info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

The server will in fact treat them differently, but the problem is the clients don’t know what do to with them. As described earlier I don’t think the C++ client is tracking both the system id and the type name in the registration of the allocation function. It should create a mapping from the class name to the function so that when the client receives a fully qualified type id (type id and system id) for which there is no mapping then it should try the class name mapping. If a class name mapping is found it should record that same function in the fully qualified type id mapping.

I'm not sure if producing an integration test isn't a bit beyond my knowledge at the moment.

I have faith!

At this point I would just open a JIRA for this issue with all the information you have gathered.

-Jake

Re: CQ Event Notification in Multisite (WAN)

Posted by Anilkumar Gingade <ag...@vmware.com>.
Mateusz,

Sorry, my understanding of native client is minimal. I will let experts to answers this.

-Anil.


From: Mateusz Rys <ra...@gmail.com>
Reply-To: "user@geode.apache.org" <us...@geode.apache.org>
Date: Wednesday, July 1, 2020 at 6:29 AM
To: "user@geode.apache.org" <us...@geode.apache.org>
Subject: Re: CQ Event Notification in Multisite (WAN)

Hello!

Anil,
Yes, it was a new server setup without any data (we deleted the directory).
Region.get() works without problems. With no Listeners active, one client adds an object to his local database and 10 seconds later, the second client reads it (also from his local database).
We rebuilt the library a couple of times, adding our own debug printouts and we found that right before the exception there is an attempt to deserialize fixedID = -135.
In JAVA it is defined as:
short GATEWAY_SENDER_EVENT_CALLBACK_ARGUMENT = -135;
Looking at DSFixedID.hpp from CPP Client I don't see -135 defined. Do you think this might be the problem?

[debug 2020/07/01 12:43:28.527312 CEST geodeA:24402 139972805822208] OWN TcrMessage:LOCAL_CREATE: REGNAME /multisiteData
[debug 2020/07/01 12:43:28.527358 CEST geodeA:24402 139972805822208] OWN TcrMessage:LOCAL_CREATE: isDELTA=false
[debug 2020/07/01 12:43:28.527380 CEST geodeA:24402 139972805822208] SerializationRegistry::deserialize typeId = -1 dsCode =  93
[debug 2020/07/01 12:43:28.527388 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: PDX type will be deserialized
[debug 2020/07/01 12:43:28.527398 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: PdxTypeHandler::deserialize called
[debug 2020/07/01 12:43:28.527415 CEST geodeA:24402 139972805822208] OWN : PdxTypeRegistry:getPdxType called typeId=1009228956
[debug 2020/07/01 12:43:28.527428 CEST geodeA:24402 139972805822208] OWN: PdxTypeRegistry:getLocalPdxType called localType=com.example.Order
[debug 2020/07/01 12:43:28.527451 CEST geodeA:24402 139972805822208] deserializePdx ClassName = com.example.Order, isLocal = 1
[debug 2020/07/01 12:43:28.527462 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: getPdxSerializableType called: com.example.Order
[debug 2020/07/01 12:43:28.527473 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: TheTypeMap::findPdxSerializable called com.example.Order
[debug 2020/07/01 12:43:28.527493 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: getPdxSerializableType called: type found
[debug 2020/07/01 12:43:28.527509 CEST geodeA:24402 139972805822208] OWN pdxHelper: pdxObjectptr OK
[debug 2020/07/01 12:43:28.527525 CEST geodeA:24402 139972805822208] OWN: PdxType:getLocalToRemoteMap called
[debug 2020/07/01 12:43:28.527535 CEST geodeA:24402 139972805822208] OWN: PdxType:getRemoteToLocalMap called
[debug 2020/07/01 12:43:28.527546 CEST geodeA:24402 139972805822208] OWN: PdxLocalReader:initialize called
OWN order from data called
OWN order from data called - after reading name_:product x
[debug 2020/07/01 12:43:28.527590 CEST geodeA:24402 139972805822208] OWN pdxHelper: pdxObjectptr->fromData(plr) OK
[debug 2020/07/01 12:43:28.527600 CEST geodeA:24402 139972805822208] OWN: PdxLocalReader:moveStream called
[debug 2020/07/01 12:43:28.527611 CEST geodeA:24402 139972805822208] OWN pdxHelper: plr.moveStream() OK
[debug 2020/07/01 12:43:28.527628 CEST geodeA:24402 139972805822208] SerializationRegistry::deserialize typeId = -1 dsCode =  2
[debug 2020/07/01 12:43:28.527638 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: FixedID type will be deserialized
[debug 2020/07/01 12:43:28.527648 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: deserializeDataSerializableFixedId called
[debug 2020/07/01 12:43:28.527665 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: FixedIDShort fixedId: -135
[debug 2020/07/01 12:43:28.527674 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: TheTypeMap::findDataSerializableFixedId called -135
[debug 2020/07/01 12:43:28.527735 CEST geodeA:24402 139972805822208] OWN SerializationRegistry: exception imminent!!!
[error 2020/07/01 12:43:28.528009 CEST geodeA:24402 139972805822208] Exception while receiving subscription event for endpoint geodeB:40404:: apache::geode::client::IllegalStateException: Unregistered type in deserialization
[debug 2020/07/01 12:43:28.528073 CEST geodeA:24402 139972805822208] TcrConnection::readMessage: receiving reply from endpoint geodeB:40404

Jake,
Yesterday, we also implemented a simple JAVA client and it works without any problems.
We are doing almost exactly the same things we did in CPP client.
The only differences are:
- using ClientXXX instead of XXX in API calls (for example: ClientCacheFactory instead of CacheFactory),
  but I assume it's just a difference in API naming convention.
- lack of registerPdxType in JAVA Client as there is no such thing
With two JAVA Clients writing and having listeners, local event is recognized as DomainClass from the get-go and remote event is recognized as PdxInstance that we have to convert.
Nevertheless, it works perfectly fine as we would expect.


With CPP Clients, we can see in the server logs that adding PdxType from remote WAN is recognized.
Does the fact that we have it twice mean that the server treats it as two separate entries?
[info 2020/06/30 14:09:22.044 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding new type: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

[info 2020/06/30 14:09:22.051 CEST <ServerConnection on port 1543 Thread 4> tid=0x52] Adding, from remote WAN: PdxType[dsid=60, typenum=15169619
        name=com.example.Order
        fields=[
        order_id:int:identity:0:idx0(relativeOffset)=0:idx1(vlfOffsetIndex)=0
        name:String:identity:1:idx0(relativeOffset)=4:idx1(vlfOffsetIndex)=-1
        quantity:short:identity:2:idx0(relativeOffset)=-2:idx1(vlfOffsetIndex)=-1]]

I'm not sure if producing an integration test isn't a bit beyond my knowledge at the moment.

Regards
Mateusz


wt., 30 cze 2020 o 19:19 Anilkumar Gingade <ag...@vmware.com>> napisał(a):
Mateusz,

The type registry (pdx) is maintained at server side only. Clients doesn’t maintain any registry.
When you say, you changed ids; was it in a new server setup, without any persistent data (persistent region)?
Can you see/try, with a fresh setup, the region.get() works and then add cache listener to it…This is to isolate/debug the issue.

-Anil.



From: Mateusz Rys <ra...@gmail.com>>
Reply-To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Date: Tuesday, June 30, 2020 at 2:45 AM
To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Subject: Re: CQ Event Notification in Multisite (WAN)

Hi Anil,

Thanks for the response.
You are of course right, this isn't connected to CQ. Yesterday we implemented CacheListener and the behavior is the same.
Unfortunately, we do have unique distributed ids set in gemfire.properties.

#on geodeA
mcast-port=0
locators=GeodeA[10334]
distributed-system-id=1
remote-locators=GeodeB[10334]

#on geodeB
mcast-port=0
locators=GeodeB[10334]
distributed-system-id=2
remote-locators=GeodeA[10334]

Same IDs are used when creating gateway senders.

We also noticed that on both clients we have "Deserialized distributed member Id 1" output, even after we changed IDs in gemfire.properties to 50 and 60 for better visualization.
[debug 2020/06/28 16:44:35.904586 CEST geodeA:3628 140700250048512] Deserializing distributed member Id
[debug 2020/06/28 16:44:35.904675 CEST geodeA:3628 140700250048512] SerializationRegistry::deserialize typeId = -1 dsCode =  1
[debug 2020/06/28 16:44:35.904815 CEST geodeA:3628 140700250048512] SerializationRegistry::deserialize typeId = -1 dsCode =  87
[debug 2020/06/28 16:44:35.905000 CEST geodeA:3628 140700250048512] SerializationRegistry::deserialize typeId = -1 dsCode =  87
[debug 2020/06/28 16:44:35.905108 CEST geodeA:3628 140700250048512] SerializationRegistry::deserialize typeId = -1 dsCode =  87
[debug 2020/06/28 16:44:35.905209 CEST geodeA:3628 140700250048512] SerializationRegistry::deserialize typeId = -1 dsCode =  87
[debug 2020/06/28 16:44:35.905308 CEST geodeA:3628 140700250048512] ClientProxyMembershipID::readVersion ordinal = 110
[debug 2020/06/28 16:44:35.905466 CEST geodeA:3628 140700250048512] ClientProxyMembershipID::writeVersion ordinal = 45
[debug 2020/06/28 16:44:35.905573 CEST geodeA:3628 140700250048512] GethashKey :192:168:56:201:41001:server::1 client id: 192.168.56.201(3371:loner):2::server
[debug 2020/06/28 16:44:35.905719 CEST geodeA:3628 140700250048512] Adding a new member to the member list maintained for version stamps member Ids. HashKey: :192:168:56:201:41001:server::1 MemberCounter: 1
[debug 2020/06/28 16:44:35.905867 CEST geodeA:3628 140700250048512] Deserialized distributed member Id 1

This would suggest that there is another place to provide IDs on the client, but I fail to see anything relevant in the API.
Do you think this can be the problem?


Regards
Mateusz

pon., 29 cze 2020 o 21:04 Anilkumar Gingade <ag...@vmware.com>> napisał(a):
Hi Mateusz,

The issue is not related to CQ functionality, its related to serialization and de-serialization of the PDX type.

As you can see from documentation, you need to configure distributed Ids in each cluster to get the PDX working across the WAN sites.

https://geode.apache.org/docs/guide/110/developing/data_serialization/use_pdx_high_level_steps.html

Have you configured the cluster sites with unique distributed ids?

-Anil.



From: Mateusz Rys <ra...@gmail.com>>
Reply-To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Date: Monday, June 29, 2020 at 2:26 AM
To: "user@geode.apache.org<ma...@geode.apache.org>" <us...@geode.apache.org>>
Subject: CQ Event Notification in Multisite (WAN)

Hi Geode Users,

Together with my friends I'm trying to setup a simple Native Client (C++) and Server configuration.
We decided to use Continuous Query (CQ) and overwrite onEvent() method to have an easy way to be informed about the updates in the database.
So far so good, everything works as expected.

But the main reason we are doing all of this is to have two or more such setups - all connected using Geode Multisite (WAN).
We successfully created Multisite configuration, we can see (with gfsh commands) that if something is written to SiteA, it is being replicated to SiteB.
After reading the documentation, it also became obvious that we must provide a way to serialize objects before they are sent over the network.
We decided that the simplest way should be to inherit from PdxSerializable class.

Unfortunately, we ran into a problem that we can't get rid of:
[error 2020/06/25 16:44:36.344952 CEST geodeA:3604 139864301352704] Exception while receiving subscription event for endpoint geodeB:40404::
apache::geode::client::IllegalStateException: Unregistered type in deserialization

The exception is visible if the event comes from a remote (SiteB) source.
If we update the database locally on SiteA we have normal onEvent() invocation in ClientA,
but the exception is present on ClientB connected to SiteB.

We register our type using cache.getTypeRegistry().registerPdxType(Order::createDeserializable);
We tried to have only the first client to register the type, only the second client, or both of them, but that didn't seem to change anything.
We also tried to toy with setPdxReadSerialized(true) but to no avail.

Have you met this exception before?
Do you have any CQ+Multisite implementation examples?


Thank you
Regards
Mateusz