You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@geode.apache.org by Amit Pandey <am...@gmail.com> on 2017/07/02 21:12:52 UTC

Re: Use Java Serialization in Spring Data

Thanks guys. I have actually once case where right now I cant avoid using
circular REF. Does using DataSerializable  interfaces solve this issue... I
really want to avoid using JVM serialization as it seems to take a lot more
space

On Sat, Jul 1, 2017 at 4:04 AM, Darrel Schneider <ds...@pivotal.io>
wrote:

> You can mix the different serialization frameworks even in a single object
> graph. For example your top level value object could use PdxSerializable
> and then it could have a field that uses DataSerializable and it could have
> a field that uses Serializable. The only restriction is that once an object
> in the graph uses standard java serialization (i.e. Serializable or
> Externalizable) then every object under that object in the graph will also
> only use standard java serialization.
> Another thing to know is that some jdk classes get special treatment by
> geode serialization. For example if you create an ArrayList and serialize
> it then we do not use its standard java serialization but instead have
> geode code that knows how to serialize an ArrayList in a special way that
> is more efficient for geode. But if you create a subclass of ArrayList then
> it is just serialized with the normal rules for domain classes. Note that
> if the ArrayList is reached from a parent object that used standard java
> serialization then this special code that serializes ArrayLists is not used.
> If you look at all the static methods on DataSerializer (the read* and
> write* ones) you will see all the jdk classes that have special built in
> serialization code.
> A really bad anti-pattern is to have one of these special container
> classes contain references to objects that will use standard java
> serialization. It will work but the size of your serialized data will be
> much larger than expected. In this case you are much better off introducing
> an object above the container that uses java serialization so that the
> whole graph under it will all use java serialization. For some of the jdk
> collections you could do this by simply creating a subclass of the jdk
> container class and use it in place the standard jdk class.
>
>
>
> On Fri, Jun 30, 2017 at 9:49 AM, John Blum <jb...@pivotal.io> wrote:
>
>> Hi Amit-
>>
>> Regarding...
>>
>> *> Can we combine the two serialization Java and PDX? *
>>
>> I was tempted to say NO until I thought well, what would happen if some
>> of my application domain objects implemented java.io.Serializable while
>> others implemented org.apache.geode.pdx.PdxSerializable or
>> org.apache.geode.DataSerializable.
>>
>> I am not sure actually; I have never tried this.  I would not recommend
>> it... i.e. I would stick to 1 serialization strategy.  You could have
>> adverse affects if read-serialized was set to true, or you were running
>> OQL queries on a Region that stored a mix of serialized objects (e.g. PDX +
>> Java Serialized), etc.
>>
>> Remember (as I stated earlier)...
>>
>> "*If either DataSerialization or PDX Serialization is configured, even
>> if your application domain object implements java.io
>> <http://java.io>.Serializable, then Geode will prefer its own serialization
>> mechanics over Java Serialization**.*"
>>
>>
>> Finally, as for...
>>
>> *> By the way somehow in my case PDX is used although I never supplied
>> any PDX ReflectionBasedAutoSerializer.  Can you let me know the possible
>> reasons for it?*
>>
>> The only way PDX will be used is if you...
>>
>>
>> 1. Set the pdx-serializer-ref attribute on the <gfe:cache> element in
>> the SDG XML namespace to any PdxSerializer implementation, for example...
>>
>> <bean *id=**"mySerializer"* class="org.example.app.geode.s
>> erialization.MyPdxSerializer"/>
>>
>> <gfe:cache properties-ref="gemfireProperties" *pdx-serializer-ref=*
>> *"mySerializer"* pdx-read-serialized="true"
>>            pdx-persistent="true" pdx-disk-store="pdxStore"/>
>>
>> Or, if you set the pdxSerializer property
>> <http://docs.spring.io/spring-data-gemfire/docs/current/api/org/springframework/data/gemfire/CacheFactoryBean.html#setPdxSerializer-java.lang.Object-> [1]
>> on the [Client]CacheFactoryBean.  It also does NOT matter if
>> ReflectionBasedAutoSerializer is used or not.  In fact, I would not
>> recommend using ReflectionBasedAutoSerializer.  I would rather use SDG's
>> MappingPdxSerializer or custom, dedicated PdxSerializers.
>>
>>
>> 2. Or, your application domain object implements
>> org.apache.geode.pdx.PdxSerializable.
>>
>>
>> if either of these 2 things are true, then PDX is used.
>>
>> Regards,
>> John
>>
>>
>> [1] http://docs.spring.io/spring-data-gemfire/docs/current/
>> api/org/springframework/data/gemfire/CacheFactoryBean.html#
>> setPdxSerializer-java.lang.Object-
>>
>>
>> On Thu, Jun 29, 2017 at 10:50 PM, Amit Pandey <am...@gmail.com>
>> wrote:
>>
>>> Hey John,
>>>
>>> Can we combine the two serialization Java and PDX? I want to use Java
>>> for some domain objects which have circular dependencies until we figure
>>> out a better way to represent them and use PDX for others?
>>>
>>> By the way somehow in my case PDX is used although I never supplied any
>>> PDX ReflectionBasedAutoSerializer.
>>>
>>> Can you let me know the possible reasons for it?
>>>
>>> Regards
>>>
>>> On Fri, Jun 30, 2017 at 8:23 AM, John Blum <jb...@pivotal.io> wrote:
>>>
>>>> Right!  There is no special configuration required to use *Java
>>>> Serialization* with Apache Geode, regardless if SDG is in play or
>>>> not.  Your application domain object just needs to implement
>>>> java.io.Serializable.
>>>>
>>>> However, if you decide to use Geode's *DataSerialization* framework
>>>> <http://geode.apache.org/docs/guide/11/developing/data_serialization/gemfire_data_serialization.html> [1]
>>>> or even PDX
>>>> <http://geode.apache.org/docs/guide/11/developing/data_serialization/gemfire_pdx_serialization.html> [2]
>>>> (and you should consider this), then SDG supports this too. For instance,
>>>> here is an example config
>>>> <https://github.com/spring-projects/spring-data-geode/blob/master/src/test/resources/org/springframework/data/gemfire/config/xml/cache-using-pdx-ns.xml#L18-L21> [3]
>>>> of using SDG to configure PDX.  Here is a slightly more involved
>>>> example
>>>> <https://github.com/spring-projects/spring-data-geode/blob/master/src/test/java/org/springframework/data/gemfire/function/ClientCacheFunctionExecutionWithPdxIntegrationTest.java> [6]
>>>> that uses *Spring* JavaConfig and "custom", "composed" *PdxSerializers*
>>>> for the application domain object types (i.e. Person & Address).
>>>>
>>>> And, if you combine Geode's PDX Serialization framework [2] with *Spring
>>>> Data's* "Mapping" infrastructure
>>>> <http://docs.spring.io/spring-data-gemfire/docs/current/reference/html/#mapping.pdx-serializer> [4],
>>>> there is a special PdxSerializer in SDG called MappingPdxSerializer
>>>> <http://docs.spring.io/spring-data-gemfire/docs/current/api/org/springframework/data/gemfire/mapping/MappingPdxSerializer.html>
>>>>  [5] that uses the SD "*Mapping meta-data*" to serialize your
>>>> application domain object types to PDX.
>>>>
>>>> Of *Java Serialization*, DataSerialization and PDX, it is recommended
>>>> that you use and prefer PDX as it offers the most flexibility and is more
>>>> efficient than *Java Serialization* (though it does not handle cycles;
>>>> so be careful there).  Of the 3, *DataSerialization* is the most
>>>> efficient.
>>>>
>>>> If either DataSerialization or PDX Serialization is configured, even if
>>>> your application domain object implements java.io.Serializable, then
>>>> Geode will prefer its own serialization mechanics over *Java
>>>> Serialization*.
>>>>
>>>> Refer to Geode's documentation
>>>> <http://geode.apache.org/docs/guide/11/developing/data_serialization/data_serialization_options.html> [7]
>>>> on serialization for more details.
>>>>
>>>> Hope this helps.
>>>>
>>>> Regards,
>>>> John
>>>>
>>>>
>>>> [1] http://geode.apache.org/docs/guide/11/developing/data_se
>>>> rialization/gemfire_data_serialization.html
>>>> [2] http://geode.apache.org/docs/guide/11/developing/data_se
>>>> rialization/gemfire_pdx_serialization.html
>>>> [3] https://github.com/spring-projects/spring-data-geode/blo
>>>> b/master/src/test/resources/org/springframework/data/gemfire
>>>> /config/xml/cache-using-pdx-ns.xml#L18-L21
>>>> [4] http://docs.spring.io/spring-data-gemfire/docs/current/r
>>>> eference/html/#mapping.pdx-serializer
>>>> [5] http://docs.spring.io/spring-data-gemfire/docs/current/a
>>>> pi/org/springframework/data/gemfire/mapping/MappingPdxSerializer.html
>>>> [6] https://github.com/spring-projects/spring-data-geode/blo
>>>> b/master/src/test/java/org/springframework/data/gemfire/func
>>>> tion/ClientCacheFunctionExecutionWithPdxIntegrationTest.java
>>>> [7] http://geode.apache.org/docs/guide/11/developing/data_se
>>>> rialization/data_serialization_options.html
>>>>
>>>>
>>>> On Thu, Jun 29, 2017 at 4:20 PM, Kirk Lund <kl...@apache.org> wrote:
>>>>
>>>>> Make the classes for your domain objects implement
>>>>> java.io.Serializable and avoid specifying DataSerializable or
>>>>> DataSerializers or PDX. This will result in use of Java serialization when
>>>>> serializing your domain objects. It'll be slower though.
>>>>>
>>>>> On Thu, Jun 29, 2017 at 3:42 PM, Amit Pandey <
>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>
>>>>>> Hi Guys,
>>>>>>
>>>>>> Whats the config for using Java Serialization in Spring Data Geode ?
>>>>>>
>>>>>> regards
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -John
>>>> john.blum10101 (skype)
>>>>
>>>
>>>
>>
>>
>> --
>> -John
>> john.blum10101 (skype)
>>
>
>

Re: Use Java Serialization in Spring Data

Posted by John Blum <jb...@pivotal.io>.
No, neither PDX not *DataSerialization* handle circular references in your
object graph; You must use *Java Serialization.*

On another note (*@Darrel*, correct me if I am wrong)...

I think that if GemFire's PDX serialization is configured (i.e. by using
the ReflectionBasedAutoSerializer, or any other PdxSerializer
implementation configured for the pdxSerializer property, e.g.
ClientCacheFactory.setPdxSerializer(:PdxSerializer)
<http://gemfire-90-javadocs.docs.pivotal.io/org/apache/geode/cache/client/ClientCacheFactory.html#setPdxSerializer-org.apache.geode.pdx.PdxSerializer->
[1]),
or your application domain object types implement either
o.a.g.pdx.PdxSerializable or o.a.g.cache.DataSerializable, and even if your
application domain object types implement java.io.Serializable, then
GemFire will prefer its own serialization framework(s) / mechanics.  That
is, PDX and *DataSerialization* always *take precedence over* *Java
Serialization*.

In other words, the moment you set a PdxSerializer and it handles
serialization for any application domain object type, regardless if it is
java.io.Serializable, it will be serialized using PDX; at least this is
what I was told.

However, if you were using a combination of implementing either
PdxSerializable/DataSerializable and java.io.Serializable, if your
application domain object type only implemented java.io.Serializable,
then *Java
Serialization* would be used.

Only *Java Serialization* handles circular references.


-John


[1]
http://gemfire-90-javadocs.docs.pivotal.io/org/apache/geode/cache/client/ClientCacheFactory.html#setPdxSerializer-org.apache.geode.pdx.PdxSerializer-


On Sun, Jul 2, 2017 at 2:12 PM, Amit Pandey <am...@gmail.com>
wrote:

> Thanks guys. I have actually once case where right now I cant avoid using
> circular REF. Does using DataSerializable  interfaces solve this issue... I
> really want to avoid using JVM serialization as it seems to take a lot more
> space
>
> On Sat, Jul 1, 2017 at 4:04 AM, Darrel Schneider <ds...@pivotal.io>
> wrote:
>
>> You can mix the different serialization frameworks even in a single
>> object graph. For example your top level value object could use
>> PdxSerializable and then it could have a field that uses DataSerializable
>> and it could have a field that uses Serializable. The only restriction is
>> that once an object in the graph uses standard java serialization (i.e.
>> Serializable or Externalizable) then every object under that object in the
>> graph will also only use standard java serialization.
>> Another thing to know is that some jdk classes get special treatment by
>> geode serialization. For example if you create an ArrayList and serialize
>> it then we do not use its standard java serialization but instead have
>> geode code that knows how to serialize an ArrayList in a special way that
>> is more efficient for geode. But if you create a subclass of ArrayList then
>> it is just serialized with the normal rules for domain classes. Note that
>> if the ArrayList is reached from a parent object that used standard java
>> serialization then this special code that serializes ArrayLists is not used.
>> If you look at all the static methods on DataSerializer (the read* and
>> write* ones) you will see all the jdk classes that have special built in
>> serialization code.
>> A really bad anti-pattern is to have one of these special container
>> classes contain references to objects that will use standard java
>> serialization. It will work but the size of your serialized data will be
>> much larger than expected. In this case you are much better off introducing
>> an object above the container that uses java serialization so that the
>> whole graph under it will all use java serialization. For some of the jdk
>> collections you could do this by simply creating a subclass of the jdk
>> container class and use it in place the standard jdk class.
>>
>>
>>
>> On Fri, Jun 30, 2017 at 9:49 AM, John Blum <jb...@pivotal.io> wrote:
>>
>>> Hi Amit-
>>>
>>> Regarding...
>>>
>>> *> Can we combine the two serialization Java and PDX? *
>>>
>>> I was tempted to say NO until I thought well, what would happen if some
>>> of my application domain objects implemented java.io.Serializable while
>>> others implemented org.apache.geode.pdx.PdxSerializable or
>>> org.apache.geode.DataSerializable.
>>>
>>> I am not sure actually; I have never tried this.  I would not recommend
>>> it... i.e. I would stick to 1 serialization strategy.  You could have
>>> adverse affects if read-serialized was set to true, or you were running
>>> OQL queries on a Region that stored a mix of serialized objects (e.g. PDX +
>>> Java Serialized), etc.
>>>
>>> Remember (as I stated earlier)...
>>>
>>> "*If either DataSerialization or PDX Serialization is configured, even
>>> if your application domain object implements java.io
>>> <http://java.io>.Serializable, then Geode will prefer its own serialization
>>> mechanics over Java Serialization**.*"
>>>
>>>
>>> Finally, as for...
>>>
>>> *> By the way somehow in my case PDX is used although I never supplied
>>> any PDX ReflectionBasedAutoSerializer.  Can you let me know the possible
>>> reasons for it?*
>>>
>>> The only way PDX will be used is if you...
>>>
>>>
>>> 1. Set the pdx-serializer-ref attribute on the <gfe:cache> element in
>>> the SDG XML namespace to any PdxSerializer implementation, for
>>> example...
>>>
>>> <bean *id=**"mySerializer"* class="org.example.app.geode.s
>>> erialization.MyPdxSerializer"/>
>>>
>>> <gfe:cache properties-ref="gemfireProperties" *pdx-serializer-ref=*
>>> *"mySerializer"* pdx-read-serialized="true"
>>>            pdx-persistent="true" pdx-disk-store="pdxStore"/>
>>>
>>> Or, if you set the pdxSerializer property
>>> <http://docs.spring.io/spring-data-gemfire/docs/current/api/org/springframework/data/gemfire/CacheFactoryBean.html#setPdxSerializer-java.lang.Object-> [1]
>>> on the [Client]CacheFactoryBean.  It also does NOT matter if
>>> ReflectionBasedAutoSerializer is used or not.  In fact, I would not
>>> recommend using ReflectionBasedAutoSerializer.  I would rather use
>>> SDG's MappingPdxSerializer or custom, dedicated PdxSerializers.
>>>
>>>
>>> 2. Or, your application domain object implements
>>> org.apache.geode.pdx.PdxSerializable.
>>>
>>>
>>> if either of these 2 things are true, then PDX is used.
>>>
>>> Regards,
>>> John
>>>
>>>
>>> [1] http://docs.spring.io/spring-data-gemfire/docs/current/a
>>> pi/org/springframework/data/gemfire/CacheFactoryBean.html#se
>>> tPdxSerializer-java.lang.Object-
>>>
>>>
>>> On Thu, Jun 29, 2017 at 10:50 PM, Amit Pandey <amit.pandey2103@gmail.com
>>> > wrote:
>>>
>>>> Hey John,
>>>>
>>>> Can we combine the two serialization Java and PDX? I want to use Java
>>>> for some domain objects which have circular dependencies until we figure
>>>> out a better way to represent them and use PDX for others?
>>>>
>>>> By the way somehow in my case PDX is used although I never supplied any
>>>> PDX ReflectionBasedAutoSerializer.
>>>>
>>>> Can you let me know the possible reasons for it?
>>>>
>>>> Regards
>>>>
>>>> On Fri, Jun 30, 2017 at 8:23 AM, John Blum <jb...@pivotal.io> wrote:
>>>>
>>>>> Right!  There is no special configuration required to use *Java
>>>>> Serialization* with Apache Geode, regardless if SDG is in play or
>>>>> not.  Your application domain object just needs to implement
>>>>> java.io.Serializable.
>>>>>
>>>>> However, if you decide to use Geode's *DataSerialization* framework
>>>>> <http://geode.apache.org/docs/guide/11/developing/data_serialization/gemfire_data_serialization.html> [1]
>>>>> or even PDX
>>>>> <http://geode.apache.org/docs/guide/11/developing/data_serialization/gemfire_pdx_serialization.html> [2]
>>>>> (and you should consider this), then SDG supports this too. For instance,
>>>>> here is an example config
>>>>> <https://github.com/spring-projects/spring-data-geode/blob/master/src/test/resources/org/springframework/data/gemfire/config/xml/cache-using-pdx-ns.xml#L18-L21> [3]
>>>>> of using SDG to configure PDX.  Here is a slightly more involved
>>>>> example
>>>>> <https://github.com/spring-projects/spring-data-geode/blob/master/src/test/java/org/springframework/data/gemfire/function/ClientCacheFunctionExecutionWithPdxIntegrationTest.java> [6]
>>>>> that uses *Spring* JavaConfig and "custom", "composed"
>>>>> *PdxSerializers* for the application domain object types (i.e. Person
>>>>> & Address).
>>>>>
>>>>> And, if you combine Geode's PDX Serialization framework [2] with *Spring
>>>>> Data's* "Mapping" infrastructure
>>>>> <http://docs.spring.io/spring-data-gemfire/docs/current/reference/html/#mapping.pdx-serializer> [4],
>>>>> there is a special PdxSerializer in SDG called MappingPdxSerializer
>>>>> <http://docs.spring.io/spring-data-gemfire/docs/current/api/org/springframework/data/gemfire/mapping/MappingPdxSerializer.html>
>>>>>  [5] that uses the SD "*Mapping meta-data*" to serialize your
>>>>> application domain object types to PDX.
>>>>>
>>>>> Of *Java Serialization*, DataSerialization and PDX, it is recommended
>>>>> that you use and prefer PDX as it offers the most flexibility and is more
>>>>> efficient than *Java Serialization* (though it does not handle
>>>>> cycles; so be careful there).  Of the 3, *DataSerialization* is the
>>>>> most efficient.
>>>>>
>>>>> If either DataSerialization or PDX Serialization is configured, even
>>>>> if your application domain object implements java.io.Serializable,
>>>>> then Geode will prefer its own serialization mechanics over *Java
>>>>> Serialization*.
>>>>>
>>>>> Refer to Geode's documentation
>>>>> <http://geode.apache.org/docs/guide/11/developing/data_serialization/data_serialization_options.html> [7]
>>>>> on serialization for more details.
>>>>>
>>>>> Hope this helps.
>>>>>
>>>>> Regards,
>>>>> John
>>>>>
>>>>>
>>>>> [1] http://geode.apache.org/docs/guide/11/developing/data_se
>>>>> rialization/gemfire_data_serialization.html
>>>>> [2] http://geode.apache.org/docs/guide/11/developing/data_se
>>>>> rialization/gemfire_pdx_serialization.html
>>>>> [3] https://github.com/spring-projects/spring-data-geode/blo
>>>>> b/master/src/test/resources/org/springframework/data/gemfire
>>>>> /config/xml/cache-using-pdx-ns.xml#L18-L21
>>>>> [4] http://docs.spring.io/spring-data-gemfire/docs/current/r
>>>>> eference/html/#mapping.pdx-serializer
>>>>> [5] http://docs.spring.io/spring-data-gemfire/docs/current/a
>>>>> pi/org/springframework/data/gemfire/mapping/MappingPdxSerializer.html
>>>>> [6] https://github.com/spring-projects/spring-data-geode/blo
>>>>> b/master/src/test/java/org/springframework/data/gemfire/func
>>>>> tion/ClientCacheFunctionExecutionWithPdxIntegrationTest.java
>>>>> [7] http://geode.apache.org/docs/guide/11/developing/data_se
>>>>> rialization/data_serialization_options.html
>>>>>
>>>>>
>>>>> On Thu, Jun 29, 2017 at 4:20 PM, Kirk Lund <kl...@apache.org> wrote:
>>>>>
>>>>>> Make the classes for your domain objects implement
>>>>>> java.io.Serializable and avoid specifying DataSerializable or
>>>>>> DataSerializers or PDX. This will result in use of Java serialization when
>>>>>> serializing your domain objects. It'll be slower though.
>>>>>>
>>>>>> On Thu, Jun 29, 2017 at 3:42 PM, Amit Pandey <
>>>>>> amit.pandey2103@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Guys,
>>>>>>>
>>>>>>> Whats the config for using Java Serialization in Spring Data Geode ?
>>>>>>>
>>>>>>> regards
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -John
>>>>> john.blum10101 (skype)
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> -John
>>> john.blum10101 (skype)
>>>
>>
>>
>


-- 
-John
john.blum10101 (skype)