You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geronimo.apache.org by Sachin Patel <sp...@gmail.com> on 2005/09/22 15:07:41 UTC

Java serizalization compatibility issues

So if you are not aware, I'm pulling in and packaging several jars from 
the lib and lib/endorsed directory into one of the eclipse plug-in, so 
the classes can be used and referenced by the rest of the eclipse 
plugins.  This is because eclipse can not reference classes or jars at 
runtime that are not packaged within a plug-in and marked as visible in 
either the plugin.xml or manifest. 

A big problem resides as now the same jars I'm packaging must be the 
same exact jars that reside in the target server I'm deploying.  This 
causes a dependency on a particular server image.  If a user modifies 
classes I reference and rebuilds their server, the plug-in is broken as 
during deployment I'll receive error messages like the following...

Caused by: java.io.InvalidClassException: 
org.apache.geronimo.kernel.config.ConfigurationModuleType; local class 
incompatible: stream classdesc serialVersionUID = 6296527685792707191, 
local class serialVersionUID = -4121586344416418391
   
So looking at that particular class, it looks like the serialVersionUID 
is generated by Java compiler.  This is bad as now jars/classes risk 
compatibility between every build.  We need a solution for this.  The 
only other option I'm aware of is for these serializable classes to hard 
code and explicity assign a value.  Of course we must then assue that we 
manually maintain backward compatibility to support the N-2 model for 
these classes.  This problem will eventually have to be solved anyway 
when there is multiple server support and across different versions. 

I think this is a must fix for 1.0 as without doing so we risk product 
migration, mixed version interoperability, tooling, possibly user 
applications, etc...

Sachin.



Re: Java serizalization compatibility issues

Posted by David Blevins <da...@visi.com>.
Ditto, sorry.  (Still not a fan of serialized configs :)

-David

On Sep 23, 2005, at 9:47 AM, Matt Hogstrom wrote:

> My bad Jeremy.  You are correct.  I latched onto serialization and  
> immediately went to configuration.  I realized the error of my ways  
> this morning when you mentioned the plugin.   Doh.
>
> Jeremy Boynes wrote:
>
>> Sachin Patel wrote: "This is because eclipse can not reference  
>> classes or jars at runtime that are not packaged within a plug-in  
>> and marked as visible in either the plugin.xml or manifest. A big  
>> problem resides as now the same jars I'm packaging must be the  
>> same exact jars that reside in the target server I'm deploying.   
>> This causes a dependency on a particular server image."
>> This thread wasn't about configuration files but about  
>> communication between JVMs. Sachin's plugin fails when talking to  
>> the server because there are different versions of the classes in  
>> the Eclipse client and in the running server.
>> RMI is the only transport guaranteed to be available by the JMX  
>> remoting specification. To use it the classes passed on the wire  
>> must be Serializable. We do not control the versions used on  
>> either end and hence need to ensure that skewed versions can  
>> interoperate. This means following the rules for serialization  
>> compatibility (such as adding serialUIDs).
>> -- 
>> Jeremy
>> Bill Stoddard wrote:
>>
>>> Matt Hogstrom wrote:
>>>
>>>
>>>> Not being totally familiar with all the nuances in G WRT to  
>>>> serialization my comments should be taken with a grain of salt.   
>>>> From my perspetive there are two major problems with serialized  
>>>> data.  One, its very fragile
>>>>
>>>
>>>
>>> Yes.
>>>
>>>
>>>> and two you can't change it if you need to.  One could argue  
>>>> users shouldn't be changing it but in extreme circumstances it  
>>>> is unavoidable.
>>>>
>>>> I'd vote for the move to XML (ouch, did I say that?)
>>>>
>>>> Matt
>>>>
>>>
>>>
>>>
>>> My inclination as well. I've been scratching my head trying to  
>>> understand 'why serialization?' rather than nice, flat, intuitive  
>>> text files. There may be a very good answer I just don't what it is.
>>>
>>>
>>>>
>>>> Jeremy Boynes wrote:
>>>>
>>>>
>>>>> Sachin's problem is not related to configuration persistence  
>>>>> but to the serialization of classes between plugin and server  
>>>>> when using JMX remoting over RMI.
>>>>>
>>>>> The upshot of it all is unless we are going to ditch all use of  
>>>>> serialization and replace it with XML then we need to exercise  
>>>>> the necessary discipline and version the classes involved.
>>>>>
>>>>> -- 
>>>>> Jeremy
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>
>


Re: Java serizalization compatibility issues

Posted by Matt Hogstrom <ma...@hogstrom.org>.
My bad Jeremy.  You are correct.  I latched onto serialization and 
immediately went to configuration.  I realized the error of my ways this 
morning when you mentioned the plugin.   Doh.

Jeremy Boynes wrote:
> Sachin Patel wrote: "This is because eclipse can not reference classes 
> or jars at runtime that are not packaged within a plug-in and marked as 
> visible in either the plugin.xml or manifest. A big problem resides as 
> now the same jars I'm packaging must be the same exact jars that reside 
> in the target server I'm deploying.  This causes a dependency on a 
> particular server image."
> 
> This thread wasn't about configuration files but about communication 
> between JVMs. Sachin's plugin fails when talking to the server because 
> there are different versions of the classes in the Eclipse client and in 
> the running server.
> 
> RMI is the only transport guaranteed to be available by the JMX remoting 
> specification. To use it the classes passed on the wire must be 
> Serializable. We do not control the versions used on either end and 
> hence need to ensure that skewed versions can interoperate. This means 
> following the rules for serialization compatibility (such as adding 
> serialUIDs).
> 
> -- 
> Jeremy
> 
> Bill Stoddard wrote:
> 
>> Matt Hogstrom wrote:
>>
>>> Not being totally familiar with all the nuances in G WRT to 
>>> serialization my comments should be taken with a grain of salt.  From 
>>> my perspetive there are two major problems with serialized data.  
>>> One, its very fragile 
>>
>>
>> Yes.
>>
>>> and two you can't change it if you need to.  One could argue users 
>>> shouldn't be changing it but in extreme circumstances it is unavoidable.
>>>
>>> I'd vote for the move to XML (ouch, did I say that?)
>>>
>>> Matt
>>
>>
>>
>> My inclination as well. I've been scratching my head trying to 
>> understand 'why serialization?' rather than nice, flat, intuitive text 
>> files. There may be a very good answer I just don't what it is.
>>
>>>
>>> Jeremy Boynes wrote:
>>>
>>>> Sachin's problem is not related to configuration persistence but to 
>>>> the serialization of classes between plugin and server when using 
>>>> JMX remoting over RMI.
>>>>
>>>> The upshot of it all is unless we are going to ditch all use of 
>>>> serialization and replace it with XML then we need to exercise the 
>>>> necessary discipline and version the classes involved.
>>>>
>>>> -- 
>>>> Jeremy
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 
> 


Re: Java serizalization compatibility issues

Posted by Jeremy Boynes <jb...@apache.org>.
Sachin Patel wrote: "This is because eclipse can not reference classes 
or jars at runtime that are not packaged within a plug-in and marked as 
visible in either the plugin.xml or manifest. A big problem resides as 
now the same jars I'm packaging must be the same exact jars that reside 
in the target server I'm deploying.  This causes a dependency on a 
particular server image."

This thread wasn't about configuration files but about communication 
between JVMs. Sachin's plugin fails when talking to the server because 
there are different versions of the classes in the Eclipse client and in 
the running server.

RMI is the only transport guaranteed to be available by the JMX remoting 
specification. To use it the classes passed on the wire must be 
Serializable. We do not control the versions used on either end and 
hence need to ensure that skewed versions can interoperate. This means 
following the rules for serialization compatibility (such as adding 
serialUIDs).

--
Jeremy

Bill Stoddard wrote:
> Matt Hogstrom wrote:
> 
>> Not being totally familiar with all the nuances in G WRT to 
>> serialization my comments should be taken with a grain of salt.  From 
>> my perspetive there are two major problems with serialized data.  One, 
>> its very fragile 
> 
> Yes.
> 
>> and two you can't change it if you need to.  One could argue users 
>> shouldn't be changing it but in extreme circumstances it is unavoidable.
>>
>> I'd vote for the move to XML (ouch, did I say that?)
>>
>> Matt
> 
> 
> My inclination as well. I've been scratching my head trying to 
> understand 'why serialization?' rather than nice, flat, intuitive text 
> files. There may be a very good answer I just don't what it is.
> 
>>
>> Jeremy Boynes wrote:
>>
>>> Sachin's problem is not related to configuration persistence but to 
>>> the serialization of classes between plugin and server when using JMX 
>>> remoting over RMI.
>>>
>>> The upshot of it all is unless we are going to ditch all use of 
>>> serialization and replace it with XML then we need to exercise the 
>>> necessary discipline and version the classes involved.
>>>
>>> -- 
>>> Jeremy
>>>
>>>
>>>
>>
>>
>>
> 
> 


Re: Java serizalization compatibility issues

Posted by Bill Stoddard <bi...@wstoddard.com>.
Matt Hogstrom wrote:
> Not being totally familiar with all the nuances in G WRT to 
> serialization my comments should be taken with a grain of salt.  From my 
> perspetive there are two major problems with serialized data.  One, its 
> very fragile 
Yes.

> and two you can't change it if you need to.  One could 
> argue users shouldn't be changing it but in extreme circumstances it is 
> unavoidable.
> 
> I'd vote for the move to XML (ouch, did I say that?)
> 
> Matt

My inclination as well. I've been scratching my head trying to understand 'why serialization?' rather than 
nice, flat, intuitive text files. There may be a very good answer I just don't what it is.

> 
> Jeremy Boynes wrote:
> 
>> Sachin's problem is not related to configuration persistence but to 
>> the serialization of classes between plugin and server when using JMX 
>> remoting over RMI.
>>
>> The upshot of it all is unless we are going to ditch all use of 
>> serialization and replace it with XML then we need to exercise the 
>> necessary discipline and version the classes involved.
>>
>> -- 
>> Jeremy
>>
>>
>>
> 
> 
> 



Re: Java serizalization compatibility issues

Posted by Jeremy Boynes <jb...@apache.org>.
Sachin Patel wrote:
> I think it would be nice to be able to publish using either SOAP or RMI, 
> and provide that option to the user in the server configuration panel in 
> the tools.  I think RMI has better performance once we remote deployment 
> is supported, but SOAP is more reliable through firewalls, so thats why 
> it would be good to have it to fall back on.
> 

I agree that it is a good option (and I'd go further and say we should 
look at things like WS-DM as well) but I don't think it should be the 
only option. If you are trying to assemble a lightweight server 
mandating a SOAP/HTTP stack just to manage it is considerable overhead.

Performance is not really a factor as this is intended for management 
function rather than transaction processing.

What this boils down to is that at some point we need to exhibit 
discipline and pay attention to version skew - serialUIDs are just part 
of that. The Eclipse plugin is the first application we have that is not 
coupled to the main release cycle so perhaps that time has come.

--
Jeremy


Re: Java serizalization compatibility issues

Posted by Sachin Patel <sp...@gmail.com>.
I think it would be nice to be able to publish using either SOAP or RMI, 
and provide that option to the user in the server configuration panel in 
the tools.  I think RMI has better performance once we remote deployment 
is supported, but SOAP is more reliable through firewalls, so thats why 
it would be good to have it to fall back on.

Jeremy Boynes wrote:
> Matt Hogstrom wrote:
>> Not being totally familiar with all the nuances in G WRT to 
>> serialization my comments should be taken with a grain of salt.  From 
>> my perspetive there are two major problems with serialized data.  
>> One, its very fragile and two you can't change it if you need to.  
>> One could argue users shouldn't be changing it but in extreme 
>> circumstances it is unavoidable.
>>
>> I'd vote for the move to XML (ouch, did I say that?)
>>
>
> What protocol should the plugin use to communicate with the server? I 
> think the only XML based protocol supported by MX4J is SOAP over HTTP.
>
> -- 
> Jeremy
>

Re: Java serizalization compatibility issues

Posted by Jeremy Boynes <jb...@apache.org>.
Matt Hogstrom wrote:
> Not being totally familiar with all the nuances in G WRT to 
> serialization my comments should be taken with a grain of salt.  From my 
> perspetive there are two major problems with serialized data.  One, its 
> very fragile and two you can't change it if you need to.  One could 
> argue users shouldn't be changing it but in extreme circumstances it is 
> unavoidable.
> 
> I'd vote for the move to XML (ouch, did I say that?)
> 

What protocol should the plugin use to communicate with the server? I 
think the only XML based protocol supported by MX4J is SOAP over HTTP.

--
Jeremy

Re: Java serizalization compatibility issues

Posted by Matt Hogstrom <ma...@hogstrom.org>.
Not being totally familiar with all the nuances in G WRT to 
serialization my comments should be taken with a grain of salt.  From my 
perspetive there are two major problems with serialized data.  One, its 
very fragile and two you can't change it if you need to.  One could 
argue users shouldn't be changing it but in extreme circumstances it is 
unavoidable.

I'd vote for the move to XML (ouch, did I say that?)

Matt

Jeremy Boynes wrote:
> Sachin's problem is not related to configuration persistence but to the 
> serialization of classes between plugin and server when using JMX 
> remoting over RMI.
> 
> The upshot of it all is unless we are going to ditch all use of 
> serialization and replace it with XML then we need to exercise the 
> necessary discipline and version the classes involved.
> 
> -- 
> Jeremy
> 
> 
> 


Re: Java serizalization compatibility issues

Posted by Andy Piper <an...@bea.com>.
At 02:43 AM 9/23/2005, Jeremy Boynes wrote:
>Sachin's problem is not related to configuration persistence but to 
>the serialization of classes between plugin and server when using 
>JMX remoting over RMI.
>
>The upshot of it all is unless we are going to ditch all use of 
>serialization and replace it with XML then we need to exercise the 
>necessary discipline and version the classes involved.

IMO most people underestimate the degree of compatibility that can be 
achieved with serialization - but then I have written RMI stacks for 
a living and am naturally biased :)

Given that RMI-based JMX is in Java SE 5 now it seems a little 
strange to entirely ditch this approach. Simply fixing the svuid for 
all serialized classes will make things considerably better since 
without this svuid's change for the most minor (and inconsequential) 
reasons. This could be done mehcanilcally with appropriate use of sed 
and serialver ...

$0.02

andy 


Re: Java serizalization compatibility issues

Posted by Jeremy Boynes <jb...@apache.org>.
Sachin's problem is not related to configuration persistence but to the 
serialization of classes between plugin and server when using JMX 
remoting over RMI.

The upshot of it all is unless we are going to ditch all use of 
serialization and replace it with XML then we need to exercise the 
necessary discipline and version the classes involved.

--
Jeremy

Re: Java serizalization compatibility issues

Posted by David Blevins <da...@visi.com>.
On Sep 22, 2005, at 6:07 AM, Sachin Patel wrote:

> So if you are not aware, I'm pulling in and packaging several jars  
> from the lib and lib/endorsed directory into one of the eclipse  
> plug-in, so the classes can be used and referenced by the rest of  
> the eclipse plugins.  This is because eclipse can not reference  
> classes or jars at runtime that are not packaged within a plug-in  
> and marked as visible in either the plugin.xml or manifest.
> A big problem resides as now the same jars I'm packaging must be  
> the same exact jars that reside in the target server I'm  
> deploying.  This causes a dependency on a particular server image.   
> If a user modifies classes I reference and rebuilds their server,  
> the plug-in is broken as during deployment I'll receive error  
> messages like the following...
>
> Caused by: java.io.InvalidClassException:  
> org.apache.geronimo.kernel.config.ConfigurationModuleType; local  
> class incompatible: stream classdesc serialVersionUID =  
> 6296527685792707191, local class serialVersionUID =  
> -4121586344416418391
>   So looking at that particular class, it looks like the  
> serialVersionUID is generated by Java compiler.  This is bad as now  
> jars/classes risk compatibility between every build.  We need a  
> solution for this.  The only other option I'm aware of is for these  
> serializable classes to hard code and explicity assign a value.  Of  
> course we must then assue that we manually maintain backward  
> compatibility to support the N-2 model for these classes.  This  
> problem will eventually have to be solved anyway when there is  
> multiple server support and across different versions.
> I think this is a must fix for 1.0 as without doing so we risk  
> product migration, mixed version interoperability, tooling,  
> possibly user applications, etc...
>

I agree with you.  We need a solution for this and it does affect  
compatibility even to the extent of breaking it on every build.

There was a one week battle on this in May that resulted in people  
saying "I dont' want to talk about it anymore."  I think the problem  
is an issue that can't be solved through serialization in an  
effective and reliable way.  Largely because of how easy it is to  
break serialized data and that broken serialized data is nearly  
unrecoverable.

Here is my perspective:

   With the proper amount of forethought and code, you can get a
   very limited amount of compatibility with serialization between
   class file versions, but it's unrealistic and unreliable at best.

   Unless you can convince everyone to do that, this entire thread
   is moot regardless of what you or I think, and our existing
   system will never serve as a reliable way to store configuration
   data across upgrades or patches.


Here are a few snippets of that argument, completely biased to my  
perspective:

----------------
On May 13, 2005, at 3:54 PM, David Blevins wrote:
> On Fri, May 13, 2005 at 03:42:58PM -0700, Jeremy Boynes wrote:
>>
>> Solving the compatibility problem will require the developers to pay
>> attention to versioning. We will need to require that at some  
>> point though.
>>
>
> Paying attention is an understatement.  In order to serialize data  
> to and from class definitions, those classes cannot change in any  
> significant way, ever.  No refactoring, repackaging, reorganizing,  
> no removing methods, no removing class variables.  This doesn't  
> just apply to public APIs, but even the internal classes of each  
> component.
>
> Considering that at deploy time we are building and serializing  
> Jetty web containers, Tomcat web containers, ActiveMQ queues/ 
> topics, OpenEJB ejb containers, Axis webservices, TranQL connectors  
> and more, is it reasonable to ask or expect these projects  
> basically not to change their code?
>
> Clearly, projects will clean and improve their code, we can't and  
> shouldn't try and stop them.  We want them to optimize and improve  
> things.  As this change is going to occur, people *will* have to  
> redeploy their apps as we support no other way to store  
> configurations.
>
> Serializing and deserializing server state for fast startup is a  
> cool feature, but not something we are in any position to guarantee  
> between releases of Axis, ActiveMQ, OpenEJB, Jetty, Tomcat, et. al.

----------------
On May 19, 2005, at 9:35 AM, David Blevins wrote:
> On Wed, May 18, 2005 at 09:04:07PM -0700, Jeremy Boynes wrote:
>
>> David Blevins wrote:
>>
>>>
>>> I know exactly how much room for error there is as well as the  
>>> amount
>>> forethought and code required to guarantee even a small amout of
>>> future compatibility.  Using SUIDs alone does not cut it. I'm not
>>> arguing it's impossible, I'm arguing that it's unrealistic and
>>> unreliable at best.
>>>
>>>
>>
>> I think you're looking at it backward - without a crystal ball you  
>> don't
>> know what future versions of the class will look like. It is the
>> responsbility of the evolved class to handle upgrading older data.
>>
>
> You won't know what order the fields were written to the stream, so  
> unless you had the forethought to write them yourself from the get- 
> go, you will be extremely limited on what you can do later.
>
> Even renaming a private variable would be impossible without  
> planning ahead.
>
>
>>> The major technical difference between xml data and object
>>> serialization data is that xml data can still be parsed even if the
>>> schema has changed. The same is not true of serialized object data.
>>>
>>
>> Why? You can parse a stream of bytes as easily as you can parse a  
>> stream
>> of characters. Whether you can use the result of the parse will  
>> depend
>> on the semantic content of the new infoset - you can create  
>> incompatible
>> XML schemas as well.
>>
>
> Simple, a well-formed XML document is always parseable weather or  
> not you choose to validate it against the schema.  Changing element  
> name, orders, or types can be dealt with later.  This is not the  
> case with object serialization where the class metadata is written  
> to the stream and you have no idea what it looks like.
>
> You cannot see, use, create, or predict the order of the class  
> metadata written to the stream.  I would hardly say it's any  
> comparison to XML Schema.
>
> As I keep saying, with the proper amount of forethought and code,  
> you can get a very limited amount of compatibility with  
> serialization between class file versions, but it's unrealistic and  
> unreliable at best.
>
> Unless you can convince everyone to do that, this entire thread is  
> moot regardless of what you or I think, and our existing system  
> will never serve as a reliable way to store configuration data  
> across upgrades or patches.
>
----------------