You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Doug Cutting <cu...@apache.org> on 2008/10/03 18:37:10 UTC

RPC versioning

It has been proposed in the discussions defining Hadoop 1.0 that we 
extend our back-compatibility policy.

http://wiki.apache.org/hadoop/Release1.0Requirements

Currently we only attempt to promise that application code will run 
without change against compatible versions of Hadoop.  If one has 
clusters running different yet compatible versions, then one must use a 
different classpath for each cluster to pick up the appropriate version 
of Hadoop's client libraries.

The proposal is that we extend this, so that a client library from one 
version of Hadoop will operate correctly with other compatible Hadoop 
versions, i.e., one need not alter one's classpath to contain the 
identical version, only a compatible version.

Question 1: Do we need to solve this problem soon, for release 1.0, 
i.e., in order to provide a release whose compatibility lifetime is ~1 
year, instead of the ~4months of 0. releases?  This is not clear to me. 
  Can someone provide cases where using the same classpath when talking 
to multiple clusters is critical?

Assuming it is, to implement this requires RPC-level support for 
versioning.  We could add this by switching to an RPC mechanism with 
built-in, automatic versioning support, like Thrift, Etch or Protocol 
Buffers.  But none of these is a drop-in replacement for Hadoop RPC. 
They will probably not initially meet our performance and scalability 
requirements.  Their adoption will also require considerable and 
destabilizing changes to Hadoop.  Finally, it is not today clear which 
of these would be the best candidate.  If we move too soon, we might 
regret our choice and wish to move again later.

So, if we answer yes to (1) above, wishing to provide RPC 
back-compatibility in 1.0, but do not want to hold up a 1.0 release, is 
there an alternative to switching?  Can we provide incremental 
versioning support to Hadoop's existing RPC mechanism that will suffice 
until a clear replacement is available?

Below I suggest a simple versioning style that Hadoop might use to
permit its RPC protocols to evolve compatibly until an RPC system with
built-in versioning support is selected.  This is not intended to be a
long-term solution, but rather something that would permit us to more
flexibly evolve Hadoop's protocols over the next year or so.

This style assumes a globally increasing Hadoop version number.  For
example, this might be the subversion repository version of trunk when
a change is first introduced.

When an RPC client and server handshake, they exchange version
numbers.  The lower of their two version numbers is selected as the
version for the connection.

Let's walk through an example.  We start with a class that contains
no versioning information and a single field, 'a':

   public class Foo implements Writable {
     int a;

     public void write(DataOutput out) throws IOException {
       out.writeInt(a);
     }

     public void readFields(DataInput in) throws IOException {
       a = in.readInt();
     }
   }

Now, in version 1, we add a second field, 'b' to this:

   public class Foo implements Writable {
     int a;
     float b;                                        // new field

     public void write(DataOutput out) throws IOException {
       int version = RPC.getVersion(out);
       out.writeInt(a);
       if (version >= 1) {                           // peer supports b
	out.writeFloat(b);                          // send it
       }
     }

     public void readFields(DataInput in) throws IOException {
       int version = RPC.getVersion(in);
       a = in.readInt();
       if (version >= 1) {                           // if supports b
	b = in.readFloat();                         // read it
       }
     }
   }

Next, in version 2, we remove the first field, 'a':

   public class Foo implements Writable {
     float b;

     public void write(DataOutput out) throws IOException {
       int version = RPC.getVersion(out);
       if (version < 2) {                            // peer wants a
	out.writeInt(0);                            // send it
       }
       if (version >= 1) {
	out.writeFloat(b);
       }
     }

     public void readFields(DataInput in) throws IOException {
       int version = RPC.getVersion(in);
       if (version < 2) {                            // peer writes a
	in.readInt();                               // ignore it
       }
       if (version >= 1) {
	b = in.readFloat();
       }
     }
   }

Could something like this work?  It would require just some minor 
changes to Hadoop's RPC mechanism, to support the version handshake. 
Beyond that, it could be implemented incrementally as RPC protocols 
evolve.  It would require some vigilance, to make sure that versioning 
logic is added when classes change, but adding automated tests against 
prior versions would identify lapses  here.

This may appear to add a lot of version-related logic, but with 
automatic versioning, in many cases, some version-related logic is still 
required.  In simple cases, one adds a completely new field with a 
default value and is done, with automatic versioning handling much of 
the work.  But in many other cases an existing field is changed and the 
application must translate old values to new, and vice versa.  These 
cases still require application logic, even with automatic versioning. 
So automatic versioning is certainly less intrusive, but not as much as 
one might first assume.

The fundamental question is how soon we need to address inter-version 
RPC compatibility.  If we wish to do it soon, I think we'd be wise to 
consider a solution that's less invasive and that does not force us into 
a potentially premature decision.

Doug

Re: RPC versioning

Posted by Steve Loughran <st...@apache.org>.

Sanjay Radia wrote:
> 
> On Oct 7, 2008, at 3:48 AM, Alejandro Abdelnur wrote:

>> How about something more simplistic?
>>
>> * The client handles a single version of Hadoop, the one that it belongs
>> * The server handles its version and a a backwards range [vN, vN-M]
>> * All client RPCs always start with the Hadoop version they belong to
>> * On the server side an RpcSwitch reads the version, checks the
>> received RPC is within valid version range, and delegates the call to
>> corresponding handler. The vN-1 to vN-M handlers will be adapters to a
>> vN handler
>>
> You will need to solve the following problem also:
> Vn-1 Vn-2 ... each changed some field of some parameter class.
> You can switch to the right handler but that handler in general only has 
> a class definition of the class that matches its version. How does he 
> read the objects with the older class definitions?
> You may be able to address this via class loader tricks or by tagging a 
> version number to the name of the class and keeping the definitions of 
> each of the older versions (can be made a little simpler than I am 
> describing though the magic of subclassing  ... but something like that 
> will be needed.
> 

That is starting to look very much like java RMI, which scares me, 
because although we use RMI a lot, it is incredibly brittle, and once 
you add OSGi to the mix even worse (as it is no longer enough to include 
classname and class/interface ID, you need to include classloader info). 
   Once you go down this path, you start looking wistfully at XML 
formats where you can use XPath to manipulate the messages. From my 
experience in SOAP-land, having a wire format that is flexible comes to 
nothing if either end doesn't handle it, or if you change the semantics 
of the interface either explicitly (operations behave differently) or 
implicitly (something that didnt look the far end now does; or other 
transaction behaviours) [1]. Its those semantic changes that really burn 
your clients, even if you handle all the marshalling issues.

If you are going to handle multiple versions on a server -and I don't 
know if that is the right approach to take with any RPC mechanism- 
here's what I'd recommend

-connecting clients include version numbers in the endpoints they hit. 
All URLs include a version.

-IPC comms is negotiated. The first thing a client does is say "Im v.8 
and I'd like a v.8 connection", the server gets to say "no" or return 
some reference (URL, Endpointer, etc) that can be used for all further 
comms.

That's extra work, but so is supporting multiple versions. Here is what 
I think is better: Define a hard split between in-cluster protocol and 
external.

In-cluster is how the boxes talk to each other. It can either assume a 
secured network and stick with NFS-class security (users are who they 
say they are, everything is trusted), or (optionally) have a paranoid 
mode where all messages have to be signed with some defences against 
replay attacks. The current (trust everything) model is a lot easier.

Things like management tools and filesystem front-ends would be expected 
to stay in sync with this stuff; its not private, so much as 
"internal-unstable", if that were a java keyword.

External is the public API for loosely coupled things that can submit 
work and work with a (remote) filesystem
  -command line applications
  -web front ends
  -IDE plugins
  -have support for notifications that work through firewalls (atom 
feeds and/or xmpp events)
This front end would have security from the outset, be fairly RESTy and 
use a wire format that is resilent to change (JSON, XML). The focus here 
is robustness, security and long-haul comms over high performance.

The nice thing about this approach is that the current IPC gets declared 
the internal API and can evolve as appropriate; its the external API 
that we need to design to be robust across versions. We can maybe use 
WebDAV as the main filesystem API, with the MacOS, windows and linux 
WebDAV filesystems talking to it; the job API is something we can do 
ourselves, possibly reviewing past work in this area (including OSGI 
designs), but doing something the REST discuss list would be proud of, 
or at least not unduly critical of.

FWIW I have done WebDAV to other filesystem work many years ago (2000); 
it's not too bad except that much of its FS semantics is that of Win9x. 
It has metadata where you could set things like the replication factor 
using PROPSET/PROPGET while MOVE and COPY operations let you rename and 
copy files/directories without pulling things client side. The big 
limitations are that it contains (invalid) assumptions that the files 
are less than a few GB and so that you can PUT/GET in one-off 
operations, rather than break up the upload into pieces. I don't think 
there's an APPEND operation either.

[1] Rethinking the Java SOAP Stack
http://www.hpl.hp.com/techreports/2005/HPL-2005-83.pdf

Re: RPC versioning

Posted by Sanjay Radia <sr...@yahoo-inc.com>.

On Oct 7, 2008, at 3:48 AM, Alejandro Abdelnur wrote:

> > ....
> > Question 1: Do we need to solve this problem soon, for release  
> 1.0, i.e., in
> > order to provide a release whose compatibility lifetime is ~1  
> year, instead
> > of the ~4months of 0. releases?  This is not clear to me.  Can  
> someone
> > provide cases where using the same classpath when talking to  
> multiple
> > clusters is critical?
>
> Yes, we have this requirement to be able to do upgrades of our system
> without downtime.
>
> Our hadoop jobs are managed from an appserver. Our appserver can
> dispatch jobs to different hadoop clusters. To do a hadoop upgrade
> without downtime we would redirect jobs out of one of the hadoop
> clusters (of course you have to ge the necessary INPUTs out too),
> upgrade that idle cluster and then redirect traffic back. At this
> point our server will be running vN while the upgraded cluster will be
> runing vN+1. Then we can later upgrade our appserver to vN+1 by taking
> instances out of rotation one at the time.
>
> Note that the situation of the appserver interacting with vN and vN+1
> hadoop clusters could be for a significant period of time (weeks if
> not months).
>
> > ....
> > Below I suggest a simple versioning style that Hadoop might use to
> > permit its RPC protocols to evolve compatibly ...
> > ...
> > When an RPC client and server handshake, they exchange version
> > numbers.  The lower of their two version numbers is selected as the
> > version for the connection.
>
> How about something more simplistic?
>
> * The client handles a single version of Hadoop, the one that it  
> belongs
> * The server handles its version and a a backwards range [vN, vN-M]
> * All client RPCs always start with the Hadoop version they belong to
> * On the server side an RpcSwitch reads the version, checks the
> received RPC is within valid version range, and delegates the call to
> corresponding handler. The vN-1 to vN-M handlers will be adapters to a
> vN handler
>
You will need to solve the following problem also:
Vn-1 Vn-2 ... each changed some field of some parameter class.
You can switch to the right handler but that handler in general only  
has a class definition of the class that matches its version. How does  
he read the objects with the older class definitions?
You may be able to address this via class loader tricks or by tagging  
a version number to the name of the class and keeping the definitions  
of each of the older versions (can be made a little simpler than I am  
describing though the magic of subclassing  ... but something like  
that will be needed.

>
>
> A
>

Re: RPC versioning

Posted by Steve Loughran <st...@apache.org>.

Alejandro Abdelnur wrote:
> You bring a good point, for submitting/managing jobs you could use a
> REST API. But still the protocol compatibility is needed for accessing
> HDFS to read/write data.
> 

Long haul? WebDAV

> A
> 
> On Tue, Oct 7, 2008 at 7:00 PM, Steve Loughran <st...@apache.org> wrote:
>> I'm going to make another suggestion, which is : are there long-haul/high
>> stability APIs which we want to offer alongside the unstable/higher
>> performance IPC protocols. So IPC could be used within the cluster, but
>> something else would run for talking to the cluster from the outside
>>  -assumes no trust of the caller and requires rigorous authentication
>>  -less brittle to changes in the serialization formats
>>  -designed with firewalls and timeouts in mind
>> A few years back, people would have said SOAP! or SOA, but something that
>> uses JSON over REST is the other option. Push up the files using WebDAV,
>> then use a RESTy interface to the JobTrackers to submit work
>>
>> There's engineering effort here, but this would be the API used by things
>> like IDE clients, and other applications to get work done
>>


-- 
Steve Loughran                  http://www.1060.org/blogxter/publish/5
Author: Ant in Action           http://antbook.org/

Re: RPC versioning

Posted by Alejandro Abdelnur <tu...@gmail.com>.

You bring a good point, for submitting/managing jobs you could use a
REST API. But still the protocol compatibility is needed for accessing
HDFS to read/write data.

A

On Tue, Oct 7, 2008 at 7:00 PM, Steve Loughran <st...@apache.org> wrote:
>
> I'm going to make another suggestion, which is : are there long-haul/high
> stability APIs which we want to offer alongside the unstable/higher
> performance IPC protocols. So IPC could be used within the cluster, but
> something else would run for talking to the cluster from the outside
>  -assumes no trust of the caller and requires rigorous authentication
>  -less brittle to changes in the serialization formats
>  -designed with firewalls and timeouts in mind
> A few years back, people would have said SOAP! or SOA, but something that
> uses JSON over REST is the other option. Push up the files using WebDAV,
> then use a RESTy interface to the JobTrackers to submit work
>
> There's engineering effort here, but this would be the API used by things
> like IDE clients, and other applications to get work done
>

Re: RPC versioning

Posted by Steve Loughran <st...@apache.org>.

Alejandro Abdelnur wrote:
>> ....
>> Question 1: Do we need to solve this problem soon, for release 1.0, i.e., in
>> order to provide a release whose compatibility lifetime is ~1 year, instead
>> of the ~4months of 0. releases?  This is not clear to me.  Can someone
>> provide cases where using the same classpath when talking to multiple
>> clusters is critical?
> 
> Yes, we have this requirement to be able to do upgrades of our system
> without downtime.
> 
> Our hadoop jobs are managed from an appserver. Our appserver can
> dispatch jobs to different hadoop clusters. To do a hadoop upgrade
> without downtime we would redirect jobs out of one of the hadoop
> clusters (of course you have to ge the necessary INPUTs out too),
> upgrade that idle cluster and then redirect traffic back. At this
> point our server will be running vN while the upgraded cluster will be
> runing vN+1. Then we can later upgrade our appserver to vN+1 by taking
> instances out of rotation one at the time.
> 
> Note that the situation of the appserver interacting with vN and vN+1
> hadoop clusters could be for a significant period of time (weeks if
> not months).
> 
>> ....
>> Below I suggest a simple versioning style that Hadoop might use to
>> permit its RPC protocols to evolve compatibly ...
>> ...
>> When an RPC client and server handshake, they exchange version
>> numbers.  The lower of their two version numbers is selected as the
>> version for the connection.
> 
> How about something more simplistic?
> 
> * The client handles a single version of Hadoop, the one that it belongs
> * The server handles its version and a a backwards range [vN, vN-M]
> * All client RPCs always start with the Hadoop version they belong to
> * On the server side an RpcSwitch reads the version, checks the
> received RPC is within valid version range, and delegates the call to
> corresponding handler. The vN-1 to vN-M handlers will be adapters to a
> vN handler

I'm going to make another suggestion, which is : are there 
long-haul/high stability APIs which we want to offer alongside the 
unstable/higher performance IPC protocols. So IPC could be used within 
the cluster, but something else would run for talking to the cluster 
from the outside
  -assumes no trust of the caller and requires rigorous authentication
  -less brittle to changes in the serialization formats
  -designed with firewalls and timeouts in mind
A few years back, people would have said SOAP! or SOA, but something 
that uses JSON over REST is the other option. Push up the files using 
WebDAV, then use a RESTy interface to the JobTrackers to submit work

There's engineering effort here, but this would be the API used by 
things like IDE clients, and other applications to get work done

Re: RPC versioning

Posted by Alejandro Abdelnur <tu...@gmail.com>.

> ....
> Question 1: Do we need to solve this problem soon, for release 1.0, i.e., in
> order to provide a release whose compatibility lifetime is ~1 year, instead
> of the ~4months of 0. releases?  This is not clear to me.  Can someone
> provide cases where using the same classpath when talking to multiple
> clusters is critical?

Yes, we have this requirement to be able to do upgrades of our system
without downtime.

Our hadoop jobs are managed from an appserver. Our appserver can
dispatch jobs to different hadoop clusters. To do a hadoop upgrade
without downtime we would redirect jobs out of one of the hadoop
clusters (of course you have to ge the necessary INPUTs out too),
upgrade that idle cluster and then redirect traffic back. At this
point our server will be running vN while the upgraded cluster will be
runing vN+1. Then we can later upgrade our appserver to vN+1 by taking
instances out of rotation one at the time.

Note that the situation of the appserver interacting with vN and vN+1
hadoop clusters could be for a significant period of time (weeks if
not months).

> ....
> Below I suggest a simple versioning style that Hadoop might use to
> permit its RPC protocols to evolve compatibly ...
> ...
> When an RPC client and server handshake, they exchange version
> numbers.  The lower of their two version numbers is selected as the
> version for the connection.

How about something more simplistic?

* The client handles a single version of Hadoop, the one that it belongs
* The server handles its version and a a backwards range [vN, vN-M]
* All client RPCs always start with the Hadoop version they belong to
* On the server side an RpcSwitch reads the version, checks the
received RPC is within valid version range, and delegates the call to
corresponding handler. The vN-1 to vN-M handlers will be adapters to a
vN handler

A

Re: RPC versioning - oops sorry

Posted by Sanjay Radia <sr...@yahoo-inc.com>.

My filter was saving this thread in my "jira bucket" and I had missed  
this thread.
I asked a few questions on the hadoop requirements page earlier today  
that you have addressed or are addressing in this thread. Sorry.

sanjay



On Oct 3, 2008, at 9:37 AM, Doug Cutting wrote:

> It has been proposed in the discussions defining Hadoop 1.0 that we
> extend our back-compatibility policy.
>
> http://wiki.apache.org/hadoop/Release1.0Requirements
>
> Currently we only attempt to promise that application code will run
> without change against compatible versions of Hadoop.  If one has
> clusters running different yet compatible versions, then one must  
> use a
> different classpath for each cluster to pick up the appropriate  
> version
> of Hadoop's client libraries.
>
> The proposal is that we extend this, so that a client library from one
> version of Hadoop will operate correctly with other compatible Hadoop
> versions, i.e., one need not alter one's classpath to contain the
> identical version, only a compatible version.
>
> Question 1: Do we need to solve this problem soon, for release 1.0,
> i.e., in order to provide a release whose compatibility lifetime is ~1
> year, instead of the ~4months of 0. releases?  This is not clear to  
> me.
>   Can someone provide cases where using the same classpath when  
> talking
> to multiple clusters is critical?
>
> Assuming it is, to implement this requires RPC-level support for
> versioning.  We could add this by switching to an RPC mechanism with
> built-in, automatic versioning support, like Thrift, Etch or Protocol
> Buffers.  But none of these is a drop-in replacement for Hadoop RPC.
> They will probably not initially meet our performance and scalability
> requirements.  Their adoption will also require considerable and
> destabilizing changes to Hadoop.  Finally, it is not today clear which
> of these would be the best candidate.  If we move too soon, we might
> regret our choice and wish to move again later.
>
> So, if we answer yes to (1) above, wishing to provide RPC
> back-compatibility in 1.0, but do not want to hold up a 1.0 release,  
> is
> there an alternative to switching?  Can we provide incremental
> versioning support to Hadoop's existing RPC mechanism that will  
> suffice
> until a clear replacement is available?
>
> Below I suggest a simple versioning style that Hadoop might use to
> permit its RPC protocols to evolve compatibly until an RPC system with
> built-in versioning support is selected.  This is not intended to be a
> long-term solution, but rather something that would permit us to more
> flexibly evolve Hadoop's protocols over the next year or so.
>
> This style assumes a globally increasing Hadoop version number.  For
> example, this might be the subversion repository version of trunk when
> a change is first introduced.
>
> When an RPC client and server handshake, they exchange version
> numbers.  The lower of their two version numbers is selected as the
> version for the connection.
>
> Let's walk through an example.  We start with a class that contains
> no versioning information and a single field, 'a':
>
>    public class Foo implements Writable {
>      int a;
>
>      public void write(DataOutput out) throws IOException {
>        out.writeInt(a);
>      }
>
>      public void readFields(DataInput in) throws IOException {
>        a = in.readInt();
>      }
>    }
>
> Now, in version 1, we add a second field, 'b' to this:
>
>    public class Foo implements Writable {
>      int a;
>      float b;                                        // new field
>
>      public void write(DataOutput out) throws IOException {
>        int version = RPC.getVersion(out);
>        out.writeInt(a);
>        if (version >= 1) {                           // peer  
> supports b
>         out.writeFloat(b);                          // send it
>        }
>      }
>
>      public void readFields(DataInput in) throws IOException {
>        int version = RPC.getVersion(in);
>        a = in.readInt();
>        if (version >= 1) {                           // if supports b
>         b = in.readFloat();                         // read it
>        }
>      }
>    }
>
> Next, in version 2, we remove the first field, 'a':
>
>    public class Foo implements Writable {
>      float b;
>
>      public void write(DataOutput out) throws IOException {
>        int version = RPC.getVersion(out);
>        if (version < 2) {                            // peer wants a
>         out.writeInt(0);                            // send it
>        }
>        if (version >= 1) {
>         out.writeFloat(b);
>        }
>      }
>
>      public void readFields(DataInput in) throws IOException {
>        int version = RPC.getVersion(in);
>        if (version < 2) {                            // peer writes a
>         in.readInt();                               // ignore it
>        }
>        if (version >= 1) {
>         b = in.readFloat();
>        }
>      }
>    }
>
> Could something like this work?  It would require just some minor
> changes to Hadoop's RPC mechanism, to support the version handshake.
> Beyond that, it could be implemented incrementally as RPC protocols
> evolve.  It would require some vigilance, to make sure that versioning
> logic is added when classes change, but adding automated tests against
> prior versions would identify lapses  here.
>
> This may appear to add a lot of version-related logic, but with
> automatic versioning, in many cases, some version-related logic is  
> still
> required.  In simple cases, one adds a completely new field with a
> default value and is done, with automatic versioning handling much of
> the work.  But in many other cases an existing field is changed and  
> the
> application must translate old values to new, and vice versa.  These
> cases still require application logic, even with automatic versioning.
> So automatic versioning is certainly less intrusive, but not as much  
> as
> one might first assume.
>
> The fundamental question is how soon we need to address inter-version
> RPC compatibility.  If we wish to do it soon, I think we'd be wise to
> consider a solution that's less invasive and that does not force us  
> into
> a potentially premature decision.
>
> Doug
>

Re: RPC versioning

Posted by Sanjay Radia <sr...@yahoo-inc.com>.

On Oct 3, 2008, at 10:06 PM, Raghu Angadi wrote:

>
> If version handling is required, I think Doug's approach will work  
> well
> for current RPC.
>
> Extra complexity of handling different versions in object  
> serialization
> might be easily over estimated (for a duration of 1 year, say). I  
> would
> think easily more than 90% of objects' serialization has not changed  
> in
> last 1 to 2 years.
>
> As long as the innocent is protected (i.e. no existing write() method
> needs to change unless the fields change), it will be fine.
>
> Many times effective serialization changes mainly because of new sub
> classes and not the actual serialization method themselves.
>
> Do we handle change of arguments to a method similarly? How are
> subclasses handled?
>
Change to method arguments? -
   One possible solution: create a new method instead -- this will be  
good enough if it is for a short time.

Subclassing? - don't; instead  add or delete fields.

>
>
> Raghu.
>
> Doug Cutting wrote:
> > It has been proposed in the discussions defining Hadoop 1.0 that we
> > extend our back-compatibility policy.
> >
> > http://wiki.apache.org/hadoop/Release1.0Requirements
> >
> > Currently we only attempt to promise that application code will run
> > without change against compatible versions of Hadoop.  If one has
> > clusters running different yet compatible versions, then one must  
> use a
> > different classpath for each cluster to pick up the appropriate  
> version
> > of Hadoop's client libraries.
> >
> > The proposal is that we extend this, so that a client library from  
> one
> > version of Hadoop will operate correctly with other compatible  
> Hadoop
> > versions, i.e., one need not alter one's classpath to contain the
> > identical version, only a compatible version.
> >
> > Question 1: Do we need to solve this problem soon, for release 1.0,
> > i.e., in order to provide a release whose compatibility lifetime  
> is ~1
> > year, instead of the ~4months of 0. releases?  This is not clear  
> to me.
> >  Can someone provide cases where using the same classpath when  
> talking
> > to multiple clusters is critical?
> >
> > Assuming it is, to implement this requires RPC-level support for
> > versioning.  We could add this by switching to an RPC mechanism with
> > built-in, automatic versioning support, like Thrift, Etch or  
> Protocol
> > Buffers.  But none of these is a drop-in replacement for Hadoop RPC.
> > They will probably not initially meet our performance and  
> scalability
> > requirements.  Their adoption will also require considerable and
> > destabilizing changes to Hadoop.  Finally, it is not today clear  
> which
> > of these would be the best candidate.  If we move too soon, we might
> > regret our choice and wish to move again later.
> >
> > So, if we answer yes to (1) above, wishing to provide RPC
> > back-compatibility in 1.0, but do not want to hold up a 1.0  
> release, is
> > there an alternative to switching?  Can we provide incremental
> > versioning support to Hadoop's existing RPC mechanism that will  
> suffice
> > until a clear replacement is available?
> >
> > Below I suggest a simple versioning style that Hadoop might use to
> > permit its RPC protocols to evolve compatibly until an RPC system  
> with
> > built-in versioning support is selected.  This is not intended to  
> be a
> > long-term solution, but rather something that would permit us to  
> more
> > flexibly evolve Hadoop's protocols over the next year or so.
> >
> > This style assumes a globally increasing Hadoop version number.  For
> > example, this might be the subversion repository version of trunk  
> when
> > a change is first introduced.
> >
> > When an RPC client and server handshake, they exchange version
> > numbers.  The lower of their two version numbers is selected as the
> > version for the connection.
> >
> > Let's walk through an example.  We start with a class that contains
> > no versioning information and a single field, 'a':
> >
> >   public class Foo implements Writable {
> >     int a;
> >
> >     public void write(DataOutput out) throws IOException {
> >       out.writeInt(a);
> >     }
> >
> >     public void readFields(DataInput in) throws IOException {
> >       a = in.readInt();
> >     }
> >   }
> >
> > Now, in version 1, we add a second field, 'b' to this:
> >
> >   public class Foo implements Writable {
> >     int a;
> >     float b;                                        // new field
> >
> >     public void write(DataOutput out) throws IOException {
> >       int version = RPC.getVersion(out);
> >       out.writeInt(a);
> >       if (version >= 1) {                           // peer  
> supports b
> >     out.writeFloat(b);                          // send it
> >       }
> >     }
> >
> >     public void readFields(DataInput in) throws IOException {
> >       int version = RPC.getVersion(in);
> >       a = in.readInt();
> >       if (version >= 1) {                           // if supports b
> >     b = in.readFloat();                         // read it
> >       }
> >     }
> >   }
> >
> > Next, in version 2, we remove the first field, 'a':
> >
> >   public class Foo implements Writable {
> >     float b;
> >
> >     public void write(DataOutput out) throws IOException {
> >       int version = RPC.getVersion(out);
> >       if (version < 2) {                            // peer wants a
> >     out.writeInt(0);                            // send it
> >       }
> >       if (version >= 1) {
> >     out.writeFloat(b);
> >       }
> >     }
> >
> >     public void readFields(DataInput in) throws IOException {
> >       int version = RPC.getVersion(in);
> >       if (version < 2) {                            // peer writes a
> >     in.readInt();                               // ignore it
> >       }
> >       if (version >= 1) {
> >     b = in.readFloat();
> >       }
> >     }
> >   }
> >
> > Could something like this work?  It would require just some minor
> > changes to Hadoop's RPC mechanism, to support the version handshake.
> > Beyond that, it could be implemented incrementally as RPC protocols
> > evolve.  It would require some vigilance, to make sure that  
> versioning
> > logic is added when classes change, but adding automated tests  
> against
> > prior versions would identify lapses  here.
> >
> > This may appear to add a lot of version-related logic, but with
> > automatic versioning, in many cases, some version-related logic is  
> still
> > required.  In simple cases, one adds a completely new field with a
> > default value and is done, with automatic versioning handling much  
> of
> > the work.  But in many other cases an existing field is changed  
> and the
> > application must translate old values to new, and vice versa.  These
> > cases still require application logic, even with automatic  
> versioning.
> > So automatic versioning is certainly less intrusive, but not as  
> much as
> > one might first assume.
> >
> > The fundamental question is how soon we need to address inter- 
> version
> > RPC compatibility.  If we wish to do it soon, I think we'd be wise  
> to
> > consider a solution that's less invasive and that does not force  
> us into
> > a potentially premature decision.
> >
> > Doug
>
>

Re: RPC versioning

Posted by Raghu Angadi <ra...@yahoo-inc.com>.

If version handling is required, I think Doug's approach will work well 
for current RPC.

Extra complexity of handling different versions in object serialization 
might be easily over estimated (for a duration of 1 year, say). I would 
think easily more than 90% of objects' serialization has not changed in 
last 1 to 2 years.

As long as the innocent is protected (i.e. no existing write() method 
needs to change unless the fields change), it will be fine.

Many times effective serialization changes mainly because of new sub 
classes and not the actual serialization method themselves.

Do we handle change of arguments to a method similarly? How are 
subclasses handled?

Raghu.

Doug Cutting wrote:
> It has been proposed in the discussions defining Hadoop 1.0 that we 
> extend our back-compatibility policy.
> 
> http://wiki.apache.org/hadoop/Release1.0Requirements
> 
> Currently we only attempt to promise that application code will run 
> without change against compatible versions of Hadoop.  If one has 
> clusters running different yet compatible versions, then one must use a 
> different classpath for each cluster to pick up the appropriate version 
> of Hadoop's client libraries.
> 
> The proposal is that we extend this, so that a client library from one 
> version of Hadoop will operate correctly with other compatible Hadoop 
> versions, i.e., one need not alter one's classpath to contain the 
> identical version, only a compatible version.
> 
> Question 1: Do we need to solve this problem soon, for release 1.0, 
> i.e., in order to provide a release whose compatibility lifetime is ~1 
> year, instead of the ~4months of 0. releases?  This is not clear to me. 
>  Can someone provide cases where using the same classpath when talking 
> to multiple clusters is critical?
> 
> Assuming it is, to implement this requires RPC-level support for 
> versioning.  We could add this by switching to an RPC mechanism with 
> built-in, automatic versioning support, like Thrift, Etch or Protocol 
> Buffers.  But none of these is a drop-in replacement for Hadoop RPC. 
> They will probably not initially meet our performance and scalability 
> requirements.  Their adoption will also require considerable and 
> destabilizing changes to Hadoop.  Finally, it is not today clear which 
> of these would be the best candidate.  If we move too soon, we might 
> regret our choice and wish to move again later.
> 
> So, if we answer yes to (1) above, wishing to provide RPC 
> back-compatibility in 1.0, but do not want to hold up a 1.0 release, is 
> there an alternative to switching?  Can we provide incremental 
> versioning support to Hadoop's existing RPC mechanism that will suffice 
> until a clear replacement is available?
> 
> Below I suggest a simple versioning style that Hadoop might use to
> permit its RPC protocols to evolve compatibly until an RPC system with
> built-in versioning support is selected.  This is not intended to be a
> long-term solution, but rather something that would permit us to more
> flexibly evolve Hadoop's protocols over the next year or so.
> 
> This style assumes a globally increasing Hadoop version number.  For
> example, this might be the subversion repository version of trunk when
> a change is first introduced.
> 
> When an RPC client and server handshake, they exchange version
> numbers.  The lower of their two version numbers is selected as the
> version for the connection.
> 
> Let's walk through an example.  We start with a class that contains
> no versioning information and a single field, 'a':
> 
>   public class Foo implements Writable {
>     int a;
> 
>     public void write(DataOutput out) throws IOException {
>       out.writeInt(a);
>     }
> 
>     public void readFields(DataInput in) throws IOException {
>       a = in.readInt();
>     }
>   }
> 
> Now, in version 1, we add a second field, 'b' to this:
> 
>   public class Foo implements Writable {
>     int a;
>     float b;                                        // new field
> 
>     public void write(DataOutput out) throws IOException {
>       int version = RPC.getVersion(out);
>       out.writeInt(a);
>       if (version >= 1) {                           // peer supports b
>     out.writeFloat(b);                          // send it
>       }
>     }
> 
>     public void readFields(DataInput in) throws IOException {
>       int version = RPC.getVersion(in);
>       a = in.readInt();
>       if (version >= 1) {                           // if supports b
>     b = in.readFloat();                         // read it
>       }
>     }
>   }
> 
> Next, in version 2, we remove the first field, 'a':
> 
>   public class Foo implements Writable {
>     float b;
> 
>     public void write(DataOutput out) throws IOException {
>       int version = RPC.getVersion(out);
>       if (version < 2) {                            // peer wants a
>     out.writeInt(0);                            // send it
>       }
>       if (version >= 1) {
>     out.writeFloat(b);
>       }
>     }
> 
>     public void readFields(DataInput in) throws IOException {
>       int version = RPC.getVersion(in);
>       if (version < 2) {                            // peer writes a
>     in.readInt();                               // ignore it
>       }
>       if (version >= 1) {
>     b = in.readFloat();
>       }
>     }
>   }
> 
> Could something like this work?  It would require just some minor 
> changes to Hadoop's RPC mechanism, to support the version handshake. 
> Beyond that, it could be implemented incrementally as RPC protocols 
> evolve.  It would require some vigilance, to make sure that versioning 
> logic is added when classes change, but adding automated tests against 
> prior versions would identify lapses  here.
> 
> This may appear to add a lot of version-related logic, but with 
> automatic versioning, in many cases, some version-related logic is still 
> required.  In simple cases, one adds a completely new field with a 
> default value and is done, with automatic versioning handling much of 
> the work.  But in many other cases an existing field is changed and the 
> application must translate old values to new, and vice versa.  These 
> cases still require application logic, even with automatic versioning. 
> So automatic versioning is certainly less intrusive, but not as much as 
> one might first assume.
> 
> The fundamental question is how soon we need to address inter-version 
> RPC compatibility.  If we wish to do it soon, I think we'd be wise to 
> consider a solution that's less invasive and that does not force us into 
> a potentially premature decision.
> 
> Doug