You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Bryan Beaudreault <bb...@hubspot.com> on 2015/05/05 17:58:22 UTC

Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Hello,

I'm about to start tackling our upgrade path for 0.94 to 1.0+. We have 6
production hbase clusters, 2 hadoop clusters, and hundreds of
APIs/daemons/crons/etc hitting all of these things.  Many of these clients
hit multiple clusters in the same process.  Daunting to say the least.

We can't take full downtime on any of these, though we can take read-only.
And ideally we could take read-only on each cluster in a staggered fashion.

>From a client perspective, all of our code currently assumes an
HTableInterface, which gives me some wiggle room I think.  With that in
mind, here's my current plan:

- Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
- Create a shim implementation of HTableInterface.  This shim would
delegate to either the old cdh4 APIs or the new shaded CDH5 classes,
depending on the cluster being talked to.
- Once the shim is in place across all clients, I will put each cluster
into read-only (a client side config of ours), migrate data to a new CDH5
cluster, then bounce affected services so they look there instead. I will
do this for each cluster in sequence.

This provides a great rollback strategy, and with our existing in-house
cluster cloning tools we can minimize the read-only window to a few minutes
if all goes well.

There are a couple gotchas I can think of with the shim, which I'm hoping
some of you might have ideas/opinions on:

1) Since protobufs are used for communication, we will have to avoid
shading those particular classes as they need to match the
package/classnames on the server side.  I think this should be fine, as
these are net-new, not conflicting with CDH4 artifacts.  Any
additions/concerns here?

2) I'd really like to be able to tackle HBase separately (and before) from
Hadoop.  With that in mind, *on the client side only*, it'd be great if I
could pull in our shaded CDH5 hbase, but with the CDH4 hadoop libraries.
All interactions with our hbase clusters happen through the hbase RPC.
Should this be fine?

3) If #2 is not possible, I'll need to further shade parts of hadoop.  Any
idea of the minimum parts of hadoop would need to be pulled in/shaded for
CDH5 hbase to work on the client side?

Thanks!  I'll look forward to posting all lessons learned at the end of
this upgrade path for the community, and appreciate any input you may have
on the above before I get started.

Bryan

Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Thanks Ted. Looks like there's a reason it hasn't been done yet.  For the
ease of my clients and the migration I may just add hbase-server to my
shaded hbase and heavily filter away the things I don't need (leaving only
those for mapreduce).  Will see how that works.

On Thu, May 14, 2015 at 2:33 PM, Ted Yu <yu...@gmail.com> wrote:

> Looks like there is a JIRA already:
> HBASE-11843 MapReduce classes shouldn't be in hbase-server
>
> Cheers
>
> On Thu, May 14, 2015 at 10:41 AM, anil gupta <an...@gmail.com>
> wrote:
>
> > +1 on moving MR related code to hbase-client or we can have a separate
> > artifact called hbase-mapreduce
> > I also have to include  hbase-server along with hbase-client in my
> project
> > just because of this reason. And once we pull in hbase-server and build
> an
> > uber jar. hbase-server pulls in a lots of unnecessary stuff.
> > Note: my project is not related to migration from 0.94 to 1.0. But, i am
> > supporting the argument for moving MR code in client or a separate
> > artifact.
> >
> > On Thu, May 14, 2015 at 9:43 AM, Bryan Beaudreault <
> > bbeaudreault@hubspot.com
> > > wrote:
> >
> > > Just an update here.  I've got something working locally that can run
> > > against either a 0.94.17 hbase or a 1.0 hbase transparently.  I
> > implemented
> > > as laid out above, but there were a bunch of gotchas.  It helps that we
> > > maintain our own fork of each version, as I needed to make some
> > > supplemental changes in each version to make things easier.  I will do
> a
> > > writeup with all of the gotchas later in the process.
> > >
> > > Next steps:
> > >
> > > - Convert server-side coprocessors
> > > - Apply the same or similar shim logic to our TableInputFormat and
> other
> > > mapreduce interfaces
> > >
> > > A couple notes for the devs:
> > >
> > > - I love that 1.0 has a separate hbase-client artifact.  Unfortunately
> > the
> > > TableInputFormat and other mapreduce classes live in hbase-server for
> > some
> > > reason.  So the end result is I basically need to pull the entire hbase
> > > super-artifact into my clients.  I may move these to hbase-client in my
> > > local fork if that is possible.
> > >
> > > - There are a few places where you are statically calling
> > > HBaseConfiguration.create().  This makes it hard for people who have a
> > lot
> > > of libraries built around HBase like us.  In our clients we inject
> > > configuration properties from our own configuration servers to
> supplement
> > > hbase-site/hbase-default.xml. When HBaseConfiguration.create() is
> called,
> > > it disregards these changes.  In my local fork I hacked in a
> > > LazyConfigurationHolder, which just keeps a static reference to a
> > > Configuration, but has a setter.  This allows me to inject my
> customized
> > > Configuration object into the hbase stack.
> > >
> > >  -- (For reference, the places you do this are, at least, ProtobufUtil
> > and
> > > ConnectionManager)
> > >  -- Hadoop also does something like this in their UserGroupInformation
> > > class, but they do provide a setConfiguration method.  Ideally there
> are
> > no
> > > static calls to create a Configuration, but this is an ok compromise
> > where
> > > necessary.
> > >
> > > I can put JIRAs in for these if it makes sense
> > >
> > >
> > >
> > > On Tue, May 5, 2015 at 10:48 PM, Bryan Beaudreault <
> > > bbeaudreault@hubspot.com
> > > > wrote:
> > >
> > > > Thanks for the response guys!
> > > >
> > > > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> > > >> mistakenly dropped anything you need? (I see that stuff has moved
> > around
> > > >> but HTI should have everything still from 0.94)
> > > >
> > > >
> > > > Yea, so far so good for HTI features.
> > > >
> > > > Sounds like you have experience copying tables in background in a
> > manner
> > > >> that minimally impinges serving given you have dev'd your own
> in-house
> > > >> cluster cloning tools?
> > > >> You will use the time while tables are read-only to 'catch-up' the
> > > >> difference between the last table copy and data that has come in
> > since?
> > > >
> > > >
> > > > Correct, we have some tools left over from our 0.92 to 0.94 upgrade,
> > > which
> > > > we've used for cluster copies.  It basically does an incremental
> distcp
> > > by
> > > > comparing the file length and md5 of each table in the target and
> > source
> > > > cluster, then only copies the diffs.  We can get very close to real
> > time
> > > > with this, then switch to read-only, do some flushes, and do one
> final
> > > copy
> > > > to catch up.  We have done this many times for various cluster moves.
> > > >
> > > > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
> > > >
> > > >
> > > > Good to know, will keep this in mind! We already shade some of the
> > > > dependencies of hbase such as guava, apache commons http, and joda.
> We
> > > > will do the same for protobuf.
> > > >
> > > >  Can you 'talk out loud' as you try stuff Bryan and if we can't
> > > >> help highlevel, perhaps we can help on specifics.
> > > >
> > > >
> > > > Gladly! I feel like I have a leg up since I've already survived the
> > 0.92
> > > > to 0.94 migration, so glad to share my experiences with this
> migration
> > as
> > > > well.  I'll update this thread as I move along.  I also plan to
> > release a
> > > > blog post on the ordeal once it's all said and done.
> > > >
> > > > We just created our initial shade of hbase.  I'm leaving tomorrow for
> > > > HBaseCon, but plan on tackling and testing all of this next week once
> > I'm
> > > > back from SF.  If anyone is facing similar upgrade challenges I'd be
> > > happy
> > > > to compare notes.
> > > >
> > > > If your clients are interacting with HDFS then you need to go the
> route
> > > of
> > > >> shading around PB and its hard, but HBase-wise only HBase 0.98 and
> 1.0
> > > use
> > > >> PBs in the RPC protocol and it shouldn't be any problem as long as
> you
> > > >> don't need security
> > > >
> > > >
> > > > Thankfully we don't interact directly with the HDFS of hbase.  There
> is
> > > > some interaction with the HDFS of our CDH4 hadoop clusters though.
> > I'll
> > > be
> > > > experimenting with these incompatibilities soon and will post here.
> > > > Hopefully I'll be able to separate them enough to not cause an issue.
> > > > Thankfully we have not moved to secure HBase yet.  That's actually on
> > the
> > > > to-do list, but hoping to do it *after* the CDH upgrade.
> > > >
> > > > ---
> > > >
> > > > Thanks again guys.  I'm expecting this will be a drawn out process
> > > > considering our scope, but will be happy to keep updates here as I
> > > proceed.
> > > >
> > > > On Tue, May 5, 2015 at 10:31 PM, Esteban Gutierrez <
> > esteban@cloudera.com
> > > >
> > > > wrote:
> > > >
> > > >> Just to a little bit to what StAck said:
> > > >>
> > > >> --
> > > >> Cloudera, Inc.
> > > >>
> > > >>
> > > >> On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote:
> > > >>
> > > >> > On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <
> > > >> > bbeaudreault@hubspot.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Hello,
> > > >> > >
> > > >> > > I'm about to start tackling our upgrade path for 0.94 to 1.0+.
> We
> > > >> have 6
> > > >> > > production hbase clusters, 2 hadoop clusters, and hundreds of
> > > >> > > APIs/daemons/crons/etc hitting all of these things.  Many of
> these
> > > >> > clients
> > > >> > > hit multiple clusters in the same process.  Daunting to say the
> > > least.
> > > >> > >
> > > >> > >
> > > >> > Nod.
> > > >> >
> > > >> >
> > > >> >
> > > >> > > We can't take full downtime on any of these, though we can take
> > > >> > read-only.
> > > >> > > And ideally we could take read-only on each cluster in a
> staggered
> > > >> > fashion.
> > > >> > >
> > > >> > > From a client perspective, all of our code currently assumes an
> > > >> > > HTableInterface, which gives me some wiggle room I think.  With
> > that
> > > >> in
> > > >> > > mind, here's my current plan:
> > > >> > >
> > > >> >
> > > >> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> > > >> > mistakenly dropped anything you need? (I see that stuff has moved
> > > around
> > > >> > but HTI should have everything still from 0.94)
> > > >> >
> > > >> >
> > > >> > >
> > > >> > > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
> > > >> > > - Create a shim implementation of HTableInterface.  This shim
> > would
> > > >> > > delegate to either the old cdh4 APIs or the new shaded CDH5
> > classes,
> > > >> > > depending on the cluster being talked to.
> > > >> > > - Once the shim is in place across all clients, I will put each
> > > >> cluster
> > > >> > > into read-only (a client side config of ours), migrate data to a
> > new
> > > >> CDH5
> > > >> > > cluster, then bounce affected services so they look there
> > instead. I
> > > >> will
> > > >> > > do this for each cluster in sequence.
> > > >> > >
> > > >> > >
> > > >> > Sounds like you have experience copying tables in background in a
> > > manner
> > > >> > that minimally impinges serving given you have dev'd your own
> > in-house
> > > >> > cluster cloning tools?
> > > >> >
> > > >> > You will use the time while tables are read-only to 'catch-up' the
> > > >> > difference between the last table copy and data that has come in
> > > since?
> > > >> >
> > > >> >
> > > >> >
> > > >> > > This provides a great rollback strategy, and with our existing
> > > >> in-house
> > > >> > > cluster cloning tools we can minimize the read-only window to a
> > few
> > > >> > minutes
> > > >> > > if all goes well.
> > > >> > >
> > > >> > > There are a couple gotchas I can think of with the shim, which
> I'm
> > > >> hoping
> > > >> > > some of you might have ideas/opinions on:
> > > >> > >
> > > >> > > 1) Since protobufs are used for communication, we will have to
> > avoid
> > > >> > > shading those particular classes as they need to match the
> > > >> > > package/classnames on the server side.  I think this should be
> > fine,
> > > >> as
> > > >> > > these are net-new, not conflicting with CDH4 artifacts.  Any
> > > >> > > additions/concerns here?
> > > >> > >
> > > >> > >
> > > >> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
> > > >> >
> > > >>
> > > >> If your clients are interacting with HDFS then you need to go the
> > route
> > > of
> > > >> shading around PB and its hard, but HBase-wise only HBase 0.98 and
> 1.0
> > > use
> > > >> PBs in the RPC protocol and it shouldn't be any problem as long as
> you
> > > >> don't need security (this is mostly because the client does a UGI in
> > the
> > > >> client and its easy to patch on both 0.94 and 1.0 to avoid to call
> > UGI).
> > > >> Another option is to move your application to asynchbase and it
> should
> > > be
> > > >> clever enough to handle both HBase versions.
> > > >>
> > > >>
> > > >>
> > > >> > I myself have little experience going a shading route so have
> little
> > > to
> > > >> > contribute. Can you 'talk out loud' as you try stuff Bryan and if
> we
> > > >> can't
> > > >> > help highlevel, perhaps we can help on specifics.
> > > >> >
> > > >> > St.Ack
> > > >> >
> > > >>
> > > >> cheers,
> > > >> esteban.
> > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
> >
>

Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Posted by Ted Yu <yu...@gmail.com>.

Looks like there is a JIRA already:
HBASE-11843 MapReduce classes shouldn't be in hbase-server

Cheers

On Thu, May 14, 2015 at 10:41 AM, anil gupta <an...@gmail.com> wrote:

> +1 on moving MR related code to hbase-client or we can have a separate
> artifact called hbase-mapreduce
> I also have to include  hbase-server along with hbase-client in my project
> just because of this reason. And once we pull in hbase-server and build an
> uber jar. hbase-server pulls in a lots of unnecessary stuff.
> Note: my project is not related to migration from 0.94 to 1.0. But, i am
> supporting the argument for moving MR code in client or a separate
> artifact.
>
> On Thu, May 14, 2015 at 9:43 AM, Bryan Beaudreault <
> bbeaudreault@hubspot.com
> > wrote:
>
> > Just an update here.  I've got something working locally that can run
> > against either a 0.94.17 hbase or a 1.0 hbase transparently.  I
> implemented
> > as laid out above, but there were a bunch of gotchas.  It helps that we
> > maintain our own fork of each version, as I needed to make some
> > supplemental changes in each version to make things easier.  I will do a
> > writeup with all of the gotchas later in the process.
> >
> > Next steps:
> >
> > - Convert server-side coprocessors
> > - Apply the same or similar shim logic to our TableInputFormat and other
> > mapreduce interfaces
> >
> > A couple notes for the devs:
> >
> > - I love that 1.0 has a separate hbase-client artifact.  Unfortunately
> the
> > TableInputFormat and other mapreduce classes live in hbase-server for
> some
> > reason.  So the end result is I basically need to pull the entire hbase
> > super-artifact into my clients.  I may move these to hbase-client in my
> > local fork if that is possible.
> >
> > - There are a few places where you are statically calling
> > HBaseConfiguration.create().  This makes it hard for people who have a
> lot
> > of libraries built around HBase like us.  In our clients we inject
> > configuration properties from our own configuration servers to supplement
> > hbase-site/hbase-default.xml. When HBaseConfiguration.create() is called,
> > it disregards these changes.  In my local fork I hacked in a
> > LazyConfigurationHolder, which just keeps a static reference to a
> > Configuration, but has a setter.  This allows me to inject my customized
> > Configuration object into the hbase stack.
> >
> >  -- (For reference, the places you do this are, at least, ProtobufUtil
> and
> > ConnectionManager)
> >  -- Hadoop also does something like this in their UserGroupInformation
> > class, but they do provide a setConfiguration method.  Ideally there are
> no
> > static calls to create a Configuration, but this is an ok compromise
> where
> > necessary.
> >
> > I can put JIRAs in for these if it makes sense
> >
> >
> >
> > On Tue, May 5, 2015 at 10:48 PM, Bryan Beaudreault <
> > bbeaudreault@hubspot.com
> > > wrote:
> >
> > > Thanks for the response guys!
> > >
> > > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> > >> mistakenly dropped anything you need? (I see that stuff has moved
> around
> > >> but HTI should have everything still from 0.94)
> > >
> > >
> > > Yea, so far so good for HTI features.
> > >
> > > Sounds like you have experience copying tables in background in a
> manner
> > >> that minimally impinges serving given you have dev'd your own in-house
> > >> cluster cloning tools?
> > >> You will use the time while tables are read-only to 'catch-up' the
> > >> difference between the last table copy and data that has come in
> since?
> > >
> > >
> > > Correct, we have some tools left over from our 0.92 to 0.94 upgrade,
> > which
> > > we've used for cluster copies.  It basically does an incremental distcp
> > by
> > > comparing the file length and md5 of each table in the target and
> source
> > > cluster, then only copies the diffs.  We can get very close to real
> time
> > > with this, then switch to read-only, do some flushes, and do one final
> > copy
> > > to catch up.  We have done this many times for various cluster moves.
> > >
> > > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
> > >
> > >
> > > Good to know, will keep this in mind! We already shade some of the
> > > dependencies of hbase such as guava, apache commons http, and joda.  We
> > > will do the same for protobuf.
> > >
> > >  Can you 'talk out loud' as you try stuff Bryan and if we can't
> > >> help highlevel, perhaps we can help on specifics.
> > >
> > >
> > > Gladly! I feel like I have a leg up since I've already survived the
> 0.92
> > > to 0.94 migration, so glad to share my experiences with this migration
> as
> > > well.  I'll update this thread as I move along.  I also plan to
> release a
> > > blog post on the ordeal once it's all said and done.
> > >
> > > We just created our initial shade of hbase.  I'm leaving tomorrow for
> > > HBaseCon, but plan on tackling and testing all of this next week once
> I'm
> > > back from SF.  If anyone is facing similar upgrade challenges I'd be
> > happy
> > > to compare notes.
> > >
> > > If your clients are interacting with HDFS then you need to go the route
> > of
> > >> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0
> > use
> > >> PBs in the RPC protocol and it shouldn't be any problem as long as you
> > >> don't need security
> > >
> > >
> > > Thankfully we don't interact directly with the HDFS of hbase.  There is
> > > some interaction with the HDFS of our CDH4 hadoop clusters though.
> I'll
> > be
> > > experimenting with these incompatibilities soon and will post here.
> > > Hopefully I'll be able to separate them enough to not cause an issue.
> > > Thankfully we have not moved to secure HBase yet.  That's actually on
> the
> > > to-do list, but hoping to do it *after* the CDH upgrade.
> > >
> > > ---
> > >
> > > Thanks again guys.  I'm expecting this will be a drawn out process
> > > considering our scope, but will be happy to keep updates here as I
> > proceed.
> > >
> > > On Tue, May 5, 2015 at 10:31 PM, Esteban Gutierrez <
> esteban@cloudera.com
> > >
> > > wrote:
> > >
> > >> Just to a little bit to what StAck said:
> > >>
> > >> --
> > >> Cloudera, Inc.
> > >>
> > >>
> > >> On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote:
> > >>
> > >> > On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <
> > >> > bbeaudreault@hubspot.com>
> > >> > wrote:
> > >> >
> > >> > > Hello,
> > >> > >
> > >> > > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We
> > >> have 6
> > >> > > production hbase clusters, 2 hadoop clusters, and hundreds of
> > >> > > APIs/daemons/crons/etc hitting all of these things.  Many of these
> > >> > clients
> > >> > > hit multiple clusters in the same process.  Daunting to say the
> > least.
> > >> > >
> > >> > >
> > >> > Nod.
> > >> >
> > >> >
> > >> >
> > >> > > We can't take full downtime on any of these, though we can take
> > >> > read-only.
> > >> > > And ideally we could take read-only on each cluster in a staggered
> > >> > fashion.
> > >> > >
> > >> > > From a client perspective, all of our code currently assumes an
> > >> > > HTableInterface, which gives me some wiggle room I think.  With
> that
> > >> in
> > >> > > mind, here's my current plan:
> > >> > >
> > >> >
> > >> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> > >> > mistakenly dropped anything you need? (I see that stuff has moved
> > around
> > >> > but HTI should have everything still from 0.94)
> > >> >
> > >> >
> > >> > >
> > >> > > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
> > >> > > - Create a shim implementation of HTableInterface.  This shim
> would
> > >> > > delegate to either the old cdh4 APIs or the new shaded CDH5
> classes,
> > >> > > depending on the cluster being talked to.
> > >> > > - Once the shim is in place across all clients, I will put each
> > >> cluster
> > >> > > into read-only (a client side config of ours), migrate data to a
> new
> > >> CDH5
> > >> > > cluster, then bounce affected services so they look there
> instead. I
> > >> will
> > >> > > do this for each cluster in sequence.
> > >> > >
> > >> > >
> > >> > Sounds like you have experience copying tables in background in a
> > manner
> > >> > that minimally impinges serving given you have dev'd your own
> in-house
> > >> > cluster cloning tools?
> > >> >
> > >> > You will use the time while tables are read-only to 'catch-up' the
> > >> > difference between the last table copy and data that has come in
> > since?
> > >> >
> > >> >
> > >> >
> > >> > > This provides a great rollback strategy, and with our existing
> > >> in-house
> > >> > > cluster cloning tools we can minimize the read-only window to a
> few
> > >> > minutes
> > >> > > if all goes well.
> > >> > >
> > >> > > There are a couple gotchas I can think of with the shim, which I'm
> > >> hoping
> > >> > > some of you might have ideas/opinions on:
> > >> > >
> > >> > > 1) Since protobufs are used for communication, we will have to
> avoid
> > >> > > shading those particular classes as they need to match the
> > >> > > package/classnames on the server side.  I think this should be
> fine,
> > >> as
> > >> > > these are net-new, not conflicting with CDH4 artifacts.  Any
> > >> > > additions/concerns here?
> > >> > >
> > >> > >
> > >> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
> > >> >
> > >>
> > >> If your clients are interacting with HDFS then you need to go the
> route
> > of
> > >> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0
> > use
> > >> PBs in the RPC protocol and it shouldn't be any problem as long as you
> > >> don't need security (this is mostly because the client does a UGI in
> the
> > >> client and its easy to patch on both 0.94 and 1.0 to avoid to call
> UGI).
> > >> Another option is to move your application to asynchbase and it should
> > be
> > >> clever enough to handle both HBase versions.
> > >>
> > >>
> > >>
> > >> > I myself have little experience going a shading route so have little
> > to
> > >> > contribute. Can you 'talk out loud' as you try stuff Bryan and if we
> > >> can't
> > >> > help highlevel, perhaps we can help on specifics.
> > >> >
> > >> > St.Ack
> > >> >
> > >>
> > >> cheers,
> > >> esteban.
> > >>
> > >
> > >
> >
>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>

Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Posted by anil gupta <an...@gmail.com>.

+1 on moving MR related code to hbase-client or we can have a separate
artifact called hbase-mapreduce
I also have to include  hbase-server along with hbase-client in my project
just because of this reason. And once we pull in hbase-server and build an
uber jar. hbase-server pulls in a lots of unnecessary stuff.
Note: my project is not related to migration from 0.94 to 1.0. But, i am
supporting the argument for moving MR code in client or a separate artifact.

On Thu, May 14, 2015 at 9:43 AM, Bryan Beaudreault <bbeaudreault@hubspot.com
> wrote:

> Just an update here.  I've got something working locally that can run
> against either a 0.94.17 hbase or a 1.0 hbase transparently.  I implemented
> as laid out above, but there were a bunch of gotchas.  It helps that we
> maintain our own fork of each version, as I needed to make some
> supplemental changes in each version to make things easier.  I will do a
> writeup with all of the gotchas later in the process.
>
> Next steps:
>
> - Convert server-side coprocessors
> - Apply the same or similar shim logic to our TableInputFormat and other
> mapreduce interfaces
>
> A couple notes for the devs:
>
> - I love that 1.0 has a separate hbase-client artifact.  Unfortunately the
> TableInputFormat and other mapreduce classes live in hbase-server for some
> reason.  So the end result is I basically need to pull the entire hbase
> super-artifact into my clients.  I may move these to hbase-client in my
> local fork if that is possible.
>
> - There are a few places where you are statically calling
> HBaseConfiguration.create().  This makes it hard for people who have a lot
> of libraries built around HBase like us.  In our clients we inject
> configuration properties from our own configuration servers to supplement
> hbase-site/hbase-default.xml. When HBaseConfiguration.create() is called,
> it disregards these changes.  In my local fork I hacked in a
> LazyConfigurationHolder, which just keeps a static reference to a
> Configuration, but has a setter.  This allows me to inject my customized
> Configuration object into the hbase stack.
>
>  -- (For reference, the places you do this are, at least, ProtobufUtil and
> ConnectionManager)
>  -- Hadoop also does something like this in their UserGroupInformation
> class, but they do provide a setConfiguration method.  Ideally there are no
> static calls to create a Configuration, but this is an ok compromise where
> necessary.
>
> I can put JIRAs in for these if it makes sense
>
>
>
> On Tue, May 5, 2015 at 10:48 PM, Bryan Beaudreault <
> bbeaudreault@hubspot.com
> > wrote:
>
> > Thanks for the response guys!
> >
> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> >> mistakenly dropped anything you need? (I see that stuff has moved around
> >> but HTI should have everything still from 0.94)
> >
> >
> > Yea, so far so good for HTI features.
> >
> > Sounds like you have experience copying tables in background in a manner
> >> that minimally impinges serving given you have dev'd your own in-house
> >> cluster cloning tools?
> >> You will use the time while tables are read-only to 'catch-up' the
> >> difference between the last table copy and data that has come in since?
> >
> >
> > Correct, we have some tools left over from our 0.92 to 0.94 upgrade,
> which
> > we've used for cluster copies.  It basically does an incremental distcp
> by
> > comparing the file length and md5 of each table in the target and source
> > cluster, then only copies the diffs.  We can get very close to real time
> > with this, then switch to read-only, do some flushes, and do one final
> copy
> > to catch up.  We have done this many times for various cluster moves.
> >
> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
> >
> >
> > Good to know, will keep this in mind! We already shade some of the
> > dependencies of hbase such as guava, apache commons http, and joda.  We
> > will do the same for protobuf.
> >
> >  Can you 'talk out loud' as you try stuff Bryan and if we can't
> >> help highlevel, perhaps we can help on specifics.
> >
> >
> > Gladly! I feel like I have a leg up since I've already survived the 0.92
> > to 0.94 migration, so glad to share my experiences with this migration as
> > well.  I'll update this thread as I move along.  I also plan to release a
> > blog post on the ordeal once it's all said and done.
> >
> > We just created our initial shade of hbase.  I'm leaving tomorrow for
> > HBaseCon, but plan on tackling and testing all of this next week once I'm
> > back from SF.  If anyone is facing similar upgrade challenges I'd be
> happy
> > to compare notes.
> >
> > If your clients are interacting with HDFS then you need to go the route
> of
> >> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0
> use
> >> PBs in the RPC protocol and it shouldn't be any problem as long as you
> >> don't need security
> >
> >
> > Thankfully we don't interact directly with the HDFS of hbase.  There is
> > some interaction with the HDFS of our CDH4 hadoop clusters though.  I'll
> be
> > experimenting with these incompatibilities soon and will post here.
> > Hopefully I'll be able to separate them enough to not cause an issue.
> > Thankfully we have not moved to secure HBase yet.  That's actually on the
> > to-do list, but hoping to do it *after* the CDH upgrade.
> >
> > ---
> >
> > Thanks again guys.  I'm expecting this will be a drawn out process
> > considering our scope, but will be happy to keep updates here as I
> proceed.
> >
> > On Tue, May 5, 2015 at 10:31 PM, Esteban Gutierrez <esteban@cloudera.com
> >
> > wrote:
> >
> >> Just to a little bit to what StAck said:
> >>
> >> --
> >> Cloudera, Inc.
> >>
> >>
> >> On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote:
> >>
> >> > On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <
> >> > bbeaudreault@hubspot.com>
> >> > wrote:
> >> >
> >> > > Hello,
> >> > >
> >> > > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We
> >> have 6
> >> > > production hbase clusters, 2 hadoop clusters, and hundreds of
> >> > > APIs/daemons/crons/etc hitting all of these things.  Many of these
> >> > clients
> >> > > hit multiple clusters in the same process.  Daunting to say the
> least.
> >> > >
> >> > >
> >> > Nod.
> >> >
> >> >
> >> >
> >> > > We can't take full downtime on any of these, though we can take
> >> > read-only.
> >> > > And ideally we could take read-only on each cluster in a staggered
> >> > fashion.
> >> > >
> >> > > From a client perspective, all of our code currently assumes an
> >> > > HTableInterface, which gives me some wiggle room I think.  With that
> >> in
> >> > > mind, here's my current plan:
> >> > >
> >> >
> >> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> >> > mistakenly dropped anything you need? (I see that stuff has moved
> around
> >> > but HTI should have everything still from 0.94)
> >> >
> >> >
> >> > >
> >> > > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
> >> > > - Create a shim implementation of HTableInterface.  This shim would
> >> > > delegate to either the old cdh4 APIs or the new shaded CDH5 classes,
> >> > > depending on the cluster being talked to.
> >> > > - Once the shim is in place across all clients, I will put each
> >> cluster
> >> > > into read-only (a client side config of ours), migrate data to a new
> >> CDH5
> >> > > cluster, then bounce affected services so they look there instead. I
> >> will
> >> > > do this for each cluster in sequence.
> >> > >
> >> > >
> >> > Sounds like you have experience copying tables in background in a
> manner
> >> > that minimally impinges serving given you have dev'd your own in-house
> >> > cluster cloning tools?
> >> >
> >> > You will use the time while tables are read-only to 'catch-up' the
> >> > difference between the last table copy and data that has come in
> since?
> >> >
> >> >
> >> >
> >> > > This provides a great rollback strategy, and with our existing
> >> in-house
> >> > > cluster cloning tools we can minimize the read-only window to a few
> >> > minutes
> >> > > if all goes well.
> >> > >
> >> > > There are a couple gotchas I can think of with the shim, which I'm
> >> hoping
> >> > > some of you might have ideas/opinions on:
> >> > >
> >> > > 1) Since protobufs are used for communication, we will have to avoid
> >> > > shading those particular classes as they need to match the
> >> > > package/classnames on the server side.  I think this should be fine,
> >> as
> >> > > these are net-new, not conflicting with CDH4 artifacts.  Any
> >> > > additions/concerns here?
> >> > >
> >> > >
> >> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
> >> >
> >>
> >> If your clients are interacting with HDFS then you need to go the route
> of
> >> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0
> use
> >> PBs in the RPC protocol and it shouldn't be any problem as long as you
> >> don't need security (this is mostly because the client does a UGI in the
> >> client and its easy to patch on both 0.94 and 1.0 to avoid to call UGI).
> >> Another option is to move your application to asynchbase and it should
> be
> >> clever enough to handle both HBase versions.
> >>
> >>
> >>
> >> > I myself have little experience going a shading route so have little
> to
> >> > contribute. Can you 'talk out loud' as you try stuff Bryan and if we
> >> can't
> >> > help highlevel, perhaps we can help on specifics.
> >> >
> >> > St.Ack
> >> >
> >>
> >> cheers,
> >> esteban.
> >>
> >
> >
>



-- 
Thanks & Regards,
Anil Gupta

Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Just an update here.  I've got something working locally that can run
against either a 0.94.17 hbase or a 1.0 hbase transparently.  I implemented
as laid out above, but there were a bunch of gotchas.  It helps that we
maintain our own fork of each version, as I needed to make some
supplemental changes in each version to make things easier.  I will do a
writeup with all of the gotchas later in the process.

Next steps:

- Convert server-side coprocessors
- Apply the same or similar shim logic to our TableInputFormat and other
mapreduce interfaces

A couple notes for the devs:

- I love that 1.0 has a separate hbase-client artifact.  Unfortunately the
TableInputFormat and other mapreduce classes live in hbase-server for some
reason.  So the end result is I basically need to pull the entire hbase
super-artifact into my clients.  I may move these to hbase-client in my
local fork if that is possible.

- There are a few places where you are statically calling
HBaseConfiguration.create().  This makes it hard for people who have a lot
of libraries built around HBase like us.  In our clients we inject
configuration properties from our own configuration servers to supplement
hbase-site/hbase-default.xml. When HBaseConfiguration.create() is called,
it disregards these changes.  In my local fork I hacked in a
LazyConfigurationHolder, which just keeps a static reference to a
Configuration, but has a setter.  This allows me to inject my customized
Configuration object into the hbase stack.

 -- (For reference, the places you do this are, at least, ProtobufUtil and
ConnectionManager)
 -- Hadoop also does something like this in their UserGroupInformation
class, but they do provide a setConfiguration method.  Ideally there are no
static calls to create a Configuration, but this is an ok compromise where
necessary.

I can put JIRAs in for these if it makes sense



On Tue, May 5, 2015 at 10:48 PM, Bryan Beaudreault <bbeaudreault@hubspot.com
> wrote:

> Thanks for the response guys!
>
> You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
>> mistakenly dropped anything you need? (I see that stuff has moved around
>> but HTI should have everything still from 0.94)
>
>
> Yea, so far so good for HTI features.
>
> Sounds like you have experience copying tables in background in a manner
>> that minimally impinges serving given you have dev'd your own in-house
>> cluster cloning tools?
>> You will use the time while tables are read-only to 'catch-up' the
>> difference between the last table copy and data that has come in since?
>
>
> Correct, we have some tools left over from our 0.92 to 0.94 upgrade, which
> we've used for cluster copies.  It basically does an incremental distcp by
> comparing the file length and md5 of each table in the target and source
> cluster, then only copies the diffs.  We can get very close to real time
> with this, then switch to read-only, do some flushes, and do one final copy
> to catch up.  We have done this many times for various cluster moves.
>
> CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
>
>
> Good to know, will keep this in mind! We already shade some of the
> dependencies of hbase such as guava, apache commons http, and joda.  We
> will do the same for protobuf.
>
>  Can you 'talk out loud' as you try stuff Bryan and if we can't
>> help highlevel, perhaps we can help on specifics.
>
>
> Gladly! I feel like I have a leg up since I've already survived the 0.92
> to 0.94 migration, so glad to share my experiences with this migration as
> well.  I'll update this thread as I move along.  I also plan to release a
> blog post on the ordeal once it's all said and done.
>
> We just created our initial shade of hbase.  I'm leaving tomorrow for
> HBaseCon, but plan on tackling and testing all of this next week once I'm
> back from SF.  If anyone is facing similar upgrade challenges I'd be happy
> to compare notes.
>
> If your clients are interacting with HDFS then you need to go the route of
>> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use
>> PBs in the RPC protocol and it shouldn't be any problem as long as you
>> don't need security
>
>
> Thankfully we don't interact directly with the HDFS of hbase.  There is
> some interaction with the HDFS of our CDH4 hadoop clusters though.  I'll be
> experimenting with these incompatibilities soon and will post here.
> Hopefully I'll be able to separate them enough to not cause an issue.
> Thankfully we have not moved to secure HBase yet.  That's actually on the
> to-do list, but hoping to do it *after* the CDH upgrade.
>
> ---
>
> Thanks again guys.  I'm expecting this will be a drawn out process
> considering our scope, but will be happy to keep updates here as I proceed.
>
> On Tue, May 5, 2015 at 10:31 PM, Esteban Gutierrez <es...@cloudera.com>
> wrote:
>
>> Just to a little bit to what StAck said:
>>
>> --
>> Cloudera, Inc.
>>
>>
>> On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote:
>>
>> > On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <
>> > bbeaudreault@hubspot.com>
>> > wrote:
>> >
>> > > Hello,
>> > >
>> > > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We
>> have 6
>> > > production hbase clusters, 2 hadoop clusters, and hundreds of
>> > > APIs/daemons/crons/etc hitting all of these things.  Many of these
>> > clients
>> > > hit multiple clusters in the same process.  Daunting to say the least.
>> > >
>> > >
>> > Nod.
>> >
>> >
>> >
>> > > We can't take full downtime on any of these, though we can take
>> > read-only.
>> > > And ideally we could take read-only on each cluster in a staggered
>> > fashion.
>> > >
>> > > From a client perspective, all of our code currently assumes an
>> > > HTableInterface, which gives me some wiggle room I think.  With that
>> in
>> > > mind, here's my current plan:
>> > >
>> >
>> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
>> > mistakenly dropped anything you need? (I see that stuff has moved around
>> > but HTI should have everything still from 0.94)
>> >
>> >
>> > >
>> > > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
>> > > - Create a shim implementation of HTableInterface.  This shim would
>> > > delegate to either the old cdh4 APIs or the new shaded CDH5 classes,
>> > > depending on the cluster being talked to.
>> > > - Once the shim is in place across all clients, I will put each
>> cluster
>> > > into read-only (a client side config of ours), migrate data to a new
>> CDH5
>> > > cluster, then bounce affected services so they look there instead. I
>> will
>> > > do this for each cluster in sequence.
>> > >
>> > >
>> > Sounds like you have experience copying tables in background in a manner
>> > that minimally impinges serving given you have dev'd your own in-house
>> > cluster cloning tools?
>> >
>> > You will use the time while tables are read-only to 'catch-up' the
>> > difference between the last table copy and data that has come in since?
>> >
>> >
>> >
>> > > This provides a great rollback strategy, and with our existing
>> in-house
>> > > cluster cloning tools we can minimize the read-only window to a few
>> > minutes
>> > > if all goes well.
>> > >
>> > > There are a couple gotchas I can think of with the shim, which I'm
>> hoping
>> > > some of you might have ideas/opinions on:
>> > >
>> > > 1) Since protobufs are used for communication, we will have to avoid
>> > > shading those particular classes as they need to match the
>> > > package/classnames on the server side.  I think this should be fine,
>> as
>> > > these are net-new, not conflicting with CDH4 artifacts.  Any
>> > > additions/concerns here?
>> > >
>> > >
>> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
>> >
>>
>> If your clients are interacting with HDFS then you need to go the route of
>> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use
>> PBs in the RPC protocol and it shouldn't be any problem as long as you
>> don't need security (this is mostly because the client does a UGI in the
>> client and its easy to patch on both 0.94 and 1.0 to avoid to call UGI).
>> Another option is to move your application to asynchbase and it should be
>> clever enough to handle both HBase versions.
>>
>>
>>
>> > I myself have little experience going a shading route so have little to
>> > contribute. Can you 'talk out loud' as you try stuff Bryan and if we
>> can't
>> > help highlevel, perhaps we can help on specifics.
>> >
>> > St.Ack
>> >
>>
>> cheers,
>> esteban.
>>
>
>

Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Thanks for the response guys!

You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> mistakenly dropped anything you need? (I see that stuff has moved around
> but HTI should have everything still from 0.94)


Yea, so far so good for HTI features.

Sounds like you have experience copying tables in background in a manner
> that minimally impinges serving given you have dev'd your own in-house
> cluster cloning tools?
> You will use the time while tables are read-only to 'catch-up' the
> difference between the last table copy and data that has come in since?


Correct, we have some tools left over from our 0.92 to 0.94 upgrade, which
we've used for cluster copies.  It basically does an incremental distcp by
comparing the file length and md5 of each table in the target and source
cluster, then only copies the diffs.  We can get very close to real time
with this, then switch to read-only, do some flushes, and do one final copy
to catch up.  We have done this many times for various cluster moves.

CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?


Good to know, will keep this in mind! We already shade some of the
dependencies of hbase such as guava, apache commons http, and joda.  We
will do the same for protobuf.

 Can you 'talk out loud' as you try stuff Bryan and if we can't
> help highlevel, perhaps we can help on specifics.


Gladly! I feel like I have a leg up since I've already survived the 0.92 to
0.94 migration, so glad to share my experiences with this migration as
well.  I'll update this thread as I move along.  I also plan to release a
blog post on the ordeal once it's all said and done.

We just created our initial shade of hbase.  I'm leaving tomorrow for
HBaseCon, but plan on tackling and testing all of this next week once I'm
back from SF.  If anyone is facing similar upgrade challenges I'd be happy
to compare notes.

If your clients are interacting with HDFS then you need to go the route of
> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use
> PBs in the RPC protocol and it shouldn't be any problem as long as you
> don't need security


Thankfully we don't interact directly with the HDFS of hbase.  There is
some interaction with the HDFS of our CDH4 hadoop clusters though.  I'll be
experimenting with these incompatibilities soon and will post here.
Hopefully I'll be able to separate them enough to not cause an issue.
Thankfully we have not moved to secure HBase yet.  That's actually on the
to-do list, but hoping to do it *after* the CDH upgrade.

---

Thanks again guys.  I'm expecting this will be a drawn out process
considering our scope, but will be happy to keep updates here as I proceed.

On Tue, May 5, 2015 at 10:31 PM, Esteban Gutierrez <es...@cloudera.com>
wrote:

> Just to a little bit to what StAck said:
>
> --
> Cloudera, Inc.
>
>
> On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote:
>
> > On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <
> > bbeaudreault@hubspot.com>
> > wrote:
> >
> > > Hello,
> > >
> > > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We have
> 6
> > > production hbase clusters, 2 hadoop clusters, and hundreds of
> > > APIs/daemons/crons/etc hitting all of these things.  Many of these
> > clients
> > > hit multiple clusters in the same process.  Daunting to say the least.
> > >
> > >
> > Nod.
> >
> >
> >
> > > We can't take full downtime on any of these, though we can take
> > read-only.
> > > And ideally we could take read-only on each cluster in a staggered
> > fashion.
> > >
> > > From a client perspective, all of our code currently assumes an
> > > HTableInterface, which gives me some wiggle room I think.  With that in
> > > mind, here's my current plan:
> > >
> >
> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> > mistakenly dropped anything you need? (I see that stuff has moved around
> > but HTI should have everything still from 0.94)
> >
> >
> > >
> > > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
> > > - Create a shim implementation of HTableInterface.  This shim would
> > > delegate to either the old cdh4 APIs or the new shaded CDH5 classes,
> > > depending on the cluster being talked to.
> > > - Once the shim is in place across all clients, I will put each cluster
> > > into read-only (a client side config of ours), migrate data to a new
> CDH5
> > > cluster, then bounce affected services so they look there instead. I
> will
> > > do this for each cluster in sequence.
> > >
> > >
> > Sounds like you have experience copying tables in background in a manner
> > that minimally impinges serving given you have dev'd your own in-house
> > cluster cloning tools?
> >
> > You will use the time while tables are read-only to 'catch-up' the
> > difference between the last table copy and data that has come in since?
> >
> >
> >
> > > This provides a great rollback strategy, and with our existing in-house
> > > cluster cloning tools we can minimize the read-only window to a few
> > minutes
> > > if all goes well.
> > >
> > > There are a couple gotchas I can think of with the shim, which I'm
> hoping
> > > some of you might have ideas/opinions on:
> > >
> > > 1) Since protobufs are used for communication, we will have to avoid
> > > shading those particular classes as they need to match the
> > > package/classnames on the server side.  I think this should be fine, as
> > > these are net-new, not conflicting with CDH4 artifacts.  Any
> > > additions/concerns here?
> > >
> > >
> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
> >
>
> If your clients are interacting with HDFS then you need to go the route of
> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use
> PBs in the RPC protocol and it shouldn't be any problem as long as you
> don't need security (this is mostly because the client does a UGI in the
> client and its easy to patch on both 0.94 and 1.0 to avoid to call UGI).
> Another option is to move your application to asynchbase and it should be
> clever enough to handle both HBase versions.
>
>
>
> > I myself have little experience going a shading route so have little to
> > contribute. Can you 'talk out loud' as you try stuff Bryan and if we
> can't
> > help highlevel, perhaps we can help on specifics.
> >
> > St.Ack
> >
>
> cheers,
> esteban.
>

Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Posted by Esteban Gutierrez <es...@cloudera.com>.

Just to a little bit to what StAck said:

--
Cloudera, Inc.


On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote:

> On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <
> bbeaudreault@hubspot.com>
> wrote:
>
> > Hello,
> >
> > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We have 6
> > production hbase clusters, 2 hadoop clusters, and hundreds of
> > APIs/daemons/crons/etc hitting all of these things.  Many of these
> clients
> > hit multiple clusters in the same process.  Daunting to say the least.
> >
> >
> Nod.
>
>
>
> > We can't take full downtime on any of these, though we can take
> read-only.
> > And ideally we could take read-only on each cluster in a staggered
> fashion.
> >
> > From a client perspective, all of our code currently assumes an
> > HTableInterface, which gives me some wiggle room I think.  With that in
> > mind, here's my current plan:
> >
>
> You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> mistakenly dropped anything you need? (I see that stuff has moved around
> but HTI should have everything still from 0.94)
>
>
> >
> > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
> > - Create a shim implementation of HTableInterface.  This shim would
> > delegate to either the old cdh4 APIs or the new shaded CDH5 classes,
> > depending on the cluster being talked to.
> > - Once the shim is in place across all clients, I will put each cluster
> > into read-only (a client side config of ours), migrate data to a new CDH5
> > cluster, then bounce affected services so they look there instead. I will
> > do this for each cluster in sequence.
> >
> >
> Sounds like you have experience copying tables in background in a manner
> that minimally impinges serving given you have dev'd your own in-house
> cluster cloning tools?
>
> You will use the time while tables are read-only to 'catch-up' the
> difference between the last table copy and data that has come in since?
>
>
>
> > This provides a great rollback strategy, and with our existing in-house
> > cluster cloning tools we can minimize the read-only window to a few
> minutes
> > if all goes well.
> >
> > There are a couple gotchas I can think of with the shim, which I'm hoping
> > some of you might have ideas/opinions on:
> >
> > 1) Since protobufs are used for communication, we will have to avoid
> > shading those particular classes as they need to match the
> > package/classnames on the server side.  I think this should be fine, as
> > these are net-new, not conflicting with CDH4 artifacts.  Any
> > additions/concerns here?
> >
> >
> CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
>

If your clients are interacting with HDFS then you need to go the route of
shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use
PBs in the RPC protocol and it shouldn't be any problem as long as you
don't need security (this is mostly because the client does a UGI in the
client and its easy to patch on both 0.94 and 1.0 to avoid to call UGI).
Another option is to move your application to asynchbase and it should be
clever enough to handle both HBase versions.



> I myself have little experience going a shading route so have little to
> contribute. Can you 'talk out loud' as you try stuff Bryan and if we can't
> help highlevel, perhaps we can help on specifics.
>
> St.Ack
>

cheers,
esteban.

Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Posted by Stack <st...@duboce.net>.

On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <bb...@hubspot.com>
wrote:

> Hello,
>
> I'm about to start tackling our upgrade path for 0.94 to 1.0+. We have 6
> production hbase clusters, 2 hadoop clusters, and hundreds of
> APIs/daemons/crons/etc hitting all of these things.  Many of these clients
> hit multiple clusters in the same process.  Daunting to say the least.
>
>
Nod.



> We can't take full downtime on any of these, though we can take read-only.
> And ideally we could take read-only on each cluster in a staggered fashion.
>
> From a client perspective, all of our code currently assumes an
> HTableInterface, which gives me some wiggle room I think.  With that in
> mind, here's my current plan:
>

You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
mistakenly dropped anything you need? (I see that stuff has moved around
but HTI should have everything still from 0.94)


>
> - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
> - Create a shim implementation of HTableInterface.  This shim would
> delegate to either the old cdh4 APIs or the new shaded CDH5 classes,
> depending on the cluster being talked to.
> - Once the shim is in place across all clients, I will put each cluster
> into read-only (a client side config of ours), migrate data to a new CDH5
> cluster, then bounce affected services so they look there instead. I will
> do this for each cluster in sequence.
>
>
Sounds like you have experience copying tables in background in a manner
that minimally impinges serving given you have dev'd your own in-house
cluster cloning tools?

You will use the time while tables are read-only to 'catch-up' the
difference between the last table copy and data that has come in since?



> This provides a great rollback strategy, and with our existing in-house
> cluster cloning tools we can minimize the read-only window to a few minutes
> if all goes well.
>
> There are a couple gotchas I can think of with the shim, which I'm hoping
> some of you might have ideas/opinions on:
>
> 1) Since protobufs are used for communication, we will have to avoid
> shading those particular classes as they need to match the
> package/classnames on the server side.  I think this should be fine, as
> these are net-new, not conflicting with CDH4 artifacts.  Any
> additions/concerns here?
>
>
CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?

I myself have little experience going a shading route so have little to
contribute. Can you 'talk out loud' as you try stuff Bryan and if we can't
help highlevel, perhaps we can help on specifics.

St.Ack