You are viewing a plain text version of this content. The canonical link for it is here.

Posted to yarn-dev@hadoop.apache.org by Wei-Chiu Chuang <we...@apache.org> on 2020/04/04 19:13:17 UTC

[DISCUSS] Shade guava into hadoop-thirdparty

Hi Hadoop devs,

I spent a good part of the past 7 months working with a dozen of colleagues
to update the guava version in Cloudera's software (that includes Hadoop,
HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)

After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
really hard because of guava. Because of Guava, the amount of work to
certify a minor release update is almost equivalent to a major release
update.

That is because:
(1) Going from guava 11 to guava 27 is a big jump. There are several
incompatible API changes in many places. Too bad the Google developers are
not sympathetic about its users.
(2) guava is used in all Hadoop jars. Not just Hadoop servers but also
client jars and Hadoop common libs.
(3) The Hadoop library is used in practically all software at Cloudera.

Here is my proposal:
(1) shade guava into hadoop-thirdparty, relocate the classpath to
org.hadoop.thirdparty.com.google.common.*
(2) make a hadoop-thirdparty 1.1.0 release.
(3) update existing references to guava to the relocated path. There are
more than 2k imports that need an update.
(4) release Hadoop 3.3.1 / 3.2.2 that contains this change.

In this way, we will be able to update guava in Hadoop in the future
without disrupting Hadoop applications.

Note: HBase already did this and this guava update project would have been
much more difficult if HBase didn't do so.

Thoughts? Other options include
(1) force downstream applications to migrate to Hadoop client artifacts as
listed here
https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
but
that's nearly impossible.
(2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
estimate how much work it's going to be.

Weichiu

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.

+1

Masatake Iwasaki

On 2020/04/06 10:32, Akira Ajisaka wrote:
> +1
>
> Thanks,
> Akira
>
> On Sun, Apr 5, 2020 at 4:13 AM Wei-Chiu Chuang <we...@apache.org> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.

+1

Masatake Iwasaki

On 2020/04/06 10:32, Akira Ajisaka wrote:
> +1
>
> Thanks,
> Akira
>
> On Sun, Apr 5, 2020 at 4:13 AM Wei-Chiu Chuang <we...@apache.org> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.

+1

Masatake Iwasaki

On 2020/04/06 10:32, Akira Ajisaka wrote:
> +1
>
> Thanks,
> Akira
>
> On Sun, Apr 5, 2020 at 4:13 AM Wei-Chiu Chuang <we...@apache.org> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.

+1

Masatake Iwasaki

On 2020/04/06 10:32, Akira Ajisaka wrote:
> +1
>
> Thanks,
> Akira
>
> On Sun, Apr 5, 2020 at 4:13 AM Wei-Chiu Chuang <we...@apache.org> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Akira Ajisaka <aa...@apache.org>.

+1

Thanks,
Akira

On Sun, Apr 5, 2020 at 4:13 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.

Great question!

I can run Java API Compliance Checker to detect any API changes. Guess
that's the only one to find out.

On Sat, Apr 4, 2020 at 1:19 PM Igor Dvorzhak <id...@google.com.invalid> wrote:

> How this proposal will impact public APIs? I.e does Hadoop expose any
> Guava classes in the client APIs that will require recompiling all client
> applications because they need to use shaded Guava classes?
>
> On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.

Great question!

I can run Java API Compliance Checker to detect any API changes. Guess
that's the only one to find out.

On Sat, Apr 4, 2020 at 1:19 PM Igor Dvorzhak <id...@google.com.invalid> wrote:

> How this proposal will impact public APIs? I.e does Hadoop expose any
> Guava classes in the client APIs that will require recompiling all client
> applications because they need to use shaded Guava classes?
>
> On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.

Great question!

I can run Java API Compliance Checker to detect any API changes. Guess
that's the only one to find out.

On Sat, Apr 4, 2020 at 1:19 PM Igor Dvorzhak <id...@google.com.invalid> wrote:

> How this proposal will impact public APIs? I.e does Hadoop expose any
> Guava classes in the client APIs that will require recompiling all client
> applications because they need to use shaded Guava classes?
>
> On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Wei-Chiu Chuang <we...@cloudera.com.INVALID>.

Great question!

I can run Java API Compliance Checker to detect any API changes. Guess
that's the only one to find out.

On Sat, Apr 4, 2020 at 1:19 PM Igor Dvorzhak <id...@google.com.invalid> wrote:

> How this proposal will impact public APIs? I.e does Hadoop expose any
> Guava classes in the client APIs that will require recompiling all client
> applications because they need to use shaded Guava classes?
>
> On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang <we...@apache.org>
> wrote:
>
>> Hi Hadoop devs,
>>
>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>> to update the guava version in Cloudera's software (that includes Hadoop,
>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>
>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>> really hard because of guava. Because of Guava, the amount of work to
>> certify a minor release update is almost equivalent to a major release
>> update.
>>
>> That is because:
>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>> incompatible API changes in many places. Too bad the Google developers are
>> not sympathetic about its users.
>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>> client jars and Hadoop common libs.
>> (3) The Hadoop library is used in practically all software at Cloudera.
>>
>> Here is my proposal:
>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>> org.hadoop.thirdparty.com.google.common.*
>> (2) make a hadoop-thirdparty 1.1.0 release.
>> (3) update existing references to guava to the relocated path. There are
>> more than 2k imports that need an update.
>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>
>> In this way, we will be able to update guava in Hadoop in the future
>> without disrupting Hadoop applications.
>>
>> Note: HBase already did this and this guava update project would have been
>> much more difficult if HBase didn't do so.
>>
>> Thoughts? Other options include
>> (1) force downstream applications to migrate to Hadoop client artifacts as
>> listed here
>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>> but
>> that's nearly impossible.
>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>> estimate how much work it's going to be.
>>
>> Weichiu
>>
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Igor Dvorzhak <id...@google.com.INVALID>.

How this proposal will impact public APIs? I.e does Hadoop expose any Guava
classes in the client APIs that will require recompiling all client
applications because they need to use shaded Guava classes?

On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Dinesh Chitlangia <di...@gmail.com>.

+1

Thanks for initiating this Weichiu.

-Dinesh

On Sat, Apr 4, 2020 at 3:13 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Igor Dvorzhak <id...@google.com.INVALID>.

How this proposal will impact public APIs? I.e does Hadoop expose any Guava
classes in the client APIs that will require recompiling all client
applications because they need to use shaded Guava classes?

On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Akira Ajisaka <aa...@apache.org>.

+1

Thanks,
Akira

On Sun, Apr 5, 2020 at 4:13 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Akira Ajisaka <aa...@apache.org>.

+1

Thanks,
Akira

On Sun, Apr 5, 2020 at 4:13 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Igor Dvorzhak <id...@google.com.INVALID>.

How this proposal will impact public APIs? I.e does Hadoop expose any Guava
classes in the client APIs that will require recompiling all client
applications because they need to use shaded Guava classes?

On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Dinesh Chitlangia <di...@gmail.com>.

+1

Thanks for initiating this Weichiu.

-Dinesh

On Sat, Apr 4, 2020 at 3:13 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Mukul Kumar Singh <mk...@gmail.com>.

+1

On 07/04/20 7:05 am, Zhankun Tang wrote:
> Thanks, Wei-Chiu for the proposal. +1.
>
> On Mon, 6 Apr 2020 at 20:17, Ayush Saxena <ay...@gmail.com> wrote:
>
>> +1
>>
>> -Ayush
>>
>>> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
>>>
>>> Hi Hadoop devs,
>>>
>>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>>> to update the guava version in Cloudera's software (that includes Hadoop,
>>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>>
>>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>>> really hard because of guava. Because of Guava, the amount of work to
>>> certify a minor release update is almost equivalent to a major release
>>> update.
>>>
>>> That is because:
>>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>>> incompatible API changes in many places. Too bad the Google developers
>> are
>>> not sympathetic about its users.
>>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>>> client jars and Hadoop common libs.
>>> (3) The Hadoop library is used in practically all software at Cloudera.
>>>
>>> Here is my proposal:
>>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>>> org.hadoop.thirdparty.com.google.common.*
>>> (2) make a hadoop-thirdparty 1.1.0 release.
>>> (3) update existing references to guava to the relocated path. There are
>>> more than 2k imports that need an update.
>>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>>
>>> In this way, we will be able to update guava in Hadoop in the future
>>> without disrupting Hadoop applications.
>>>
>>> Note: HBase already did this and this guava update project would have
>> been
>>> much more difficult if HBase didn't do so.
>>>
>>> Thoughts? Other options include
>>> (1) force downstream applications to migrate to Hadoop client artifacts
>> as
>>> listed here
>>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>>> but
>>> that's nearly impossible.
>>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>>> estimate how much work it's going to be.
>>>
>>> Weichiu
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Mukul Kumar Singh <mk...@gmail.com>.

+1

On 07/04/20 7:05 am, Zhankun Tang wrote:
> Thanks, Wei-Chiu for the proposal. +1.
>
> On Mon, 6 Apr 2020 at 20:17, Ayush Saxena <ay...@gmail.com> wrote:
>
>> +1
>>
>> -Ayush
>>
>>> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
>>>
>>> Hi Hadoop devs,
>>>
>>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>>> to update the guava version in Cloudera's software (that includes Hadoop,
>>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>>
>>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>>> really hard because of guava. Because of Guava, the amount of work to
>>> certify a minor release update is almost equivalent to a major release
>>> update.
>>>
>>> That is because:
>>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>>> incompatible API changes in many places. Too bad the Google developers
>> are
>>> not sympathetic about its users.
>>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>>> client jars and Hadoop common libs.
>>> (3) The Hadoop library is used in practically all software at Cloudera.
>>>
>>> Here is my proposal:
>>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>>> org.hadoop.thirdparty.com.google.common.*
>>> (2) make a hadoop-thirdparty 1.1.0 release.
>>> (3) update existing references to guava to the relocated path. There are
>>> more than 2k imports that need an update.
>>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>>
>>> In this way, we will be able to update guava in Hadoop in the future
>>> without disrupting Hadoop applications.
>>>
>>> Note: HBase already did this and this guava update project would have
>> been
>>> much more difficult if HBase didn't do so.
>>>
>>> Thoughts? Other options include
>>> (1) force downstream applications to migrate to Hadoop client artifacts
>> as
>>> listed here
>>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>>> but
>>> that's nearly impossible.
>>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>>> estimate how much work it's going to be.
>>>
>>> Weichiu
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Mukul Kumar Singh <mk...@gmail.com>.

+1

On 07/04/20 7:05 am, Zhankun Tang wrote:
> Thanks, Wei-Chiu for the proposal. +1.
>
> On Mon, 6 Apr 2020 at 20:17, Ayush Saxena <ay...@gmail.com> wrote:
>
>> +1
>>
>> -Ayush
>>
>>> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
>>>
>>> Hi Hadoop devs,
>>>
>>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>>> to update the guava version in Cloudera's software (that includes Hadoop,
>>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>>
>>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>>> really hard because of guava. Because of Guava, the amount of work to
>>> certify a minor release update is almost equivalent to a major release
>>> update.
>>>
>>> That is because:
>>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>>> incompatible API changes in many places. Too bad the Google developers
>> are
>>> not sympathetic about its users.
>>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>>> client jars and Hadoop common libs.
>>> (3) The Hadoop library is used in practically all software at Cloudera.
>>>
>>> Here is my proposal:
>>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>>> org.hadoop.thirdparty.com.google.common.*
>>> (2) make a hadoop-thirdparty 1.1.0 release.
>>> (3) update existing references to guava to the relocated path. There are
>>> more than 2k imports that need an update.
>>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>>
>>> In this way, we will be able to update guava in Hadoop in the future
>>> without disrupting Hadoop applications.
>>>
>>> Note: HBase already did this and this guava update project would have
>> been
>>> much more difficult if HBase didn't do so.
>>>
>>> Thoughts? Other options include
>>> (1) force downstream applications to migrate to Hadoop client artifacts
>> as
>>> listed here
>>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>>> but
>>> that's nearly impossible.
>>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>>> estimate how much work it's going to be.
>>>
>>> Weichiu
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Mukul Kumar Singh <mk...@gmail.com>.

+1

On 07/04/20 7:05 am, Zhankun Tang wrote:
> Thanks, Wei-Chiu for the proposal. +1.
>
> On Mon, 6 Apr 2020 at 20:17, Ayush Saxena <ay...@gmail.com> wrote:
>
>> +1
>>
>> -Ayush
>>
>>> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
>>>
>>> Hi Hadoop devs,
>>>
>>> I spent a good part of the past 7 months working with a dozen of
>> colleagues
>>> to update the guava version in Cloudera's software (that includes Hadoop,
>>> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>>>
>>> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
>>> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
>>> really hard because of guava. Because of Guava, the amount of work to
>>> certify a minor release update is almost equivalent to a major release
>>> update.
>>>
>>> That is because:
>>> (1) Going from guava 11 to guava 27 is a big jump. There are several
>>> incompatible API changes in many places. Too bad the Google developers
>> are
>>> not sympathetic about its users.
>>> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
>>> client jars and Hadoop common libs.
>>> (3) The Hadoop library is used in practically all software at Cloudera.
>>>
>>> Here is my proposal:
>>> (1) shade guava into hadoop-thirdparty, relocate the classpath to
>>> org.hadoop.thirdparty.com.google.common.*
>>> (2) make a hadoop-thirdparty 1.1.0 release.
>>> (3) update existing references to guava to the relocated path. There are
>>> more than 2k imports that need an update.
>>> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>>>
>>> In this way, we will be able to update guava in Hadoop in the future
>>> without disrupting Hadoop applications.
>>>
>>> Note: HBase already did this and this guava update project would have
>> been
>>> much more difficult if HBase didn't do so.
>>>
>>> Thoughts? Other options include
>>> (1) force downstream applications to migrate to Hadoop client artifacts
>> as
>>> listed here
>>>
>> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
>>> but
>>> that's nearly impossible.
>>> (2) Migrate Guava to Java APIs. I suppose this is a big project and I
>> can't
>>> estimate how much work it's going to be.
>>>
>>> Weichiu
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Zhankun Tang <zt...@apache.org>.

Thanks, Wei-Chiu for the proposal. +1.

On Mon, 6 Apr 2020 at 20:17, Ayush Saxena <ay...@gmail.com> wrote:

> +1
>
> -Ayush
>
> > On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
> >
> > Hi Hadoop devs,
> >
> > I spent a good part of the past 7 months working with a dozen of
> colleagues
> > to update the guava version in Cloudera's software (that includes Hadoop,
> > HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> >
> > After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> > 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> > really hard because of guava. Because of Guava, the amount of work to
> > certify a minor release update is almost equivalent to a major release
> > update.
> >
> > That is because:
> > (1) Going from guava 11 to guava 27 is a big jump. There are several
> > incompatible API changes in many places. Too bad the Google developers
> are
> > not sympathetic about its users.
> > (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> > client jars and Hadoop common libs.
> > (3) The Hadoop library is used in practically all software at Cloudera.
> >
> > Here is my proposal:
> > (1) shade guava into hadoop-thirdparty, relocate the classpath to
> > org.hadoop.thirdparty.com.google.common.*
> > (2) make a hadoop-thirdparty 1.1.0 release.
> > (3) update existing references to guava to the relocated path. There are
> > more than 2k imports that need an update.
> > (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> >
> > In this way, we will be able to update guava in Hadoop in the future
> > without disrupting Hadoop applications.
> >
> > Note: HBase already did this and this guava update project would have
> been
> > much more difficult if HBase didn't do so.
> >
> > Thoughts? Other options include
> > (1) force downstream applications to migrate to Hadoop client artifacts
> as
> > listed here
> >
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> > but
> > that's nearly impossible.
> > (2) Migrate Guava to Java APIs. I suppose this is a big project and I
> can't
> > estimate how much work it's going to be.
> >
> > Weichiu
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Zhankun Tang <zt...@apache.org>.

Thanks, Wei-Chiu for the proposal. +1.

On Mon, 6 Apr 2020 at 20:17, Ayush Saxena <ay...@gmail.com> wrote:

> +1
>
> -Ayush
>
> > On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
> >
> > Hi Hadoop devs,
> >
> > I spent a good part of the past 7 months working with a dozen of
> colleagues
> > to update the guava version in Cloudera's software (that includes Hadoop,
> > HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> >
> > After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> > 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> > really hard because of guava. Because of Guava, the amount of work to
> > certify a minor release update is almost equivalent to a major release
> > update.
> >
> > That is because:
> > (1) Going from guava 11 to guava 27 is a big jump. There are several
> > incompatible API changes in many places. Too bad the Google developers
> are
> > not sympathetic about its users.
> > (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> > client jars and Hadoop common libs.
> > (3) The Hadoop library is used in practically all software at Cloudera.
> >
> > Here is my proposal:
> > (1) shade guava into hadoop-thirdparty, relocate the classpath to
> > org.hadoop.thirdparty.com.google.common.*
> > (2) make a hadoop-thirdparty 1.1.0 release.
> > (3) update existing references to guava to the relocated path. There are
> > more than 2k imports that need an update.
> > (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> >
> > In this way, we will be able to update guava in Hadoop in the future
> > without disrupting Hadoop applications.
> >
> > Note: HBase already did this and this guava update project would have
> been
> > much more difficult if HBase didn't do so.
> >
> > Thoughts? Other options include
> > (1) force downstream applications to migrate to Hadoop client artifacts
> as
> > listed here
> >
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> > but
> > that's nearly impossible.
> > (2) Migrate Guava to Java APIs. I suppose this is a big project and I
> can't
> > estimate how much work it's going to be.
> >
> > Weichiu
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Zhankun Tang <zt...@apache.org>.

Thanks, Wei-Chiu for the proposal. +1.

On Mon, 6 Apr 2020 at 20:17, Ayush Saxena <ay...@gmail.com> wrote:

> +1
>
> -Ayush
>
> > On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
> >
> > Hi Hadoop devs,
> >
> > I spent a good part of the past 7 months working with a dozen of
> colleagues
> > to update the guava version in Cloudera's software (that includes Hadoop,
> > HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> >
> > After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> > 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> > really hard because of guava. Because of Guava, the amount of work to
> > certify a minor release update is almost equivalent to a major release
> > update.
> >
> > That is because:
> > (1) Going from guava 11 to guava 27 is a big jump. There are several
> > incompatible API changes in many places. Too bad the Google developers
> are
> > not sympathetic about its users.
> > (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> > client jars and Hadoop common libs.
> > (3) The Hadoop library is used in practically all software at Cloudera.
> >
> > Here is my proposal:
> > (1) shade guava into hadoop-thirdparty, relocate the classpath to
> > org.hadoop.thirdparty.com.google.common.*
> > (2) make a hadoop-thirdparty 1.1.0 release.
> > (3) update existing references to guava to the relocated path. There are
> > more than 2k imports that need an update.
> > (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> >
> > In this way, we will be able to update guava in Hadoop in the future
> > without disrupting Hadoop applications.
> >
> > Note: HBase already did this and this guava update project would have
> been
> > much more difficult if HBase didn't do so.
> >
> > Thoughts? Other options include
> > (1) force downstream applications to migrate to Hadoop client artifacts
> as
> > listed here
> >
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> > but
> > that's nearly impossible.
> > (2) Migrate Guava to Java APIs. I suppose this is a big project and I
> can't
> > estimate how much work it's going to be.
> >
> > Weichiu
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Zhankun Tang <zt...@apache.org>.

Thanks, Wei-Chiu for the proposal. +1.

On Mon, 6 Apr 2020 at 20:17, Ayush Saxena <ay...@gmail.com> wrote:

> +1
>
> -Ayush
>
> > On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
> >
> > Hi Hadoop devs,
> >
> > I spent a good part of the past 7 months working with a dozen of
> colleagues
> > to update the guava version in Cloudera's software (that includes Hadoop,
> > HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> >
> > After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> > 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> > really hard because of guava. Because of Guava, the amount of work to
> > certify a minor release update is almost equivalent to a major release
> > update.
> >
> > That is because:
> > (1) Going from guava 11 to guava 27 is a big jump. There are several
> > incompatible API changes in many places. Too bad the Google developers
> are
> > not sympathetic about its users.
> > (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> > client jars and Hadoop common libs.
> > (3) The Hadoop library is used in practically all software at Cloudera.
> >
> > Here is my proposal:
> > (1) shade guava into hadoop-thirdparty, relocate the classpath to
> > org.hadoop.thirdparty.com.google.common.*
> > (2) make a hadoop-thirdparty 1.1.0 release.
> > (3) update existing references to guava to the relocated path. There are
> > more than 2k imports that need an update.
> > (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> >
> > In this way, we will be able to update guava in Hadoop in the future
> > without disrupting Hadoop applications.
> >
> > Note: HBase already did this and this guava update project would have
> been
> > much more difficult if HBase didn't do so.
> >
> > Thoughts? Other options include
> > (1) force downstream applications to migrate to Hadoop client artifacts
> as
> > listed here
> >
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> > but
> > that's nearly impossible.
> > (2) Migrate Guava to Java APIs. I suppose this is a big project and I
> can't
> > estimate how much work it's going to be.
> >
> > Weichiu
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Ayush Saxena <ay...@gmail.com>.

+1

-Ayush

> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
> 
> Hi Hadoop devs,
> 
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> 
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
> 
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
> 
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> 
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
> 
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
> 
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
> 
> Weichiu

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Igor Dvorzhak <id...@google.com.INVALID>.

How this proposal will impact public APIs? I.e does Hadoop expose any Guava
classes in the client APIs that will require recompiling all client
applications because they need to use shaded Guava classes?

On Sat, Apr 4, 2020 at 12:13 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Ayush Saxena <ay...@gmail.com>.

+1

-Ayush

> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
> 
> Hi Hadoop devs,
> 
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> 
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
> 
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
> 
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> 
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
> 
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
> 
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
> 
> Weichiu

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Dinesh Chitlangia <di...@gmail.com>.

+1

Thanks for initiating this Weichiu.

-Dinesh

On Sat, Apr 4, 2020 at 3:13 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Ayush Saxena <ay...@gmail.com>.

+1

-Ayush

> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
> 
> Hi Hadoop devs,
> 
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> 
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
> 
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
> 
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> 
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
> 
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
> 
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
> 
> Weichiu

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Ayush Saxena <ay...@gmail.com>.

+1

-Ayush

> On 05-Apr-2020, at 12:43 AM, Wei-Chiu Chuang <we...@apache.org> wrote:
> 
> Hi Hadoop devs,
> 
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
> 
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
> 
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
> 
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
> 
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
> 
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
> 
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
> 
> Weichiu

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Akira Ajisaka <aa...@apache.org>.

+1

Thanks,
Akira

On Sun, Apr 5, 2020 at 4:13 AM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>

Re: [DISCUSS] Shade guava into hadoop-thirdparty

Posted by Dinesh Chitlangia <di...@gmail.com>.

+1

Thanks for initiating this Weichiu.

-Dinesh

On Sat, Apr 4, 2020 at 3:13 PM Wei-Chiu Chuang <we...@apache.org> wrote:

> Hi Hadoop devs,
>
> I spent a good part of the past 7 months working with a dozen of colleagues
> to update the guava version in Cloudera's software (that includes Hadoop,
> HBase, Spark, Hive, Cloudera Manager ... more than 20+ projects)
>
> After 7 months, I finally came to a conclusion: Update to Hadoop 3.3 /
> 3.2.1 / 3.1.3, even if you just go from Hadoop 3.0/ 3.1.0 is going to be
> really hard because of guava. Because of Guava, the amount of work to
> certify a minor release update is almost equivalent to a major release
> update.
>
> That is because:
> (1) Going from guava 11 to guava 27 is a big jump. There are several
> incompatible API changes in many places. Too bad the Google developers are
> not sympathetic about its users.
> (2) guava is used in all Hadoop jars. Not just Hadoop servers but also
> client jars and Hadoop common libs.
> (3) The Hadoop library is used in practically all software at Cloudera.
>
> Here is my proposal:
> (1) shade guava into hadoop-thirdparty, relocate the classpath to
> org.hadoop.thirdparty.com.google.common.*
> (2) make a hadoop-thirdparty 1.1.0 release.
> (3) update existing references to guava to the relocated path. There are
> more than 2k imports that need an update.
> (4) release Hadoop 3.3.1 / 3.2.2 that contains this change.
>
> In this way, we will be able to update guava in Hadoop in the future
> without disrupting Hadoop applications.
>
> Note: HBase already did this and this guava update project would have been
> much more difficult if HBase didn't do so.
>
> Thoughts? Other options include
> (1) force downstream applications to migrate to Hadoop client artifacts as
> listed here
>
> https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html
> but
> that's nearly impossible.
> (2) Migrate Guava to Java APIs. I suppose this is a big project and I can't
> estimate how much work it's going to be.
>
> Weichiu
>