You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Allen Wittenauer <aw...@effectivemachines.com> on 2017/04/03 16:04:14 UTC

[DISCUSS] Changing the default class path for clients

	This morning I had a bit of a shower thought:

	With the new shaded hadoop client in 3.0, is there any reason the default classpath should remain the full blown jar list?  e.g., shouldn’t ‘hadoop classpath’ just return configuration, user supplied bits (e.g., HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and hadoop-client-runtime? We’d obviously have to add some plumbing for daemons and the capability for the user to get the full list, but that should be trivial.  

	Thoughts?
---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org


Re: [DISCUSS] Changing the default class path for clients

Posted by Andrew Wang <an...@cloudera.com>.
Thanks for digging that up. I agree with your analysis of our public
documentation, though we still need a transition path. Officially, our
classpath is not covered by compatibility, though we know that in reality,
classpath changes are quite impactful to users.

While we were having a related discussion on YARN container classpath
isolation, the plan was to still provide the existing set of JARs by
default, with applications having to explicitly opt-in to a clean
classpath. This feels similar.

How do you feel about providing e.g. `hadoop userclasspath` and `hadoop
daemonclasspath`, and having `hadoop classpath` continue to default to
`daemonclasspath` for now? We could then deprecate+remove `hadoop
classpath` in a future release.

On Mon, Apr 3, 2017 at 11:08 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
> 1.0.4:
>
>         "Prints the class path needed to get the Hadoop jar and the
> required libraries.”
>
>  2.8.0 and 3.0.0:
>
>         "Prints the class path needed to get the Hadoop jar and the
> required libraries. If called without arguments, then prints the classpath
> set up by the command scripts, which is likely to contain wildcards in the
> classpath entries.”
>
>         I would take that to mean “what gives me all the public APIs?”
> Which, by definition, should all be in hadoop-client-runtime (with the
> possible exception of the DistributedFileSystem Quota APIs, since for some
> reason those are marked public.)
>
> Let me ask it a different way:
>
>         Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’,
> etc, etc, etc, have anything but hadoop-client-runtime as the provided jar?
> Yes, some things might break, but given this is 3.0, some changes should be
> expected anyway. Given the definition above "needed to get the Hadoop jar
> and the required libraries”  switching this over seems correct.
>
>
> > On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez <es...@cloudera.com>
> wrote:
> >
> >
> > I agreed with Andrew too. Users have relied for years on `hadoop
> classpath` for their script to launch jobs or other tools, perhaps no the
> best idea to change the behavior without providing a proper deprecation
> path.
> >
> > thanks!
> > esteban.
> >
> > --
> > Cloudera, Inc.
> >
> >
> > On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > What's the current contract for `hadoop classpath`? Would it be safer to
> > introduce `hadoop userclasspath` or similar for this behavior?
> >
> > I'm betting that changing `hadoop classpath` will lead to some breakages,
> > so I'd prefer to make this new behavior opt-in.
> >
> > Best,
> > Andrew
> >
> > On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <
> aw@effectivemachines.com>
> > wrote:
> >
> > >
> > >         This morning I had a bit of a shower thought:
> > >
> > >         With the new shaded hadoop client in 3.0, is there any reason
> the
> > > default classpath should remain the full blown jar list?  e.g.,
> shouldn’t
> > > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > > and the capability for the user to get the full list, but that should
> be
> > > trivial.
> > >
> > >         Thoughts?
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> > >
> > >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Andrew Wang <an...@cloudera.com>.
Thanks for digging that up. I agree with your analysis of our public
documentation, though we still need a transition path. Officially, our
classpath is not covered by compatibility, though we know that in reality,
classpath changes are quite impactful to users.

While we were having a related discussion on YARN container classpath
isolation, the plan was to still provide the existing set of JARs by
default, with applications having to explicitly opt-in to a clean
classpath. This feels similar.

How do you feel about providing e.g. `hadoop userclasspath` and `hadoop
daemonclasspath`, and having `hadoop classpath` continue to default to
`daemonclasspath` for now? We could then deprecate+remove `hadoop
classpath` in a future release.

On Mon, Apr 3, 2017 at 11:08 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
> 1.0.4:
>
>         "Prints the class path needed to get the Hadoop jar and the
> required libraries.”
>
>  2.8.0 and 3.0.0:
>
>         "Prints the class path needed to get the Hadoop jar and the
> required libraries. If called without arguments, then prints the classpath
> set up by the command scripts, which is likely to contain wildcards in the
> classpath entries.”
>
>         I would take that to mean “what gives me all the public APIs?”
> Which, by definition, should all be in hadoop-client-runtime (with the
> possible exception of the DistributedFileSystem Quota APIs, since for some
> reason those are marked public.)
>
> Let me ask it a different way:
>
>         Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’,
> etc, etc, etc, have anything but hadoop-client-runtime as the provided jar?
> Yes, some things might break, but given this is 3.0, some changes should be
> expected anyway. Given the definition above "needed to get the Hadoop jar
> and the required libraries”  switching this over seems correct.
>
>
> > On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez <es...@cloudera.com>
> wrote:
> >
> >
> > I agreed with Andrew too. Users have relied for years on `hadoop
> classpath` for their script to launch jobs or other tools, perhaps no the
> best idea to change the behavior without providing a proper deprecation
> path.
> >
> > thanks!
> > esteban.
> >
> > --
> > Cloudera, Inc.
> >
> >
> > On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > What's the current contract for `hadoop classpath`? Would it be safer to
> > introduce `hadoop userclasspath` or similar for this behavior?
> >
> > I'm betting that changing `hadoop classpath` will lead to some breakages,
> > so I'd prefer to make this new behavior opt-in.
> >
> > Best,
> > Andrew
> >
> > On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <
> aw@effectivemachines.com>
> > wrote:
> >
> > >
> > >         This morning I had a bit of a shower thought:
> > >
> > >         With the new shaded hadoop client in 3.0, is there any reason
> the
> > > default classpath should remain the full blown jar list?  e.g.,
> shouldn’t
> > > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > > and the capability for the user to get the full list, but that should
> be
> > > trivial.
> > >
> > >         Thoughts?
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> > >
> > >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Andrew Wang <an...@cloudera.com>.
Thanks for digging that up. I agree with your analysis of our public
documentation, though we still need a transition path. Officially, our
classpath is not covered by compatibility, though we know that in reality,
classpath changes are quite impactful to users.

While we were having a related discussion on YARN container classpath
isolation, the plan was to still provide the existing set of JARs by
default, with applications having to explicitly opt-in to a clean
classpath. This feels similar.

How do you feel about providing e.g. `hadoop userclasspath` and `hadoop
daemonclasspath`, and having `hadoop classpath` continue to default to
`daemonclasspath` for now? We could then deprecate+remove `hadoop
classpath` in a future release.

On Mon, Apr 3, 2017 at 11:08 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
> 1.0.4:
>
>         "Prints the class path needed to get the Hadoop jar and the
> required libraries.”
>
>  2.8.0 and 3.0.0:
>
>         "Prints the class path needed to get the Hadoop jar and the
> required libraries. If called without arguments, then prints the classpath
> set up by the command scripts, which is likely to contain wildcards in the
> classpath entries.”
>
>         I would take that to mean “what gives me all the public APIs?”
> Which, by definition, should all be in hadoop-client-runtime (with the
> possible exception of the DistributedFileSystem Quota APIs, since for some
> reason those are marked public.)
>
> Let me ask it a different way:
>
>         Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’,
> etc, etc, etc, have anything but hadoop-client-runtime as the provided jar?
> Yes, some things might break, but given this is 3.0, some changes should be
> expected anyway. Given the definition above "needed to get the Hadoop jar
> and the required libraries”  switching this over seems correct.
>
>
> > On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez <es...@cloudera.com>
> wrote:
> >
> >
> > I agreed with Andrew too. Users have relied for years on `hadoop
> classpath` for their script to launch jobs or other tools, perhaps no the
> best idea to change the behavior without providing a proper deprecation
> path.
> >
> > thanks!
> > esteban.
> >
> > --
> > Cloudera, Inc.
> >
> >
> > On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > What's the current contract for `hadoop classpath`? Would it be safer to
> > introduce `hadoop userclasspath` or similar for this behavior?
> >
> > I'm betting that changing `hadoop classpath` will lead to some breakages,
> > so I'd prefer to make this new behavior opt-in.
> >
> > Best,
> > Andrew
> >
> > On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <
> aw@effectivemachines.com>
> > wrote:
> >
> > >
> > >         This morning I had a bit of a shower thought:
> > >
> > >         With the new shaded hadoop client in 3.0, is there any reason
> the
> > > default classpath should remain the full blown jar list?  e.g.,
> shouldn’t
> > > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > > and the capability for the user to get the full list, but that should
> be
> > > trivial.
> > >
> > >         Thoughts?
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> > >
> > >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Andrew Wang <an...@cloudera.com>.
Thanks for digging that up. I agree with your analysis of our public
documentation, though we still need a transition path. Officially, our
classpath is not covered by compatibility, though we know that in reality,
classpath changes are quite impactful to users.

While we were having a related discussion on YARN container classpath
isolation, the plan was to still provide the existing set of JARs by
default, with applications having to explicitly opt-in to a clean
classpath. This feels similar.

How do you feel about providing e.g. `hadoop userclasspath` and `hadoop
daemonclasspath`, and having `hadoop classpath` continue to default to
`daemonclasspath` for now? We could then deprecate+remove `hadoop
classpath` in a future release.

On Mon, Apr 3, 2017 at 11:08 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
> 1.0.4:
>
>         "Prints the class path needed to get the Hadoop jar and the
> required libraries.”
>
>  2.8.0 and 3.0.0:
>
>         "Prints the class path needed to get the Hadoop jar and the
> required libraries. If called without arguments, then prints the classpath
> set up by the command scripts, which is likely to contain wildcards in the
> classpath entries.”
>
>         I would take that to mean “what gives me all the public APIs?”
> Which, by definition, should all be in hadoop-client-runtime (with the
> possible exception of the DistributedFileSystem Quota APIs, since for some
> reason those are marked public.)
>
> Let me ask it a different way:
>
>         Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’,
> etc, etc, etc, have anything but hadoop-client-runtime as the provided jar?
> Yes, some things might break, but given this is 3.0, some changes should be
> expected anyway. Given the definition above "needed to get the Hadoop jar
> and the required libraries”  switching this over seems correct.
>
>
> > On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez <es...@cloudera.com>
> wrote:
> >
> >
> > I agreed with Andrew too. Users have relied for years on `hadoop
> classpath` for their script to launch jobs or other tools, perhaps no the
> best idea to change the behavior without providing a proper deprecation
> path.
> >
> > thanks!
> > esteban.
> >
> > --
> > Cloudera, Inc.
> >
> >
> > On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > What's the current contract for `hadoop classpath`? Would it be safer to
> > introduce `hadoop userclasspath` or similar for this behavior?
> >
> > I'm betting that changing `hadoop classpath` will lead to some breakages,
> > so I'd prefer to make this new behavior opt-in.
> >
> > Best,
> > Andrew
> >
> > On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <
> aw@effectivemachines.com>
> > wrote:
> >
> > >
> > >         This morning I had a bit of a shower thought:
> > >
> > >         With the new shaded hadoop client in 3.0, is there any reason
> the
> > > default classpath should remain the full blown jar list?  e.g.,
> shouldn’t
> > > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > > and the capability for the user to get the full list, but that should
> be
> > > trivial.
> > >
> > >         Thoughts?
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> > >
> > >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
1.0.4:

	"Prints the class path needed to get the Hadoop jar and the required libraries.”

 2.8.0 and 3.0.0:

	"Prints the class path needed to get the Hadoop jar and the required libraries. If called without arguments, then prints the classpath set up by the command scripts, which is likely to contain wildcards in the classpath entries.”

	I would take that to mean “what gives me all the public APIs?”  Which, by definition, should all be in hadoop-client-runtime (with the possible exception of the DistributedFileSystem Quota APIs, since for some reason those are marked public.) 

Let me ask it a different way:

	Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’, etc, etc, etc, have anything but hadoop-client-runtime as the provided jar? Yes, some things might break, but given this is 3.0, some changes should be expected anyway. Given the definition above "needed to get the Hadoop jar and the required libraries”  switching this over seems correct.  


> On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez <es...@cloudera.com> wrote:
> 
> 
> I agreed with Andrew too. Users have relied for years on `hadoop classpath` for their script to launch jobs or other tools, perhaps no the best idea to change the behavior without providing a proper deprecation path.
> 
> thanks!
> esteban.
> 
> --
> Cloudera, Inc.
> 
> 
> On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com> wrote:
> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
> 
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
> 
> Best,
> Andrew
> 
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw...@effectivemachines.com>
> wrote:
> 
> >
> >         This morning I had a bit of a shower thought:
> >
> >         With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> >         Thoughts?
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org


Re: [DISCUSS] Changing the default class path for clients

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
1.0.4:

	"Prints the class path needed to get the Hadoop jar and the required libraries.”

 2.8.0 and 3.0.0:

	"Prints the class path needed to get the Hadoop jar and the required libraries. If called without arguments, then prints the classpath set up by the command scripts, which is likely to contain wildcards in the classpath entries.”

	I would take that to mean “what gives me all the public APIs?”  Which, by definition, should all be in hadoop-client-runtime (with the possible exception of the DistributedFileSystem Quota APIs, since for some reason those are marked public.) 

Let me ask it a different way:

	Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’, etc, etc, etc, have anything but hadoop-client-runtime as the provided jar? Yes, some things might break, but given this is 3.0, some changes should be expected anyway. Given the definition above "needed to get the Hadoop jar and the required libraries”  switching this over seems correct.  


> On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez <es...@cloudera.com> wrote:
> 
> 
> I agreed with Andrew too. Users have relied for years on `hadoop classpath` for their script to launch jobs or other tools, perhaps no the best idea to change the behavior without providing a proper deprecation path.
> 
> thanks!
> esteban.
> 
> --
> Cloudera, Inc.
> 
> 
> On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com> wrote:
> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
> 
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
> 
> Best,
> Andrew
> 
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw...@effectivemachines.com>
> wrote:
> 
> >
> >         This morning I had a bit of a shower thought:
> >
> >         With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> >         Thoughts?
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Re: [DISCUSS] Changing the default class path for clients

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
1.0.4:

	"Prints the class path needed to get the Hadoop jar and the required libraries.”

 2.8.0 and 3.0.0:

	"Prints the class path needed to get the Hadoop jar and the required libraries. If called without arguments, then prints the classpath set up by the command scripts, which is likely to contain wildcards in the classpath entries.”

	I would take that to mean “what gives me all the public APIs?”  Which, by definition, should all be in hadoop-client-runtime (with the possible exception of the DistributedFileSystem Quota APIs, since for some reason those are marked public.) 

Let me ask it a different way:

	Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’, etc, etc, etc, have anything but hadoop-client-runtime as the provided jar? Yes, some things might break, but given this is 3.0, some changes should be expected anyway. Given the definition above "needed to get the Hadoop jar and the required libraries”  switching this over seems correct.  


> On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez <es...@cloudera.com> wrote:
> 
> 
> I agreed with Andrew too. Users have relied for years on `hadoop classpath` for their script to launch jobs or other tools, perhaps no the best idea to change the behavior without providing a proper deprecation path.
> 
> thanks!
> esteban.
> 
> --
> Cloudera, Inc.
> 
> 
> On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com> wrote:
> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
> 
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
> 
> Best,
> Andrew
> 
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw...@effectivemachines.com>
> wrote:
> 
> >
> >         This morning I had a bit of a shower thought:
> >
> >         With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> >         Thoughts?
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS] Changing the default class path for clients

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
1.0.4:

	"Prints the class path needed to get the Hadoop jar and the required libraries.”

 2.8.0 and 3.0.0:

	"Prints the class path needed to get the Hadoop jar and the required libraries. If called without arguments, then prints the classpath set up by the command scripts, which is likely to contain wildcards in the classpath entries.”

	I would take that to mean “what gives me all the public APIs?”  Which, by definition, should all be in hadoop-client-runtime (with the possible exception of the DistributedFileSystem Quota APIs, since for some reason those are marked public.) 

Let me ask it a different way:

	Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’, etc, etc, etc, have anything but hadoop-client-runtime as the provided jar? Yes, some things might break, but given this is 3.0, some changes should be expected anyway. Given the definition above "needed to get the Hadoop jar and the required libraries”  switching this over seems correct.  


> On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez <es...@cloudera.com> wrote:
> 
> 
> I agreed with Andrew too. Users have relied for years on `hadoop classpath` for their script to launch jobs or other tools, perhaps no the best idea to change the behavior without providing a proper deprecation path.
> 
> thanks!
> esteban.
> 
> --
> Cloudera, Inc.
> 
> 
> On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com> wrote:
> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
> 
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
> 
> Best,
> Andrew
> 
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw...@effectivemachines.com>
> wrote:
> 
> >
> >         This morning I had a bit of a shower thought:
> >
> >         With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> >         Thoughts?
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Re: [DISCUSS] Changing the default class path for clients

Posted by Esteban Gutierrez <es...@cloudera.com>.
I agreed with Andrew too. Users have relied for years on `hadoop classpath`
for their script to launch jobs or other tools, perhaps no the best idea to
change the behavior without providing a proper deprecation path.

thanks!
esteban.

--
Cloudera, Inc.


On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com>
wrote:

> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
>
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
>
> Best,
> Andrew
>
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw@effectivemachines.com
> >
> wrote:
>
> >
> >         This morning I had a bit of a shower thought:
> >
> >         With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> >         Thoughts?
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Esteban Gutierrez <es...@cloudera.com>.
I agreed with Andrew too. Users have relied for years on `hadoop classpath`
for their script to launch jobs or other tools, perhaps no the best idea to
change the behavior without providing a proper deprecation path.

thanks!
esteban.

--
Cloudera, Inc.


On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com>
wrote:

> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
>
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
>
> Best,
> Andrew
>
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw@effectivemachines.com
> >
> wrote:
>
> >
> >         This morning I had a bit of a shower thought:
> >
> >         With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> >         Thoughts?
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Esteban Gutierrez <es...@cloudera.com>.
I agreed with Andrew too. Users have relied for years on `hadoop classpath`
for their script to launch jobs or other tools, perhaps no the best idea to
change the behavior without providing a proper deprecation path.

thanks!
esteban.

--
Cloudera, Inc.


On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com>
wrote:

> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
>
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
>
> Best,
> Andrew
>
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw@effectivemachines.com
> >
> wrote:
>
> >
> >         This morning I had a bit of a shower thought:
> >
> >         With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> >         Thoughts?
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Esteban Gutierrez <es...@cloudera.com>.
I agreed with Andrew too. Users have relied for years on `hadoop classpath`
for their script to launch jobs or other tools, perhaps no the best idea to
change the behavior without providing a proper deprecation path.

thanks!
esteban.

--
Cloudera, Inc.


On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang <an...@cloudera.com>
wrote:

> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
>
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
>
> Best,
> Andrew
>
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw@effectivemachines.com
> >
> wrote:
>
> >
> >         This morning I had a bit of a shower thought:
> >
> >         With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> >         Thoughts?
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Andrew Wang <an...@cloudera.com>.
What's the current contract for `hadoop classpath`? Would it be safer to
introduce `hadoop userclasspath` or similar for this behavior?

I'm betting that changing `hadoop classpath` will lead to some breakages,
so I'd prefer to make this new behavior opt-in.

Best,
Andrew

On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
>         This morning I had a bit of a shower thought:
>
>         With the new shaded hadoop client in 3.0, is there any reason the
> default classpath should remain the full blown jar list?  e.g., shouldn’t
> ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> hadoop-client-runtime? We’d obviously have to add some plumbing for daemons
> and the capability for the user to get the full list, but that should be
> trivial.
>
>         Thoughts?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Andrew Wang <an...@cloudera.com>.
What's the current contract for `hadoop classpath`? Would it be safer to
introduce `hadoop userclasspath` or similar for this behavior?

I'm betting that changing `hadoop classpath` will lead to some breakages,
so I'd prefer to make this new behavior opt-in.

Best,
Andrew

On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
>         This morning I had a bit of a shower thought:
>
>         With the new shaded hadoop client in 3.0, is there any reason the
> default classpath should remain the full blown jar list?  e.g., shouldn’t
> ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> hadoop-client-runtime? We’d obviously have to add some plumbing for daemons
> and the capability for the user to get the full list, but that should be
> trivial.
>
>         Thoughts?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Andrew Wang <an...@cloudera.com>.
What's the current contract for `hadoop classpath`? Would it be safer to
introduce `hadoop userclasspath` or similar for this behavior?

I'm betting that changing `hadoop classpath` will lead to some breakages,
so I'd prefer to make this new behavior opt-in.

Best,
Andrew

On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
>         This morning I had a bit of a shower thought:
>
>         With the new shaded hadoop client in 3.0, is there any reason the
> default classpath should remain the full blown jar list?  e.g., shouldn’t
> ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> hadoop-client-runtime? We’d obviously have to add some plumbing for daemons
> and the capability for the user to get the full list, but that should be
> trivial.
>
>         Thoughts?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Changing the default class path for clients

Posted by Andrew Wang <an...@cloudera.com>.
What's the current contract for `hadoop classpath`? Would it be safer to
introduce `hadoop userclasspath` or similar for this behavior?

I'm betting that changing `hadoop classpath` will lead to some breakages,
so I'd prefer to make this new behavior opt-in.

Best,
Andrew

On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
>         This morning I had a bit of a shower thought:
>
>         With the new shaded hadoop client in 3.0, is there any reason the
> default classpath should remain the full blown jar list?  e.g., shouldn’t
> ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> hadoop-client-runtime? We’d obviously have to add some plumbing for daemons
> and the capability for the user to get the full list, but that should be
> trivial.
>
>         Thoughts?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>