You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Christopher <ct...@apache.org> on 2013/11/06 23:05:54 UTC

"Provided" dependencies

What's the latest opinion whether things should be marked "provided" in the pom?
I've changed my mind on this a few times, myself, so I'm curious what
others think.

The provided scope means that it will not propagate as a transitive
dependency. Other than that, it doesn't do much... though we can
control packaging based on provided or not.

I'm not sure this gets us much, and it's inconvenient for users. We
can control packaging in other ways (like being more explicit and
carefully considering which dependencies we include in an RPM or
tarball, for instance).

If we drop its declaration, what this means, is that if users want to
build with Accumulo as a dependency, but against a different version
of Hadoop than what we declare in our POM, they'll have to explicitly
<exclude> the hadoop dependencies, and redeclare them, or they will
have to use their <dependencyManagement> section to force a particular
dependency of hadoop.

The advantage to users, though, if we drop this, is that they won't
have to constantly re-declare transitive dependencies to get their
projects to build/test/run.

See http://s.apache.org/maven-dependency-scopes

Thoughts?

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

Re: "Provided" dependencies

Posted by Billie Rinaldi <bi...@gmail.com>.
Christopher,

You're our maven expert, so in general I will defer to your good
judgement.  I am fine with either approach as long as we continue not to
package those dependencies that are currently marked provided.  I've gotten
used to this practice and now like it, and if I recall correctly it was the
reason we marked the dependencies provided in the first place.

Billie


On Wed, Nov 6, 2013 at 4:06 PM, Christopher <ct...@apache.org> wrote:

> This has nothing to do with packaging. It has to do with developer
> workspaces and default dependency resolution using maven.
>
> I'm not suggesting a change to packaging. The declaration of the scope
> is independent of packaging.
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Wed, Nov 6, 2013 at 6:54 PM, John Vines <vi...@apache.org> wrote:
> > We support different versions of hadoop and we already need the HDFS
> > classpath for the conf files, so we might as well use the ones there
> > instead of bundling them up and potentially causing conflicts if
> something
> > strange happens in the hadoop client api.
> >
> >
> > On Wed, Nov 6, 2013 at 6:46 PM, Christopher <ct...@apache.org> wrote:
> >
> >> I'm not sure I understand your meaning. Why exactly do you think
> >> specifying the scope as provided makes sense?
> >>
> >> --
> >> Christopher L Tubbs II
> >> http://gravatar.com/ctubbsii
> >>
> >>
> >> On Wed, Nov 6, 2013 at 5:46 PM, John Vines <vi...@apache.org> wrote:
> >> > The provided make sense for hadoop to pick up dependencies. To a less
> >> > extent, it makes sense for ZK.
> >> >
> >> > However, as someone who is using accumulo for a project, I would love
> to
> >> > have a client library that is as sparse as possible to avoid having to
> >> deal
> >> > with resource conflicts.
> >> >
> >> >
> >> > On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <
> >> joey+ml@clouderagovt.com>wrote:
> >> >
> >> >> Do Accumulo users need Hadoop or it's dependencies in order to use
> the
> >> >> client APIs?
> >> >>
> >> >> The only client API that I could see needing it would be the
> >> >> [In|Out]putFormats, but it'd be cool if that was a separate module
> and
> >> >> that module had the appropriate Hadoop dependencies with the compile
> >> >> scope.
> >> >>
> >> >> -Joey
> >> >>
> >> >> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org>
> >> wrote:
> >> >> > What's the latest opinion whether things should be marked
> "provided"
> >> in
> >> >> the pom?
> >> >> > I've changed my mind on this a few times, myself, so I'm curious
> what
> >> >> > others think.
> >> >> >
> >> >> > The provided scope means that it will not propagate as a transitive
> >> >> > dependency. Other than that, it doesn't do much... though we can
> >> >> > control packaging based on provided or not.
> >> >> >
> >> >> > I'm not sure this gets us much, and it's inconvenient for users. We
> >> >> > can control packaging in other ways (like being more explicit and
> >> >> > carefully considering which dependencies we include in an RPM or
> >> >> > tarball, for instance).
> >> >> >
> >> >> > If we drop its declaration, what this means, is that if users want
> to
> >> >> > build with Accumulo as a dependency, but against a different
> version
> >> >> > of Hadoop than what we declare in our POM, they'll have to
> explicitly
> >> >> > <exclude> the hadoop dependencies, and redeclare them, or they will
> >> >> > have to use their <dependencyManagement> section to force a
> particular
> >> >> > dependency of hadoop.
> >> >> >
> >> >> > The advantage to users, though, if we drop this, is that they won't
> >> >> > have to constantly re-declare transitive dependencies to get their
> >> >> > projects to build/test/run.
> >> >> >
> >> >> > See http://s.apache.org/maven-dependency-scopes
> >> >> >
> >> >> > Thoughts?
> >> >> >
> >> >> > --
> >> >> > Christopher L Tubbs II
> >> >> > http://gravatar.com/ctubbsii
> >> >>
> >>
>

Re: "Provided" dependencies

Posted by Christopher <ct...@apache.org>.
This has nothing to do with packaging. It has to do with developer
workspaces and default dependency resolution using maven.

I'm not suggesting a change to packaging. The declaration of the scope
is independent of packaging.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Wed, Nov 6, 2013 at 6:54 PM, John Vines <vi...@apache.org> wrote:
> We support different versions of hadoop and we already need the HDFS
> classpath for the conf files, so we might as well use the ones there
> instead of bundling them up and potentially causing conflicts if something
> strange happens in the hadoop client api.
>
>
> On Wed, Nov 6, 2013 at 6:46 PM, Christopher <ct...@apache.org> wrote:
>
>> I'm not sure I understand your meaning. Why exactly do you think
>> specifying the scope as provided makes sense?
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>> On Wed, Nov 6, 2013 at 5:46 PM, John Vines <vi...@apache.org> wrote:
>> > The provided make sense for hadoop to pick up dependencies. To a less
>> > extent, it makes sense for ZK.
>> >
>> > However, as someone who is using accumulo for a project, I would love to
>> > have a client library that is as sparse as possible to avoid having to
>> deal
>> > with resource conflicts.
>> >
>> >
>> > On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <
>> joey+ml@clouderagovt.com>wrote:
>> >
>> >> Do Accumulo users need Hadoop or it's dependencies in order to use the
>> >> client APIs?
>> >>
>> >> The only client API that I could see needing it would be the
>> >> [In|Out]putFormats, but it'd be cool if that was a separate module and
>> >> that module had the appropriate Hadoop dependencies with the compile
>> >> scope.
>> >>
>> >> -Joey
>> >>
>> >> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org>
>> wrote:
>> >> > What's the latest opinion whether things should be marked "provided"
>> in
>> >> the pom?
>> >> > I've changed my mind on this a few times, myself, so I'm curious what
>> >> > others think.
>> >> >
>> >> > The provided scope means that it will not propagate as a transitive
>> >> > dependency. Other than that, it doesn't do much... though we can
>> >> > control packaging based on provided or not.
>> >> >
>> >> > I'm not sure this gets us much, and it's inconvenient for users. We
>> >> > can control packaging in other ways (like being more explicit and
>> >> > carefully considering which dependencies we include in an RPM or
>> >> > tarball, for instance).
>> >> >
>> >> > If we drop its declaration, what this means, is that if users want to
>> >> > build with Accumulo as a dependency, but against a different version
>> >> > of Hadoop than what we declare in our POM, they'll have to explicitly
>> >> > <exclude> the hadoop dependencies, and redeclare them, or they will
>> >> > have to use their <dependencyManagement> section to force a particular
>> >> > dependency of hadoop.
>> >> >
>> >> > The advantage to users, though, if we drop this, is that they won't
>> >> > have to constantly re-declare transitive dependencies to get their
>> >> > projects to build/test/run.
>> >> >
>> >> > See http://s.apache.org/maven-dependency-scopes
>> >> >
>> >> > Thoughts?
>> >> >
>> >> > --
>> >> > Christopher L Tubbs II
>> >> > http://gravatar.com/ctubbsii
>> >>
>>

Re: "Provided" dependencies

Posted by John Vines <vi...@apache.org>.
We support different versions of hadoop and we already need the HDFS
classpath for the conf files, so we might as well use the ones there
instead of bundling them up and potentially causing conflicts if something
strange happens in the hadoop client api.


On Wed, Nov 6, 2013 at 6:46 PM, Christopher <ct...@apache.org> wrote:

> I'm not sure I understand your meaning. Why exactly do you think
> specifying the scope as provided makes sense?
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Wed, Nov 6, 2013 at 5:46 PM, John Vines <vi...@apache.org> wrote:
> > The provided make sense for hadoop to pick up dependencies. To a less
> > extent, it makes sense for ZK.
> >
> > However, as someone who is using accumulo for a project, I would love to
> > have a client library that is as sparse as possible to avoid having to
> deal
> > with resource conflicts.
> >
> >
> > On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <
> joey+ml@clouderagovt.com>wrote:
> >
> >> Do Accumulo users need Hadoop or it's dependencies in order to use the
> >> client APIs?
> >>
> >> The only client API that I could see needing it would be the
> >> [In|Out]putFormats, but it'd be cool if that was a separate module and
> >> that module had the appropriate Hadoop dependencies with the compile
> >> scope.
> >>
> >> -Joey
> >>
> >> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org>
> wrote:
> >> > What's the latest opinion whether things should be marked "provided"
> in
> >> the pom?
> >> > I've changed my mind on this a few times, myself, so I'm curious what
> >> > others think.
> >> >
> >> > The provided scope means that it will not propagate as a transitive
> >> > dependency. Other than that, it doesn't do much... though we can
> >> > control packaging based on provided or not.
> >> >
> >> > I'm not sure this gets us much, and it's inconvenient for users. We
> >> > can control packaging in other ways (like being more explicit and
> >> > carefully considering which dependencies we include in an RPM or
> >> > tarball, for instance).
> >> >
> >> > If we drop its declaration, what this means, is that if users want to
> >> > build with Accumulo as a dependency, but against a different version
> >> > of Hadoop than what we declare in our POM, they'll have to explicitly
> >> > <exclude> the hadoop dependencies, and redeclare them, or they will
> >> > have to use their <dependencyManagement> section to force a particular
> >> > dependency of hadoop.
> >> >
> >> > The advantage to users, though, if we drop this, is that they won't
> >> > have to constantly re-declare transitive dependencies to get their
> >> > projects to build/test/run.
> >> >
> >> > See http://s.apache.org/maven-dependency-scopes
> >> >
> >> > Thoughts?
> >> >
> >> > --
> >> > Christopher L Tubbs II
> >> > http://gravatar.com/ctubbsii
> >>
>

Re: "Provided" dependencies

Posted by Christopher <ct...@apache.org>.
I'm not sure I understand your meaning. Why exactly do you think
specifying the scope as provided makes sense?

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Wed, Nov 6, 2013 at 5:46 PM, John Vines <vi...@apache.org> wrote:
> The provided make sense for hadoop to pick up dependencies. To a less
> extent, it makes sense for ZK.
>
> However, as someone who is using accumulo for a project, I would love to
> have a client library that is as sparse as possible to avoid having to deal
> with resource conflicts.
>
>
> On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <jo...@clouderagovt.com>wrote:
>
>> Do Accumulo users need Hadoop or it's dependencies in order to use the
>> client APIs?
>>
>> The only client API that I could see needing it would be the
>> [In|Out]putFormats, but it'd be cool if that was a separate module and
>> that module had the appropriate Hadoop dependencies with the compile
>> scope.
>>
>> -Joey
>>
>> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
>> > What's the latest opinion whether things should be marked "provided" in
>> the pom?
>> > I've changed my mind on this a few times, myself, so I'm curious what
>> > others think.
>> >
>> > The provided scope means that it will not propagate as a transitive
>> > dependency. Other than that, it doesn't do much... though we can
>> > control packaging based on provided or not.
>> >
>> > I'm not sure this gets us much, and it's inconvenient for users. We
>> > can control packaging in other ways (like being more explicit and
>> > carefully considering which dependencies we include in an RPM or
>> > tarball, for instance).
>> >
>> > If we drop its declaration, what this means, is that if users want to
>> > build with Accumulo as a dependency, but against a different version
>> > of Hadoop than what we declare in our POM, they'll have to explicitly
>> > <exclude> the hadoop dependencies, and redeclare them, or they will
>> > have to use their <dependencyManagement> section to force a particular
>> > dependency of hadoop.
>> >
>> > The advantage to users, though, if we drop this, is that they won't
>> > have to constantly re-declare transitive dependencies to get their
>> > projects to build/test/run.
>> >
>> > See http://s.apache.org/maven-dependency-scopes
>> >
>> > Thoughts?
>> >
>> > --
>> > Christopher L Tubbs II
>> > http://gravatar.com/ctubbsii
>>

Re: "Provided" dependencies

Posted by John Vines <vi...@apache.org>.
The provided make sense for hadoop to pick up dependencies. To a less
extent, it makes sense for ZK.

However, as someone who is using accumulo for a project, I would love to
have a client library that is as sparse as possible to avoid having to deal
with resource conflicts.


On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <jo...@clouderagovt.com>wrote:

> Do Accumulo users need Hadoop or it's dependencies in order to use the
> client APIs?
>
> The only client API that I could see needing it would be the
> [In|Out]putFormats, but it'd be cool if that was a separate module and
> that module had the appropriate Hadoop dependencies with the compile
> scope.
>
> -Joey
>
> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
> > What's the latest opinion whether things should be marked "provided" in
> the pom?
> > I've changed my mind on this a few times, myself, so I'm curious what
> > others think.
> >
> > The provided scope means that it will not propagate as a transitive
> > dependency. Other than that, it doesn't do much... though we can
> > control packaging based on provided or not.
> >
> > I'm not sure this gets us much, and it's inconvenient for users. We
> > can control packaging in other ways (like being more explicit and
> > carefully considering which dependencies we include in an RPM or
> > tarball, for instance).
> >
> > If we drop its declaration, what this means, is that if users want to
> > build with Accumulo as a dependency, but against a different version
> > of Hadoop than what we declare in our POM, they'll have to explicitly
> > <exclude> the hadoop dependencies, and redeclare them, or they will
> > have to use their <dependencyManagement> section to force a particular
> > dependency of hadoop.
> >
> > The advantage to users, though, if we drop this, is that they won't
> > have to constantly re-declare transitive dependencies to get their
> > projects to build/test/run.
> >
> > See http://s.apache.org/maven-dependency-scopes
> >
> > Thoughts?
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
>

Re: "Provided" dependencies

Posted by Christopher <ct...@apache.org>.
On Wed, Nov 6, 2013 at 5:43 PM, Michael Berman <mb...@sqrrl.com> wrote:
> I think it would be nice to separate what client API users need from the
> the provided dependencies issue.  It seems like whatever module client
> projects depend on should itself only have dependencies on things that it
> actually needs.  If it doesn't need hadoop, then it shouldn't declare it as
> a dependency at all.  The hadoop-dependent server and the
> hadoop-independent client interface both need to share intermediate
> objects, but it seems like those could be defined in another, common
> hadoop-independent module.

What you're talking about is ACCUMULO-1483, which is a separate, but
related, issue (https://issues.apache.org/jira/browse/ACCUMULO-1483)
for creating a minimal API jar (accumulo-client-api), to include at
compile time, so users don't need to include the full dependency tree
to when only writing client code. The client code does need Hadoop
right now (some of our client code accepts or returns hadoop Text
objects). I would hope that the implementation of 1483 would eliminate
those cases.

> In/Outputformats are an exception, but I agree they would be best separated
> into their own hadoop-dependent module (which might itself depend on the
> client module).

Also related to ACCUMULO-1483... perhaps as a subtask to create an
accumulo-client-mapreduce module (or leave these in the accumulo-core
module).

> As far as the provided question goes, it seems to me that the only reason
> to mark a dep provided is if we think developers will *usually* want to
> compile against different versions.  Initially I thought it would make
> sense if we thought the runtime versions would vary, but Chris makes a good
> point that the deps we include in the distributed package can be selected
> independently of the maven dep scope.  Since you can build accumulo against
> any version of hadoop and it will still run against any other version of
> hadoop, I think it's better to make things easier on us by having it
> compile scoped.

I think you're right that the real question is whether users *usually*
need to specify a different version. However, even if they do need to
specify a different version, I think it makes more sense for them to
rely on their dependencyManagement section to select a specific
version, or to use excludes and declare the dependency which provides
the required classes, explicitly.

> If someone depends on the accumulo server, then they may have to exclude
> the transitive dependency if our hadoop is polluting theirs, but I think
> that issue can be mitigated by not requiring client apps to depend on the
> entire server.

Right. That will be solved with ACCUMULO-1483, which I'm going to
tackle in the next dev cycle.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


> On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <jo...@clouderagovt.com>wrote:
>
>> Do Accumulo users need Hadoop or it's dependencies in order to use the
>> client APIs?
>>
>> The only client API that I could see needing it would be the
>> [In|Out]putFormats, but it'd be cool if that was a separate module and
>> that module had the appropriate Hadoop dependencies with the compile
>> scope.
>>
>> -Joey
>>
>> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
>> > What's the latest opinion whether things should be marked "provided" in
>> the pom?
>> > I've changed my mind on this a few times, myself, so I'm curious what
>> > others think.
>> >
>> > The provided scope means that it will not propagate as a transitive
>> > dependency. Other than that, it doesn't do much... though we can
>> > control packaging based on provided or not.
>> >
>> > I'm not sure this gets us much, and it's inconvenient for users. We
>> > can control packaging in other ways (like being more explicit and
>> > carefully considering which dependencies we include in an RPM or
>> > tarball, for instance).
>> >
>> > If we drop its declaration, what this means, is that if users want to
>> > build with Accumulo as a dependency, but against a different version
>> > of Hadoop than what we declare in our POM, they'll have to explicitly
>> > <exclude> the hadoop dependencies, and redeclare them, or they will
>> > have to use their <dependencyManagement> section to force a particular
>> > dependency of hadoop.
>> >
>> > The advantage to users, though, if we drop this, is that they won't
>> > have to constantly re-declare transitive dependencies to get their
>> > projects to build/test/run.
>> >
>> > See http://s.apache.org/maven-dependency-scopes
>> >
>> > Thoughts?
>> >
>> > --
>> > Christopher L Tubbs II
>> > http://gravatar.com/ctubbsii
>>

Re: "Provided" dependencies

Posted by Adam Fuchs <af...@apache.org>.
On Nov 6, 2013 6:04 PM, "Joey Echeverria" <jo...@clouderagovt.com> wrote:
> ...
> If I depend on Accumulo in my maven project, then I shouldn't need to
> depend on Hadoop unless the APIs I'm using leak that dependency or I
> have an explicit dependency on Hadoop elsewhere.

We currently leak the Text object. It would be great if we didn't! (newbie
project?)

> > Since you can build accumulo against
> > any version of hadoop and it will still run against any other version of
> > hadoop, I think it's better to make things easier on us by having it
> > compile scoped.
>
> That's not strictly true. If you build against Hadoop1, I don't think
> you can run against Hadoop2, but I could be wrong. I do know that
> unless you're doing some reflection magic, you have to modify
> [In|Out]putFormats as the APIs moved some classes to interfaces and
> vice versa.

We have done some crazy reflection stuff to make this possible.

Cheers,
Adam

Re: "Provided" dependencies

Posted by Joey Echeverria <jo...@clouderagovt.com>.
I'm a little lost here I think.

On Wed, Nov 6, 2013 at 5:43 PM, Michael Berman <mb...@sqrrl.com> wrote:
> As far as the provided question goes, it seems to me that the only reason
> to mark a dep provided is if we think developers will *usually* want to
> compile against different versions.  Initially I thought it would make
> sense if we thought the runtime versions would vary, but Chris makes a good
> point that the deps we include in the distributed package can be selected
> independently of the maven dep scope.

If I depend on Accumulo in my maven project, then I shouldn't need to
depend on Hadoop unless the APIs I'm using leak that dependency or I
have an explicit dependency on Hadoop elsewhere.

> Since you can build accumulo against
> any version of hadoop and it will still run against any other version of
> hadoop, I think it's better to make things easier on us by having it
> compile scoped.

That's not strictly true. If you build against Hadoop1, I don't think
you can run against Hadoop2, but I could be wrong. I do know that
unless you're doing some reflection magic, you have to modify
[In|Out]putFormats as the APIs moved some classes to interfaces and
vice versa.

> If someone depends on the accumulo server, then they may have to exclude
> the transitive dependency if our hadoop is polluting theirs, but I think
> that issue can be mitigated by not requiring client apps to depend on the
> entire server.

I could see the server artifact having Hadoop scoped compile, but I
can't imagine that most users actually build against it. Or are we
just taking about changing it for -server?

-Joey

> On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <jo...@clouderagovt.com>wrote:
>
>> Do Accumulo users need Hadoop or it's dependencies in order to use the
>> client APIs?
>>
>> The only client API that I could see needing it would be the
>> [In|Out]putFormats, but it'd be cool if that was a separate module and
>> that module had the appropriate Hadoop dependencies with the compile
>> scope.
>>
>> -Joey
>>
>> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
>> > What's the latest opinion whether things should be marked "provided" in
>> the pom?
>> > I've changed my mind on this a few times, myself, so I'm curious what
>> > others think.
>> >
>> > The provided scope means that it will not propagate as a transitive
>> > dependency. Other than that, it doesn't do much... though we can
>> > control packaging based on provided or not.
>> >
>> > I'm not sure this gets us much, and it's inconvenient for users. We
>> > can control packaging in other ways (like being more explicit and
>> > carefully considering which dependencies we include in an RPM or
>> > tarball, for instance).
>> >
>> > If we drop its declaration, what this means, is that if users want to
>> > build with Accumulo as a dependency, but against a different version
>> > of Hadoop than what we declare in our POM, they'll have to explicitly
>> > <exclude> the hadoop dependencies, and redeclare them, or they will
>> > have to use their <dependencyManagement> section to force a particular
>> > dependency of hadoop.
>> >
>> > The advantage to users, though, if we drop this, is that they won't
>> > have to constantly re-declare transitive dependencies to get their
>> > projects to build/test/run.
>> >
>> > See http://s.apache.org/maven-dependency-scopes
>> >
>> > Thoughts?
>> >
>> > --
>> > Christopher L Tubbs II
>> > http://gravatar.com/ctubbsii
>>

Re: "Provided" dependencies

Posted by Michael Berman <mb...@sqrrl.com>.
I think it would be nice to separate what client API users need from the
the provided dependencies issue.  It seems like whatever module client
projects depend on should itself only have dependencies on things that it
actually needs.  If it doesn't need hadoop, then it shouldn't declare it as
a dependency at all.  The hadoop-dependent server and the
hadoop-independent client interface both need to share intermediate
objects, but it seems like those could be defined in another, common
hadoop-independent module.

In/Outputformats are an exception, but I agree they would be best separated
into their own hadoop-dependent module (which might itself depend on the
client module).

As far as the provided question goes, it seems to me that the only reason
to mark a dep provided is if we think developers will *usually* want to
compile against different versions.  Initially I thought it would make
sense if we thought the runtime versions would vary, but Chris makes a good
point that the deps we include in the distributed package can be selected
independently of the maven dep scope.  Since you can build accumulo against
any version of hadoop and it will still run against any other version of
hadoop, I think it's better to make things easier on us by having it
compile scoped.

If someone depends on the accumulo server, then they may have to exclude
the transitive dependency if our hadoop is polluting theirs, but I think
that issue can be mitigated by not requiring client apps to depend on the
entire server.


On Wed, Nov 6, 2013 at 5:17 PM, Joey Echeverria <jo...@clouderagovt.com>wrote:

> Do Accumulo users need Hadoop or it's dependencies in order to use the
> client APIs?
>
> The only client API that I could see needing it would be the
> [In|Out]putFormats, but it'd be cool if that was a separate module and
> that module had the appropriate Hadoop dependencies with the compile
> scope.
>
> -Joey
>
> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
> > What's the latest opinion whether things should be marked "provided" in
> the pom?
> > I've changed my mind on this a few times, myself, so I'm curious what
> > others think.
> >
> > The provided scope means that it will not propagate as a transitive
> > dependency. Other than that, it doesn't do much... though we can
> > control packaging based on provided or not.
> >
> > I'm not sure this gets us much, and it's inconvenient for users. We
> > can control packaging in other ways (like being more explicit and
> > carefully considering which dependencies we include in an RPM or
> > tarball, for instance).
> >
> > If we drop its declaration, what this means, is that if users want to
> > build with Accumulo as a dependency, but against a different version
> > of Hadoop than what we declare in our POM, they'll have to explicitly
> > <exclude> the hadoop dependencies, and redeclare them, or they will
> > have to use their <dependencyManagement> section to force a particular
> > dependency of hadoop.
> >
> > The advantage to users, though, if we drop this, is that they won't
> > have to constantly re-declare transitive dependencies to get their
> > projects to build/test/run.
> >
> > See http://s.apache.org/maven-dependency-scopes
> >
> > Thoughts?
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
>

Re: "Provided" dependencies

Posted by Joey Echeverria <jo...@clouderagovt.com>.
Do Accumulo users need Hadoop or it's dependencies in order to use the
client APIs?

The only client API that I could see needing it would be the
[In|Out]putFormats, but it'd be cool if that was a separate module and
that module had the appropriate Hadoop dependencies with the compile
scope.

-Joey

On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
> What's the latest opinion whether things should be marked "provided" in the pom?
> I've changed my mind on this a few times, myself, so I'm curious what
> others think.
>
> The provided scope means that it will not propagate as a transitive
> dependency. Other than that, it doesn't do much... though we can
> control packaging based on provided or not.
>
> I'm not sure this gets us much, and it's inconvenient for users. We
> can control packaging in other ways (like being more explicit and
> carefully considering which dependencies we include in an RPM or
> tarball, for instance).
>
> If we drop its declaration, what this means, is that if users want to
> build with Accumulo as a dependency, but against a different version
> of Hadoop than what we declare in our POM, they'll have to explicitly
> <exclude> the hadoop dependencies, and redeclare them, or they will
> have to use their <dependencyManagement> section to force a particular
> dependency of hadoop.
>
> The advantage to users, though, if we drop this, is that they won't
> have to constantly re-declare transitive dependencies to get their
> projects to build/test/run.
>
> See http://s.apache.org/maven-dependency-scopes
>
> Thoughts?
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii

Re: "Provided" dependencies

Posted by Josh Elser <jo...@gmail.com>.
+1 1.5.0 was a big pain in the butt with the addition of the provided 
scope when moving from 1.4. Or was it 1.3 to 1.4? Whichever it was, it 
was annoying :)

On 11/7/13, 5:05 PM, Keith Turner wrote:
> Dropping provided sounds good.    Seems like it would make users poms
> simpler.
>
>
> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
>
>> What's the latest opinion whether things should be marked "provided" in
>> the pom?
>> I've changed my mind on this a few times, myself, so I'm curious what
>> others think.
>>
>> The provided scope means that it will not propagate as a transitive
>> dependency. Other than that, it doesn't do much... though we can
>> control packaging based on provided or not.
>>
>> I'm not sure this gets us much, and it's inconvenient for users. We
>> can control packaging in other ways (like being more explicit and
>> carefully considering which dependencies we include in an RPM or
>> tarball, for instance).
>>
>> If we drop its declaration, what this means, is that if users want to
>> build with Accumulo as a dependency, but against a different version
>> of Hadoop than what we declare in our POM, they'll have to explicitly
>> <exclude> the hadoop dependencies, and redeclare them, or they will
>> have to use their <dependencyManagement> section to force a particular
>> dependency of hadoop.
>>
>> The advantage to users, though, if we drop this, is that they won't
>> have to constantly re-declare transitive dependencies to get their
>> projects to build/test/run.
>>
>> See http://s.apache.org/maven-dependency-scopes
>>
>> Thoughts?
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>

Re: "Provided" dependencies

Posted by Christopher <ct...@apache.org>.
It eases the common case of including transitive dependencies in user
code that writes against the Accumulo API, allowing them to control
versioning of dependencies with the standard dependencyManagement
section (they won't have to explicitly specify dependencies which are
transitive, which can be non-intuitive).

It makes things slightly more difficult in the case of specifying an
alternate transitive dependency than the standard one (eg. using CDH
instead of Apache Hadoop). In this case, users will continue to
specify their CDH dependencies as they would anyway in their project,
but they'll also have to exclude the Apache Hadoop transitive
dependency from the Accumulo dependency.

There's a trade-off here, based on what makes it easier for users
writing client code against Accumulo. As far as I can tell, there's no
other pro/con for the declaration, as our packaging scheme is
independent of the scope.

These choices become moot when we can get separate accumulo-client-api
and accumulo-client-runtime dependencies via ACCUMULO-1483.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Nov 7, 2013 at 5:09 PM, Sean Busbey <bu...@clouderagovt.com> wrote:
> Can we please specify what use case we're hoping to ease by changing our
> provided status for e.g. hadoop-client?
>
>
>
>
> On Thu, Nov 7, 2013 at 4:05 PM, Keith Turner <ke...@deenlo.com> wrote:
>
>> Dropping provided sounds good.    Seems like it would make users poms
>> simpler.
>>
>>
>> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
>>
>> > What's the latest opinion whether things should be marked "provided" in
>> > the pom?
>> > I've changed my mind on this a few times, myself, so I'm curious what
>> > others think.
>> >
>> > The provided scope means that it will not propagate as a transitive
>> > dependency. Other than that, it doesn't do much... though we can
>> > control packaging based on provided or not.
>> >
>> > I'm not sure this gets us much, and it's inconvenient for users. We
>> > can control packaging in other ways (like being more explicit and
>> > carefully considering which dependencies we include in an RPM or
>> > tarball, for instance).
>> >
>> > If we drop its declaration, what this means, is that if users want to
>> > build with Accumulo as a dependency, but against a different version
>> > of Hadoop than what we declare in our POM, they'll have to explicitly
>> > <exclude> the hadoop dependencies, and redeclare them, or they will
>> > have to use their <dependencyManagement> section to force a particular
>> > dependency of hadoop.
>> >
>> > The advantage to users, though, if we drop this, is that they won't
>> > have to constantly re-declare transitive dependencies to get their
>> > projects to build/test/run.
>> >
>> > See http://s.apache.org/maven-dependency-scopes
>> >
>> > Thoughts?
>> >
>> > --
>> > Christopher L Tubbs II
>> > http://gravatar.com/ctubbsii
>> >
>>
>
>
>
> --
> Sean

Re: "Provided" dependencies

Posted by Sean Busbey <bu...@clouderagovt.com>.
Can we please specify what use case we're hoping to ease by changing our
provided status for e.g. hadoop-client?




On Thu, Nov 7, 2013 at 4:05 PM, Keith Turner <ke...@deenlo.com> wrote:

> Dropping provided sounds good.    Seems like it would make users poms
> simpler.
>
>
> On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:
>
> > What's the latest opinion whether things should be marked "provided" in
> > the pom?
> > I've changed my mind on this a few times, myself, so I'm curious what
> > others think.
> >
> > The provided scope means that it will not propagate as a transitive
> > dependency. Other than that, it doesn't do much... though we can
> > control packaging based on provided or not.
> >
> > I'm not sure this gets us much, and it's inconvenient for users. We
> > can control packaging in other ways (like being more explicit and
> > carefully considering which dependencies we include in an RPM or
> > tarball, for instance).
> >
> > If we drop its declaration, what this means, is that if users want to
> > build with Accumulo as a dependency, but against a different version
> > of Hadoop than what we declare in our POM, they'll have to explicitly
> > <exclude> the hadoop dependencies, and redeclare them, or they will
> > have to use their <dependencyManagement> section to force a particular
> > dependency of hadoop.
> >
> > The advantage to users, though, if we drop this, is that they won't
> > have to constantly re-declare transitive dependencies to get their
> > projects to build/test/run.
> >
> > See http://s.apache.org/maven-dependency-scopes
> >
> > Thoughts?
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
> >
>



-- 
Sean

Re: "Provided" dependencies

Posted by Keith Turner <ke...@deenlo.com>.
Dropping provided sounds good.    Seems like it would make users poms
simpler.


On Wed, Nov 6, 2013 at 5:05 PM, Christopher <ct...@apache.org> wrote:

> What's the latest opinion whether things should be marked "provided" in
> the pom?
> I've changed my mind on this a few times, myself, so I'm curious what
> others think.
>
> The provided scope means that it will not propagate as a transitive
> dependency. Other than that, it doesn't do much... though we can
> control packaging based on provided or not.
>
> I'm not sure this gets us much, and it's inconvenient for users. We
> can control packaging in other ways (like being more explicit and
> carefully considering which dependencies we include in an RPM or
> tarball, for instance).
>
> If we drop its declaration, what this means, is that if users want to
> build with Accumulo as a dependency, but against a different version
> of Hadoop than what we declare in our POM, they'll have to explicitly
> <exclude> the hadoop dependencies, and redeclare them, or they will
> have to use their <dependencyManagement> section to force a particular
> dependency of hadoop.
>
> The advantage to users, though, if we drop this, is that they won't
> have to constantly re-declare transitive dependencies to get their
> projects to build/test/run.
>
> See http://s.apache.org/maven-dependency-scopes
>
> Thoughts?
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>