You are viewing a plain text version of this content. The canonical link for it is here.

Posted to ivy-user@ant.apache.org by Robert Buck <rb...@m-Qube.com> on 2007/01/26 14:44:43 UTC

machine local repositories and disconnected development

Back in December I asked a series of questions under "newbie: cant get
local and shared repository to work correctly", referring to the IVY
documentation for "Configuring default resolver", where I refer to the
following cited configuration, which has no practical value as it does
not appear to do what the IVY documentation is implying:

<ivyconf>
  <conf defaultResolver="default"/>
  <include url="${ivy.default.conf.dir}/ivyconf-public.xml"/>
  <include url="${ivy.default.conf.dir}/ivyconf-shared.xml"/>
  <include url="${ivy.default.conf.dir}/ivyconf-local.xml"/>
  <include url="${ivy.default.conf.dir}/ivyconf-main-chain.xml"/>
  <include url="${ivy.default.conf.dir}/ivyconf-default-chain.xml"/>
</ivyconf>

Getting back to that only now...

It would seem to me IVY has three practical value areas: (a)
intra-project dependencies, (b) external dependencies, and (c) caching.
Having used IVY for two months now, there are a number of issues that
arise due to how IVY is implemented.

Regarding intra-project dependencies and caching, one ends up with at
least three duplicate copies of the built artifacts: one in the build
output directory, one copy in the local repository as a result of the
publish action, and one copy in the cache resulting from a resolve by a
depending project (not to mention the artifact ultimately published to a
public repository). Having at least three copies for each artifact is
completely unnecessary, for large projects will result in 10's of
Gigabytes of unnecessary disk utilization, will slow down builds
significantly.

Question: How does one merge a cache and the local repository so that
resolve and publish act out the same file system location?

Regarding external dependencies and caching, what is implied in the
cited documentation is that shared is, well, shared between several
workspaces. What is not documented is how to achieve a shared
repository. Specifically, in the email archives people have already
asked about how to define a machine-local shared repository that is
"shared" by all workspaces on the local machine, that is NOT published
to by default. What has not been answered to other people is the HOW.
Responders have stated that you replicate by some undefined process the
public repository to the shared repository. What I want to know is the
HOW. The value of a shared repository is obvious: once the artifacts are
replicated from the public repository, a developer may work in any
workspace completely disconnected. However, the shared repository MUST
NOT be polluted with any workspace specific artifacts.

Question: What out of the box IVY tooling allows a copy of an artifact
to be duplicated into the "shared" repository?


Rants:

An initial thought (given the current implementation of IVY) was that it
would have been nice to create a graph such as:

Public <- Cache(Shared) <- Local

Where Public is an HTTP based repository, Cache is where the resolved
artifacts from the public repository go, and Local where the workspace
artifacts are published to and resolved from.

Because the IVY implementation of a cache violates the single
responsibility principle, its use in this sort of model is compromised
on account of the different classes of data stored in it. Rather than
introducing a private meta-data cache (per workspace) that only has the
dependency information necessary for the resolve and publish mechanisms,
IVY presently throws all these artifacts in with the published and
resolved artifacts. The meta data clearly has no practical value outside
of an individual workspace, and this seems to be a pretty large design
flaw.

-Bob

Re: machine local repositories and disconnected development

Posted by Xavier Hanin <xa...@gmail.com>.

On 1/31/07, Robert Buck <rb...@m-qube.com> wrote:
>
>
> > -----Original Message-----
> > From: Xavier Hanin [mailto:xavier.hanin@gmail.com]
> > Sent: Monday, January 29, 2007 9:40 AM
> > To: ivy-user@incubator.apache.org
> > Subject: Re: machine local repositories and disconnected development
> >
> > On 1/26/07, Robert Buck <rb...@m-qube.com> wrote:
> [...]
> >
> > >
> > > Because the IVY implementation of a cache violates the single
> > > responsibility principle, its use in this sort of model is
> > compromised
> > > on account of the different classes of data stored in it.
> > Rather than
> > > introducing a private meta-data cache (per workspace) that only has
> > > the dependency information necessary for the resolve and publish
> > > mechanisms, IVY presently throws all these artifacts in with the
> > > published and resolved artifacts. The meta data clearly has no
> > > practical value outside of an individual workspace, and
> > this seems to
> > > be a pretty large design flaw.
>
> > I'm not sure to understand what you mean here, but the Ivy
> > cache is not really meant to be shared among several users.
> > But I agree that there are several kind of information in Ivy
> > cache which should be isolated to make it cleaner and more flexible.
> >
> > Xavier
>
> All I am saying is that the because the cache fulfills two roles, that
> of caching meta data private to a workspace (about the local artifacts),
> and storing the local artifacts too, it cannot be used in a "shared"
> mode because this would result in corruption of other workspaces.
> Consequently, it is for this reason the inverted resolve chain could
> never work in practice.
OK, I agree, the two concepts are different and should clearly be
distinguished. I'm currently working on some refactorings, and
hopefully this should ease this distinction in a near future. A first
step in the right direction :-)

>
> I like the idea of the decorators that was introduced later in this
> thread. Then it would seem possible to let the local cache be a purely
> workspace concept, containing workspace private information, and the
> resolved artifacts could remain in a http caching resolver, satisfying
> the targetted goal I was wishing for.
>
> I really like the idea of the decorators. It would really add a lot of
> flexibility to IVY, while I might also suggest it might tighten up some
> key concepts and details related to IVY, if not also its internal
> implementation.
I really like it too, but it won't be straightforward to implement due
to current use of the cache. But it's far from impossible, and would
really improve flexibility and design. Vote for the issue to show your
interest:
https://issues.apache.org/jira/browse/IVY-399

- Xavier

>
> Thanks folks,
>
> -Bob
>

RE: machine local repositories and disconnected development

Posted by Robert Buck <rb...@m-Qube.com>.

> -----Original Message-----
> From: Xavier Hanin [mailto:xavier.hanin@gmail.com] 
> Sent: Monday, January 29, 2007 9:40 AM
> To: ivy-user@incubator.apache.org
> Subject: Re: machine local repositories and disconnected development
> 
> On 1/26/07, Robert Buck <rb...@m-qube.com> wrote:
[...]
> 
> >
> > Because the IVY implementation of a cache violates the single 
> > responsibility principle, its use in this sort of model is 
> compromised 
> > on account of the different classes of data stored in it. 
> Rather than 
> > introducing a private meta-data cache (per workspace) that only has 
> > the dependency information necessary for the resolve and publish 
> > mechanisms, IVY presently throws all these artifacts in with the 
> > published and resolved artifacts. The meta data clearly has no 
> > practical value outside of an individual workspace, and 
> this seems to 
> > be a pretty large design flaw.

> I'm not sure to understand what you mean here, but the Ivy 
> cache is not really meant to be shared among several users. 
> But I agree that there are several kind of information in Ivy 
> cache which should be isolated to make it cleaner and more flexible.
> 
> Xavier

All I am saying is that the because the cache fulfills two roles, that
of caching meta data private to a workspace (about the local artifacts),
and storing the local artifacts too, it cannot be used in a "shared"
mode because this would result in corruption of other workspaces.
Consequently, it is for this reason the inverted resolve chain could
never work in practice.

I like the idea of the decorators that was introduced later in this
thread. Then it would seem possible to let the local cache be a purely
workspace concept, containing workspace private information, and the
resolved artifacts could remain in a http caching resolver, satisfying
the targetted goal I was wishing for.

I really like the idea of the decorators. It would really add a lot of
flexibility to IVY, while I might also suggest it might tighten up some
key concepts and details related to IVY, if not also its internal
implementation.

Thanks folks,

-Bob

Re: machine local repositories and disconnected development

Posted by Eric Crahen <er...@gmail.com>.

https://issues.apache.org/jira/browse/IVY-399

I think both caching issues could be handled through the same decorator
approach

On 1/29/07, Xavier Hanin <xa...@gmail.com> wrote:
>
> On 1/29/07, Eric Crahen <er...@gmail.com> wrote:
> > On 1/29/07, Xavier Hanin <xa...@gmail.com> wrote:
> > >
> > > Supporting this kind of graph
> > > could be interesting, and what makes it difficult for Ivy is that Ivy
> > > heavily relies on its cache mechanism, which makes it impossible to do
> > > what you want (i.e. never put anything from your local repository to
> > > the cache).
> > >
> > > This would be a very powerful feature to add. In 2.0, is there any
> reason
> > for the cache to have to be so baked into everything? In otherwords, why
> not
> > implement every resolver and all of the internal management w/ no
> caching
> > what so ever baked in anywhere? Instead all caching is done in a
> decorator
> > fashion by wrapping a caching resolver around any other resolver? In
> > otherwords, the core of Ivy only knows about resolvers, no concept of
> cache
> > exists in the heart of Ivy.
> >
> > It seems to me this would be much more flexible, and it would still be
> very
> > possible to provide the syntactic sugar to make it very simple and even
> > seemless to configure these wrappers by default. At the same time,
> people
> > who will use the flexibility have the power to set up chains that might
> go
> > something like.
> >
> > (logical chain)
> >   localresolver
> >   cacheresolver
> >     httpresolver url="..."
> >   cacheresolver
> >     httpresolver url="..."
> >
> > There is no longer any need to have things like useLocal flags. Its
> already
> > expressed that the local resolver is not cached because its just not
> wrapped
> > in a caching resolver.
> >
> > I think this idiom should be applied to both artifact and metadata
> > resolution.
> >
> > One cool thing about this, is that in this way, since all caching is
> simply
> > a type of resolver we'd provide people who don't like the particular
> method
> > we use to perform caching in the resolver we provide are free to provide
> > their own. This would address lots of the issues that have been raised
> about
> > caching, consistency, doing anything remotely fancy with local resolvers
> -
> > right now its very hard to address any of that because caching is not
> very
> > plugable as it stands.
> >
> > I think the only drawback is that it seems like its harder to configure
> out
> > of the box because most people by default would want to wrap every
> resolver
> > with a cacheresolver - but like I said, this is easily solvable by
> providing
> > some simple syntactic sugar. For instance the simplehttpresolver might
> be
> > the name of an undecorated resolver for power users, and the things
> named
> > httpresolver would simple be an alias for the cacheresolver wrapped
> around
> > the simplehttpresolver (or subclass, whatever is the most sensible
> choice)
>
> Very interesting idea, indeed. With enough syntactic sugar, it could
> even be backward compatible, really interesting. One thing to keep in
> mind is to isolate the two parts of the cache: what is cached from
> dependency resolvers (basically module descriptors and artifacts) and
> what is cached for reusing latest resolve or for a deliver (basically
> what lays at the root of the cache in current implementation). This is
> worth an issue in JIRA, could you create it?
>
> - Xavier
>
> >
> > --
> >
> > - Eric
> >
> >
>



-- 

- Eric

Re: machine local repositories and disconnected development

Posted by Xavier Hanin <xa...@gmail.com>.

On 1/29/07, Eric Crahen <er...@gmail.com> wrote:
> On 1/29/07, Xavier Hanin <xa...@gmail.com> wrote:
> >
> > Supporting this kind of graph
> > could be interesting, and what makes it difficult for Ivy is that Ivy
> > heavily relies on its cache mechanism, which makes it impossible to do
> > what you want (i.e. never put anything from your local repository to
> > the cache).
> >
> > This would be a very powerful feature to add. In 2.0, is there any reason
> for the cache to have to be so baked into everything? In otherwords, why not
> implement every resolver and all of the internal management w/ no caching
> what so ever baked in anywhere? Instead all caching is done in a decorator
> fashion by wrapping a caching resolver around any other resolver? In
> otherwords, the core of Ivy only knows about resolvers, no concept of cache
> exists in the heart of Ivy.
>
> It seems to me this would be much more flexible, and it would still be very
> possible to provide the syntactic sugar to make it very simple and even
> seemless to configure these wrappers by default. At the same time, people
> who will use the flexibility have the power to set up chains that might go
> something like.
>
> (logical chain)
>   localresolver
>   cacheresolver
>     httpresolver url="..."
>   cacheresolver
>     httpresolver url="..."
>
> There is no longer any need to have things like useLocal flags. Its already
> expressed that the local resolver is not cached because its just not wrapped
> in a caching resolver.
>
> I think this idiom should be applied to both artifact and metadata
> resolution.
>
> One cool thing about this, is that in this way, since all caching is simply
> a type of resolver we'd provide people who don't like the particular method
> we use to perform caching in the resolver we provide are free to provide
> their own. This would address lots of the issues that have been raised about
> caching, consistency, doing anything remotely fancy with local resolvers -
> right now its very hard to address any of that because caching is not very
> plugable as it stands.
>
> I think the only drawback is that it seems like its harder to configure out
> of the box because most people by default would want to wrap every resolver
> with a cacheresolver - but like I said, this is easily solvable by providing
> some simple syntactic sugar. For instance the simplehttpresolver might be
> the name of an undecorated resolver for power users, and the things named
> httpresolver would simple be an alias for the cacheresolver wrapped around
> the simplehttpresolver (or subclass, whatever is the most sensible choice)

Very interesting idea, indeed. With enough syntactic sugar, it could
even be backward compatible, really interesting. One thing to keep in
mind is to isolate the two parts of the cache: what is cached from
dependency resolvers (basically module descriptors and artifacts) and
what is cached for reusing latest resolve or for a deliver (basically
what lays at the root of the cache in current implementation). This is
worth an issue in JIRA, could you create it?

- Xavier

>
> --
>
> - Eric
>
>

Re: machine local repositories and disconnected development

Posted by Eric Crahen <er...@gmail.com>.

On 1/29/07, Xavier Hanin <xa...@gmail.com> wrote:
>
> Supporting this kind of graph
> could be interesting, and what makes it difficult for Ivy is that Ivy
> heavily relies on its cache mechanism, which makes it impossible to do
> what you want (i.e. never put anything from your local repository to
> the cache).
>
> This would be a very powerful feature to add. In 2.0, is there any reason
for the cache to have to be so baked into everything? In otherwords, why not
implement every resolver and all of the internal management w/ no caching
what so ever baked in anywhere? Instead all caching is done in a decorator
fashion by wrapping a caching resolver around any other resolver? In
otherwords, the core of Ivy only knows about resolvers, no concept of cache
exists in the heart of Ivy.

It seems to me this would be much more flexible, and it would still be very
possible to provide the syntactic sugar to make it very simple and even
seemless to configure these wrappers by default. At the same time, people
who will use the flexibility have the power to set up chains that might go
something like.

(logical chain)
  localresolver
  cacheresolver
    httpresolver url="..."
  cacheresolver
    httpresolver url="..."

There is no longer any need to have things like useLocal flags. Its already
expressed that the local resolver is not cached because its just not wrapped
in a caching resolver.

I think this idiom should be applied to both artifact and metadata
resolution.

One cool thing about this, is that in this way, since all caching is simply
a type of resolver we'd provide people who don't like the particular method
we use to perform caching in the resolver we provide are free to provide
their own. This would address lots of the issues that have been raised about
caching, consistency, doing anything remotely fancy with local resolvers -
right now its very hard to address any of that because caching is not very
plugable as it stands.

I think the only drawback is that it seems like its harder to configure out
of the box because most people by default would want to wrap every resolver
with a cacheresolver - but like I said, this is easily solvable by providing
some simple syntactic sugar. For instance the simplehttpresolver might be
the name of an undecorated resolver for power users, and the things named
httpresolver would simple be an alias for the cacheresolver wrapped around
the simplehttpresolver (or subclass, whatever is the most sensible choice)

-- 

- Eric

Re: machine local repositories and disconnected development

Posted by Xavier Hanin <xa...@gmail.com>.

On 1/26/07, Robert Buck <rb...@m-qube.com> wrote:
> Back in December I asked a series of questions under "newbie: cant get
> local and shared repository to work correctly", referring to the IVY
> documentation for "Configuring default resolver", where I refer to the
> following cited configuration, which has no practical value as it does
> not appear to do what the IVY documentation is implying:
>
> <ivyconf>
>   <conf defaultResolver="default"/>
>   <include url="${ivy.default.conf.dir}/ivyconf-public.xml"/>
>   <include url="${ivy.default.conf.dir}/ivyconf-shared.xml"/>
>   <include url="${ivy.default.conf.dir}/ivyconf-local.xml"/>
>   <include url="${ivy.default.conf.dir}/ivyconf-main-chain.xml"/>
>   <include url="${ivy.default.conf.dir}/ivyconf-default-chain.xml"/>
> </ivyconf>
>
> Getting back to that only now...
>
> It would seem to me IVY has three practical value areas: (a)
> intra-project dependencies, (b) external dependencies, and (c) caching.
> Having used IVY for two months now, there are a number of issues that
> arise due to how IVY is implemented.
>
> Regarding intra-project dependencies and caching, one ends up with at
> least three duplicate copies of the built artifacts: one in the build
> output directory, one copy in the local repository as a result of the
> publish action, and one copy in the cache resulting from a resolve by a
> depending project (not to mention the artifact ultimately published to a
> public repository). Having at least three copies for each artifact is
> completely unnecessary, for large projects will result in 10's of
> Gigabytes of unnecessary disk utilization, will slow down builds
> significantly.
>
> Question: How does one merge a cache and the local repository so that
> resolve and publish act out the same file system location?
You can use useOrigin="true" so that your artifacts are not copied to the cache.
>
> Regarding external dependencies and caching, what is implied in the
> cited documentation is that shared is, well, shared between several
> workspaces. What is not documented is how to achieve a shared
> repository. Specifically, in the email archives people have already
> asked about how to define a machine-local shared repository that is
> "shared" by all workspaces on the local machine, that is NOT published
> to by default. What has not been answered to other people is the HOW.
> Responders have stated that you replicate by some undefined process the
> public repository to the shared repository. What I want to know is the
> HOW. The value of a shared repository is obvious: once the artifacts are
> replicated from the public repository, a developer may work in any
> workspace completely disconnected. However, the shared repository MUST
> NOT be polluted with any workspace specific artifacts.
>
> Question: What out of the box IVY tooling allows a copy of an artifact
> to be duplicated into the "shared" repository?
See Gilles answer and mine in other e-mail.

>
>
> Rants:
>
> An initial thought (given the current implementation of IVY) was that it
> would have been nice to create a graph such as:
>
> Public <- Cache(Shared) <- Local
>
> Where Public is an HTTP based repository, Cache is where the resolved
> artifacts from the public repository go, and Local where the workspace
> artifacts are published to and resolved from.
This one interesting point of view, but Ivy has to be flexible, and
will never hard code this kind of graph. Supporting this kind of graph
could be interesting, and what makes it difficult for Ivy is that Ivy
heavily relies on its cache mechanism, which makes it impossible to do
what you want (i.e. never put anything from your local repository to
the cache).

But the idea is interesting, it could be worth thinking of a way for
Ivy to deal with a cache in a more flexible way. Maybe this should go
in the section "requirements for a 2.0" on the wiki, since the change
seems to be too important for the 1.x stream.

>
> Because the IVY implementation of a cache violates the single
> responsibility principle, its use in this sort of model is compromised
> on account of the different classes of data stored in it. Rather than
> introducing a private meta-data cache (per workspace) that only has the
> dependency information necessary for the resolve and publish mechanisms,
> IVY presently throws all these artifacts in with the published and
> resolved artifacts. The meta data clearly has no practical value outside
> of an individual workspace, and this seems to be a pretty large design
> flaw.
I'm not sure to understand what you mean here, but the Ivy cache is
not really meant to be shared among several users. But I agree that
there are several kind of information in Ivy cache which should be
isolated to make it cleaner and more flexible.

Xavier
>
> -Bob
>