You are viewing a plain text version of this content. The canonical link for it is here.

Posted to ivy-dev@incubator.apache.org by Stefano Mazzocchi <st...@apache.org> on 2006/11/09 19:12:59 UTC

[RT] on package management

NOTE: [RT] stands for 'random thoughts' and it's a tradition of the
Cocoon community as a way to foster innovation and promote
brainstorming. Anything said in an RT can be out of line, blue sky or
even wako. That's a feature, not a bug.

                        ----------------------

I'm glad to see Ivy join the ASF and I'm glad to see both Ant and Maven
people joining here.

As Steve mentioned already, I'm very much interested in a "gump that
doesn't suck".

In order to do that, I need a project dependency graph... with two types
of nodes: projects and versions

A project is the "concept of a project", a version is an immutable
"instance of a project" in time.

The dependency information links versions, not projects (this is where
Gump got it wrong!). This is something that pretty much all package
managers understand. For Gump, the 'project dependency' information can
be inferred from the 'version dependency' information... but having that
information allows Gump to act both as a nightly build system and as a
continuous integration system.

                                - o -

Another part that Gump did wrong (well not really, but didn't really
thought it would be a problem) is that if metadata is not part of the
workflow, it will lag behind.

Maven solved this problem brilliantly and Ivy follows the same
footsteps. But other systems like apt-get and ports have the same concept.

Maven does dependency management and it does project automation. I
personally like Ant much better for project automation because Ant
procedurality fits me better (take a good build.xml, copy over and tweak
it for my needs). Maven works great for standard tasks, but I rarely
have those :-) and if I do, I have an ant template I can use (if ant
implemented target inheritance, you could have predefined 'standard'
targets in ant as well, but that's another story).

But Maven dependency management (and the m2 eclipse plugin that sets up
your build path + source for you!) is sooo compelling that it forced me
to switch over.

In order to keep the project metadata in the workflow, you need tools
that make primary use of that, tools that give value.

                                 - o -

My day job is to deal with incredibly large quantities of metadata.

I wrote an essay on my blog about the problem of the "quality of
metadata" and possible ways to solve it (it's the second link that shows
up if you query "quality of metadata", btw)

http://www.betaversion.org/~stefano/linotype/news/95/

Applying the lessons learned in metadata management and interoperability
here, there are a few things to note:

 1) there are ways to associate quality with metrics, but those metrics
are very hard to define objectively (and they are normally full of
exceptions)

 2) metadata gets 'polished' over time, mistakes are found and
corrected, change and feedback are *fundamental* to the stability of a
system and the convergence of metadata to higher quality standards

 3) decentralization doesn't decrease your quality, it just increases
your entropy

So, following the discussions here:

 1) the people who care about metadata are not necessarily the same
people that produce the software. Tools like maven and ivy force them to
care by making them impossible to build the project otherwise. This is a
very ingenious approach, but this is not the only way that metadata can
be introduced or changed. A system that is designed around the concept
of allowing projects and metadata to be edited independently has an
advantage in terms of social scalability over one that doesn't.

 2) if we allow project and their metadata to be edited independently,
they need to have independent versions.

 3) it is conceivable to enable 'trust metrics' that are more granular
than a repository level. So, for example, one could ask your package
manager to trust metadata that was signed with a key that you trust or
that was part of a chain of trusted keys.

                                  - o -

I've also been involved in the design of the Avalon and then Cocoon
blocks system, which is a precursor of OSGi.

There, we identified the need to distinguish between "instance" blocks
and "interface" blocks. (this concept is similar to 'virtual packages'
in apt-get)

So, basically, it is possible for a version of a package to depend on a
version of a package interface.

     cocoon 2.1.8 -(needs)-> jaxp 1.3

Then it is possible for a version of a package to implement one or more
package interfaces.

     xerces-j 2.6 -(implements)-> jaxp 1.3
     xerces-j 2.4 -(implements)-> jaxp 1.3
     JVM 1.5 [stub] -(implements)-> jaxp 1.3

This creates a polymorphic decoupling: package "cocoon 2.1.8" can query
the repository for all packages that implement "jaxp 1.3" and select
which one to use at runtime.

Another useful feature is to be able to express the need for version ranges:

     cocoon 2.1.8 -(needs)-> jaxp 1.x

                                  - o -

Another important feature of a package manager is the lack of central
point of failure. Here maven didn't really predict its own success and
went the wrong way.

Bittorrent shows how much transparent HTTP-based mirroring systems can
scale. The problem with bittorrent is that it's very slow to start and
it works best with very big files (because the time taken to download is
on average the time the other peers participate in the swarm)

It is possible to design a system that uses bittorrent feature of using
HTTP ranges to get 'chunks' of the same file from different repositories
(it also makes it a lot hard to 'falsify' a binary package since you get
its pieces from many different repositories) but doesn't need to follow
a percolative random-graph model of peer discovery.

Think of it as a bittorrent meets DNS sort of thing: you can query a
repository with a URI that identifies a package and it will return you
the metadata for that package, along with a list of URLs that contain
that package. Then your client can decide the best strategy to download
the package.

In this view, the concept of "uploading" a file to a repository is
bogus: the metadata should tell me where to find it (for example, where
in the tar.gz binary distro, the jar I want is found).

Here, I follow the 'ports' approach where there is no need for
repackaging, but just a repository of metadata and patches that can be
applied to the *original* distribution.

Thoughts?

-- 
Stefano.

Re: Gump

Posted by Stefano Mazzocchi <st...@apache.org>.

Stefan Bodewig wrote:
> On Thu, 09 Nov 2006, Stefano Mazzocchi <st...@apache.org> wrote:
> 
>> [this doesn't really belong here, but what the hell]
> 
> True, we should continue this on general@gump if you want to.

I will start off a thread there shortly.

> I just note that your perception of the current state of Gump is
> remarkably different from mine.

And I fully respect that and appreciate you telling me so.

But I want to say that I base my perception not on the percentage of
failure that has been steadily increasing since that time we reached
100% years ago (that is probably a maven2 problem and therefore a
technological one).

I'm talking about the fact that gump has been incredibly slow at
absorbing new project and new people.

I'm talking about the fact that we never pass the "it's probably a gump
fault" stage when something wrong shows up.

I'm talking about the fact that people send gump emails to /dev/null
because the signal/noise ratio is unbearable.

and I can go on if you want :-)

-- 
Stefano.

Re: Gump

Posted by Stefan Bodewig <bo...@apache.org>.

On Thu, 09 Nov 2006, Stefano Mazzocchi <st...@apache.org> wrote:

> [this doesn't really belong here, but what the hell]

True, we should continue this on general@gump if you want to.

I just note that your perception of the current state of Gump is
remarkably different from mine.

Stefan

Re: Gump (was Re: [RT] on package management)

Posted by Xavier Hanin <xa...@gmail.com>.

On 11/10/06, Stefano Mazzocchi <st...@apache.org> wrote:
>
> Stefan Bodewig wrote:
> > If a project doesn't want to be integrated against the latest version
> > of another project, that's a different issue - but probably only means
> > it shouldn't really be part of a Gump build anyway because Gump would
> > be the wrong system then.
>
> I rather strongly disagree.

I want a gump that is useful for the project *and* for the project's
> upstream dependencies.

The current gump creates social dynamics that are simply not working.

I'm not used to gump, but I both understand how interesting the concept is
and how limited it is in real life IMO. Trying to build the latest of
everything together (if I understand Gump philosophy correctly) is
interesting but you frequently need to be able to tune the dependencies to
be able to know if your project is in good shape or not. With current Gump
the information that your project does not build can be related to something
you do not control, and this is a problem.

So I share the opinion of Stefano that a tool able to see dependencies with
finer details and control over versions would really be interesting, and
cannot yet be found in the open source space. As I said in another mail, I
find this so much interesting that I would be glad to work on the subject
(even if my free time for now is hard to find).

Xavier

--
> Stefano.
>
>

Re: Gump (was Re: [RT] on package management)

Posted by Stefano Mazzocchi <st...@apache.org>.

Stefan Bodewig wrote:
> On Thu, 09 Nov 2006, Stefano Mazzocchi <st...@apache.org> wrote:
> 
>> NOTE: [RT] stands for 'random thoughts' and it's a tradition of the
>> Cocoon community as a way to foster innovation and promote
>> brainstorming. Anything said in an RT can be out of line, blue sky
>> or even wako. That's a feature, not a bug.
> 
> and I changed the subject because I'm responding to a side-note and
> not the original RT 8-)
> 
>> The dependency information links versions, not projects (this is
>> where Gump got it wrong!). This is something that pretty much all
>> package managers understand. For Gump, the 'project dependency'
>> information can be inferred from the 'version dependency'
>> information... but having that information allows Gump to act both
>> as a nightly build system and as a continuous integration system.
> 
> Gump really only wants to be a continuous integration system, not a
> nightly build system.  And as such, it must ignore the version
> dependencies.

[this doesn't really belong here, but what the hell]

> If a project doesn't want to be integrated against the latest version
> of another project, that's a different issue - but probably only means
> it shouldn't really be part of a Gump build anyway because Gump would
> be the wrong system then.

I rather strongly disagree.

I want a gump that is useful for the project *and* for the project's
upstream dependencies.

The current gump creates social dynamics that are simply not working.

-- 
Stefano.

Gump (was Re: [RT] on package management)

Posted by Stefan Bodewig <bo...@apache.org>.

On Thu, 09 Nov 2006, Stefano Mazzocchi <st...@apache.org> wrote:

> NOTE: [RT] stands for 'random thoughts' and it's a tradition of the
> Cocoon community as a way to foster innovation and promote
> brainstorming. Anything said in an RT can be out of line, blue sky
> or even wako. That's a feature, not a bug.

and I changed the subject because I'm responding to a side-note and
not the original RT 8-)

> The dependency information links versions, not projects (this is
> where Gump got it wrong!). This is something that pretty much all
> package managers understand. For Gump, the 'project dependency'
> information can be inferred from the 'version dependency'
> information... but having that information allows Gump to act both
> as a nightly build system and as a continuous integration system.

Gump really only wants to be a continuous integration system, not a
nightly build system.  And as such, it must ignore the version
dependencies.

If a project doesn't want to be integrated against the latest version
of another project, that's a different issue - but probably only means
it shouldn't really be part of a Gump build anyway because Gump would
be the wrong system then.

Stefan

Re: [RT] on package management

Posted by Stefano Mazzocchi <st...@apache.org>.

Steve Loughran wrote:
> Stefano Mazzocchi wrote:
> 
>>  (if ant
>> implemented target inheritance, you could have predefined 'standard'
>> targets in ant as well, but that's another story).
> 
> 
> It's been a long time since you exercised your commit rights

yes, and hopefully continue to do for a long time enjoying you guy doing
a much better job at it :-)

>, but if you
> do <import> another build file, you can override the targets, and refer
> to the previous ones by ${project{.target, where ${project} is
> project/@name of the imported file.
> 
> What you cannot do is declare a dependency on the dependencies of
> another project, i.e you cannot go
> 
> <target name="package" depends="dependencies(core.package)" />
> 
> Instead you have to define state targets that represent a
> state/milestone without any side effects in the target itself
> 
> 
> <target name="package" depends="ready-to-package" >
>     ...
> </target>
> 
> just an FYI. 

Very cool. Has nobody thought of using this to provide a basic set of
'default targets' that people could use for a "maven2->ant" back
migration strategy?

> If/when you choose to re-exercise your ant commit rights,
> we will give you a full java-developer test including hard questions
> about classloaders that nobody gets right.

Bring it on.

[FYI, I spent the last 6 months of my life trying designing a
cocoon-block-light system for Longwell and a liveconnect->java
bootstrapping classloader for firefox extensions to access java code
from them for Piggy Bank ;-)]

-- 
Stefano.

Re: [RT] on package management

Posted by Steve Loughran <st...@apache.org>.

Stefano Mazzocchi wrote:

>  (if ant
> implemented target inheritance, you could have predefined 'standard'
> targets in ant as well, but that's another story).

It's been a long time since you exercised your commit rights, but if you 
do <import> another build file, you can override the targets, and refer 
to the previous ones by ${project{.target, where ${project} is 
project/@name of the imported file.

What you cannot do is declare a dependency on the dependencies of 
another project, i.e you cannot go

<target name="package" depends="dependencies(core.package)" />

Instead you have to define state targets that represent a 
state/milestone without any side effects in the target itself

<target name="package" depends="ready-to-package" >
	...
</target>

just an FYI. If/when you choose to re-exercise your ant commit rights, 
we will give you a full java-developer test including hard questions 
about classloaders that nobody gets right.

-steve