You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@maven.apache.org by Brett Porter <br...@apache.org> on 2007/08/31 07:22:54 UTC

[PROPOSAL] Local Repository Separation

See: http://docs.codehaus.org/display/MAVEN/Local+repository+separation

Text included below for inline comments (which I'll feed back into  
the document as needed).

Context

The current local repository is a single file structure, stored  
typically in an individual user's home directory.

The suffers from the following problems:
* there is no locking, so if multiple Maven instances attempt to run  
on the same machine they can corrupt each other's metadata
* it serves multiple purposes - it is both a cache of remote  
repository artifacts, and a place to locally install artifacts that  
you build. Because of this, it is possible that the local cache does  
not always accurately reflect the state of the remote repository
** downloading a snapshot from a remote repository also writes the  
chosen version out as -SNAPSHOT, meaning that continues to get used  
even if the snapshot repository is removed
** downloading an artifact from a remote repository with a fixed  
version does not write metadata, so if that repository is later  
removed the artifact is still used though a clean build would fail.  
This particularly affects testing staged releases
* it can be different to isolate differences in the local repository  
without deleting the entire cache, requiring time consuming downloads.
* it isn't possible have multiple checkouts of the same development  
version and build them independently (particularly important for CI  
servers).
* it isn't possible to easily clean out a subsection of the repository
* the artifact code is over-complicated to implement the logic for  
sharing the storage

Solution

General Considerations

This solution aims to not change the current behaviour, other than to  
make it easier/possible to correct things considered known bugs as  
documented above. Resolution behaviour should not otherwise be  
affected and any such changes should be in the related proposal.

This proposal simply alters the storage of the artifacts.

Locking

Locking should be implemented at the individual artifact level. This  
can be done with a lockfile in the artifact top level directory  
(rather than the individual version), locking both the metadata and  
artifact.

An artifact operation should be done with files in a temporary  
location, and moved to the final location in one operation, wrapped  
by the creation of the lockfile. This makes the duration of the lock  
relatively short, so that Maven can simply block on the existence of  
a lockfile (both read and write operations), and remove it after a  
short period of time.

Local repository separation

The structure of the local repository should become:

.
|-- cache
|-- remote
|   |-- apache.snapshots
|   |-- central
|   |-- codehaus.snapshots
|   `-- ...
`-- workspace
     |-- default
     |-- workspace1
     `-- ...

The purposes of these directories are as follows:

* cache: immutable artifacts downloaded from a remote repository. No  
metadata is stored in this directory tree.

* remote: contains a directory for each remote repository (by  
repository identifier). This contains the metadata and mutable  
artifacts from that repository. Metadata files will return to the  
format {{maven-metadata.xml}} instead of the current {{maven-metadata- 
<id>.xml}} file format. Files in these repositories will typically be  
snapshots and metadata for releases, since actual releases are not  
mutable and can be stored in the {{cache}} directory

* workspace: contains a directory for each local workspace, with the  
primary one being {{default}}. This contains the metadata and files  
for any artifact built by maven (both snapshots, and releases).

Under each of these locations, the standard layout remains as it is now:

.
|-- cache
|   |-- com
|   |   `-- example
|   |       `-- ...
|   `-- org
|       |-- apache
|       |   `-- ...
|       `-- codehaus
|           `-- ...
|-- remote
|   |-- apache.snapshots
|   |   `-- org
|   |       `-- apache
|   |           `-- ...
|   |-- central
|   |-- codehaus.snapshots
|   |   `-- org
|   |       `-- codehaus
|   |           `-- ...
|   `-- ...
`-- workspace
     |-- default
     |   `-- com
     |       `-- example
     |           `-- ...
     |-- workspace1
     `-- ...

Search sequence

As current behaviour is to be retained when correct, the solution  
should aim to merge the metadata across the current workspace and  
active remote repositories to decide what artifact to use. The  
artifact can then be utilised from either the workspace, remote, or  
cache repositories.

Existence in the {{cache}} directory is not a decision point for  
using an artifact - this must be achieved and the artifact from there  
used if possible. This will help enable better utilising the remote  
repository metadata for tracking the source of an artifact in the  
future to resolve some of the problems listed in the context section  
of this proposal.

Deployment and Merging

Content will be deployed from the workspace directory, but will not  
be merged to other sections - there should be no need to migrate and  
data among the repository sections.

Rolling back a reactor build

While this would be a separate feature, and not the default  
behaviour, it would now be possible to use a temporary workspace to  
build artifacts during a reactor build and merge them into another  
workspace on completion, making an entire reactor build "atomic" with  
respect to the local repository.

Co-existence with Maven 2.0.x

A best practice should be to change your Maven 2.0.x configuration to  
use {{~/.m2/repository/cache}} as the local repository, and move the  
existing content, as this will co-exist properly with Maven 2.1.

Upgrade path

There is no need to upgrade existing local repositories - first use  
of Maven 2.1 will only mean users need to rebuild any local software,  
as remote artifacts will be redownloaded (however see above for the  
minimisation of this).

Shared Local Repositories

Though locking will now make this possible, it is still not a  
recommended practice to share the local repository. However, this  
structure will allow people to share the {{cache}} location safely to  
reduce disk usage if desired.

Cheers,
Brett




--
Brett Porter - brett@apache.org
Blog: http://www.devzuz.org/blogs/bporter/

Re: [PROPOSAL] Local Repository Separation

Posted by Andrew Williams <an...@handyande.co.uk>.

It will most likely work in small development environments.
What jason is saying is that it is not so likely to in corporate  
environments with more than one subnet.

Andy

On 1 Sep 2007, at 17:59, Nigel Magnay wrote:

> I guess ymmv, but I've never had zeroconf not work in development
> environments (we use the log4j zeroconf extensions all the time). Some
> services deliberately set hopcounts low if they're providing something
> particularly localized.
>
> Anyway - I wouldn't suggest it as the only mechanism (and it's  
> something you
> could do as in a mojo), just something that becomes easier if you  
> don't have
> inconsistent repository IDs.
>
> On 01/09/07, Jason van Zyl <ja...@maven.org> wrote:
>>
>>
>> On 1 Sep 07, at 5:43 AM 1 Sep 07, Nigel Magnay wrote:
>>
>>> A couple of really neat features, regardless of whether guids or
>>> some other
>>> identifying mechanism is used, would be
>>> 1) ability to use zeroconf (/bonjour) style networking to
>>> automatically
>>> configure your mirror settings
>>
>> In practice from experience I know that this doesn't often work very
>> well as multi-cast is often suppressed on corporate networks, and
>> multi-case DNS doesn't really work well across subnets. What does
>> work well is DNSSD as you just need to make the DNS server browse-
>> able, push SRV records into the system and then any tooling can find
>> your configuration mechanism. This works very well, but so does just
>> checking in a copy of Maven that a team uses and sharing the m2_home/
>> conf/settings.xml.  But the DNSSD is very handy but we had to hack up
>> the existing implementations to make it work probably.
>>
>> For this proposal I think it boils down to the ephemeral versus non.
>> I think there is an easier way to do what is proposed.
>>
>>> 2) for repositories themselves to contain a bit more metadata about
>>> the
>>> repository itself - what it does and doesn't contain for example.
>>> In one of
>>> our large builds, we have quite a lot of repositories, and the
>>> daily 'ask
>>> every repo if it has a new SNAPSHOT' was sped up considerably by
>>> configuring
>>> proximity to say 'no' to most paths.
>>>
>>> On 01/09/07, Stephen Connolly <st...@one-dash.com> wrote:
>>>>
>>>> Jerome Lacoste wrote:
>>>>> On 8/31/07, Brett Porter <br...@apache.org> wrote:
>>>>>
>>>>>> Yeah, I meant to note that - I was thinking that this should be
>>>>>> accompanied by a proposal to take care of the id ambiguity  
>>>>>> problems
>>>>>> which we've discussed a couple of times before.
>>>>>>
>>>>>> I think URLs are still problematic (since you can often have
>>>>>> different ones for the same location), though are a bit more  
>>>>>> robust
>>>>>> than IDs. We could hash them to generate the directory name in  
>>>>>> the
>>>>>> repository.
>>>>>>
>>>>>> What do others think?
>>>>>>
>>>>>
>>>>> How are workspaces identified ? a hash of the current directory ?
>>>>> Something related to the project one is working on ? Something
>>>>> related
>>>>> to the current process ?
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Jerome
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> -
>>>>> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
>>>>> For additional commands, e-mail: dev-help@maven.apache.org
>>>>>
>>>>>
>>>> Would a better option be to have the repositories store a  
>>>> identifying
>>>> GUID in their root URL.  That way mirrors would pick up the same  
>>>> GUID
>>>> and be identified as the same repository.
>>>>
>>>> Of course then there's the question, should an aggregating mirror
>>>> return
>>>> the GUIDs of all the repositories it aggregates or a unique hash.
>>>>
>>>> My feeling is it should return a unique hash, but maybe it could
>>>> return
>>>> information about it aggregating other repositories...
>>>>
>>>> thus:
>>>>
>>>> The repository-id.xml for an aggregate repository could be
>>>> something like
>>>>
>>>> <repository-id>
>>>>   <name>ACME Corp's Aggregated Proxy Repository</name>
>>>>   <id>1234-663a-7766-aabbef45-3244</id>
>>>>   <aggregate-repositories>
>>>>     <id>7688-364a-a7f6-1234567-f56e</id>
>>>>     <id>bcd3-5432-8899-9876543-acbd</id>
>>>>   </aggregate-repositories>
>>>> </repository-id>
>>>>
>>>> While repo1.maven.org could be:
>>>>
>>>> <repository-id>
>>>>   <name>Maven Central Repository</name>
>>>>   <id>7688-364a-a7f6-1234567-f56e</id>
>>>> </repository-id>
>>>>
>>>> And, say java.net2 (http://download.dev.java.net/maven/2) could be
>>>>
>>>> <repository-id>
>>>>   <name>Java.net's Maven2 Repository</name>
>>>>   <id>bcd3-5432-8899-9876543-acbd</id>
>>>> </repository-id>
>>>>
>>>> The advantage I see is that existing clients will not be looking
>>>> for the
>>>> repository-id.xml file, so they won't care.  If we can't find a
>>>> repository-id.xml file for the repository, we generate a GUID on  
>>>> the
>>>> client and store a mapping of URLs to GUIDs in a file in ~/.m2/ so
>>>> that
>>>> a user can control the mapping of these autogenerated GUIDs
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
>>>> For additional commands, e-mail: dev-help@maven.apache.org
>>>>
>>>>
>>
>> Thanks,
>>
>> Jason
>>
>> ----------------------------------------------------------
>> Jason van Zyl
>> Founder and PMC Chair, Apache Maven
>> jason at sonatype dot com
>> ----------------------------------------------------------
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
>> For additional commands, e-mail: dev-help@maven.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Nigel Magnay <ni...@gmail.com>.

I guess ymmv, but I've never had zeroconf not work in development
environments (we use the log4j zeroconf extensions all the time). Some
services deliberately set hopcounts low if they're providing something
particularly localized.

Anyway - I wouldn't suggest it as the only mechanism (and it's something you
could do as in a mojo), just something that becomes easier if you don't have
inconsistent repository IDs.

On 01/09/07, Jason van Zyl <ja...@maven.org> wrote:
>
>
> On 1 Sep 07, at 5:43 AM 1 Sep 07, Nigel Magnay wrote:
>
> > A couple of really neat features, regardless of whether guids or
> > some other
> > identifying mechanism is used, would be
> > 1) ability to use zeroconf (/bonjour) style networking to
> > automatically
> > configure your mirror settings
>
> In practice from experience I know that this doesn't often work very
> well as multi-cast is often suppressed on corporate networks, and
> multi-case DNS doesn't really work well across subnets. What does
> work well is DNSSD as you just need to make the DNS server browse-
> able, push SRV records into the system and then any tooling can find
> your configuration mechanism. This works very well, but so does just
> checking in a copy of Maven that a team uses and sharing the m2_home/
> conf/settings.xml.  But the DNSSD is very handy but we had to hack up
> the existing implementations to make it work probably.
>
> For this proposal I think it boils down to the ephemeral versus non.
> I think there is an easier way to do what is proposed.
>
> > 2) for repositories themselves to contain a bit more metadata about
> > the
> > repository itself - what it does and doesn't contain for example.
> > In one of
> > our large builds, we have quite a lot of repositories, and the
> > daily 'ask
> > every repo if it has a new SNAPSHOT' was sped up considerably by
> > configuring
> > proximity to say 'no' to most paths.
> >
> > On 01/09/07, Stephen Connolly <st...@one-dash.com> wrote:
> >>
> >> Jerome Lacoste wrote:
> >>> On 8/31/07, Brett Porter <br...@apache.org> wrote:
> >>>
> >>>> Yeah, I meant to note that - I was thinking that this should be
> >>>> accompanied by a proposal to take care of the id ambiguity problems
> >>>> which we've discussed a couple of times before.
> >>>>
> >>>> I think URLs are still problematic (since you can often have
> >>>> different ones for the same location), though are a bit more robust
> >>>> than IDs. We could hash them to generate the directory name in the
> >>>> repository.
> >>>>
> >>>> What do others think?
> >>>>
> >>>
> >>> How are workspaces identified ? a hash of the current directory ?
> >>> Something related to the project one is working on ? Something
> >>> related
> >>> to the current process ?
> >>>
> >>> Cheers,
> >>>
> >>> Jerome
> >>>
> >>> --------------------------------------------------------------------
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> >>> For additional commands, e-mail: dev-help@maven.apache.org
> >>>
> >>>
> >> Would a better option be to have the repositories store a identifying
> >> GUID in their root URL.  That way mirrors would pick up the same GUID
> >> and be identified as the same repository.
> >>
> >> Of course then there's the question, should an aggregating mirror
> >> return
> >> the GUIDs of all the repositories it aggregates or a unique hash.
> >>
> >> My feeling is it should return a unique hash, but maybe it could
> >> return
> >> information about it aggregating other repositories...
> >>
> >> thus:
> >>
> >> The repository-id.xml for an aggregate repository could be
> >> something like
> >>
> >> <repository-id>
> >>   <name>ACME Corp's Aggregated Proxy Repository</name>
> >>   <id>1234-663a-7766-aabbef45-3244</id>
> >>   <aggregate-repositories>
> >>     <id>7688-364a-a7f6-1234567-f56e</id>
> >>     <id>bcd3-5432-8899-9876543-acbd</id>
> >>   </aggregate-repositories>
> >> </repository-id>
> >>
> >> While repo1.maven.org could be:
> >>
> >> <repository-id>
> >>   <name>Maven Central Repository</name>
> >>   <id>7688-364a-a7f6-1234567-f56e</id>
> >> </repository-id>
> >>
> >> And, say java.net2 (http://download.dev.java.net/maven/2) could be
> >>
> >> <repository-id>
> >>   <name>Java.net's Maven2 Repository</name>
> >>   <id>bcd3-5432-8899-9876543-acbd</id>
> >> </repository-id>
> >>
> >> The advantage I see is that existing clients will not be looking
> >> for the
> >> repository-id.xml file, so they won't care.  If we can't find a
> >> repository-id.xml file for the repository, we generate a GUID on the
> >> client and store a mapping of URLs to GUIDs in a file in ~/.m2/ so
> >> that
> >> a user can control the mapping of these autogenerated GUIDs
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> >> For additional commands, e-mail: dev-help@maven.apache.org
> >>
> >>
>
> Thanks,
>
> Jason
>
> ----------------------------------------------------------
> Jason van Zyl
> Founder and PMC Chair, Apache Maven
> jason at sonatype dot com
> ----------------------------------------------------------
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [PROPOSAL] Local Repository Separation

Posted by Jason van Zyl <ja...@maven.org>.

On 2 Sep 07, at 6:35 PM 2 Sep 07, Brett Porter wrote:

>
> On 02/09/2007, at 11:37 PM, Brian E. Fox wrote:
>
>>> I know its another directory, but the following might be more
>>> straightforward:
>>
>>> .
>>> |-- metadata
>>> |   |-- apache.snapshots
>>> |   |-- central
>>> |   |-- codehaus.snapshots
>>> |   `-- ...
>>> |-- release-cache
>>> |-- snapshot-cache
>>> `-- workspace
>>>     |-- default
>>>     |-- workspace1
>>>     `-- ...
>>
>> I'm not sure why the metadata should be separate, but I can see the
>> release-cache, snapshot-cache and workspaces being useful. I like  
>> this
>> suggestion better than the original. The locking would be nice too.
>>
>
> the metadata separation is a bit of a toss up for me - it would  
> have the benefit of being able to interchange a local and remote  
> repository instead of the metadata format being separate. I added  
> it in here as a related change because of that benefit, but it  
> isn't really related to the initial requirements.
>

If we're not using repository ids, how are you going to designate the  
source. If you are going to use URLs and someone changes it, how are  
you going to guarantee consistency?

I don't think there's any value in separating the metadata. If you're  
going to use transactions now you need two instead of one to lay down  
the files.

> - Brett
>
> --
> Brett Porter - brett@apache.org
> Blog: http://www.devzuz.org/blogs/bporter/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>

Thanks,

Jason

----------------------------------------------------------
Jason van Zyl
Founder and PMC Chair, Apache Maven
jason at sonatype dot com
----------------------------------------------------------




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Brett Porter <br...@apache.org>.

On 02/09/2007, at 11:37 PM, Brian E. Fox wrote:

>> I know its another directory, but the following might be more
>> straightforward:
>
>> .
>> |-- metadata
>> |   |-- apache.snapshots
>> |   |-- central
>> |   |-- codehaus.snapshots
>> |   `-- ...
>> |-- release-cache
>> |-- snapshot-cache
>> `-- workspace
>>     |-- default
>>     |-- workspace1
>>     `-- ...
>
> I'm not sure why the metadata should be separate, but I can see the
> release-cache, snapshot-cache and workspaces being useful. I like this
> suggestion better than the original. The locking would be nice too.
>

the metadata separation is a bit of a toss up for me - it would have  
the benefit of being able to interchange a local and remote  
repository instead of the metadata format being separate. I added it  
in here as a related change because of that benefit, but it isn't  
really related to the initial requirements.

- Brett

--
Brett Porter - brett@apache.org
Blog: http://www.devzuz.org/blogs/bporter/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

RE: [PROPOSAL] Local Repository Separation

Posted by "Brian E. Fox" <br...@reply.infinity.nu>.

>I know its another directory, but the following might be more  
>straightforward:

>.
>|-- metadata
>|   |-- apache.snapshots
>|   |-- central
>|   |-- codehaus.snapshots
>|   `-- ...
>|-- release-cache
>|-- snapshot-cache
>`-- workspace
>     |-- default
>     |-- workspace1
>     `-- ...

I'm not sure why the metadata should be separate, but I can see the
release-cache, snapshot-cache and workspaces being useful. I like this
suggestion better than the original. The locking would be nice too. 

--Brian


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Brett Porter <br...@apache.org>.

On 02/09/2007, at 2:44 PM, Jason van Zyl wrote:

>
> On 1 Sep 07, at 7:04 PM 1 Sep 07, Brett Porter wrote:
>
>>
>> On 02/09/2007, at 1:33 AM, Jason van Zyl wrote:
>>
>>> For this proposal I think it boils down to the ephemeral versus  
>>> non. I think there is an easier way to do what is proposed.
>>>
>>
>> Are you talking about my proposal, or the settings zeroconf stuff?
>>
>
> I'm talking about your proposal being too complicated.

I believe this would actually simplify the current artifact handling  
code since there are dodgy bits related to dealing with local  
metadata. It has no effect on the user  in terms of their Maven  
usage. The only affect is "finding something" if they go digging  
around in the local repository - which you pointed out at the end so  
I'll come back to it.

> An option to use a separate local repository goes a long way and  
> with a cached remote using Proximity this is not loathsome.

My local repository is currently 2.5Gb.

For every one you create, you duplicate a significant portion of that  
data on the local disk, even if you avoid the network traffic by  
using a proxy.

> I don't think using a shared local repository is a particularly  
> bright idea.

This wasn't a proposal to facilitate sharing a local repository other  
than on the same machine. It's not unheard of this even in a normal  
dev machine setup. I've burned myself trying to build Maven and some  
plugin at the same time.

> But creating any layered structure should be reduced for the need  
> at hand. It seems to be people want to just separate between what  
> their projects produce and what is fixed. Trying to break things  
> down into the repository they came from isn't going to help anyone.

Is this the only part you are really objecting to, or the whole  
proposal? I don't want to throw the baby out with the bathwater.

I started out with just the locking and the separation of a single  
local repository, as came up on the list.

I then split the remote repositories out because I saw the  
opportunity to simplify the code, make it more accurately reflect a  
remote repository (they would now be identical and you could just  
sync them straight down), and because it would correct some other  
related bugs I saw.

I then split the local repository into workspaces as I anticipated  
that was easier to deal with than swapping the entire local  
repository path and having to have separate configuration for where  
the cache is to share (to address a spiraling disk space usage if you  
start switching local repos).

which parts do you like, and which don't you like?

> Something like telling Maven to cache by a groupId is one approach.  
> Could be a project within an organization or an entire organization  
> and this is probably only in the context of a build server. The  
> complexity added for developers being able to shared a single local  
> repository is just not worth it.

Sorry, I could really parse this. What would maven cache by group id,  
and where? Do you view this as complexity in the code (which I don't  
think will be the case) or for the user?

> To go from one place for the local cache to N where N is the number  
> of repositories would be overwhelmingly confusing.

This seems to be the only user complexity issue you are highlighting.  
I don't think users digging in their local repository is a  
particularly common case, but I would rule this out as an issue on  
releases downloaded by maven (they are in the cache directory) and  
stuff they install (it's in the workspace they used). I can see why  
it might be an issue for finding snapshots from a remote repository.  
But that change is totally negotiable in the proposal, I just felt  
that under the current set up the ability to use a remote repository  
as a local repository and simplify the handling was worth it.

I know its another directory, but the following might be more  
straightforward:

.
|-- metadata
|   |-- apache.snapshots
|   |-- central
|   |-- codehaus.snapshots
|   `-- ...
|-- release-cache
|-- snapshot-cache
`-- workspace
     |-- default
     |-- workspace1
     `-- ...

So you have one place that caches releases, one that caches snapshots  
(that can be nuked more easily), workspaces for your local  
installation, and optionally the metadata separated out to make it  
possible to rsync full remote repositories (since Maven could honour  
jars in here, but by default just store in the cache instead).

- Brett

--
Brett Porter - brett@apache.org
Blog: http://www.devzuz.org/blogs/bporter/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Jason van Zyl <ja...@maven.org>.

On 1 Sep 07, at 7:04 PM 1 Sep 07, Brett Porter wrote:

>
> On 02/09/2007, at 1:33 AM, Jason van Zyl wrote:
>
>> For this proposal I think it boils down to the ephemeral versus  
>> non. I think there is an easier way to do what is proposed.
>>
>
> Are you talking about my proposal, or the settings zeroconf stuff?
>

I'm talking about your proposal being too complicated. An option to  
use a separate local repository goes a long way and with a cached  
remote using Proximity this is not loathsome. I don't think using a  
shared local repository is a particularly bright idea. But creating  
any layered structure should be reduced for the need at hand. It  
seems to be people want to just separate between what their projects  
produce and what is fixed. Trying to break things down into the  
repository they came from isn't going to help anyone. Something like  
telling Maven to cache by a groupId is one approach. Could be a  
project within an organization or an entire organization and this is  
probably only in the context of a build server. The complexity added  
for developers being able to shared a single local repository is just  
not worth it. To go from one place for the local cache to N where N  
is the number of repositories would be overwhelmingly confusing.

> If it's for my proposal... let's hear the easier way, please.
>
> - Brett
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>

Thanks,

Jason

----------------------------------------------------------
Jason van Zyl
Founder and PMC Chair, Apache Maven
jason at sonatype dot com
----------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Brett Porter <br...@apache.org>.

On 02/09/2007, at 1:33 AM, Jason van Zyl wrote:

> For this proposal I think it boils down to the ephemeral versus  
> non. I think there is an easier way to do what is proposed.
>

Are you talking about my proposal, or the settings zeroconf stuff?

If it's for my proposal... let's hear the easier way, please.

- Brett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Jason van Zyl <ja...@maven.org>.

On 1 Sep 07, at 5:43 AM 1 Sep 07, Nigel Magnay wrote:

> A couple of really neat features, regardless of whether guids or  
> some other
> identifying mechanism is used, would be
> 1) ability to use zeroconf (/bonjour) style networking to  
> automatically
> configure your mirror settings

In practice from experience I know that this doesn't often work very  
well as multi-cast is often suppressed on corporate networks, and  
multi-case DNS doesn't really work well across subnets. What does  
work well is DNSSD as you just need to make the DNS server browse- 
able, push SRV records into the system and then any tooling can find  
your configuration mechanism. This works very well, but so does just  
checking in a copy of Maven that a team uses and sharing the m2_home/ 
conf/settings.xml.  But the DNSSD is very handy but we had to hack up  
the existing implementations to make it work probably.

For this proposal I think it boils down to the ephemeral versus non.  
I think there is an easier way to do what is proposed.

> 2) for repositories themselves to contain a bit more metadata about  
> the
> repository itself - what it does and doesn't contain for example.  
> In one of
> our large builds, we have quite a lot of repositories, and the  
> daily 'ask
> every repo if it has a new SNAPSHOT' was sped up considerably by  
> configuring
> proximity to say 'no' to most paths.
>
> On 01/09/07, Stephen Connolly <st...@one-dash.com> wrote:
>>
>> Jerome Lacoste wrote:
>>> On 8/31/07, Brett Porter <br...@apache.org> wrote:
>>>
>>>> Yeah, I meant to note that - I was thinking that this should be
>>>> accompanied by a proposal to take care of the id ambiguity problems
>>>> which we've discussed a couple of times before.
>>>>
>>>> I think URLs are still problematic (since you can often have
>>>> different ones for the same location), though are a bit more robust
>>>> than IDs. We could hash them to generate the directory name in the
>>>> repository.
>>>>
>>>> What do others think?
>>>>
>>>
>>> How are workspaces identified ? a hash of the current directory ?
>>> Something related to the project one is working on ? Something  
>>> related
>>> to the current process ?
>>>
>>> Cheers,
>>>
>>> Jerome
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
>>> For additional commands, e-mail: dev-help@maven.apache.org
>>>
>>>
>> Would a better option be to have the repositories store a identifying
>> GUID in their root URL.  That way mirrors would pick up the same GUID
>> and be identified as the same repository.
>>
>> Of course then there's the question, should an aggregating mirror  
>> return
>> the GUIDs of all the repositories it aggregates or a unique hash.
>>
>> My feeling is it should return a unique hash, but maybe it could  
>> return
>> information about it aggregating other repositories...
>>
>> thus:
>>
>> The repository-id.xml for an aggregate repository could be  
>> something like
>>
>> <repository-id>
>>   <name>ACME Corp's Aggregated Proxy Repository</name>
>>   <id>1234-663a-7766-aabbef45-3244</id>
>>   <aggregate-repositories>
>>     <id>7688-364a-a7f6-1234567-f56e</id>
>>     <id>bcd3-5432-8899-9876543-acbd</id>
>>   </aggregate-repositories>
>> </repository-id>
>>
>> While repo1.maven.org could be:
>>
>> <repository-id>
>>   <name>Maven Central Repository</name>
>>   <id>7688-364a-a7f6-1234567-f56e</id>
>> </repository-id>
>>
>> And, say java.net2 (http://download.dev.java.net/maven/2) could be
>>
>> <repository-id>
>>   <name>Java.net's Maven2 Repository</name>
>>   <id>bcd3-5432-8899-9876543-acbd</id>
>> </repository-id>
>>
>> The advantage I see is that existing clients will not be looking  
>> for the
>> repository-id.xml file, so they won't care.  If we can't find a
>> repository-id.xml file for the repository, we generate a GUID on the
>> client and store a mapping of URLs to GUIDs in a file in ~/.m2/ so  
>> that
>> a user can control the mapping of these autogenerated GUIDs
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
>> For additional commands, e-mail: dev-help@maven.apache.org
>>
>>

Thanks,

Jason

----------------------------------------------------------
Jason van Zyl
Founder and PMC Chair, Apache Maven
jason at sonatype dot com
----------------------------------------------------------




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Nigel Magnay <ni...@gmail.com>.

A couple of really neat features, regardless of whether guids or some other
identifying mechanism is used, would be
1) ability to use zeroconf (/bonjour) style networking to automatically
configure your mirror settings
2) for repositories themselves to contain a bit more metadata about the
repository itself - what it does and doesn't contain for example. In one of
our large builds, we have quite a lot of repositories, and the daily 'ask
every repo if it has a new SNAPSHOT' was sped up considerably by configuring
proximity to say 'no' to most paths.

On 01/09/07, Stephen Connolly <st...@one-dash.com> wrote:
>
> Jerome Lacoste wrote:
> > On 8/31/07, Brett Porter <br...@apache.org> wrote:
> >
> >> Yeah, I meant to note that - I was thinking that this should be
> >> accompanied by a proposal to take care of the id ambiguity problems
> >> which we've discussed a couple of times before.
> >>
> >> I think URLs are still problematic (since you can often have
> >> different ones for the same location), though are a bit more robust
> >> than IDs. We could hash them to generate the directory name in the
> >> repository.
> >>
> >> What do others think?
> >>
> >
> > How are workspaces identified ? a hash of the current directory ?
> > Something related to the project one is working on ? Something related
> > to the current process ?
> >
> > Cheers,
> >
> > Jerome
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> >
> >
> Would a better option be to have the repositories store a identifying
> GUID in their root URL.  That way mirrors would pick up the same GUID
> and be identified as the same repository.
>
> Of course then there's the question, should an aggregating mirror return
> the GUIDs of all the repositories it aggregates or a unique hash.
>
> My feeling is it should return a unique hash, but maybe it could return
> information about it aggregating other repositories...
>
> thus:
>
> The repository-id.xml for an aggregate repository could be something like
>
> <repository-id>
>   <name>ACME Corp's Aggregated Proxy Repository</name>
>   <id>1234-663a-7766-aabbef45-3244</id>
>   <aggregate-repositories>
>     <id>7688-364a-a7f6-1234567-f56e</id>
>     <id>bcd3-5432-8899-9876543-acbd</id>
>   </aggregate-repositories>
> </repository-id>
>
> While repo1.maven.org could be:
>
> <repository-id>
>   <name>Maven Central Repository</name>
>   <id>7688-364a-a7f6-1234567-f56e</id>
> </repository-id>
>
> And, say java.net2 (http://download.dev.java.net/maven/2) could be
>
> <repository-id>
>   <name>Java.net's Maven2 Repository</name>
>   <id>bcd3-5432-8899-9876543-acbd</id>
> </repository-id>
>
> The advantage I see is that existing clients will not be looking for the
> repository-id.xml file, so they won't care.  If we can't find a
> repository-id.xml file for the repository, we generate a GUID on the
> client and store a mapping of URLs to GUIDs in a file in ~/.m2/ so that
> a user can control the mapping of these autogenerated GUIDs
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

Re: [PROPOSAL] Local Repository Separation

Posted by Brett Porter <br...@apache.org>.

On 01/09/2007, at 6:22 PM, Stephen Connolly wrote:

> Would a better option be to have the repositories store a  
> identifying GUID in their root URL.  That way mirrors would pick up  
> the same GUID and be identified as the same repository.

Stephen - did you want to drop this into the user proposals section?  
I agree we need repository GUIDs and repository metadata, but I think  
it's a different thing to what I was proposing (apart from naming the  
directories) so we should track it separately and include all the  
settings and POM references as well.

Thanks,
Brett

>
> Of course then there's the question, should an aggregating mirror  
> return the GUIDs of all the repositories it aggregates or a unique  
> hash.
>
> My feeling is it should return a unique hash, but maybe it could  
> return information about it aggregating other repositories...
>
> thus:
>
> The repository-id.xml for an aggregate repository could be  
> something like
>
> <repository-id>
>  <name>ACME Corp's Aggregated Proxy Repository</name>
>  <id>1234-663a-7766-aabbef45-3244</id>
>  <aggregate-repositories>
>    <id>7688-364a-a7f6-1234567-f56e</id>
>    <id>bcd3-5432-8899-9876543-acbd</id>
>  </aggregate-repositories>
> </repository-id>
>
> While repo1.maven.org could be:
>
> <repository-id>
>  <name>Maven Central Repository</name>
>  <id>7688-364a-a7f6-1234567-f56e</id>
> </repository-id>
>
> And, say java.net2 (http://download.dev.java.net/maven/2) could be
>
> <repository-id>
>  <name>Java.net's Maven2 Repository</name>
>  <id>bcd3-5432-8899-9876543-acbd</id>
> </repository-id>
>
> The advantage I see is that existing clients will not be looking  
> for the repository-id.xml file, so they won't care.  If we can't  
> find a repository-id.xml file for the repository, we generate a  
> GUID on the client and store a mapping of URLs to GUIDs in a file  
> in ~/.m2/ so that a user can control the mapping of these  
> autogenerated GUIDs
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org

--
Brett Porter - brett@apache.org
Blog: http://www.devzuz.org/blogs/bporter/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Stephen Connolly <st...@one-dash.com>.

Jerome Lacoste wrote:
> On 8/31/07, Brett Porter <br...@apache.org> wrote:
>   
>> Yeah, I meant to note that - I was thinking that this should be
>> accompanied by a proposal to take care of the id ambiguity problems
>> which we've discussed a couple of times before.
>>
>> I think URLs are still problematic (since you can often have
>> different ones for the same location), though are a bit more robust
>> than IDs. We could hash them to generate the directory name in the
>> repository.
>>
>> What do others think?
>>     
>
> How are workspaces identified ? a hash of the current directory ?
> Something related to the project one is working on ? Something related
> to the current process ?
>
> Cheers,
>
> Jerome
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>   
Would a better option be to have the repositories store a identifying 
GUID in their root URL.  That way mirrors would pick up the same GUID 
and be identified as the same repository.

Of course then there's the question, should an aggregating mirror return 
the GUIDs of all the repositories it aggregates or a unique hash.

My feeling is it should return a unique hash, but maybe it could return 
information about it aggregating other repositories...

thus:

The repository-id.xml for an aggregate repository could be something like

<repository-id>
  <name>ACME Corp's Aggregated Proxy Repository</name>
  <id>1234-663a-7766-aabbef45-3244</id>
  <aggregate-repositories>
    <id>7688-364a-a7f6-1234567-f56e</id>
    <id>bcd3-5432-8899-9876543-acbd</id>
  </aggregate-repositories>
</repository-id>

While repo1.maven.org could be:

<repository-id>
  <name>Maven Central Repository</name>
  <id>7688-364a-a7f6-1234567-f56e</id>
</repository-id>

And, say java.net2 (http://download.dev.java.net/maven/2) could be

<repository-id>
  <name>Java.net's Maven2 Repository</name>
  <id>bcd3-5432-8899-9876543-acbd</id>
</repository-id>

The advantage I see is that existing clients will not be looking for the 
repository-id.xml file, so they won't care.  If we can't find a 
repository-id.xml file for the repository, we generate a GUID on the 
client and store a mapping of URLs to GUIDs in a file in ~/.m2/ so that 
a user can control the mapping of these autogenerated GUIDs

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Jerome Lacoste <je...@gmail.com>.

On 8/31/07, Brett Porter <br...@apache.org> wrote:
> Yeah, I meant to note that - I was thinking that this should be
> accompanied by a proposal to take care of the id ambiguity problems
> which we've discussed a couple of times before.
>
> I think URLs are still problematic (since you can often have
> different ones for the same location), though are a bit more robust
> than IDs. We could hash them to generate the directory name in the
> repository.
>
> What do others think?

How are workspaces identified ? a hash of the current directory ?
Something related to the project one is working on ? Something related
to the current process ?

Cheers,

Jerome

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Brett Porter <br...@apache.org>.

Yeah, I meant to note that - I was thinking that this should be  
accompanied by a proposal to take care of the id ambiguity problems  
which we've discussed a couple of times before.

I think URLs are still problematic (since you can often have  
different ones for the same location), though are a bit more robust  
than IDs. We could hash them to generate the directory name in the  
repository.

What do others think?

Thanks,
Brett

On 01/09/2007, at 2:04 AM, Milos Kleint wrote:

> looks great.
> One comment. "Remote" folder is grouped by repo indentifiers.
> Unfortunately these often differ  among projects. Results in many
> duplicate files and folder structures. Can we go by URL? or have some
> means of automatically defining aliases for the same remote repo URL?
>
> Milos
>
> On 8/31/07, Brett Porter <br...@apache.org> wrote:
>> See: http://docs.codehaus.org/display/MAVEN/Local+repository 
>> +separation
>>
>> Text included below for inline comments (which I'll feed back into
>> the document as needed).
>>
>> Context
>>
>> The current local repository is a single file structure, stored
>> typically in an individual user's home directory.
>>
>> The suffers from the following problems:
>> * there is no locking, so if multiple Maven instances attempt to run
>> on the same machine they can corrupt each other's metadata
>> * it serves multiple purposes - it is both a cache of remote
>> repository artifacts, and a place to locally install artifacts that
>> you build. Because of this, it is possible that the local cache does
>> not always accurately reflect the state of the remote repository
>> ** downloading a snapshot from a remote repository also writes the
>> chosen version out as -SNAPSHOT, meaning that continues to get used
>> even if the snapshot repository is removed
>> ** downloading an artifact from a remote repository with a fixed
>> version does not write metadata, so if that repository is later
>> removed the artifact is still used though a clean build would fail.
>> This particularly affects testing staged releases
>> * it can be different to isolate differences in the local repository
>> without deleting the entire cache, requiring time consuming  
>> downloads.
>> * it isn't possible have multiple checkouts of the same development
>> version and build them independently (particularly important for CI
>> servers).
>> * it isn't possible to easily clean out a subsection of the  
>> repository
>> * the artifact code is over-complicated to implement the logic for
>> sharing the storage
>>
>> Solution
>>
>> General Considerations
>>
>> This solution aims to not change the current behaviour, other than to
>> make it easier/possible to correct things considered known bugs as
>> documented above. Resolution behaviour should not otherwise be
>> affected and any such changes should be in the related proposal.
>>
>> This proposal simply alters the storage of the artifacts.
>>
>> Locking
>>
>> Locking should be implemented at the individual artifact level. This
>> can be done with a lockfile in the artifact top level directory
>> (rather than the individual version), locking both the metadata and
>> artifact.
>>
>> An artifact operation should be done with files in a temporary
>> location, and moved to the final location in one operation, wrapped
>> by the creation of the lockfile. This makes the duration of the lock
>> relatively short, so that Maven can simply block on the existence of
>> a lockfile (both read and write operations), and remove it after a
>> short period of time.
>>
>> Local repository separation
>>
>> The structure of the local repository should become:
>>
>> .
>> |-- cache
>> |-- remote
>> |   |-- apache.snapshots
>> |   |-- central
>> |   |-- codehaus.snapshots
>> |   `-- ...
>> `-- workspace
>>      |-- default
>>      |-- workspace1
>>      `-- ...
>>
>> The purposes of these directories are as follows:
>>
>> * cache: immutable artifacts downloaded from a remote repository. No
>> metadata is stored in this directory tree.
>>
>> * remote: contains a directory for each remote repository (by
>> repository identifier). This contains the metadata and mutable
>> artifacts from that repository. Metadata files will return to the
>> format {{maven-metadata.xml}} instead of the current {{maven- 
>> metadata-
>> <id>.xml}} file format. Files in these repositories will typically be
>> snapshots and metadata for releases, since actual releases are not
>> mutable and can be stored in the {{cache}} directory
>>
>> * workspace: contains a directory for each local workspace, with the
>> primary one being {{default}}. This contains the metadata and files
>> for any artifact built by maven (both snapshots, and releases).
>>
>> Under each of these locations, the standard layout remains as it  
>> is now:
>>
>> .
>> |-- cache
>> |   |-- com
>> |   |   `-- example
>> |   |       `-- ...
>> |   `-- org
>> |       |-- apache
>> |       |   `-- ...
>> |       `-- codehaus
>> |           `-- ...
>> |-- remote
>> |   |-- apache.snapshots
>> |   |   `-- org
>> |   |       `-- apache
>> |   |           `-- ...
>> |   |-- central
>> |   |-- codehaus.snapshots
>> |   |   `-- org
>> |   |       `-- codehaus
>> |   |           `-- ...
>> |   `-- ...
>> `-- workspace
>>      |-- default
>>      |   `-- com
>>      |       `-- example
>>      |           `-- ...
>>      |-- workspace1
>>      `-- ...
>>
>> Search sequence
>>
>> As current behaviour is to be retained when correct, the solution
>> should aim to merge the metadata across the current workspace and
>> active remote repositories to decide what artifact to use. The
>> artifact can then be utilised from either the workspace, remote, or
>> cache repositories.
>>
>> Existence in the {{cache}} directory is not a decision point for
>> using an artifact - this must be achieved and the artifact from there
>> used if possible. This will help enable better utilising the remote
>> repository metadata for tracking the source of an artifact in the
>> future to resolve some of the problems listed in the context section
>> of this proposal.
>>
>> Deployment and Merging
>>
>> Content will be deployed from the workspace directory, but will not
>> be merged to other sections - there should be no need to migrate and
>> data among the repository sections.
>>
>> Rolling back a reactor build
>>
>> While this would be a separate feature, and not the default
>> behaviour, it would now be possible to use a temporary workspace to
>> build artifacts during a reactor build and merge them into another
>> workspace on completion, making an entire reactor build "atomic" with
>> respect to the local repository.
>>
>> Co-existence with Maven 2.0.x
>>
>> A best practice should be to change your Maven 2.0.x configuration to
>> use {{~/.m2/repository/cache}} as the local repository, and move the
>> existing content, as this will co-exist properly with Maven 2.1.
>>
>> Upgrade path
>>
>> There is no need to upgrade existing local repositories - first use
>> of Maven 2.1 will only mean users need to rebuild any local software,
>> as remote artifacts will be redownloaded (however see above for the
>> minimisation of this).
>>
>> Shared Local Repositories
>>
>> Though locking will now make this possible, it is still not a
>> recommended practice to share the local repository. However, this
>> structure will allow people to share the {{cache}} location safely to
>> reduce disk usage if desired.
>>
>> Cheers,
>> Brett
>>
>>
>>
>>
>> --
>> Brett Porter - brett@apache.org
>> Blog: http://www.devzuz.org/blogs/bporter/
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org

--
Brett Porter - brett@apache.org
Blog: http://www.devzuz.org/blogs/bporter/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Milos Kleint <mk...@gmail.com>.

looks great.
One comment. "Remote" folder is grouped by repo indentifiers.
Unfortunately these often differ  among projects. Results in many
duplicate files and folder structures. Can we go by URL? or have some
means of automatically defining aliases for the same remote repo URL?

Milos

On 8/31/07, Brett Porter <br...@apache.org> wrote:
> See: http://docs.codehaus.org/display/MAVEN/Local+repository+separation
>
> Text included below for inline comments (which I'll feed back into
> the document as needed).
>
> Context
>
> The current local repository is a single file structure, stored
> typically in an individual user's home directory.
>
> The suffers from the following problems:
> * there is no locking, so if multiple Maven instances attempt to run
> on the same machine they can corrupt each other's metadata
> * it serves multiple purposes - it is both a cache of remote
> repository artifacts, and a place to locally install artifacts that
> you build. Because of this, it is possible that the local cache does
> not always accurately reflect the state of the remote repository
> ** downloading a snapshot from a remote repository also writes the
> chosen version out as -SNAPSHOT, meaning that continues to get used
> even if the snapshot repository is removed
> ** downloading an artifact from a remote repository with a fixed
> version does not write metadata, so if that repository is later
> removed the artifact is still used though a clean build would fail.
> This particularly affects testing staged releases
> * it can be different to isolate differences in the local repository
> without deleting the entire cache, requiring time consuming downloads.
> * it isn't possible have multiple checkouts of the same development
> version and build them independently (particularly important for CI
> servers).
> * it isn't possible to easily clean out a subsection of the repository
> * the artifact code is over-complicated to implement the logic for
> sharing the storage
>
> Solution
>
> General Considerations
>
> This solution aims to not change the current behaviour, other than to
> make it easier/possible to correct things considered known bugs as
> documented above. Resolution behaviour should not otherwise be
> affected and any such changes should be in the related proposal.
>
> This proposal simply alters the storage of the artifacts.
>
> Locking
>
> Locking should be implemented at the individual artifact level. This
> can be done with a lockfile in the artifact top level directory
> (rather than the individual version), locking both the metadata and
> artifact.
>
> An artifact operation should be done with files in a temporary
> location, and moved to the final location in one operation, wrapped
> by the creation of the lockfile. This makes the duration of the lock
> relatively short, so that Maven can simply block on the existence of
> a lockfile (both read and write operations), and remove it after a
> short period of time.
>
> Local repository separation
>
> The structure of the local repository should become:
>
> .
> |-- cache
> |-- remote
> |   |-- apache.snapshots
> |   |-- central
> |   |-- codehaus.snapshots
> |   `-- ...
> `-- workspace
>      |-- default
>      |-- workspace1
>      `-- ...
>
> The purposes of these directories are as follows:
>
> * cache: immutable artifacts downloaded from a remote repository. No
> metadata is stored in this directory tree.
>
> * remote: contains a directory for each remote repository (by
> repository identifier). This contains the metadata and mutable
> artifacts from that repository. Metadata files will return to the
> format {{maven-metadata.xml}} instead of the current {{maven-metadata-
> <id>.xml}} file format. Files in these repositories will typically be
> snapshots and metadata for releases, since actual releases are not
> mutable and can be stored in the {{cache}} directory
>
> * workspace: contains a directory for each local workspace, with the
> primary one being {{default}}. This contains the metadata and files
> for any artifact built by maven (both snapshots, and releases).
>
> Under each of these locations, the standard layout remains as it is now:
>
> .
> |-- cache
> |   |-- com
> |   |   `-- example
> |   |       `-- ...
> |   `-- org
> |       |-- apache
> |       |   `-- ...
> |       `-- codehaus
> |           `-- ...
> |-- remote
> |   |-- apache.snapshots
> |   |   `-- org
> |   |       `-- apache
> |   |           `-- ...
> |   |-- central
> |   |-- codehaus.snapshots
> |   |   `-- org
> |   |       `-- codehaus
> |   |           `-- ...
> |   `-- ...
> `-- workspace
>      |-- default
>      |   `-- com
>      |       `-- example
>      |           `-- ...
>      |-- workspace1
>      `-- ...
>
> Search sequence
>
> As current behaviour is to be retained when correct, the solution
> should aim to merge the metadata across the current workspace and
> active remote repositories to decide what artifact to use. The
> artifact can then be utilised from either the workspace, remote, or
> cache repositories.
>
> Existence in the {{cache}} directory is not a decision point for
> using an artifact - this must be achieved and the artifact from there
> used if possible. This will help enable better utilising the remote
> repository metadata for tracking the source of an artifact in the
> future to resolve some of the problems listed in the context section
> of this proposal.
>
> Deployment and Merging
>
> Content will be deployed from the workspace directory, but will not
> be merged to other sections - there should be no need to migrate and
> data among the repository sections.
>
> Rolling back a reactor build
>
> While this would be a separate feature, and not the default
> behaviour, it would now be possible to use a temporary workspace to
> build artifacts during a reactor build and merge them into another
> workspace on completion, making an entire reactor build "atomic" with
> respect to the local repository.
>
> Co-existence with Maven 2.0.x
>
> A best practice should be to change your Maven 2.0.x configuration to
> use {{~/.m2/repository/cache}} as the local repository, and move the
> existing content, as this will co-exist properly with Maven 2.1.
>
> Upgrade path
>
> There is no need to upgrade existing local repositories - first use
> of Maven 2.1 will only mean users need to rebuild any local software,
> as remote artifacts will be redownloaded (however see above for the
> minimisation of this).
>
> Shared Local Repositories
>
> Though locking will now make this possible, it is still not a
> recommended practice to share the local repository. However, this
> structure will allow people to share the {{cache}} location safely to
> reduce disk usage if desired.
>
> Cheers,
> Brett
>
>
>
>
> --
> Brett Porter - brett@apache.org
> Blog: http://www.devzuz.org/blogs/bporter/
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Brett Porter <br...@apache.org>.

On 01/09/2007, at 3:06 AM, Arnaud HERITIER wrote:

> Which new features can we imagine for corporate proxies like archiva,
> proximity ? In that case developers often see only one remote
> repository which is defined as proxy. How will we know the data come
> from ?
>
I don't think anything is necessary in that instance - the proxy  
server will deal with how to put everything together. Certainly in  
Archiva's case, it tracks the originating repository for dealing with  
it on it's end, but there's probably no need to keep anything  
separate on the Maven side after that.

Cheers,
Brett


--
Brett Porter - brett@apache.org
Blog: http://www.devzuz.org/blogs/bporter/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Arnaud HERITIER <ah...@gmail.com>.

Which new features can we imagine for corporate proxies like archiva,
proximity ? In that case developers often see only one remote
repository which is defined as proxy. How will we know the data come
from ?

Arnaud

On 31/08/2007, Brett Porter <br...@apache.org> wrote:
> See: http://docs.codehaus.org/display/MAVEN/Local+repository+separation
>
> Text included below for inline comments (which I'll feed back into
> the document as needed).
>
> Context
>
> The current local repository is a single file structure, stored
> typically in an individual user's home directory.
>
> The suffers from the following problems:
> * there is no locking, so if multiple Maven instances attempt to run
> on the same machine they can corrupt each other's metadata
> * it serves multiple purposes - it is both a cache of remote
> repository artifacts, and a place to locally install artifacts that
> you build. Because of this, it is possible that the local cache does
> not always accurately reflect the state of the remote repository
> ** downloading a snapshot from a remote repository also writes the
> chosen version out as -SNAPSHOT, meaning that continues to get used
> even if the snapshot repository is removed
> ** downloading an artifact from a remote repository with a fixed
> version does not write metadata, so if that repository is later
> removed the artifact is still used though a clean build would fail.
> This particularly affects testing staged releases
> * it can be different to isolate differences in the local repository
> without deleting the entire cache, requiring time consuming downloads.
> * it isn't possible have multiple checkouts of the same development
> version and build them independently (particularly important for CI
> servers).
> * it isn't possible to easily clean out a subsection of the repository
> * the artifact code is over-complicated to implement the logic for
> sharing the storage
>
> Solution
>
> General Considerations
>
> This solution aims to not change the current behaviour, other than to
> make it easier/possible to correct things considered known bugs as
> documented above. Resolution behaviour should not otherwise be
> affected and any such changes should be in the related proposal.
>
> This proposal simply alters the storage of the artifacts.
>
> Locking
>
> Locking should be implemented at the individual artifact level. This
> can be done with a lockfile in the artifact top level directory
> (rather than the individual version), locking both the metadata and
> artifact.
>
> An artifact operation should be done with files in a temporary
> location, and moved to the final location in one operation, wrapped
> by the creation of the lockfile. This makes the duration of the lock
> relatively short, so that Maven can simply block on the existence of
> a lockfile (both read and write operations), and remove it after a
> short period of time.
>
> Local repository separation
>
> The structure of the local repository should become:
>
> .
> |-- cache
> |-- remote
> |   |-- apache.snapshots
> |   |-- central
> |   |-- codehaus.snapshots
> |   `-- ...
> `-- workspace
>      |-- default
>      |-- workspace1
>      `-- ...
>
> The purposes of these directories are as follows:
>
> * cache: immutable artifacts downloaded from a remote repository. No
> metadata is stored in this directory tree.
>
> * remote: contains a directory for each remote repository (by
> repository identifier). This contains the metadata and mutable
> artifacts from that repository. Metadata files will return to the
> format {{maven-metadata.xml}} instead of the current {{maven-metadata-
> <id>.xml}} file format. Files in these repositories will typically be
> snapshots and metadata for releases, since actual releases are not
> mutable and can be stored in the {{cache}} directory
>
> * workspace: contains a directory for each local workspace, with the
> primary one being {{default}}. This contains the metadata and files
> for any artifact built by maven (both snapshots, and releases).
>
> Under each of these locations, the standard layout remains as it is now:
>
> .
> |-- cache
> |   |-- com
> |   |   `-- example
> |   |       `-- ...
> |   `-- org
> |       |-- apache
> |       |   `-- ...
> |       `-- codehaus
> |           `-- ...
> |-- remote
> |   |-- apache.snapshots
> |   |   `-- org
> |   |       `-- apache
> |   |           `-- ...
> |   |-- central
> |   |-- codehaus.snapshots
> |   |   `-- org
> |   |       `-- codehaus
> |   |           `-- ...
> |   `-- ...
> `-- workspace
>      |-- default
>      |   `-- com
>      |       `-- example
>      |           `-- ...
>      |-- workspace1
>      `-- ...
>
> Search sequence
>
> As current behaviour is to be retained when correct, the solution
> should aim to merge the metadata across the current workspace and
> active remote repositories to decide what artifact to use. The
> artifact can then be utilised from either the workspace, remote, or
> cache repositories.
>
> Existence in the {{cache}} directory is not a decision point for
> using an artifact - this must be achieved and the artifact from there
> used if possible. This will help enable better utilising the remote
> repository metadata for tracking the source of an artifact in the
> future to resolve some of the problems listed in the context section
> of this proposal.
>
> Deployment and Merging
>
> Content will be deployed from the workspace directory, but will not
> be merged to other sections - there should be no need to migrate and
> data among the repository sections.
>
> Rolling back a reactor build
>
> While this would be a separate feature, and not the default
> behaviour, it would now be possible to use a temporary workspace to
> build artifacts during a reactor build and merge them into another
> workspace on completion, making an entire reactor build "atomic" with
> respect to the local repository.
>
> Co-existence with Maven 2.0.x
>
> A best practice should be to change your Maven 2.0.x configuration to
> use {{~/.m2/repository/cache}} as the local repository, and move the
> existing content, as this will co-exist properly with Maven 2.1.
>
> Upgrade path
>
> There is no need to upgrade existing local repositories - first use
> of Maven 2.1 will only mean users need to rebuild any local software,
> as remote artifacts will be redownloaded (however see above for the
> minimisation of this).
>
> Shared Local Repositories
>
> Though locking will now make this possible, it is still not a
> recommended practice to share the local repository. However, this
> structure will allow people to share the {{cache}} location safely to
> reduce disk usage if desired.
>
> Cheers,
> Brett
>
>
>
>
> --
> Brett Porter - brett@apache.org
> Blog: http://www.devzuz.org/blogs/bporter/
>
>


-- 
..........................................................
Arnaud HERITIER
..........................................................
OCTO Technology - aheritier AT octo DOT com
www.octo.com | blog.octo.com
..........................................................
ASF - aheritier AT apache DOT org
www.apache.org | maven.apache.org
...........................................................

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Kenney Westerhof <ke...@apache.org>.

Hi,

Reply is below.

Brett Porter wrote:
> Hi Kenney,
> 
> On 14/09/2007, at 9:15 PM, Kenney Westerhof wrote:
> 
>> Hi,
>>
>> I sent a mail a few days ago but it didn't make it to the list.
>>
>> One very important feature would be the separation of build artifacts
>> (maven plugins and their dependencies), and project artifacts.
>> The separation isn't clear in maven itself - repo's get mixed up,
>> wrong repo's consulted; build artifacts interfering with plugin 
>> artifacts.
>> Having a separate directory containing information on what build 
>> artifacts
>> are used makes it easy to freeze a maven environment, and see what
>> was used runtime.
>>
>> Also see [1], which you, Brett, weren't in favour of back then. Perhaps
>> now this can be taken into consideration with this proposal.
>>
>>
>> [1] http://jira.codehaus.org/browse/MNG-724
> 
> I definitely like the idea of separating the different contexts of the 
> build (plugin/extension) dependencies from the project dependencies. I 
> think this is most important for the metadata and snapshots - I don't 
> think we need duplicate copies of plexus container releases, etc 
> floating around in the local repository.

Definitely not. Though in fact only snapshot plugins and their dependencies
can cause problems with this; that is, their presence and availability
to maven, not wheter or not their versions are specified in the pom. If you
leave out the plugin version you get whatever is latest in the local repo,
which can be a snapshot. This makes builds unstable when you're developing
plugins.

> This will help to some extent by already separating some things out (in 
> particular the ability to have a different place for locally built 
> plugins) - but plugins that come from the same place as other artifacts 
> will still be grouped together.

Indeed - we'd group the artifacts by their source (repo or local),
but also their role in the build - required by maven, or required by
the project's you're building. The latter can be accomplished by
the workspaces. Plugins and their dependencies needed by the build
can go in the 'cache' or 'snapshot' locations in the ~/.m2/,
but artifacts generated during build will first be put in a workspace,
and can then be merged to the snapshot location in ~/.m2/ after
the build completes. We still want to build a plugin or one of it's dependencies
and be able to use it in another build.

> How would you like to see that worked out in practice - 
> pluginRepositories and build extensions stored separately? Or do you 
> think that with the other plugin lockdown/enforcer uses enough can be 
> achieved already?

Enforcer etc. will certainly help, but if you don't have a rule, or allow
snapshot versions, then you still don't know which one is used. Consider
using a snapshot plugin with a dep on a snapshot artifact. Somewhere during
the build that snapshot artifact is built, and somewhere later in the build
that plugin is used with that new snapshot artifact - it could break the build.
One way to solve this is to pre-resolve all plugins and their dependencies before
the build starts, but that only works if you use timestamped snapshots, otherwise
the artifacts can be copied over during the build. 
Separating the installation location _during the build_ can solve this and make builds
more stable. The 'install' goal would become an aggregator.

During the build, we now already have several sources for artifacts: local repo,
remote repos, and reactor. In the current scheme, reactor artifacts can come
from the local repo source and vice versa. I believe that having this separation,
even just during the build, makes the code easier, and we have more control on
where to look for what artifacts without having to worry about side effects during
the build and contamination of the artifact sources. We can then more easily say
which source gets precedence over another, and most importantly, for what type
of artifact (build or project). I imagine that we would have the precedence order
for project artifacts as [reactor (or workspace), remote repos, local repo],
and the order for build artifacts as [remote repos, local repo], so that during
one build, the artifacts available to maven for plugins and extensions remain
the same.

One more thought on the workspace: if we change the install mojo to be an aggregator,
there's really no need to have separate workspaces to accomplish the above,
since we have a 'workspace' artifact source already: the reactor. 
But workspaces are handy because when you're working on several projects at once,
like maven and plexus, you can have 2 workspaces containing the snapshots for both,
if you don't want them to interfere with eachother. One of the two could just use
deployed versions (like maven would only use deployed snapshots from plexus, for instance),
and the other (plexus) would use it's own snapshots. Making cross-project artifacts
available to other projects would mean, if you use workspaces, that you have to deploy them.
This reduces the risk on build failures (for instance) on maven, when someone updates 
plexus and maven to work together, and does not deploy plexus.

I still have a lot of thinking to do on this, but I hope more people will since it's
rather complex. I'm not sure on what would be the best approach.

-- Kenney

> 
> Cheers,
> Brett
> 
> -- 
> Brett Porter - brett@apache.org
> Blog: http://www.devzuz.org/blogs/bporter/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Brett Porter <br...@apache.org>.

Hi Kenney,

On 14/09/2007, at 9:15 PM, Kenney Westerhof wrote:

> Hi,
>
> I sent a mail a few days ago but it didn't make it to the list.
>
> One very important feature would be the separation of build artifacts
> (maven plugins and their dependencies), and project artifacts.
> The separation isn't clear in maven itself - repo's get mixed up,
> wrong repo's consulted; build artifacts interfering with plugin  
> artifacts.
> Having a separate directory containing information on what build  
> artifacts
> are used makes it easy to freeze a maven environment, and see what
> was used runtime.
>
> Also see [1], which you, Brett, weren't in favour of back then.  
> Perhaps
> now this can be taken into consideration with this proposal.
>
>
> [1] http://jira.codehaus.org/browse/MNG-724

I definitely like the idea of separating the different contexts of  
the build (plugin/extension) dependencies from the project  
dependencies. I think this is most important for the metadata and  
snapshots - I don't think we need duplicate copies of plexus  
container releases, etc floating around in the local repository.

This will help to some extent by already separating some things out  
(in particular the ability to have a different place for locally  
built plugins) - but plugins that come from the same place as other  
artifacts will still be grouped together.

How would you like to see that worked out in practice -  
pluginRepositories and build extensions stored separately? Or do you  
think that with the other plugin lockdown/enforcer uses enough can be  
achieved already?

Cheers,
Brett

--
Brett Porter - brett@apache.org
Blog: http://www.devzuz.org/blogs/bporter/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by John Casey <jd...@commonjava.org>.

Here's another batch comment...sorry for the bursty communication  
style! :-)

Local Repo Separation Notes:

1. Having a strict (automatic) one-workspace-per-build approach kills  
any idea of having integration-test runs that themselves have  
predictable, isolated environments, and puts us back to using the  
plugin contortions I've been using lately. Two things are key here:  
(a) you don't want to pollute *anything* else with the artifacts  
subjected to integration-testing, and (b) you want to ensure that  
integration tests "resolve" the subject artifact accurately every  
time, regardless of what else is going on in the build. If you want  
to adopt workspaces as a transactional approach to building (which  
IMO implies having something to bulk-transfer the resulting artifacts  
out of the workspace to some common location for reuse elsewhere on  
localhost) that's one thing, but we really shouldn't limit the  
capability for creating other workspaces for different use cases  
within a single build. It'd be nice if we had a delete-on-exit  
mechanism that didn't cause problems with embedded environments;  
then, there wouldn't even be a reason to worry about the accumulation  
of "orphaned" workspaces, I suppose...but planting these sorts of  
things under the target directory would eliminate this concern to a  
large degree too.

2. Kenney brings up a good point about the reactor artifacts, and in  
addition to making the artifact-resolution process clearer it also  
virtually eliminates the need to cache resolution results. In a  
reactor build where there are interdependencies, we simply use those  
interdependencies to sort the build order...then, the local workspace  
for that build is used to resolve them first, then local repo, then  
remote repo. If the sort order is correct, and the build fails  
appropriately if one of the projects fails to build, there should be  
no caching or other magic necessary here. One slight hitch is that  
the install mojo being an aggregator may not be as simple as it  
sounds. IIRC, there are issues with the way we're approaching these  
aggregators in the build now...I think Bad Things tend to happen when  
you bind them into the lifecycle, for example...so that's something  
that would need some proving.

IMO, it would be cleaner to avoid implementing the reactor workspace  
using the existing resolution cache, since plugins don't have easy  
access to cached information or the replace semantics that IIRC  
currently exist in the MavenProject instance when coping with  
interdependent project. It would be MUCH cleaner to use the  
filesystem and a more simplistic artifact resolution mechanism, then  
optimize that iff we need to. In-memory caching can be pretty evil  
when it comes to this stuff; it's been a source of constant headache  
in things like the assembly plugin up to now.

-john

On Sep 19, 2007, at 8:31 PM, Brett Porter wrote:

>
> On 18/09/2007, at 10:22 PM, Kenney Westerhof wrote:
>
>> Hi,
>>
>>>> 2. Workspaces should be something you have to set consciously,  
>>>> not automatically created. This allows an integration-testing  
>>>> run (for example) to run in isolation by using a different  
>>>> workspace id, and clean up after itself when finished.
>>> Agreed.
>>
>> I think they should always be created, and after the entire build  
>> finished,
>> merged to the main tree. Look at it as build transactions.
>
> I had listed this as a separate feature that could be built on top  
> of this, under "Rolling back a reactor build" - I don't think it  
> impacts the base implementation, though.
>
> In your other mail you indicated likewise that it might need more  
> thought (as it could be avoided altogether by making the install  
> plugin an aggregator).
>
> WDYT? Do I need to make some additional changes to the proposal?
>
> Cheers,
> Brett
>
> --
> Brett Porter - brett@apache.org
> Blog: http://www.devzuz.org/blogs/bporter/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>

---
John Casey
Committer and PMC Member, Apache Maven
mail: jdcasey at commonjava dot org
blog: http://www.ejlife.net/blogs/john

Re: [PROPOSAL] Local Repository Separation

Posted by Brett Porter <br...@apache.org>.

On 18/09/2007, at 10:22 PM, Kenney Westerhof wrote:

> Hi,
>
>>> 2. Workspaces should be something you have to set consciously,  
>>> not automatically created. This allows an integration-testing run  
>>> (for example) to run in isolation by using a different workspace  
>>> id, and clean up after itself when finished.
>> Agreed.
>
> I think they should always be created, and after the entire build  
> finished,
> merged to the main tree. Look at it as build transactions.

I had listed this as a separate feature that could be built on top of  
this, under "Rolling back a reactor build" - I don't think it impacts  
the base implementation, though.

In your other mail you indicated likewise that it might need more  
thought (as it could be avoided altogether by making the install  
plugin an aggregator).

WDYT? Do I need to make some additional changes to the proposal?

Cheers,
Brett

--
Brett Porter - brett@apache.org
Blog: http://www.devzuz.org/blogs/bporter/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Christian Gruber <ch...@gmail.com>.

Fair enough.  Sounds good to me. -cg.

On 18-Sep-07, at 8:47 AM, Kenney Westerhof wrote:

> If a CI only builds 1 project (-N, no reactor), then a per-build
> workspace isn't needed. But a per reactor-build local workspace
> as a default, when there are multiple projects in the reactor, as a  
> default,
> seems useful to me.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Kenney Westerhof <ke...@apache.org>.

Christian Gruber wrote:
> Hmm.  I'm liking how this is all shaping up, but I'm wondering about the 
> "in the top level pom" bit.  Already some things we do assuming a 
> reactor build are confusing because each project is built separately, 
> and when you start getting into continuum which essentially replaces the 
> reactor with its own per-sub-project build infrastructure to allow only 
> parts of the build that are relevant/changed to be executed, and any 
> assumptions of parent location are off.  I think causing this to go 
> project at a time, and have workspace_repo per subproject would probably 
> be more supportive of the overall rather independent nature of maven sub 
> projects.

Ofcourse. If you build a project standalone in a CI, you already have a special
environment: it is assumed that you only checked out that subproject, and
that other artifacts are present in remote repo's, or in the local repo.
The latter however assumes that if the dependencies are not deployed yet,
someone did a local build before you build this project. This doesn't work
well for humans doing builds, but for a CI taking this type of control
in it's own 'hands', the local repo also reflects what has been built.

So yes, having 'global' workspaces is definitely useful for CI systems
building different sets of projects (from different SCM systems perhaps).

If a CI only builds 1 project (-N, no reactor), then a per-build
workspace isn't needed. But a per reactor-build local workspace
as a default, when there are multiple projects in the reactor, as a default,
seems useful to me.

-- Kenney

> 
> Christian.
> 
> On 18-Sep-07, at 8:22 AM, Kenney Westerhof wrote:
> 
>>
>> Btw, we don't necessarily require the workspace repo to be present
>> in the ~/.m2/ directory. It could just aswell be target/workspace-repo/
>> in the top level pom.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Christian Gruber <ch...@gmail.com>.

Hmm.  I'm liking how this is all shaping up, but I'm wondering about  
the "in the top level pom" bit.  Already some things we do assuming a  
reactor build are confusing because each project is built separately,  
and when you start getting into continuum which essentially replaces  
the reactor with its own per-sub-project build infrastructure to allow  
only parts of the build that are relevant/changed to be executed, and  
any assumptions of parent location are off.  I think causing this to  
go project at a time, and have workspace_repo per subproject would  
probably be more supportive of the overall rather independent nature  
of maven sub projects.

Christian.

On 18-Sep-07, at 8:22 AM, Kenney Westerhof wrote:

>
> Btw, we don't necessarily require the workspace repo to be present
> in the ~/.m2/ directory. It could just aswell be target/workspace- 
> repo/
> in the top level pom.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org

Re: [PROPOSAL] Local Repository Separation

Posted by Kenney Westerhof <ke...@apache.org>.

Hi,

>> 2. Workspaces should be something you have to set consciously, not 
>> automatically created. This allows an integration-testing run (for 
>> example) to run in isolation by using a different workspace id, and 
>> clean up after itself when finished.
> 
> Agreed.

I think they should always be created, and after the entire build finished,
merged to the main tree. Look at it as build transactions. When something
fails in mid-build, you can end up with a few artifacts referring to wrong
versions. For instance for a 2 artifact snapshot build, when the first artifact
is installed and the 2nd fails, where the 2nd has a dependency on the first.
If there were already snapshots present, then you have one that's newer
(with possible code changes that only work for the newer 2nd artifact).
Other projects depending on the 2nd artifact will get the old code and
the newer first artifact which will break builds.
When deploying comes into play, this will affect more than just local users.

For CI systems or multi-user builds having these workspaces will reduce
interference with other builds, as was said before.

Btw, we don't necessarily require the workspace repo to be present
in the ~/.m2/ directory. It could just aswell be target/workspace-repo/
in the top level pom.

-- Kenney

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org