You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@maven.apache.org by Rafal Krzewski <Ra...@caltha.pl> on 2003/03/06 09:24:15 UTC

distributed repository - DNS SRV vs HTTP + XML

Hello world!

I really like Michael's idea of associating repository information with
internet domain names. On the other hand, I second the opinion of Brian
that tweaking the DNS records for their domains, may be hard or even
impossible for many interested parties (projects). An alternative for
using DNS would be treating the <domain> entry proposed by Michael as
a part of URL of an xml file that would be downloaded during dependency
resolution process. The <domain> entry could contain a host name, or
even a hostname with optional path designating a directory on the HTTP
server to make things even easier for organizations wishing to
distribute their projects (in case they are not able to put files into
the root directory for their domain's website nor create virtual HTTP
servers).

Example:

Consider the following dependency declaration:

<dependency>
  <artifactId>labeo-base</artifactId>
  <groupId>labeo.sourceforge.net</groupId>
</dependency>

While processing dependency resolver discovers no <domain> is defined,
so it's assumed to be equal to <groupId>. Then it attempts to download
http://labeo.sourceforge.net/repository-info.xml, suppose it finds the
following content:

<?xml version="1.0" encoding="ISO-8859-1"?>
<repository-info>
  <repository>
    <priority>0</priority>
    <weitght>0</weight>
    <url>http://labeo.sourceforge.net/maven</url>
 </repository>
</repository-info>

This is the simple case: the project is self hosting and has only one
server. More complex scenario:

<?xml version="1.0" encoding="ISO-8859-1"?>
<repository-info>
  <repository>
    <priority>0</priority>
    <weitght>70</weight>
    <url>http://www1.caltha.pl/labeo/</url>
 </repository>
  <repository>
    <priority>0</priority>
    <weitght>30</weight>
    <url>http://www2.caltha.pl/labeo/</url>
 </repository>
  <repository>
    <priority>1</priority>
    <weitght>0</weight>
    <url>http://ibiblio.org/maven</url>
 </repository>
</repository-info>

Caltha.pl provides hosting to the project on two servers, but www1 has
better connectivity, and the provider wishes that it handled roughly 70%
of the traffic. The project is also backed up on ibiblio, but
submissions to that repository take some time, so new versions might
be missing. Therefore ibiblio should be contanted only when both
Caltha.pl servers are unavailable.

Having this information, the dependency resolver chooses a server
(start with lowest priority, select a server at random using weights,
if it's down select another server at random, advance to higher
priority when all servers with current priority are down).

The resolver might attempt to download 'repository-layout.xml' file
as the test that the repo is up and running and use it's contents
for furthcoming lookup of the artifact, or could proceed right from
there with the current artifact lookup scheme.

If none of the repositories defined in the domain's repository-info.xml
are available, or the repository-info.xml is missing, resolver would
try to download the requested artifact from the repositories defined
in the maven.repo.remote property. This gives us full backward
compatibility.

R.

Re: distributed repository - DNS SRV vs HTTP + XML

Posted by Jason van Zyl <ja...@zenplex.com>.

On Fri, 2003-03-07 at 03:53, Rafal Krzewski wrote:
> Stanley,Michael P. wrote:
> > Actually a combination of the two isn't out of the question either.  The
> > HTTP+XML will provide a way to map to a repository.  But in reality, you
> > are just re-writing the DNS SRV protocol.
> 
> Rewirte? I'm just trying to reuse the design, implementing it upon
> different media: one that fits better with current Maven infrastructure,
> has lower entry cost (no need to shop for wire protocol implementation)
> and makes the feature available to (far) wider audience.
> 
> > It is also possible to set it up for them to coexist.  Just add the
> > HTTP+XML method to the search algorithm.  
> 
> Sure enough. Both implementaions would adhere to a single interface,
> used by dep resolve and could be plugged in/out as needed.
> 
> > local machine -> remote(intranet) -> DNS SRV discovered -> HTTP+XML
> > discovered
> > 
> > The search algorithm should be easily configurable and extendible.
> > Adding more known hosts (local, intranet, mirror, remote) and
> > de-centralized (unknown) hosts.
> 
> We need a formal specification of Maven dependency resolution scheme.
> An xdoc that defines what is 'dependency', 'repository' and describes
> the resolution procedure in detail would be most useful.

Yah, I'm trying to brain dump as fast as possible but caught up in the
classworlds update which will end shortly.

> R.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: turbine-maven-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: turbine-maven-dev-help@jakarta.apache.org
-- 
jvz.

Jason van Zyl
jason@zenplex.com
http://tambora.zenplex.org

In short, man creates for himself a new religion of a rational
and technical order to justify his work and to be justified in it.
  
  -- Jacques Ellul, The Technological Society

Re: distributed repository - DNS SRV vs HTTP + XML

Posted by Rafal Krzewski <Ra...@caltha.pl>.

Stanley,Michael P. wrote:
> Actually a combination of the two isn't out of the question either.  The
> HTTP+XML will provide a way to map to a repository.  But in reality, you
> are just re-writing the DNS SRV protocol.

Rewirte? I'm just trying to reuse the design, implementing it upon
different media: one that fits better with current Maven infrastructure,
has lower entry cost (no need to shop for wire protocol implementation)
and makes the feature available to (far) wider audience.

> It is also possible to set it up for them to coexist.  Just add the
> HTTP+XML method to the search algorithm.  

Sure enough. Both implementaions would adhere to a single interface,
used by dep resolve and could be plugged in/out as needed.

> local machine -> remote(intranet) -> DNS SRV discovered -> HTTP+XML
> discovered
> 
> The search algorithm should be easily configurable and extendible.
> Adding more known hosts (local, intranet, mirror, remote) and
> de-centralized (unknown) hosts.

We need a formal specification of Maven dependency resolution scheme.
An xdoc that defines what is 'dependency', 'repository' and describes
the resolution procedure in detail would be most useful.

R.

RE: distributed repository - DNS SRV vs HTTP + XML

Posted by "Stanley,Michael P." <ms...@mitre.org>.

Actually a combination of the two isn't out of the question either.  The
HTTP+XML will provide a way to map to a repository.  But in reality, you
are just re-writing the DNS SRV protocol.

It is also possible to set it up for them to coexist.  Just add the
HTTP+XML method to the search algorithm.  

local machine -> remote(intranet) -> DNS SRV discovered -> HTTP+XML
discovered

The search algorithm should be easily configurable and extendible.
Adding more known hosts (local, intranet, mirror, remote) and
de-centralized (unknown) hosts.

- Mike

> -----Original Message-----
> From: Rafal Krzewski [mailto:Rafal.Krzewski@caltha.pl]
> Sent: Thursday, March 06, 2003 3:24 AM
> To: Turbine Maven Developers List
> Subject: distributed repository - DNS SRV vs HTTP + XML
> 
> Hello world!
> 
> I really like Michael's idea of associating repository information
with
> internet domain names. On the other hand, I second the opinion of
Brian
> that tweaking the DNS records for their domains, may be hard or even
> impossible for many interested parties (projects). An alternative for
> using DNS would be treating the <domain> entry proposed by Michael as
> a part of URL of an xml file that would be downloaded during
dependency
> resolution process. The <domain> entry could contain a host name, or
> even a hostname with optional path designating a directory on the HTTP
> server to make things even easier for organizations wishing to
> distribute their projects (in case they are not able to put files into
> the root directory for their domain's website nor create virtual HTTP
> servers).
> 
> Example:
> 
> Consider the following dependency declaration:
> 
> <dependency>
>   <artifactId>labeo-base</artifactId>
>   <groupId>labeo.sourceforge.net</groupId>
> </dependency>
> 
> While processing dependency resolver discovers no <domain> is defined,
> so it's assumed to be equal to <groupId>. Then it attempts to download
> http://labeo.sourceforge.net/repository-info.xml, suppose it finds the
> following content:
> 
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <repository-info>
>   <repository>
>     <priority>0</priority>
>     <weitght>0</weight>
>     <url>http://labeo.sourceforge.net/maven</url>
>  </repository>
> </repository-info>
> 
> This is the simple case: the project is self hosting and has only one
> server. More complex scenario:
> 
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <repository-info>
>   <repository>
>     <priority>0</priority>
>     <weitght>70</weight>
>     <url>http://www1.caltha.pl/labeo/</url>
>  </repository>
>   <repository>
>     <priority>0</priority>
>     <weitght>30</weight>
>     <url>http://www2.caltha.pl/labeo/</url>
>  </repository>
>   <repository>
>     <priority>1</priority>
>     <weitght>0</weight>
>     <url>http://ibiblio.org/maven</url>
>  </repository>
> </repository-info>
> 
> Caltha.pl provides hosting to the project on two servers, but www1 has
> better connectivity, and the provider wishes that it handled roughly
70%
> of the traffic. The project is also backed up on ibiblio, but
> submissions to that repository take some time, so new versions might
> be missing. Therefore ibiblio should be contanted only when both
> Caltha.pl servers are unavailable.
> 
> Having this information, the dependency resolver chooses a server
> (start with lowest priority, select a server at random using weights,
> if it's down select another server at random, advance to higher
> priority when all servers with current priority are down).
> 
> The resolver might attempt to download 'repository-layout.xml' file
> as the test that the repo is up and running and use it's contents
> for furthcoming lookup of the artifact, or could proceed right from
> there with the current artifact lookup scheme.
> 
> If none of the repositories defined in the domain's
repository-info.xml
> are available, or the repository-info.xml is missing, resolver would
> try to download the requested artifact from the repositories defined
> in the maven.repo.remote property. This gives us full backward
> compatibility.
> 
> R.
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
turbine-maven-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
turbine-maven-dev-help@jakarta.apache.org