You are viewing a plain text version of this content. The canonical link for it is here.
Posted to infrastructure-dev@apache.org by Jukka Zitting <ju...@gmail.com> on 2008/07/12 14:24:57 UTC

[scm] Making a public git mirror of an Apache project

Hi,

Currently we only enable git-svn use for committers (authenticated
https on svn.eu), so I looked at how to make a git mirror that could
be used by external contributors.

Here's an example using repo.or.gz for hosting and Apache Tika for the
code. Note that this example only covers mirroring of the project
trunk, I haven't yet figured out how to include tags and branches.

0) Create a public git repository using http://repo.or.cz/m/regproj.cgi.

1) Create a git-svn clone of the project you're interested in:

    $ mkdir /path/to/git/repository
    $ cd /path/to/git/repository
    $ git svn init https://svn.eu.apache.org/repos/asf/incubator/tika/trunk
    $ git svn fetch --log-window-size=10000 --authors-file=/path/to/authors.txt

The log-window-size option is used to speed up the initial import and
reduce load on the svn server.

The (optional) authors-file option can be used to include the names
and @apache.org email addresses of the committers in the git-svn
commit messages. See http://people.apache.org/~jukka/authors.txt for
an example authors.txt file I prepared based on information in
http://people.apache.org/~jim/committers.html.

2) Push the clone to a public repository:

    $ git push git+ssh://repo.or.cz/srv/git/tika

3) Automate periodic updates of the mirror:

    $ crontab -l
    48 5 * * * /path/to/update.sh

    $ cat /path/to/update.sh
    #!/bin/sh
    cd /path/to/git/repository
    git svn fetch --log-window-size=10000 --authors-file=/path/to/authors.txt
    git push git+ssh://repo.or.cz/srv/git/tika

4) You're done. The mirror is available at http://repo.or.cz/w/tika.git.

Git users can then track the project for example by cloning it with
"git clone git://repo.or.cz/tika.git". That clone is similar to an
anonymous svn checkout of the trunk.

I haven't yet announced the Tika mirror anywhere (except now on this
list). How do people feel about git mirroring and using generic
hosting services like repo.or.cz for doing that?

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Matthieu Riou <ma...@offthelip.org>.
On Mon, Oct 13, 2008 at 2:47 AM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Fri, Oct 10, 2008 at 4:51 PM, Jukka Zitting <ju...@gmail.com>
> wrote:
> > On Mon, Sep 22, 2008 at 5:46 AM, Matthieu Riou <ma...@offthelip.org>
> wrote:
> >> Ugly but effective.
> >
> > Cool! I'll give it a try this weekend.
>
> It worked without problems. The ODE mirror now contains all tags and
> branches.
>

Sweet! Thanks Jukka!


>
> BR,
>
> Jukka Zitting
>

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Fri, Oct 10, 2008 at 4:51 PM, Jukka Zitting <ju...@gmail.com> wrote:
> On Mon, Sep 22, 2008 at 5:46 AM, Matthieu Riou <ma...@offthelip.org> wrote:
>> Ugly but effective.
>
> Cool! I'll give it a try this weekend.

It worked without problems. The ODE mirror now contains all tags and branches.

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Mon, Sep 22, 2008 at 5:46 AM, Matthieu Riou <ma...@offthelip.org> wrote:
> So I poked around the git-svn code which was a good occasion to get back to
> perl. The code is a bit dreadful actually. Turns out git-svn fails when it
> uses the do_switch svn function to follow the parent. If you force it to use
> do_update instead, it works nicely. So right now I've deactivated it just by
> hacking git-svn directly with the following patch:
>
> diff --git a/git-svn.perl b/git-svn.perl
> index af8279a..12246eb 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -4274,7 +4274,8 @@ sub can_do_switch {
>                }
>                $pool->clear;
>        }
> -       $can_do_switch;
> +  #$can_do_switch;
> +  ""
>  }
>
> Ugly but effective.

Cool! I'll give it a try this weekend.

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Matthieu Riou <ma...@offthelip.org>.
On Sat, Sep 20, 2008 at 12:57 PM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Fri, Sep 12, 2008 at 8:55 PM, Matthieu Riou <ma...@offthelip.org>
> wrote:
> > Just as a follow-up, I've reproduced this (although it took more than
> 30mn,
> > your server must be fast :) ) and raised the issue to the git mailing
> list.
>
> It felt like 30 mins, but I was hacking at the time so it could just
> as well have been two hours. :-)
>
> There was a recent thread on the git mailing list about adding tags
> and branches to a git-svn clone that was originally created using
> sources from just the trunk. It seemed easy enough (and doesn't break
> the trunk history), so I now created a git mirror for just the Apache
> ODE trunk. I'll add tags and branches to the mirror once the issue
> with git-svn not understanding the bart branch is solved.
>

Thanks a lot. You might have a chance to check whether that method to add
tags and branches works :)

So I poked around the git-svn code which was a good occasion to get back to
perl. The code is a bit dreadful actually. Turns out git-svn fails when it
uses the do_switch svn function to follow the parent. If you force it to use
do_update instead, it works nicely. So right now I've deactivated it just by
hacking git-svn directly with the following patch:

diff --git a/git-svn.perl b/git-svn.perl
index af8279a..12246eb 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -4274,7 +4274,8 @@ sub can_do_switch {
                }
                $pool->clear;
        }
-       $can_do_switch;
+  #$can_do_switch;
+  ""
 }

Ugly but effective. I'm going to clean it up to add that as a command line
option and propose the patch to the git-svn guys, we'll see how it goes.
With luck, someone more knowledgable will figure out why exactly do_switch
fails. Anyway if others run into this at least we'll have a workaround.

Btw if you need some help with your repository, feel free to ask.

Thanks again,
Matthieu


>
> BR,
>
> Jukka Zitting
>

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Fri, Sep 12, 2008 at 8:55 PM, Matthieu Riou <ma...@offthelip.org> wrote:
> Just as a follow-up, I've reproduced this (although it took more than 30mn,
> your server must be fast :) ) and raised the issue to the git mailing list.

It felt like 30 mins, but I was hacking at the time so it could just
as well have been two hours. :-)

There was a recent thread on the git mailing list about adding tags
and branches to a git-svn clone that was originally created using
sources from just the trunk. It seemed easy enough (and doesn't break
the trunk history), so I now created a git mirror for just the Apache
ODE trunk. I'll add tags and branches to the mirror once the issue
with git-svn not understanding the bart branch is solved.

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Matthieu Riou <ma...@offthelip.org>.
On Thu, Sep 11, 2008 at 8:49 AM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Thu, Sep 11, 2008 at 6:28 PM, Matthieu Riou <ma...@offthelip.org>
> wrote:
> > So did the export end up working or is git-svn still confused with the
> > 563283:563286 changesets? If you give me the error that git-svn reports,
> I
> > could report a bug to them. I can also try the export if you want and zip
> it
> > if/when it ends up working, I don't want this to get too time consuming
> for
> > you.
>
> I didn't keep the exact log, but right after dealing with that set of
> changes it fails with an error about trying to remove a file (I think
> it was jbi.rake) that doesn't exist in the currently checked out
> revision.
>
> You can recreate the error like this:
>
>    export GIT_DIR=ode.git
>    git svn init -s https://svn.eu.apache.org/repos/asf/ode/
>     git svn fetch --log-window-size=10000
>
> The process takes some while (like 30 mins) until it fails.
>

Just as a follow-up, I've reproduced this (although it took more than 30mn,
your server must be fast :) ) and raised the issue to the git mailing list.
No answer so far (I posted yesterday) but let's wait and see. I might end up
patching git-svn to add some sort of ignore when the branch parent detection
ends up being wrong.

Thanks,
Matthieu


>
> BR,
>
> Jukka Zitting
>

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Sep 11, 2008 at 6:28 PM, Matthieu Riou <ma...@offthelip.org> wrote:
> So did the export end up working or is git-svn still confused with the
> 563283:563286 changesets? If you give me the error that git-svn reports, I
> could report a bug to them. I can also try the export if you want and zip it
> if/when it ends up working, I don't want this to get too time consuming for
> you.

I didn't keep the exact log, but right after dealing with that set of
changes it fails with an error about trying to remove a file (I think
it was jbi.rake) that doesn't exist in the currently checked out
revision.

You can recreate the error like this:

    export GIT_DIR=ode.git
    git svn init -s https://svn.eu.apache.org/repos/asf/ode/
    git svn fetch --log-window-size=10000

The process takes some while (like 30 mins) until it fails.

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Matthieu Riou <ma...@offthelip.org>.
On Thu, Sep 11, 2008 at 3:17 AM, Jukka Zitting <ju...@gmail.com>wrote:

> Hi,
>
> On Thu, Sep 11, 2008 at 1:05 PM, Jukka Zitting <ju...@gmail.com>
> wrote:
> > or perhaps the strange-looking revision 560672:
> >
> > $ svn log -r 560672 --verbose https://svn.apache.org/repos/asf/
> > ------------------------------------------------------------------------
> > r560672 | (no author) | (no date) | 1 line
> >
> >
> > ------------------------------------------------------------------------
>
> Never mind that, I figured it out. The revision is a normal change to
> the read-protected infrastructure section of the asf repository. The
> revision number showed up in the git-svn log just   as the source
> revision number of the TLP move in revision 560673:
>
> $ svn log -r 560673 --verbose https://svn.apache.org/repos/asf/
> ------------------------------------------------------------------------
> r560673 | bayard | 2007-07-29 07:20:14 +0300 (Sun, 29 Jul 2007) | 1 line
> Changed paths:
>   D /incubator/ode
>   A /ode (from /incubator/ode:560672)
>
> Moving to TLP
> ------------------------------------------------------------------------
>

So did the export end up working or is git-svn still confused with the
563283:563286 changesets? If you give me the error that git-svn reports, I
could report a bug to them. I can also try the export if you want and zip it
if/when it ends up working, I don't want this to get too time consuming for
you.

Thanks for looking into it!
Matthieu


>
> BR,
>
> Jukka Zitting
>

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Sep 11, 2008 at 1:05 PM, Jukka Zitting <ju...@gmail.com> wrote:
> or perhaps the strange-looking revision 560672:
>
> $ svn log -r 560672 --verbose https://svn.apache.org/repos/asf/
> ------------------------------------------------------------------------
> r560672 | (no author) | (no date) | 1 line
>
>
> ------------------------------------------------------------------------

Never mind that, I figured it out. The revision is a normal change to
the read-protected infrastructure section of the asf repository. The
revision number showed up in the git-svn log just   as the source
revision number of the TLP move in revision 560673:

$ svn log -r 560673 --verbose https://svn.apache.org/repos/asf/
------------------------------------------------------------------------
r560673 | bayard | 2007-07-29 07:20:14 +0300 (Sun, 29 Jul 2007) | 1 line
Changed paths:
   D /incubator/ode
   A /ode (from /incubator/ode:560672)

Moving to TLP
------------------------------------------------------------------------

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Wed, Sep 10, 2008 at 6:54 PM, Matthieu Riou <ma...@offthelip.org> wrote:
> Sorry I'm late to the party. Could you also mirror ODE (
> http://svn.apache.org/repos/asf/ode/)? That would be really appreciated.

I tried adding ODE using my normal full mirror settings (i.e. pulling
all tags and branches), but git-svn seems to get confused (well, I do
too!) by the relationship between the bart branch and the tasks folder
in trunk:

$ svn log -r 563283:563286 --verbose https://svn.apache.org/repos/asf/
------------------------------------------------------------------------
r563283 | mszefler | 2007-08-07 00:19:24 +0300 (Tue, 07 Aug 2007) | 2 lines
Changed paths:
   D /ode/branches/bart

Remove.

------------------------------------------------------------------------
r563284 | mszefler | 2007-08-07 00:19:32 +0300 (Tue, 07 Aug 2007) | 2 lines
Changed paths:
   A /ode/branches/bart (from /ode/trunk/tasks:563283)
   D /ode/trunk/tasks

Moved.

------------------------------------------------------------------------
r563285 | mszefler | 2007-08-07 00:20:11 +0300 (Tue, 07 Aug 2007) | 2 lines
Changed paths:
   A /ode/trunk/tasks (from /ode/branches/bart:563284)

copyied back.

------------------------------------------------------------------------
r563286 | mszefler | 2007-08-07 00:21:57 +0300 (Tue, 07 Aug 2007) | 2 lines
Changed paths:
   A /ode/branches/bart/tasks (from /ode/trunk/tasks:563285)

copied.

------------------------------------------------------------------------

or perhaps the strange-looking revision 560672:

$ svn log -r 560672 --verbose https://svn.apache.org/repos/asf/
------------------------------------------------------------------------
r560672 | (no author) | (no date) | 1 line


------------------------------------------------------------------------

I'm not sure how to proceed. I could do a simplified export with
--no-follow-parent, but that would make ODE a special case and less
feature-rich as the other mirrors I'm managing. I'd rather get this
issue sorted properly, but that probably requires some git-svn hacking
and possibly some more details about what happened in revision 560672.

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Matthieu Riou <ma...@offthelip.org>.
Hi Jukka,

On Sun, Jul 20, 2008 at 8:57 AM, Jukka Zitting <ju...@gmail.com>
> wrote:
> Ping me if you're interested in seeing other Apache codebases mirrored.
>
>
Sorry I'm late to the party. Could you also mirror ODE (
http://svn.apache.org/repos/asf/ode/)? That would be really appreciated.

FWIW, recently I've been using a personal mirror on github that gets
synchronized frequently with ODE trunk. I've only exported a partial history
to avoid putting stress on Apache SVN. The fun part is a Ruby script that
has been written by a Buildr contributor (they use the same scheme) that
does all the heavy lifting. To get started, you just do:

ruby -e 'require "open-uri";eval(open("
http://github.com/matthieu/apache-ode/tree/master%2Fode-git.rb?raw=true
").read)

This gets the script from the repo and executes (I know it's a potential
security disaster but I do read scripts before I execute them). This
automatically checks out the mirror for you and sets up a few Git aliases
like 'apache-pull' or 'apache-push' in your new repo. The most interesting
one I guess is 'synchronize':

[alias]
  apache-fetch = !git-svn fetch apache/trunk
  apache-merge = !git merge apache/trunk
  apache-pull = !git apache-fetch && git apache-merge
  apache-push = !git-svn dcommit --username mriou
  get = !git apache-fetch && git fetch origin
  mrg = !git apache-merge && git merge origin
  rbs = !git rebase --onto apache/trunk origin/master master
  put = !git apache-push && git push origin
  synchronize = !git get && git rbs && git put

At the end of the checkout it outputs a few guidelines on how to use these
commands and details the setup a bit. It's pretty damn brilliant (I can
insist, I'm not the author).

Anyway I'm just mentioning it in case there's something you (or someone
else) could reuse in the script. It seems that your setup is working pretty
well though so you probably don't need much of it.

Thanks for your work!
Matthieu

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Sun, Jul 20, 2008 at 8:57 AM, Jukka Zitting <ju...@gmail.com> wrote:
> Ping me if you're interested in seeing other Apache codebases mirrored.

I've added a number of mirrors based on demand.

The latest one was Wicket whose version history took a whopping 5+
days to pull. It seems like git-svn is re-reading the entire version
histories of all tags and branches. It can still somehow piece
together the correct branch points, but this mechanism is obviously
not very efficient. I'll check with the git team if there's a way to
make git-svn only read the changes _after_ a branch point.

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Sun, Jul 20, 2008 at 7:31 PM, Grzegorz Kossakowski
<gk...@apache.org> wrote:
> I would like to know if you are fine with me pointing a few Cocoon users to
> your server as they are now completely stuck when it comes to tracking
> Cocoon using Git.

No problem, as long as they are aware of the experimental status of
this service. I'll add a warning on the main gitweb page.

> Next thing I would like to know is what are your plans and how much of load
> your server can take?

My main reasons for creating the mirrors were to find out how to
provide an alternative to the lengthy (and resource-intensive) git-svn
clone operations and how to make Apache codebases accessible through
git also for non-committers (i.e. people without access to
https://svn.eu). I think I've now figured out the answers to those
questions, and as a second step I'm interested in whether and how
people find the mirrors useful. So any feedback from experimentally
minded users would be welcome.

My server (a Fujitsu-Siemens Primergy RX100) is a currently rather
underutilized and I have more than 150GB of unused traffic capacity
per month, so I should be able to serve at least hundredths of git
users with no trouble.

> Personally, I would prefer to see a VM instance established for Git hosting
> purposes and other Git-related experiments.

I'd be happy to move the mirrors over to an ASF server, but I
understand well why infrastructure feels hesitant to add new services.
For now it's probably best to gather feedback using my server.

> Then it would make sense to find out how typical contributor to Apache
> project could benefit from using Git-clone of Apache repository for
> preparing a patch. If we find any benefits then it would probably make
> sense to document how such process should look like.

The use cases discussed on this list already describe a number of
cases where a tool like git has benefits over svn. And of course a git
mirror will be convenient to people who already know git better or
prefer it over svn. But of course Subversion should still be the
preferred tool if there is no reason why a user would be better served
by git.

Personally I find the history browsing, patch/branch management and
offline commit capabilities of git to be so valuable that I rather use
a git clone than an svn checkout of the projects where I'm not a
committer.

> So the basic question is: what we want to do next?

If there's interest, I'm happy to keep serving the git mirrors and add
new ones as requested. It would be nice to gather some examples of how
people are using the mirrors. Also, we could try to find out how much
demand there is for a service like this.

BR,

Jukka Zitting

Re: [scm] Making a public git mirror of an Apache project

Posted by Grzegorz Kossakowski <gk...@apache.org>.
Jukka Zitting pisze:
> Hi,
> 
> On Sat, Jul 12, 2008 at 3:24 PM, Jukka Zitting <ju...@gmail.com> wrote:
>> Currently we only enable git-svn use for committers (authenticated
>> https on svn.eu), so I looked at how to make a git mirror that could
>> be used by external contributors.
> 
> After some more experimentation I've come up with a much improved
> setup that includes full version histories for not only the project
> trunk but also all tags and branches. I also figured a way to make
> copies of the remote tracking branches created by git-svn available to
> downstream git clones based on the git-svn clone. With this setup it's
> possible to create a git mirror of a project that allows git users to
> access the entire project history without the long (often hours)
> initial git-svn fetch. For example, making a git clone of the entire
> Ant history takes just a few minutes.
> 
> Here's the sequence of commands I use to prepare a git-svn clone of an
> Apache project. Note that I'm making the git-svn clone available to
> external clients both through gitweb and git-daemon.
> 
>     SVN_URL=... # https://svn.eu... URL of a project (below trunk,tags,branches)
>     GIT_DIR=... # Path to the bare git directory for the git-svn clone
>     GIT_URL=... # The URL at which the git repository is served by git-daemon
>     DESCRIPTION=... # Full name of the project
> 
>     export GIT_DIR
>     git svn init --stdlayout "$SVN_URL"
>     git config gitweb.owner "The Apache Software Foundation"
>     git config gitweb.url "$GIT_URL"
>     echo "$DESCRIPTION" > "$GIT_DIR/description"
>     touch "$GIT_DIR/git-daemon-export-ok"
> 
> Once the git-svn clone has been prepared, I use the following commands
> to update it:
> 
>     GIT_DIR=... # Path to the bare git directory for the git-svn clone
>     AUTHORS=... # Path to the authors file
>     export GIT_DIR
> 
>     git svn fetch --authors-file "$AUTHORS" --log-window-size=10000
> 
>     # Make the git master branch always track the svn trunk
>     git update-ref refs/heads/master refs/remotes/trunk
> 
>     # Copy all other remote branches (svn branches and tags) to normal
> git branches
>     git for-each-ref refs/remotes | cut -d / -f 3- | grep -v -x trunk
> | grep -v @ |
>         while read ref
>     do
>         git update-ref "refs/heads/$ref" "refs/remotes/$ref"
>     done
> 
>     # Prune branches or tags that have been removed in svn
>     git for-each-ref refs/heads | cut -d / -f 3- | grep -v -x master |
>         while read ref
>     do
>         git rev-parse "refs/remotes/$ref" > /dev/null 2>&1 ||
>             git update-ref -d "refs/heads/$ref" "refs/heads/$ref"
>     done
> 
>     git gc --auto
> 
> With this setup (and standard configurations for gitweb and
> git-daemon) I've configured git-svn clones of a few representative
> Apache codebases and made them available at
> http://jukka.zitting.name/git/ (gitweb) and
> git://jukka.zitting.name/<project>.git (git-daemon). These mirrors are
> automatically updated daily. If you're interested, you can browse the
> project histories on the gitweb interface or clone a project using
> "git clone git://jukka.zitting.name/<project>.git" for local
> inspection. Note that these git mirrors are highly experimental and
> may be discontinued or recreated at any time.
> 
> The script to make the mirrored clones has run for a few days now, and
> it'll probably take another day before the remaining big codebases
> (Cocoon and httpd) are done. After the initial setup it'll be easy to
> keep the mirrors up to date. Ping me if you're interested in seeing
> other Apache codebases mirrored.

Hi Jukka!

At the beginning I have to say it: sweet job, really!

I'm very happy that you have invested your time in polishing all details and providing us a nice Git 
mirror of Apache resources.

I guess that this is a more proof of concept now but still successful one. I've managed to clone 
Cocoon repository from your hosting in twenty minutes which compared to about eight hours when 
cloned directly from svn is a great result.

Since I've got couple questions about tracking Apache repositories in the past (the last one is from 
yesterday) I think it would make sense to announce your hosting somewhere as soon as we are sure 
that we allowed to do that.
I mean here: I would like to avoid any impression that Git attempts try to bypass any Apache rules 
or procedures when it comes to collaboration and development.

I would like to know if you are fine with me pointing a few Cocoon users to your server as they are 
now completely stuck when it comes to tracking Cocoon using Git.

Next thing I would like to know is what are your plans and how much of load your server can take?


                                              -- o0o --


Personally, I would prefer to see a VM instance established for Git hosting purposes and other 
Git-related experiments. Then it would make sense to find out how typical contributor to Apache 
project could benefit from using Git-clone of Apache repository for preparing a patch.
If we find any benefits then it would probably make sense to document how such process should look like.

In more distant future it would probably make sense to find out if committers could digest 
contributions by using Git directly. However, this looks like very distant goal.


So the basic question is: what we want to do next?

-- 
Best regards,
Grzegorz Kossakowski (who is interested in helping on such efforts)

Re: [scm] Making a public git mirror of an Apache project

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Sat, Jul 12, 2008 at 3:24 PM, Jukka Zitting <ju...@gmail.com> wrote:
> Currently we only enable git-svn use for committers (authenticated
> https on svn.eu), so I looked at how to make a git mirror that could
> be used by external contributors.

After some more experimentation I've come up with a much improved
setup that includes full version histories for not only the project
trunk but also all tags and branches. I also figured a way to make
copies of the remote tracking branches created by git-svn available to
downstream git clones based on the git-svn clone. With this setup it's
possible to create a git mirror of a project that allows git users to
access the entire project history without the long (often hours)
initial git-svn fetch. For example, making a git clone of the entire
Ant history takes just a few minutes.

Here's the sequence of commands I use to prepare a git-svn clone of an
Apache project. Note that I'm making the git-svn clone available to
external clients both through gitweb and git-daemon.

    SVN_URL=... # https://svn.eu... URL of a project (below trunk,tags,branches)
    GIT_DIR=... # Path to the bare git directory for the git-svn clone
    GIT_URL=... # The URL at which the git repository is served by git-daemon
    DESCRIPTION=... # Full name of the project

    export GIT_DIR
    git svn init --stdlayout "$SVN_URL"
    git config gitweb.owner "The Apache Software Foundation"
    git config gitweb.url "$GIT_URL"
    echo "$DESCRIPTION" > "$GIT_DIR/description"
    touch "$GIT_DIR/git-daemon-export-ok"

Once the git-svn clone has been prepared, I use the following commands
to update it:

    GIT_DIR=... # Path to the bare git directory for the git-svn clone
    AUTHORS=... # Path to the authors file
    export GIT_DIR

    git svn fetch --authors-file "$AUTHORS" --log-window-size=10000

    # Make the git master branch always track the svn trunk
    git update-ref refs/heads/master refs/remotes/trunk

    # Copy all other remote branches (svn branches and tags) to normal
git branches
    git for-each-ref refs/remotes | cut -d / -f 3- | grep -v -x trunk
| grep -v @ |
        while read ref
    do
        git update-ref "refs/heads/$ref" "refs/remotes/$ref"
    done

    # Prune branches or tags that have been removed in svn
    git for-each-ref refs/heads | cut -d / -f 3- | grep -v -x master |
        while read ref
    do
        git rev-parse "refs/remotes/$ref" > /dev/null 2>&1 ||
            git update-ref -d "refs/heads/$ref" "refs/heads/$ref"
    done

    git gc --auto

With this setup (and standard configurations for gitweb and
git-daemon) I've configured git-svn clones of a few representative
Apache codebases and made them available at
http://jukka.zitting.name/git/ (gitweb) and
git://jukka.zitting.name/<project>.git (git-daemon). These mirrors are
automatically updated daily. If you're interested, you can browse the
project histories on the gitweb interface or clone a project using
"git clone git://jukka.zitting.name/<project>.git" for local
inspection. Note that these git mirrors are highly experimental and
may be discontinued or recreated at any time.

The script to make the mirrored clones has run for a few days now, and
it'll probably take another day before the remaining big codebases
(Cocoon and httpd) are done. After the initial setup it'll be easy to
keep the mirrors up to date. Ping me if you're interested in seeing
other Apache codebases mirrored.

BR,

Jukka Zitting