You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Jens-Heiner Rechtien <jh...@web.de> on 2011/07/28 20:42:06 UTC

Re: Converting the repo using mercurial's convert extension

On 07/28/2011 04:32 PM, Pedro F. Giffuni wrote:
>
> --- On Thu, 7/28/11, Christian Lohmaier wrote:
> ...
>>
>> [1] Note that with the map, it would also be possible to
>> reuse the old OOo-Subversion repo for the linear commits,
>> after all the hg repo was a conversion from the svn server.
>> This would save quite a bit of time.
>>
>
> I like this idea ... if the old SVN server is still available
> and we can do a progressive conversion of the rest of the Hg
> stuff we will save a lot of metadata that had previously been
> lost (plus we save conversion time).

It's still available: http://svn.services.openoffice.org/ooo resp. 
svn://svn.services.openoffice.org/ooo

You can use svnsync to create a local copy of the rep. Will take a while :-)

>
> I suspect the original CVS stuff that was lost in SVN conversion
> is gone for good by now.

Oh, I'm pretty sure that it's still available somewhere. There have been 
some CVSup mirrors of the stuff.

But, frankly, I can't see the need of having the CVS stuff at hand. It's 
very hard to make sense of this historical data anyway, at least if you 
haven't got a decade of OOo developer knowledge under the belt.

It's true that the conversion was lossy, but that was intentional! You 
wouldn't believe how much cruft can accumulate in a decade of happy 
coding. A full conversion of our old CVS repository into SVN resulted in 
a SVN repository of about 90 GiB in size.

>
> Also, having the complete repository (with branches) in google
> code sounds like a good approach.
>
> Pedro.
>
>

Heiner


-- 
Jens-Heiner Rechtien

Re: Converting the repo using mercurial's convert extension

Posted by Christian Lohmaier <cl...@openoffice.org>.
On Tue, Aug 2, 2011 at 9:12 PM, Christian Lohmaier <cl...@openoffice.org> wrote:
> [...]
>> can you list the steps you are following in more detail (even a dump of your
>> term history) and upload the scripts to svn?
>
> I won't upload to svn, as I'm not a committer (and have no intentions
> to sign the iCLA)
>
> But I pasted it here
> http://pastie.org/2310454

Oh, I forgot: While I won't upload it to svn myself, I don't claim any
copyright on this bit of info, so feel free to take it as
Apace-Licensed, LGPL, MPL in whatever present or future versions, as
Public Domain - you get the idea.

ciao
Christian

Re: Converting the repo using mercurial's convert extension

Posted by Christian Lohmaier <cl...@openoffice.org>.
Hi Andrew, *,

On Tue, Aug 2, 2011 at 7:27 PM, Andrew Rist <an...@oracle.com> wrote:
> On 8/2/2011 7:25 AM, Christian Lohmaier wrote:
> On Thu, Jul 28, 2011 at 8:42 PM, Jens-Heiner Rechtien <jh...@web.de>
> wrote:
> On 07/28/2011 04:32 PM, Pedro F. Giffuni wrote:
> --- On Thu, 7/28/11, Christian Lohmaier wrote:
>
> The script to do the mapping is attached [...]
>
> unfortunately attachments don't make it through the listserver.

Sorry, It at lets patches and attached signatures pass, so I was just
hoping a text-attachment would make it as well...

> can you list the steps you are following in more detail (even a dump of your
> term history) and upload the scripts to svn?

I won't upload to svn, as I'm not a committer (and have no intentions
to sign the iCLA)

But I pasted it here
http://pastie.org/2310454

> I'll try to set up a conversion and dumps closer to the source to see how we
> can speed up this process.
> Andrew

So next step here is to create a svn dump, pass it through
svndumpfilter to only keep trunk and populate a new repo with that,
then attempt the hg conversion.
If that fails because svn is "ahead" of mercurial, do the same but
only dump up to the last matched version. That way you still will have
the time-benefit but not all svn's history.

ciao
Christian

Re: Converting the repo using mercurial's convert extension

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Andrew Rist wrote on Tue, Aug 02, 2011 at 10:27:40 -0700:
> On 8/2/2011 7:25 AM, Christian Lohmaier wrote:
> >The script to do the mapping is attached
> unfortunately attachments don't make it through the listserver.

IIRC, the list only blocks some kinds of attachments.

And that can be disabled by a request to infra --- we did that in
dev@subversion.  See INFRA-3724

Re: Converting the repo using mercurial's convert extension

Posted by Andrew Rist <an...@oracle.com>.


On 8/2/2011 7:25 AM, Christian Lohmaier wrote:
> Hi Heiner, *,
>
> On Thu, Jul 28, 2011 at 8:42 PM, Jens-Heiner Rechtien<jh...@web.de>  wrote:
>> On 07/28/2011 04:32 PM, Pedro F. Giffuni wrote:
>>> --- On Thu, 7/28/11, Christian Lohmaier wrote:
>>> ...
>>>> [1] Note that with the map, it would also be possible to
>>>> reuse the old OOo-Subversion repo for the linear commits,
>>>> after all the hg repo was a conversion from the svn server.
>>>> This would save quite a bit of time.
>>> I like this idea ... if the old SVN server is still available
>>> and we can do a progressive conversion of the rest of the Hg
>>> stuff we will save a lot of metadata that had previously been
>>> lost (plus we save conversion time).
>> It's still available: http://svn.services.openoffice.org/ooo resp.
>> svn://svn.services.openoffice.org/ooo
>>
>> You can use svnsync to create a local copy of the rep. Will take a while :-)
> Yes, takes almost two days - would have been easier if there was a dump :-)
>
> Now good thing: mapping the revisions works, and is completed on a
> slow machine in 10 minutes.
> Bad thing: I couldn't test the import with the plain repo, as hg
> convert wants a *full* checkout of the repo, not just trunk, and that
> doesn't fit in the 100GB I have available[1], so I'm now creating a
> dump to run through svndumpfilter to only preserve trunk and retry
> with that shrunk repo. (
>
> The script to do the mapping is attached
unfortunately attachments don't make it through the listserver.
can you list the steps you are following in more detail (even a dump of 
your term history) and upload the scripts to svn?
I'll try to set up a conversion and dumps closer to the source to see 
how we can speed up this process.
Andrew
> , it will create the mapping
> that can be used as hg-shamap to kickstart the conversing skipping the
> first 263205 revision, thus saving way more than 20 days of conversion
> time :-)
>
> So while the process won't work with the pristine copy of the pristine
> svn copy, it can still be used as basis for the conversion when
> filtered to only include trunk, as all the linear revisions are
> matched in trunk.
>
> [1] the svn repo contains 984 cws - and when you assume just 1 GB per
> cws for simplicity, you would need 1TB of storage to just do the
> conversion. and svn needs loads of RAM to perform the checkout - the
> 1GB of real RAM and 1GB of RAM was all occupied by svn, and after
> filling the 100GB a mere 12MB were free, so even with enough storage
> capacity, the amount of RAM probably would not have been sufficient
> for a full checkout...
>
> ciao
> Christian

Re: Converting the repo using mercurial's convert extension

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.
Hello guys;

I looked a bit at the Mercurial conversion tool and maybe
I'm reading something wrong but there is a --filemap
option where one can exclude directories from the conversion.

The tool does seem to support branches though so perhaps
it is worth it to spend 1TB of space and wait.

cheers,

Pedro.

--- On Tue, 8/2/11, Christian Lohmaier <cl...@openoffice.org> wrote:
...
> Hi Heiner, *,
> 
> On Thu, Jul 28, 2011 at 8:42 PM, Jens-Heiner Rechtien
> <jh...@web.de>
> wrote:
> > On 07/28/2011 04:32 PM, Pedro F. Giffuni wrote:
> >> --- On Thu, 7/28/11, Christian Lohmaier wrote:
> >> ...
> >>>
> >>> [1] Note that with the map, it would also be
> possible to
> >>> reuse the old OOo-Subversion repo for the
> linear commits,
> >>> after all the hg repo was a conversion from
> the svn server.
> >>> This would save quite a bit of time.
> >>
> >> I like this idea ... if the old SVN server is
> still available
> >> and we can do a progressive conversion of the rest
> of the Hg
> >> stuff we will save a lot of metadata that had
> previously been
> >> lost (plus we save conversion time).
> >
> > It's still available: http://svn.services.openoffice.org/ooo resp.
> > svn://svn.services.openoffice.org/ooo
> >
> > You can use svnsync to create a local copy of the rep.
> Will take a while :-)
> 
> Yes, takes almost two days - would have been easier if
> there was a dump :-)
> 
> Now good thing: mapping the revisions works, and is
> completed on a
> slow machine in 10 minutes.
> Bad thing: I couldn't test the import with the plain repo,
> as hg
> convert wants a *full* checkout of the repo, not just
> trunk, and that
> doesn't fit in the 100GB I have available[1], so I'm now
> creating a
> dump to run through svndumpfilter to only preserve trunk
> and retry
> with that shrunk repo. (
> 
> The script to do the mapping is attached, it will create
> the mapping
> that can be used as hg-shamap to kickstart the conversing
> skipping the
> first 263205 revision, thus saving way more than 20 days of
> conversion
> time :-)
> 
> So while the process won't work with the pristine copy of
> the pristine
> svn copy, it can still be used as basis for the conversion
> when
> filtered to only include trunk, as all the linear revisions
> are
> matched in trunk.
> 
> [1] the svn repo contains 984 cws - and when you assume
> just 1 GB per
> cws for simplicity, you would need 1TB of storage to just
> do the
> conversion. and svn needs loads of RAM to perform the
> checkout - the
> 1GB of real RAM and 1GB of RAM was all occupied by svn, and
> after
> filling the 100GB a mere 12MB were free, so even with
> enough storage
> capacity, the amount of RAM probably would not have been
> sufficient
> for a full checkout...
> 
> ciao
> Christian
> 

Re: Converting the repo using mercurial's convert extension

Posted by Christian Lohmaier <cl...@openoffice.org>.
Hi Heiner, *,

On Thu, Jul 28, 2011 at 8:42 PM, Jens-Heiner Rechtien <jh...@web.de> wrote:
> On 07/28/2011 04:32 PM, Pedro F. Giffuni wrote:
>> --- On Thu, 7/28/11, Christian Lohmaier wrote:
>> ...
>>>
>>> [1] Note that with the map, it would also be possible to
>>> reuse the old OOo-Subversion repo for the linear commits,
>>> after all the hg repo was a conversion from the svn server.
>>> This would save quite a bit of time.
>>
>> I like this idea ... if the old SVN server is still available
>> and we can do a progressive conversion of the rest of the Hg
>> stuff we will save a lot of metadata that had previously been
>> lost (plus we save conversion time).
>
> It's still available: http://svn.services.openoffice.org/ooo resp.
> svn://svn.services.openoffice.org/ooo
>
> You can use svnsync to create a local copy of the rep. Will take a while :-)

Yes, takes almost two days - would have been easier if there was a dump :-)

Now good thing: mapping the revisions works, and is completed on a
slow machine in 10 minutes.
Bad thing: I couldn't test the import with the plain repo, as hg
convert wants a *full* checkout of the repo, not just trunk, and that
doesn't fit in the 100GB I have available[1], so I'm now creating a
dump to run through svndumpfilter to only preserve trunk and retry
with that shrunk repo. (

The script to do the mapping is attached, it will create the mapping
that can be used as hg-shamap to kickstart the conversing skipping the
first 263205 revision, thus saving way more than 20 days of conversion
time :-)

So while the process won't work with the pristine copy of the pristine
svn copy, it can still be used as basis for the conversion when
filtered to only include trunk, as all the linear revisions are
matched in trunk.

[1] the svn repo contains 984 cws - and when you assume just 1 GB per
cws for simplicity, you would need 1TB of storage to just do the
conversion. and svn needs loads of RAM to perform the checkout - the
1GB of real RAM and 1GB of RAM was all occupied by svn, and after
filling the 100GB a mere 12MB were free, so even with enough storage
capacity, the amount of RAM probably would not have been sufficient
for a full checkout...

ciao
Christian

Re: Converting the repo using mercurial's convert extension

Posted by Herbert Duerr <hd...@alice.de>.
On 07/28/2011 08:42 PM, Jens-Heiner Rechtien wrote:
> But, frankly, I can't see the need of having the CVS stuff at hand. It's
> very hard to make sense of this historical data anyway, at least if you
> haven't got a decade of OOo developer knowledge under the belt.
>
> It's true that the conversion was lossy, but that was intentional! You
> wouldn't believe how much cruft can accumulate in a decade of happy
> coding. A full conversion of our old CVS repository into SVN resulted in
> a SVN repository of about 90 GiB in size.

I know many developers only care about tip and maybe the head of 
branches. It is what matters most and so people were able to develop 
code long before VCS were available.

I disagree you need to have a decade of OOo developer knowledge to make 
use of it. Quite the opposite indeed! If someone new wants to work on 
some piece of code and isn't sure why it was coded that way then it is 
extremely helpful to look at its commit comments and especially the 
issue numbers mentioned there. The info in the issue and the attached 
documents often show corner use cases that better be handled properly 
even when the code is to be refactored.

Have a look at e.g.
http://hg.services.openoffice.org/DEV300/shortlog/19d852424fb4
or
http://hg.services.openoffice.org/DEV300/shortlog/a70e5539c48b

I don't believe for a second that someone who is interested in the use 
cases and history of some piece of some code will be happy when he finds 
a "CWS-TOOLING: integrate CWS vcl92" and no way to find out what the 
original commit comments were.

Herbert

Re: Converting the repo using mercurial's convert extension

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.

--- On Thu, 7/28/11, Jens-Heiner Rechtien wrote:

> On 07/28/2011 04:32 PM, Pedro F. Giffuni wrote:
> >
> > --- On Thu, 7/28/11, Christian Lohmaier wrote:
> > ...
> >>
> >> [1] Note that with the map, it would also be
> >> possible to reuse the old OOo-Subversion repo
> >> for the linear commits, after all the hg repo
> >>  repo was a conversion from the server.
> >> This would save quite a bit of time.
> >>
> >
> > I like this idea ... if the old SVN server is
> > still available and we can do a progressive
> > conversion of the rest of the Hg
> > stuff we will save a lot of metadata that had
> > previously been lost (plus we save conversion
> > time).
> 
> It's still available: http://svn.services.openoffice.org/ooo 
> resp. svn://svn.services.openoffice.org/ooo
>

Very, very nice.. it even has some old branches, perhaps those
can be updated progressively too at a later time! FWIW, it's
running Subversion 1.5.4. 

There are also some extra tools: "The CWS tooling has been
reworked to adapt to SVN.".


> You can use svnsync to create a local copy of the rep. Will
> take a while :-)
>

Ugh... I don't have the space locally to do even this :(,
but it certainly looks like we can save a lot of stuff
(and time) from there.
 
> >
> > I suspect the original CVS stuff that was lost in SVN
> > conversion is gone for good by now.
> 
> Oh, I'm pretty sure that it's still available somewhere.
> There have been some CVSup mirrors of the stuff.
> 
> But, frankly, I can't see the need of having the CVS stuff
> at hand. It's very hard to make sense of this historical
> data anyway, at least if you haven't got a decade of OOo
> developer knowledge under the belt.
>

Yes, I think the SVN conversion already has what could be
recovered easily, I doubt a newer conversion tool can add
much to it.

Thanks for the link. I hope we end up going this way!

Pedro.