You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Greg Stein <gs...@gmail.com> on 2011/07/09 07:57:28 UTC

single repository status

Hi all,

I borrowed some space on some ASF equipment and used the
'fetch-all-cws.sh' script to pull down all of the CWS repositories (in
addition to the OOO340 and master_l10n/OOO340 repositories). This
consumed 77 Gb of disk space.

The fetch-all-cws.sh script has been updated with all my fixes to make
this happen, and the cws-list.txt file has been updated to reflect all
CWSs that have actual content beyond OOO340.

I have updated single-hg.sh to do a basic combination of the
repositories. This produces a 2.7 Gb repository.

However! The single-hg script does not apply any tags/bookmarks for
the heads introduced by the pulls from the CWS repositories.
Originally, I thought to apply bookmarks, but ... meh. We can just
apply tags. It isn't like we are going to push to the repository and
need the bookmark to float with the changes. So the next step is to
update single-hg to apply the appropriate tags. At that point, I will
publish a bundle so that most people can skip all the above steps and
start playing with the repository that will feed into the next step
(convert to svn).

Once the tags are properly marked, then we can start testing the
conversion to Subversion. Please note, however: the hg conversion
script does *not* process tags. We will have to write that. We will
also have to somehow manage construction of branches. We may also have
to update the conversion tool to properly handle the "merge"
changesets that Hg records.

In the single repository that I constructed, there are 102 heads. I
believe these will become branches.

We will need a script to take the single repository as input, and run
the conversion process. I think there will multiple inputs to that
process, so there will be additional input files to coordinate and
drive the conversion. All of this should be placed into the tools/dev/
directory. Help is wanted!

Cheers,
-g

Re: single repository status

Posted by Michael Stahl <ms...@openoffice.org>.
On 20.07.2011 12:05, Eike Rathke wrote:
> Hi Michael,
>
> On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:
>
>> unfortunately it seems none of the tools that convert from HG or
>> git to SVN can create SVN branches with SVN mergeinfo (necessary in
>> order to be able to merge the branches back into the trunk).
>>
>> there are some tools to convert from git that can create SVN
>> branches, but they leave out the SVN mergeinfo; apparently the
>> intent is to maintain a read-only mirror...
>
> I didn't dug deeper into this, but conversion from hg to git should
> be pretty straight forward and then there's git-svn, would that be
> viable to import branches as well?

well, i've already got a git repo with 104 open CWS branches on my
laptop for 2 weeks now...
(which, by the way, i've already made good use of, because git diff has
a very useful option --name-status, making discovery of added files on
CWS branches much easier)

but it seems that git-svn isn't going to help for the branching/merging:

http://www.kernel.org/pub/software/scm/git/docs/git-svn.html

> MERGE TRACKING
>
> While git svn can track copy history (including branches and tags)
> for repositories adopting a standard layout, it cannot yet represent
> merge history that happened inside git back upstream to SVN users.
> Therefore it is advised that users keep history as linear as
> possible inside git to ease compatibility with SVN (see the CAVEATS
> section below).

also, funnily, it has a --mergeinfo parameter which allows the caller to
specify SVN mergeinfo, but it can't compute it itself...

what i guess we could do is to use git-svn to merge and/or rebase the
open CWSes and push the result (as a linear history) to SVN.

there is also a HgSubversion extension that has similar functionality
and restrictions:

http://mercurial.selenic.com/wiki/HgSubversion

> The important point to note is that hgsubversion cannot push merge
> changesets to a svn repository.

i think the problem is that Hg/git/otherDSCMs and SVN have fundamentally 
different data models for representing branching/merging; the fact that 
so far nobody has come up with a conversion tool that handles merges 
well suggests to me that it's not an easy problem to solve.

regards,
  michael


Re: single repository status

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.
My 0.02$ here is:

- Git was already discussed and I think it's a dead end.
Apache.org is not a running git ATM and if we want to mirror
everything temporarily we should do it in Hg as it's not clear
Git will preserve all the history either. It's not clear
that Oracle is in a hurry to kill everything OOo until the
code has been transferred to it's new home though, so maybe
there's no need to push the panic button and we can do the
merging in orderly manner.

- I really think it's time to import something.. so someone
please move the Hg trunk to SVN and get started. We can look
for options for meging the branches later on.

- No report yet on bugzilla/JIRA? That's important too.

cheers,

Pedro.

--- On Mon, 7/25/11, Rob Weir wrote:
...
> 
> They currently support a read-only git mirror of a
> project's SVN repository
> 
> http://git.apache.org/
> 
> So what we are asking for is slightly different.  It
> is read-only.
> But it is not a live mirror of the SVN repository.  It
> is an archived
> mirror of Sun/Oracle's Hg repository.  So from the
> Infrastructure
> perspective, once the migration of Hg to git is done (and
> that should
> be easy) hosting it read-only is nothing new.  If
> there are any
> objections, it would be from the IP perspective.  This
> is more than
> getting an updated SGA from Oracle. It is also about
> incompatible
> licensed code.  We can only carry that in our SVN
> repository
> temporarily, until we graduate.  So having it exist
> long-term in a
> read-only git archive is something we'd want to understand
> better.
> 
> 
> > what others think about that ?
> >
> > ++
> >
> >>
> >>
> >> As you may notice, none of the above solves the
> core problem.  It
> >> merely tries to push that problem to the side, so
> the trunk can move
> >> forward, and allow the CWS migration problem to be
> worked on in
> >> parallel.
> >>
> >> Is this a plausible approach?
> >>
> >> -Rob
> >>
> >>> ... And a new (hi)story begin in Apache svn !
> :)
> >>>
> >>> This will require infra to set up a special
> ooo-history git repos... but
> >>> if
> >>> we are kind they may accept :).
> >>>
> >>> What do you think ?
> >>>
> >>> ++
> >>>
> >>>
> >>> On 07/21/2011 03:16 PM, Jens-Heiner Rechtien
> wrote:
> >>>>
> >>>> On 07/20/2011 12:05 PM, Eike Rathke
> wrote:
> >>>>>
> >>>>> Hi Michael,
> >>>>>
> >>>>> On Tuesday, 2011-07-19 23:26:48 +0200,
> Michael Stahl wrote:
> >>>>>
> >>>>>> unfortunately it seems none of the
> tools that convert from HG or git
> >>>>>> to SVN can create SVN branches
> with SVN mergeinfo (necessary in
> >>>>>> order to be able to merge the
> branches back into the trunk).
> >>>>>>
> >>>>>> there are some tools to convert
> from git that can create SVN
> >>>>>> branches, but they leave out the
> SVN mergeinfo; apparently the
> >>>>>> intent is to maintain a read-only
> mirror...
> >>>>>
> >>>>> I didn't dug deeper into this, but
> conversion from hg to git should be
> >>>>> pretty straight forward and then
> there's git-svn, would that be viable
> >>>>> to import branches as well?
> >>>>>
> >>>>>
> >>>>>> basically we have these options
> for converting to SVN:
> >>>>>>
> >>>>>> 1. convert full history
> >>>>>> requires writing tool to create
> SVN branches and mergeinfo
> >>>>>>
> >>>>>> 2. convert trunk only, using
> follow-first-parent heuristic
> >>>>>> with hacks where we want to follow
> second parent instead
> >>>>>>
> >>>>>> 3. no history in SVN, just check
> in OOO340 tip
> >>>>>
> >>>>> I'd prefer #3 and have a read-only
> hg/git repository for cases where
> >>>>> one
> >>>>> really wants to lookup history. AOOo
> needs to get its code base going.
> >>>>
> >>>> +1 for #3. We need the repository ASAP to
> get going. If we have to write
> >>>> the conversion tools first we'll loose way
> to much time which could be
> >>>> spent better on getting AOOo 3.4 (or
> whatever the first AOOo release
> >>>> will be called) out of the door. A pity
> that Apache git support is not
> >>>> ready for prime time ... would make things
> so much easier.
> >>>>
> >>>> Heiner
> >>>>
> >>>>
> >>>
> >
> 
> 1e

Re: single repository status

Posted by Rob Weir <ap...@robweir.com>.
On Mon, Jul 25, 2011 at 2:16 PM, florent andré
<fl...@4sengines.com> wrote:
> Rob,
>
> And others, because we have to choose an easy as possible and fast solution
> for begin to play with code.
>
> On 07/25/2011 01:33 AM, Rob Weir wrote:
>>
>> On Sat, Jul 23, 2011 at 8:22 AM, florent andré
>> <fl...@4sengines.com>  wrote:
>>>
>>> Hi,
>>>
>>> I also think that we need codebase in svn soon.
>>>
>>> Following your all pretty good comments, import a "perfect" hg history
>>> into
>>> svn seems not to be quite easy... and will require works and effort.
>>>
>>
>> I not yet convinced that this conversion is impossible.  But I am
>> convinced that it is difficult.
>>
>>> As Michael Stahl says "Hg/git/otherDSCMs and SVN have fundamentally
>>> different data models for representing branching/merging".
>>> So hg -->  svn is complicated but hg -->  git seems to works pretty well.
>>>
>>> The main point is to have a way to lurk the history, and to host this
>>> history into Apache infra in order to be Oracle's server shut down
>>> tolerant.
>>>
>>> So, what about this proposal ? :
>>>
>>> - ask infra to set up a special git ooo-history
>>> - import only the main hg branch into svn
>>> - if someone interested in a specific hg branch :
>>> ** git checkout theBranch (from ooo-history)
>>> ** svn create branch from trunk ( Btrunk)
>>> ** diff Btrunk / theBranch for creating patch
>>> ** apply patch on Btrunk
>>> ** commit Btrunk
>>>
>>> Sure we will lost a lot of history in svn... but we still have it in git
>>> ooo-history...
>>
>>
>> In the above approach we would still have the history of the trunk in
>> SVN, right? Or would we need to go to git in order to get that history
>> as well?
>
> I'm not an hg to svn expert so I don't really now... seems yes but have to
> be confirm.
>
>>
>> Would this work as as a general approach:
>>
>> 1) Move forward with the trunk migration into SVN now.  This allows
>> work on the trunk to go forward.  We have people waiting to start on
>> this work.
>
> +1 even if we can't restore all history (because we can have workarounds
> with git and hg).
>
>>
>> 2) The CWS migration planning and experimentation can continue.  If
>> someone is able to create tooling to make this possible, or finds
>> another elegant solution, then we can migrate the CWS's over at a
>> later point.
>
> +1
>
>>
>> 3) In the mean time, if anyone needs to do work on a CWS, they can
>> extract from Hg and work on it that way.   But if they want to merge
>> it into SVN, then they will have to do some careful and manual work.
>> This may be painful for some large CWS's, like the one Armin has.
>
> Sure it will need manual work... but with the procedure I propose I think
> that it's lower the bar :
>>> - if someone interested in a specific hg branch :
>>> ** git/hg checkout gitBranch (from ooo-history/needed cws)
>>> ** svn create branch from apache svn trunk ( svnBranch)
>>> ** diff svnBranch / gitBranch for creating patch
>>> ** apply patch on svnBranch
>>> ** commit svnBranch
>
> And some work will still be needed for merging just created svnBranch with
> trunk
>
>
>>
>> 4) If, in 6 months from now, or whenever Oracle wants to turn off
>> their OpenOffice.org servers, then we need to have a solution for the
>> CWS.  But we can wait until closer to that date.  Maybe by then Apache
>> will support git, in which case the problem has solved itself.
>
> I'm personally not really fan of the "wait and see" procedure... because may
> Oracle will not ping us when press the "stop button".
>
> Furthermore have a git (read only) ooo-history will make all already git
> users happy.
>
> There is someone from the infra here that can say if an ooo-history git repo
> is feasible ?
>

They currently support a read-only git mirror of a project's SVN repository

http://git.apache.org/

So what we are asking for is slightly different.  It is read-only.
But it is not a live mirror of the SVN repository.  It is an archived
mirror of Sun/Oracle's Hg repository.  So from the Infrastructure
perspective, once the migration of Hg to git is done (and that should
be easy) hosting it read-only is nothing new.  If there are any
objections, it would be from the IP perspective.  This is more than
getting an updated SGA from Oracle. It is also about incompatible
licensed code.  We can only carry that in our SVN repository
temporarily, until we graduate.  So having it exist long-term in a
read-only git archive is something we'd want to understand better.


> what others think about that ?
>
> ++
>
>>
>>
>> As you may notice, none of the above solves the core problem.  It
>> merely tries to push that problem to the side, so the trunk can move
>> forward, and allow the CWS migration problem to be worked on in
>> parallel.
>>
>> Is this a plausible approach?
>>
>> -Rob
>>
>>> ... And a new (hi)story begin in Apache svn ! :)
>>>
>>> This will require infra to set up a special ooo-history git repos... but
>>> if
>>> we are kind they may accept :).
>>>
>>> What do you think ?
>>>
>>> ++
>>>
>>>
>>> On 07/21/2011 03:16 PM, Jens-Heiner Rechtien wrote:
>>>>
>>>> On 07/20/2011 12:05 PM, Eike Rathke wrote:
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>> On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:
>>>>>
>>>>>> unfortunately it seems none of the tools that convert from HG or git
>>>>>> to SVN can create SVN branches with SVN mergeinfo (necessary in
>>>>>> order to be able to merge the branches back into the trunk).
>>>>>>
>>>>>> there are some tools to convert from git that can create SVN
>>>>>> branches, but they leave out the SVN mergeinfo; apparently the
>>>>>> intent is to maintain a read-only mirror...
>>>>>
>>>>> I didn't dug deeper into this, but conversion from hg to git should be
>>>>> pretty straight forward and then there's git-svn, would that be viable
>>>>> to import branches as well?
>>>>>
>>>>>
>>>>>> basically we have these options for converting to SVN:
>>>>>>
>>>>>> 1. convert full history
>>>>>> requires writing tool to create SVN branches and mergeinfo
>>>>>>
>>>>>> 2. convert trunk only, using follow-first-parent heuristic
>>>>>> with hacks where we want to follow second parent instead
>>>>>>
>>>>>> 3. no history in SVN, just check in OOO340 tip
>>>>>
>>>>> I'd prefer #3 and have a read-only hg/git repository for cases where
>>>>> one
>>>>> really wants to lookup history. AOOo needs to get its code base going.
>>>>
>>>> +1 for #3. We need the repository ASAP to get going. If we have to write
>>>> the conversion tools first we'll loose way to much time which could be
>>>> spent better on getting AOOo 3.4 (or whatever the first AOOo release
>>>> will be called) out of the door. A pity that Apache git support is not
>>>> ready for prime time ... would make things so much easier.
>>>>
>>>> Heiner
>>>>
>>>>
>>>
>

Re: single repository status

Posted by florent andré <fl...@4sengines.com>.
Rob,

And others, because we have to choose an easy as possible and fast 
solution for begin to play with code.

On 07/25/2011 01:33 AM, Rob Weir wrote:
> On Sat, Jul 23, 2011 at 8:22 AM, florent andré
> <fl...@4sengines.com>  wrote:
>> Hi,
>>
>> I also think that we need codebase in svn soon.
>>
>> Following your all pretty good comments, import a "perfect" hg history into
>> svn seems not to be quite easy... and will require works and effort.
>>
>
> I not yet convinced that this conversion is impossible.  But I am
> convinced that it is difficult.
>
>> As Michael Stahl says "Hg/git/otherDSCMs and SVN have fundamentally
>> different data models for representing branching/merging".
>> So hg -->  svn is complicated but hg -->  git seems to works pretty well.
>>
>> The main point is to have a way to lurk the history, and to host this
>> history into Apache infra in order to be Oracle's server shut down tolerant.
>>
>> So, what about this proposal ? :
>>
>> - ask infra to set up a special git ooo-history
>> - import only the main hg branch into svn
>> - if someone interested in a specific hg branch :
>> ** git checkout theBranch (from ooo-history)
>> ** svn create branch from trunk ( Btrunk)
>> ** diff Btrunk / theBranch for creating patch
>> ** apply patch on Btrunk
>> ** commit Btrunk
>>
>> Sure we will lost a lot of history in svn... but we still have it in git
>> ooo-history...
>
>
> In the above approach we would still have the history of the trunk in
> SVN, right? Or would we need to go to git in order to get that history
> as well?

I'm not an hg to svn expert so I don't really now... seems yes but have 
to be confirm.

>
> Would this work as as a general approach:
>
> 1) Move forward with the trunk migration into SVN now.  This allows
> work on the trunk to go forward.  We have people waiting to start on
> this work.

+1 even if we can't restore all history (because we can have workarounds 
with git and hg).

>
> 2) The CWS migration planning and experimentation can continue.  If
> someone is able to create tooling to make this possible, or finds
> another elegant solution, then we can migrate the CWS's over at a
> later point.

+1

>
> 3) In the mean time, if anyone needs to do work on a CWS, they can
> extract from Hg and work on it that way.   But if they want to merge
> it into SVN, then they will have to do some careful and manual work.
> This may be painful for some large CWS's, like the one Armin has.

Sure it will need manual work... but with the procedure I propose I 
think that it's lower the bar :
 >> - if someone interested in a specific hg branch :
 >> ** git/hg checkout gitBranch (from ooo-history/needed cws)
 >> ** svn create branch from apache svn trunk ( svnBranch)
 >> ** diff svnBranch / gitBranch for creating patch
 >> ** apply patch on svnBranch
 >> ** commit svnBranch

And some work will still be needed for merging just created svnBranch 
with trunk


>
> 4) If, in 6 months from now, or whenever Oracle wants to turn off
> their OpenOffice.org servers, then we need to have a solution for the
> CWS.  But we can wait until closer to that date.  Maybe by then Apache
> will support git, in which case the problem has solved itself.

I'm personally not really fan of the "wait and see" procedure... because 
may Oracle will not ping us when press the "stop button".

Furthermore have a git (read only) ooo-history will make all already git 
users happy.

There is someone from the infra here that can say if an ooo-history git 
repo is feasible ?

what others think about that ?

++

>
>
> As you may notice, none of the above solves the core problem.  It
> merely tries to push that problem to the side, so the trunk can move
> forward, and allow the CWS migration problem to be worked on in
> parallel.
>
> Is this a plausible approach?
>
> -Rob
>
>> ... And a new (hi)story begin in Apache svn ! :)
>>
>> This will require infra to set up a special ooo-history git repos... but if
>> we are kind they may accept :).
>>
>> What do you think ?
>>
>> ++
>>
>>
>> On 07/21/2011 03:16 PM, Jens-Heiner Rechtien wrote:
>>>
>>> On 07/20/2011 12:05 PM, Eike Rathke wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:
>>>>
>>>>> unfortunately it seems none of the tools that convert from HG or git
>>>>> to SVN can create SVN branches with SVN mergeinfo (necessary in
>>>>> order to be able to merge the branches back into the trunk).
>>>>>
>>>>> there are some tools to convert from git that can create SVN
>>>>> branches, but they leave out the SVN mergeinfo; apparently the
>>>>> intent is to maintain a read-only mirror...
>>>>
>>>> I didn't dug deeper into this, but conversion from hg to git should be
>>>> pretty straight forward and then there's git-svn, would that be viable
>>>> to import branches as well?
>>>>
>>>>
>>>>> basically we have these options for converting to SVN:
>>>>>
>>>>> 1. convert full history
>>>>> requires writing tool to create SVN branches and mergeinfo
>>>>>
>>>>> 2. convert trunk only, using follow-first-parent heuristic
>>>>> with hacks where we want to follow second parent instead
>>>>>
>>>>> 3. no history in SVN, just check in OOO340 tip
>>>>
>>>> I'd prefer #3 and have a read-only hg/git repository for cases where one
>>>> really wants to lookup history. AOOo needs to get its code base going.
>>>
>>> +1 for #3. We need the repository ASAP to get going. If we have to write
>>> the conversion tools first we'll loose way to much time which could be
>>> spent better on getting AOOo 3.4 (or whatever the first AOOo release
>>> will be called) out of the door. A pity that Apache git support is not
>>> ready for prime time ... would make things so much easier.
>>>
>>> Heiner
>>>
>>>
>>

Re: single repository status

Posted by Rob Weir <ap...@robweir.com>.
On Sat, Jul 23, 2011 at 8:22 AM, florent andré
<fl...@4sengines.com> wrote:
> Hi,
>
> I also think that we need codebase in svn soon.
>
> Following your all pretty good comments, import a "perfect" hg history into
> svn seems not to be quite easy... and will require works and effort.
>

I not yet convinced that this conversion is impossible.  But I am
convinced that it is difficult.

> As Michael Stahl says "Hg/git/otherDSCMs and SVN have fundamentally
> different data models for representing branching/merging".
> So hg --> svn is complicated but hg --> git seems to works pretty well.
>
> The main point is to have a way to lurk the history, and to host this
> history into Apache infra in order to be Oracle's server shut down tolerant.
>
> So, what about this proposal ? :
>
> - ask infra to set up a special git ooo-history
> - import only the main hg branch into svn
> - if someone interested in a specific hg branch :
> ** git checkout theBranch (from ooo-history)
> ** svn create branch from trunk ( Btrunk)
> ** diff Btrunk / theBranch for creating patch
> ** apply patch on Btrunk
> ** commit Btrunk
>
> Sure we will lost a lot of history in svn... but we still have it in git
> ooo-history...


In the above approach we would still have the history of the trunk in
SVN, right? Or would we need to go to git in order to get that history
as well?

Would this work as as a general approach:

1) Move forward with the trunk migration into SVN now.  This allows
work on the trunk to go forward.  We have people waiting to start on
this work.

2) The CWS migration planning and experimentation can continue.  If
someone is able to create tooling to make this possible, or finds
another elegant solution, then we can migrate the CWS's over at a
later point.

3) In the mean time, if anyone needs to do work on a CWS, they can
extract from Hg and work on it that way.   But if they want to merge
it into SVN, then they will have to do some careful and manual work.
This may be painful for some large CWS's, like the one Armin has.

4) If, in 6 months from now, or whenever Oracle wants to turn off
their OpenOffice.org servers, then we need to have a solution for the
CWS.  But we can wait until closer to that date.  Maybe by then Apache
will support git, in which case the problem has solved itself.


As you may notice, none of the above solves the core problem.  It
merely tries to push that problem to the side, so the trunk can move
forward, and allow the CWS migration problem to be worked on in
parallel.

Is this a plausible approach?

-Rob

> ... And a new (hi)story begin in Apache svn ! :)
>
> This will require infra to set up a special ooo-history git repos... but if
> we are kind they may accept :).
>
> What do you think ?
>
> ++
>
>
> On 07/21/2011 03:16 PM, Jens-Heiner Rechtien wrote:
>>
>> On 07/20/2011 12:05 PM, Eike Rathke wrote:
>>>
>>> Hi Michael,
>>>
>>> On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:
>>>
>>>> unfortunately it seems none of the tools that convert from HG or git
>>>> to SVN can create SVN branches with SVN mergeinfo (necessary in
>>>> order to be able to merge the branches back into the trunk).
>>>>
>>>> there are some tools to convert from git that can create SVN
>>>> branches, but they leave out the SVN mergeinfo; apparently the
>>>> intent is to maintain a read-only mirror...
>>>
>>> I didn't dug deeper into this, but conversion from hg to git should be
>>> pretty straight forward and then there's git-svn, would that be viable
>>> to import branches as well?
>>>
>>>
>>>> basically we have these options for converting to SVN:
>>>>
>>>> 1. convert full history
>>>> requires writing tool to create SVN branches and mergeinfo
>>>>
>>>> 2. convert trunk only, using follow-first-parent heuristic
>>>> with hacks where we want to follow second parent instead
>>>>
>>>> 3. no history in SVN, just check in OOO340 tip
>>>
>>> I'd prefer #3 and have a read-only hg/git repository for cases where one
>>> really wants to lookup history. AOOo needs to get its code base going.
>>
>> +1 for #3. We need the repository ASAP to get going. If we have to write
>> the conversion tools first we'll loose way to much time which could be
>> spent better on getting AOOo 3.4 (or whatever the first AOOo release
>> will be called) out of the door. A pity that Apache git support is not
>> ready for prime time ... would make things so much easier.
>>
>> Heiner
>>
>>
>

Re: single repository status

Posted by Mathias Bauer <Ma...@gmx.net>.
On 23.07.2011 15:46, Rob Weir wrote:

> One concern would be on the IP perspective.  Before we can have a
> release or graduate from Podling, we need to review our source code
> and remove all incompatible 3rd party components.  We're also working
> to ensure that the Oracle SGA is updated to contain all the files we
> need for AOOo.  If we have two repositories, a "clean" one in SVN and
> an "unreviewed" version in git that we occasionally dip into for
> unintegrated branches, then IP compliance becomes more difficult.
> Maybe not impossible, but certainly more complicated.

With the help from Michael Stahl I created a list of all new files that 
are in all cws repositories, but not in the OOO340 trunk and forwarded 
it to Andrew Rist.

I also had a look on the files and for me it seems that they are less 
problematic from an IP perspective than many files we already have in 
the trunk repo.

So from a technical perspective the cws shouldn't be a problem. OTOH I'm 
afraid that postponing to move the sources into the svn repo might 
create troubles for those with a legal POV.

Regards,
Mathias

Re: single repository status

Posted by Florent André <fl...@apache.org>.

On 07/23/2011 03:46 PM, Rob Weir wrote:
> On Sat, Jul 23, 2011 at 8:22 AM, florent andré
> <fl...@4sengines.com>  wrote:
>> Hi,
>>
>> I also think that we need codebase in svn soon.
>>
>
> Yes.
>
>> Following your all pretty good comments, import a "perfect" hg history into
>> svn seems not to be quite easy... and will require works and effort.
>>
>> As Michael Stahl says "Hg/git/otherDSCMs and SVN have fundamentally
>> different data models for representing branching/merging".
>> So hg -->  svn is complicated but hg -->  git seems to works pretty well.
>>
>> The main point is to have a way to lurk the history, and to host this
>> history into Apache infra in order to be Oracle's server shut down tolerant.
>>
>> So, what about this proposal ? :
>>
>> - ask infra to set up a special git ooo-history
>
> I'm assuming that ooo-history would be read-only?

yep, as all existing git repo.
The only "special" things is that this git repo isn't link with an 
existing svn repos (actual apache git repo are read-only mirror of svn).

>
>> - import only the main hg branch into svn
>> - if someone interested in a specific hg branch :
>> ** git checkout theBranch (from ooo-history)
>> ** svn create branch from trunk ( Btrunk)
>> ** diff Btrunk / theBranch for creating patch
>> ** apply patch on Btrunk
>> ** commit Btrunk
>>
>> Sure we will lost a lot of history in svn... but we still have it in git
>> ooo-history...
>> ... And a new (hi)story begin in Apache svn ! :)
>>
>> This will require infra to set up a special ooo-history git repos... but if
>> we are kind they may accept :).
>>
>> What do you think ?
>>
>
> I like the "lazy" approach.  Defer the branch conversion work until it
> is actually needed.  It puts the human judgement in the loop at the
> time when it is needed.  If we believed that 100% (or even 80% or 90%)
> of the CWS would be needed immediately for integration into our first
> release, then it might make sense to do all that work up front.  But
> if we believe that only a few are needed, then it doesn't make sense
> to hold up the entire project for this.

hg repos seems hudge and if we all import in one shoot there is - IMO - 
a risk of being lost in this ocean of code.

Begin with the more "almost ready" hg branch and then import code after 
human judgement could be a more step by step and manageable processing.

>
> One concern would be on the IP perspective.  Before we can have a
> release or graduate from Podling, we need to review our source code
> and remove all incompatible 3rd party components.  We're also working
> to ensure that the Oracle SGA is updated to contain all the files we
> need for AOOo.  If we have two repositories, a "clean" one in SVN and
> an "unreviewed" version in git that we occasionally dip into for
> unintegrated branches, then IP compliance becomes more difficult.
> Maybe not impossible, but certainly more complicated.

Oracle SGA is for sure a things to fix first.

Considering 2 repositories and release/graduation :
- we have to *check it carefully*, but from my POV the "official" repos 
will be the svn one, ooo-history is just a backup one, so might not be 
considered for release / graduate

- the IP / incompatible 3rd party, could be more easily manageable I 
think because :
* we first working on one hg branch/trunk and not on all the code ocean. 
So it will be more easy to get rid of IP and 3rd because less place to 
get them hidden.
* integration of a new svn branch (from ooo-history) will be a patch to 
the clean svn trunk. This patch / new branch will be first checked by 
the "human behind the commit" and after by others ooo commiters...
Code added will me less, more that double checked, so ip/3rd will have 
less chances to use the black door.

++


>
> -Rob
>
>
>> ++
>>
>>
>> On 07/21/2011 03:16 PM, Jens-Heiner Rechtien wrote:
>>>
>>> On 07/20/2011 12:05 PM, Eike Rathke wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:
>>>>
>>>>> unfortunately it seems none of the tools that convert from HG or git
>>>>> to SVN can create SVN branches with SVN mergeinfo (necessary in
>>>>> order to be able to merge the branches back into the trunk).
>>>>>
>>>>> there are some tools to convert from git that can create SVN
>>>>> branches, but they leave out the SVN mergeinfo; apparently the
>>>>> intent is to maintain a read-only mirror...
>>>>
>>>> I didn't dug deeper into this, but conversion from hg to git should be
>>>> pretty straight forward and then there's git-svn, would that be viable
>>>> to import branches as well?
>>>>
>>>>
>>>>> basically we have these options for converting to SVN:
>>>>>
>>>>> 1. convert full history
>>>>> requires writing tool to create SVN branches and mergeinfo
>>>>>
>>>>> 2. convert trunk only, using follow-first-parent heuristic
>>>>> with hacks where we want to follow second parent instead
>>>>>
>>>>> 3. no history in SVN, just check in OOO340 tip
>>>>
>>>> I'd prefer #3 and have a read-only hg/git repository for cases where one
>>>> really wants to lookup history. AOOo needs to get its code base going.
>>>
>>> +1 for #3. We need the repository ASAP to get going. If we have to write
>>> the conversion tools first we'll loose way to much time which could be
>>> spent better on getting AOOo 3.4 (or whatever the first AOOo release
>>> will be called) out of the door. A pity that Apache git support is not
>>> ready for prime time ... would make things so much easier.
>>>
>>> Heiner
>>>
>>>
>>

Re: single repository status

Posted by Rob Weir <ap...@robweir.com>.
On Sat, Jul 23, 2011 at 8:22 AM, florent andré
<fl...@4sengines.com> wrote:
> Hi,
>
> I also think that we need codebase in svn soon.
>

Yes.

> Following your all pretty good comments, import a "perfect" hg history into
> svn seems not to be quite easy... and will require works and effort.
>
> As Michael Stahl says "Hg/git/otherDSCMs and SVN have fundamentally
> different data models for representing branching/merging".
> So hg --> svn is complicated but hg --> git seems to works pretty well.
>
> The main point is to have a way to lurk the history, and to host this
> history into Apache infra in order to be Oracle's server shut down tolerant.
>
> So, what about this proposal ? :
>
> - ask infra to set up a special git ooo-history

I'm assuming that ooo-history would be read-only?

> - import only the main hg branch into svn
> - if someone interested in a specific hg branch :
> ** git checkout theBranch (from ooo-history)
> ** svn create branch from trunk ( Btrunk)
> ** diff Btrunk / theBranch for creating patch
> ** apply patch on Btrunk
> ** commit Btrunk
>
> Sure we will lost a lot of history in svn... but we still have it in git
> ooo-history...
> ... And a new (hi)story begin in Apache svn ! :)
>
> This will require infra to set up a special ooo-history git repos... but if
> we are kind they may accept :).
>
> What do you think ?
>

I like the "lazy" approach.  Defer the branch conversion work until it
is actually needed.  It puts the human judgement in the loop at the
time when it is needed.  If we believed that 100% (or even 80% or 90%)
of the CWS would be needed immediately for integration into our first
release, then it might make sense to do all that work up front.  But
if we believe that only a few are needed, then it doesn't make sense
to hold up the entire project for this.

One concern would be on the IP perspective.  Before we can have a
release or graduate from Podling, we need to review our source code
and remove all incompatible 3rd party components.  We're also working
to ensure that the Oracle SGA is updated to contain all the files we
need for AOOo.  If we have two repositories, a "clean" one in SVN and
an "unreviewed" version in git that we occasionally dip into for
unintegrated branches, then IP compliance becomes more difficult.
Maybe not impossible, but certainly more complicated.

-Rob


> ++
>
>
> On 07/21/2011 03:16 PM, Jens-Heiner Rechtien wrote:
>>
>> On 07/20/2011 12:05 PM, Eike Rathke wrote:
>>>
>>> Hi Michael,
>>>
>>> On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:
>>>
>>>> unfortunately it seems none of the tools that convert from HG or git
>>>> to SVN can create SVN branches with SVN mergeinfo (necessary in
>>>> order to be able to merge the branches back into the trunk).
>>>>
>>>> there are some tools to convert from git that can create SVN
>>>> branches, but they leave out the SVN mergeinfo; apparently the
>>>> intent is to maintain a read-only mirror...
>>>
>>> I didn't dug deeper into this, but conversion from hg to git should be
>>> pretty straight forward and then there's git-svn, would that be viable
>>> to import branches as well?
>>>
>>>
>>>> basically we have these options for converting to SVN:
>>>>
>>>> 1. convert full history
>>>> requires writing tool to create SVN branches and mergeinfo
>>>>
>>>> 2. convert trunk only, using follow-first-parent heuristic
>>>> with hacks where we want to follow second parent instead
>>>>
>>>> 3. no history in SVN, just check in OOO340 tip
>>>
>>> I'd prefer #3 and have a read-only hg/git repository for cases where one
>>> really wants to lookup history. AOOo needs to get its code base going.
>>
>> +1 for #3. We need the repository ASAP to get going. If we have to write
>> the conversion tools first we'll loose way to much time which could be
>> spent better on getting AOOo 3.4 (or whatever the first AOOo release
>> will be called) out of the door. A pity that Apache git support is not
>> ready for prime time ... would make things so much easier.
>>
>> Heiner
>>
>>
>

Re: single repository status

Posted by florent andré <fl...@4sengines.com>.
Hi,

I also think that we need codebase in svn soon.

Following your all pretty good comments, import a "perfect" hg history 
into svn seems not to be quite easy... and will require works and effort.

As Michael Stahl says "Hg/git/otherDSCMs and SVN have fundamentally 
different data models for representing branching/merging".
So hg --> svn is complicated but hg --> git seems to works pretty well.

The main point is to have a way to lurk the history, and to host this 
history into Apache infra in order to be Oracle's server shut down tolerant.

So, what about this proposal ? :

- ask infra to set up a special git ooo-history
- import only the main hg branch into svn
- if someone interested in a specific hg branch :
** git checkout theBranch (from ooo-history)
** svn create branch from trunk ( Btrunk)
** diff Btrunk / theBranch for creating patch
** apply patch on Btrunk
** commit Btrunk

Sure we will lost a lot of history in svn... but we still have it in git 
ooo-history...
... And a new (hi)story begin in Apache svn ! :)

This will require infra to set up a special ooo-history git repos... but 
if we are kind they may accept :).

What do you think ?

++


On 07/21/2011 03:16 PM, Jens-Heiner Rechtien wrote:
> On 07/20/2011 12:05 PM, Eike Rathke wrote:
>> Hi Michael,
>>
>> On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:
>>
>>> unfortunately it seems none of the tools that convert from HG or git
>>> to SVN can create SVN branches with SVN mergeinfo (necessary in
>>> order to be able to merge the branches back into the trunk).
>>>
>>> there are some tools to convert from git that can create SVN
>>> branches, but they leave out the SVN mergeinfo; apparently the
>>> intent is to maintain a read-only mirror...
>>
>> I didn't dug deeper into this, but conversion from hg to git should be
>> pretty straight forward and then there's git-svn, would that be viable
>> to import branches as well?
>>
>>
>>> basically we have these options for converting to SVN:
>>>
>>> 1. convert full history
>>> requires writing tool to create SVN branches and mergeinfo
>>>
>>> 2. convert trunk only, using follow-first-parent heuristic
>>> with hacks where we want to follow second parent instead
>>>
>>> 3. no history in SVN, just check in OOO340 tip
>>
>> I'd prefer #3 and have a read-only hg/git repository for cases where one
>> really wants to lookup history. AOOo needs to get its code base going.
>
> +1 for #3. We need the repository ASAP to get going. If we have to write
> the conversion tools first we'll loose way to much time which could be
> spent better on getting AOOo 3.4 (or whatever the first AOOo release
> will be called) out of the door. A pity that Apache git support is not
> ready for prime time ... would make things so much easier.
>
> Heiner
>
>

Re: single repository status

Posted by Jens-Heiner Rechtien <jh...@web.de>.
On 07/20/2011 12:05 PM, Eike Rathke wrote:
> Hi Michael,
>
> On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:
>
>> unfortunately it seems none of the tools that convert from HG or git
>> to SVN can create SVN branches with SVN mergeinfo (necessary in
>> order to be able to merge the branches back into the trunk).
>>
>> there are some tools to convert from git that can create SVN
>> branches, but they leave out the SVN mergeinfo; apparently the
>> intent is to maintain a read-only mirror...
>
> I didn't dug deeper into this, but conversion from hg to git should be
> pretty straight forward and then there's git-svn, would that be viable
> to import branches as well?
>
>
>> basically we have these options for converting to SVN:
>>
>> 1. convert full history
>>     requires writing tool to create SVN branches and mergeinfo
>>
>> 2. convert trunk only, using follow-first-parent heuristic
>>     with hacks where we want to follow second parent instead
>>
>> 3. no history in SVN, just check in OOO340 tip
>
> I'd prefer #3 and have a read-only hg/git repository for cases where one
> really wants to lookup history. AOOo needs to get its code base going.

+1 for #3. We need the repository ASAP to get going. If we have to write 
the conversion tools first we'll loose way to much time which could be 
spent better on getting AOOo 3.4 (or whatever the first AOOo release 
will be called) out of the door. A pity that Apache git support is not 
ready for prime time ... would make things so much easier.

Heiner


-- 
Jens-Heiner Rechtien

Re: single repository status

Posted by Eike Rathke <oo...@erack.de>.
Hi Michael,

On Tuesday, 2011-07-19 23:26:48 +0200, Michael Stahl wrote:

> unfortunately it seems none of the tools that convert from HG or git
> to SVN can create SVN branches with SVN mergeinfo (necessary in
> order to be able to merge the branches back into the trunk).
> 
> there are some tools to convert from git that can create SVN
> branches, but they leave out the SVN mergeinfo; apparently the
> intent is to maintain a read-only mirror...

I didn't dug deeper into this, but conversion from hg to git should be
pretty straight forward and then there's git-svn, would that be viable
to import branches as well?


> basically we have these options for converting to SVN:
> 
> 1. convert full history
>    requires writing tool to create SVN branches and mergeinfo
> 
> 2. convert trunk only, using follow-first-parent heuristic
>    with hacks where we want to follow second parent instead
> 
> 3. no history in SVN, just check in OOO340 tip

I'd prefer #3 and have a read-only hg/git repository for cases where one
really wants to lookup history. AOOo needs to get its code base going.

  Eike

-- 
 PGP/OpenPGP/GnuPG encrypted mail preferred in all private communication.
 Key ID: 0x293C05FD - 997A 4C60 CE41 0149 0DB3  9E96 2F1A D073 293C 05FD

Re: single repository status

Posted by BRM <bm...@yahoo.com>.
----- Original Message ----

> From: Michael Stahl <ms...@openoffice.org>
> On 14.07.2011 18:03, Michael Stahl wrote:
> > On 09.07.2011 07:57, Greg  Stein wrote:
> >> Once the tags are properly marked, then we can start  testing the
> >> conversion to Subversion. Please note, however: the hg  conversion
> >> script does *not* process tags. We will have to write  that. We will
> >> also have to somehow manage construction of branches.  We may also have
> >> to update the conversion tool to properly handle  the "merge"
> >> changesets that Hg records.
> > 
> > after  looking at the HG convert svn_sink a little, this looks rather
> > difficult  to me.
> > 
> > well, tags should be easy.
> > 
> > but how to  reconstruct the branching is kind of unclear to me.
> > the first ~260k  changesets are linear and were initially taken from SVN,
> > so that  shouldn't be a problem.
> > but then there are lots of HG merge  changesets.
> 
> unfortunately it seems none of the tools that convert from HG  or git to SVN 
>can create SVN branches with SVN mergeinfo (necessary in order to  be able to 
>merge the branches back into the trunk).
> 
> there are some tools  to convert from git that can create SVN branches, but 
>they leave out the SVN  mergeinfo; apparently the intent is to maintain a 
>read-only  mirror...
> 
> > in the HG repo there is sort of a "master" history that is  basically a
> > sequence of CWS integration merge changesets, with the  occasional
> > masterfix thrown in.
> > but this is not explicit: these  are all just merges with 2 parents, and
> > it's not guaranteed which of  these 2 is the CWS and which the master.
> > basically the requirement is to  convert one of the thousands of paths
> > through this DAG from the common  ancestor revision ~260k to revision
> > c904c1944462 (OOO340 head) into a  SVN trunk.
> 
> to be more specific, the last revision from OOo SVN was the  tag 
>"last_svn_milestone" (263206  d0058b5891eb)
> (which is not actually a  common ancestor, as some CWSes are based on older 
>milestones)
> 
> the  following merge commits are those on the DEV300 master/"trunk" where i've  
>noticed the "follow the first parent" heuristic would fail, and the second  
>parent (after the arrow) should be taken instead.
> 
> 90552a19cdc4 ->  dae1ffc5c15d
> 9572400cd241 -> ce1b12199f72
> c33f611c4373 ->  5d0c069f2570
> 13b3d2dae916 -> 62bf02dff30b
> 37dc1e423a1e ->  664c5f3a9291
> 
> 
> basically we have these options for converting to  SVN:
> 
> 1. convert full history
>    requires writing tool to create  SVN branches and mergeinfo
> 
> 2. convert trunk only, using  follow-first-parent heuristic
>    with hacks where we want to follow  second parent instead
> 
> 3. no history in SVN, just check in OOO340  tip
> 
> option 1. seems to be too much effort to me, and would have to be  implemented 
>by a real SVN wizard.
> 
> option 2. requires some effort, but  should be doable; we still lose all CWS 
>internal history though (e.g. CWS  undoapi becomes a single 57,788 line 
>changeset against 562 files).
> 
> after  thinking about it for a while, a trunk-only history in SVN doesn't seem 
>to be  all that useful to me (you need to go over the network to access 
>something  incomplete...).
> 
> far more useful would be to have a read-only HG/git  repository available that 
>contains the _full_ history and all open CWS branches,  which can be cloned and 
>examined off-line.
> 
> opinions?
> 

I at least lurk on the SVN and TSVN lists where similar kinds of questions come 
up on occasion as people convert to/from SVN.

#1 is probably simpler than one may think as it could be a basic shell script 
that simply iterates over every revision in HG and makes the equivalent commit 
to SVN.
People have used that approach to convert CVS repositories to SVN where the 
conversion was not very simple. This method does maintain all the merge info for 
branching.
For comparisons sake, cvs2svn extracts the data from the CVS/RCS repository and 
builds an SVN Dump File with the equivalent data, which I doubt contains the 
merge info, to be loaded into SVN.
Both processes are rather time-consuming.

If this was desired, then I would suggest whoever spearheads this talk with the 
SVN and HG developers. I don't know what the API/interface for HG is, but SVN 
has a nice library that can be utilized if a tool was being built.
Further the cvs2svn project may be of interest as they do support cvs2hg, so it 
may provide at least some insight into a hg2svn tool.

$0.02

Ben


Re: single repository status

Posted by BRM <bm...@yahoo.com>.
----- Original Message ----

> From: Pedro F. Giffuni <gi...@tutopia.com>
> To: ooo-dev@incubator.apache.org
> Sent: Wed, July 20, 2011 12:30:47 AM
> Subject: Re: single repository status
> 
> FWIW;
> 
> When merging the branches most of the CWS information will  be
> lost anyways .. won't it? I have seen other projects using
> subversion  that take a snapshot and eventually update stuff
> from trunk, but when the  project is finished and merged back
> the branch history is not viewable from  the trunk, the
> commit message is normally the only trace left.
> 

SVN keeps the branches in the history and they will always be available even if 
not visible in the current global revision.
Tools like TortoiseSVN provide Revision Graphs that will map it out and make it 
easy to view and access the various branches in the history.
This is also why SVN advocates using URLs with Peg Revisions for certain 
functionality - e.g. svn:externals - which keeps the URL locked to a given 
portion of the repository history.
And it allows you to re-use branch names for different tasks if desired.

Of course, that is supposing that you delete the branch when it has been fully 
re-integrated; which if you are using merge tracking and "svn merge 
--reintegrate" you will need to do.

One of Subversions goals is to not lose anything, period - including history.
So while there is still considerable work going on per merges (see info on SVN 
1.7), the history of the branches is never lost unless you explicitly dump the 
repository, edit the dump, and reload it - which for most is not worth the time 
& effort - but then, SVN didn't lose the information as you explicitly deleted 
it outside of SVN.

----- Original Message ----
> From: Michael Stahl <ms...@openoffice.org>
> i think the problem is that  Hg/git/otherDSCMs and SVN have fundamentally 
> different data models for  representing branching/merging; the fact that 
> so far nobody has come up with  a conversion tool that handles merges 
> well suggests to me that it's not an  easy problem to solve.

I think there are several issues at play. SVN historically (pre-SVN 1.5) did not 
do anything for tracking merges; so you had to track it all in the logs and that 
was project/repository dependent - even developer dependent. So that aspect 
makes it really hard to do merge tracking in conversion tools when there is no 
information to use - at least, moving from SVN to other things. I don't know 
about Hg/git/etc.

The fundamental issue is that the conversion tools need to be able to process 
all the information - all the meta data, etc - stored by the tool being 
converted from, and then be able to apply it appropriately to the tool being 
converted to. While non-trivial, that is simply mapping the various pieces of 
information between the tools when that information is available. cvs2svn could 
certainly write the merge tracking to SVN in the SVN Dump Files it generated if 
it understood the information and could do so in a useful manner; but then it 
also needs to be able to track the revision numbers well enough to be able to 
apply them in the produced dump file - also non-trivial. So at least for SVN, I 
think the issue is more that it is non-trivial work and the feature have not 
been around quite long enough that people implementing the conversion tools have 
had time to integrate them properly, and perhaps there has also not been enough 
demand for it yet either. I'm not sure CVS did much in way of merge tracking, so 
that might play into it as well as most converting to SVN/git/hg/etc are 
converting from CVS, not one of the others where merge tracking is more 
prevalent or even assumed.

As always, $0.02

Ben


Re: single repository status

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.
FWIW;

When merging the branches most of the CWS information will be
lost anyways .. won't it? I have seen other projects using
subversion that take a snapshot and eventually update stuff
from trunk, but when the project is finished and merged back
the branch history is not viewable from the trunk, the
commit message is normally the only trace left.

Perhaps we should do a "simple" trunk conversion and identify
the branches that are useful to merge them one by one? We
could even set a schedule for branch merging and after certain
date just don't look back.

While here.. what tool are you using to do the conversion?
A quick look would indicate

hg convert --dest-type svn hgreponame svnreponame

would do the trick, but then some people just write scripts
for this:
http://qa-ex-consultant.blogspot.com/2009/10/converting-mercurial-repo-to-subversion.html

Pedro.

--- On Tue, 7/19/11, Michael Stahl <ms...@openoffice.org> wrote:

> On 14.07.2011 18:03, Michael Stahl
> wrote:
> > On 09.07.2011 07:57, Greg Stein wrote:
> >> Once the tags are properly marked, then we can
> start testing the
> >> conversion to Subversion. Please note, however:
> the hg conversion
> >> script does *not* process tags. We will have to
> write that. We will
> >> also have to somehow manage construction of
> branches. We may also have
> >> to update the conversion tool to properly handle
> the "merge"
> >> changesets that Hg records.
> > 
> > after looking at the HG convert svn_sink a little,
> this looks rather
> > difficult to me.
> > 
> > well, tags should be easy.
> > 
> > but how to reconstruct the branching is kind of
> unclear to me.
> > the first ~260k changesets are linear and were
> initially taken from SVN,
> > so that shouldn't be a problem.
> > but then there are lots of HG merge changesets.
> 
> unfortunately it seems none of the tools that convert from
> HG or git to SVN can create SVN branches with SVN mergeinfo
> (necessary in order to be able to merge the branches back
> into the trunk).
> 
> there are some tools to convert from git that can create
> SVN branches, but they leave out the SVN mergeinfo;
> apparently the intent is to maintain a read-only mirror...
> 
> > in the HG repo there is sort of a "master" history
> that is basically a
> > sequence of CWS integration merge changesets, with the
> occasional
> > masterfix thrown in.
> > but this is not explicit: these are all just merges
> with 2 parents, and
> > it's not guaranteed which of these 2 is the CWS and
> which the master.
> > basically the requirement is to convert one of the
> thousands of paths
> > through this DAG from the common ancestor revision
> ~260k to revision
> > c904c1944462 (OOO340 head) into a SVN trunk.
> 
> to be more specific, the last revision from OOo SVN was the
> tag "last_svn_milestone" (263206  d0058b5891eb)
> (which is not actually a common ancestor, as some CWSes are
> based on older milestones)
> 
> the following merge commits are those on the DEV300
> master/"trunk" where i've noticed the "follow the first
> parent" heuristic would fail, and the second parent (after
> the arrow) should be taken instead.
> 
> 90552a19cdc4 -> dae1ffc5c15d
> 9572400cd241 -> ce1b12199f72
> c33f611c4373 -> 5d0c069f2570
> 13b3d2dae916 -> 62bf02dff30b
> 37dc1e423a1e -> 664c5f3a9291
> 
> 
> basically we have these options for converting to SVN:
> 
> 1. convert full history
>    requires writing tool to create SVN
> branches and mergeinfo
> 
> 2. convert trunk only, using follow-first-parent heuristic
>    with hacks where we want to follow second
> parent instead
> 
> 3. no history in SVN, just check in OOO340 tip
> 
> option 1. seems to be too much effort to me, and would have
> to be implemented by a real SVN wizard.
> 
> option 2. requires some effort, but should be doable; we
> still lose all CWS internal history though (e.g. CWS undoapi
> becomes a single 57,788 line changeset against 562 files).
> 
> after thinking about it for a while, a trunk-only history
> in SVN doesn't seem to be all that useful to me (you need to
> go over the network to access something incomplete...).
> 
> far more useful would be to have a read-only HG/git
> repository available that contains the _full_ history and
> all open CWS branches, which can be cloned and examined
> off-line.
> 
> opinions?
> 
> regards,
>  michael
> 
> 
> =n

Re: single repository status

Posted by Michael Stahl <ms...@openoffice.org>.
On 14.07.2011 18:03, Michael Stahl wrote:
> On 09.07.2011 07:57, Greg Stein wrote:
>> Once the tags are properly marked, then we can start testing the
>> conversion to Subversion. Please note, however: the hg conversion
>> script does *not* process tags. We will have to write that. We will
>> also have to somehow manage construction of branches. We may also have
>> to update the conversion tool to properly handle the "merge"
>> changesets that Hg records.
>
> after looking at the HG convert svn_sink a little, this looks rather
> difficult to me.
>
> well, tags should be easy.
>
> but how to reconstruct the branching is kind of unclear to me.
> the first ~260k changesets are linear and were initially taken from SVN,
> so that shouldn't be a problem.
> but then there are lots of HG merge changesets.

unfortunately it seems none of the tools that convert from HG or git to 
SVN can create SVN branches with SVN mergeinfo (necessary in order to be 
able to merge the branches back into the trunk).

there are some tools to convert from git that can create SVN branches, 
but they leave out the SVN mergeinfo; apparently the intent is to 
maintain a read-only mirror...

> in the HG repo there is sort of a "master" history that is basically a
> sequence of CWS integration merge changesets, with the occasional
> masterfix thrown in.
> but this is not explicit: these are all just merges with 2 parents, and
> it's not guaranteed which of these 2 is the CWS and which the master.
> basically the requirement is to convert one of the thousands of paths
> through this DAG from the common ancestor revision ~260k to revision
> c904c1944462 (OOO340 head) into a SVN trunk.

to be more specific, the last revision from OOo SVN was the tag 
"last_svn_milestone" (263206  d0058b5891eb)
(which is not actually a common ancestor, as some CWSes are based on 
older milestones)

the following merge commits are those on the DEV300 master/"trunk" where 
i've noticed the "follow the first parent" heuristic would fail, and the 
second parent (after the arrow) should be taken instead.

90552a19cdc4 -> dae1ffc5c15d
9572400cd241 -> ce1b12199f72
c33f611c4373 -> 5d0c069f2570
13b3d2dae916 -> 62bf02dff30b
37dc1e423a1e -> 664c5f3a9291


basically we have these options for converting to SVN:

1. convert full history
    requires writing tool to create SVN branches and mergeinfo

2. convert trunk only, using follow-first-parent heuristic
    with hacks where we want to follow second parent instead

3. no history in SVN, just check in OOO340 tip

option 1. seems to be too much effort to me, and would have to be 
implemented by a real SVN wizard.

option 2. requires some effort, but should be doable; we still lose all 
CWS internal history though (e.g. CWS undoapi becomes a single 57,788 
line changeset against 562 files).

after thinking about it for a while, a trunk-only history in SVN doesn't 
seem to be all that useful to me (you need to go over the network to 
access something incomplete...).

far more useful would be to have a read-only HG/git repository available 
that contains the _full_ history and all open CWS branches, which can be 
cloned and examined off-line.

opinions?

regards,
  michael


Re: single repository status

Posted by Mathias Bauer <Ma...@gmx.net>.
On 14.07.2011 18:03, Michael Stahl wrote:

> one thing we could do in any case is just not convert the CWSes; put up 
> a read-only HG or git repo with all CWSes somewhere, then use that to 
> rebase the CWS onto OOO340, then apply a patch to a SVN branch based on 
> OOO340, then merge with SVN trunk.

And that's what I have suggested in the begining. :-)

We can make sure that all files created by these patches will be part of
the software grant from Oracle by adding them to the list of files.

Or we can make the patches themselves a part of the software grant, thus
adding approx. 100 patch files to the list of granted files.
This would require a rebase of the cws, but than be done quite fast. We
don't need to resolve conflicts now, that can be done later when we
apply the patches to the svn branches.

Regards,
Mathias

Re: single repository status

Posted by Michael Stahl <ms...@openoffice.org>.
On 09.07.2011 07:57, Greg Stein wrote:
> Hi all,
>
> I borrowed some space on some ASF equipment and used the
> 'fetch-all-cws.sh' script to pull down all of the CWS repositories (in
> addition to the OOO340 and master_l10n/OOO340 repositories). This
> consumed 77 Gb of disk space.
>
> The fetch-all-cws.sh script has been updated with all my fixes to make
> this happen, and the cws-list.txt file has been updated to reflect all
> CWSs that have actual content beyond OOO340.
>
> I have updated single-hg.sh to do a basic combination of the
> repositories. This produces a 2.7 Gb repository.
>
> However! The single-hg script does not apply any tags/bookmarks for
> the heads introduced by the pulls from the CWS repositories.
> Originally, I thought to apply bookmarks, but ... meh. We can just
> apply tags. It isn't like we are going to push to the repository and
> need the bookmark to float with the changes. So the next step is to
> update single-hg to apply the appropriate tags. At that point, I will
> publish a bundle so that most people can skip all the above steps and
> start playing with the repository that will feed into the next step
> (convert to svn).
>
> Once the tags are properly marked, then we can start testing the
> conversion to Subversion. Please note, however: the hg conversion
> script does *not* process tags. We will have to write that. We will
> also have to somehow manage construction of branches. We may also have
> to update the conversion tool to properly handle the "merge"
> changesets that Hg records.

after looking at the HG convert svn_sink a little, this looks rather 
difficult to me.

well, tags should be easy.

but how to reconstruct the branching is kind of unclear to me.
the first ~260k changesets are linear and were initially taken from SVN, 
so that shouldn't be a problem.
but then there are lots of HG merge changesets.

in the HG repo there is sort of a "master" history that is basically a 
sequence of CWS integration merge changesets, with the occasional 
masterfix thrown in.
but this is not explicit: these are all just merges with 2 parents, and 
it's not guaranteed which of these 2 is the CWS and which the master.
basically the requirement is to convert one of the thousands of paths 
through this DAG from the common ancestor revision ~260k to revision 
c904c1944462 (OOO340 head) into a SVN trunk.

> In the single repository that I constructed, there are 102 heads. I
> believe these will become branches.

well, they should...
these don't have a common ancestor from the "trunk" history; generally 
they contain several merges from the master into the CWS as well.
it was also quite common practice to merge one CWS into another.

i don't know SVN implementation of branching/merging that well, and i 
have no idea how this history should look like in SVN, or whether it's 
even expressible.

one thing we could do in any case is just not convert the CWSes; put up 
a read-only HG or git repo with all CWSes somewhere, then use that to 
rebase the CWS onto OOO340, then apply a patch to a SVN branch based on 
OOO340, then merge with SVN trunk.

> We will need a script to take the single repository as input, and run
> the conversion process. I think there will multiple inputs to that
> process, so there will be additional input files to coordinate and
> drive the conversion. All of this should be placed into the tools/dev/
> directory. Help is wanted!
>
> Cheers,
> -g

-- 
'I have had at least two students tell me (in the words of one) that
  "you seduced me, you made me think computing and programming was
  sooo elegant and sooo beautiful, and now I am stuck with C++."'
  -- Matthias Felleisen


Re: single repository status

Posted by Greg Stein <gs...@gmail.com>.
On Jul 9, 2011 3:45 PM, "Mathias Bauer" <Ma...@gmx.net> wrote:
>
> On 09.07.2011 20:23, Greg Stein wrote:
>
> > I don't understand this. There are changesets in sbclean that are *not*
in
> > OOO340. Why would we not want them?
> >
> > And what is the term "masterfixes" that you're using here?
>
> After DEV300_m106 was tagged and OOO340 has been branched off, some
> build breakers have been found. Fixes have been applied on both code
> lines, means: in different repositories. Thus their changesets are not
> identical, but the fix nevertheless exists in both repos.

Gotcha. Now I understand. Separate commits rather than a merge.

I'll make sure sbclean is removed from my local set of CWS repositories.

>
> "Masterfix" means that this is a change that was not done on a child
> workspace, but directly committed to the master code line of the repo.
> Sorry for using an insider term. I promise to improve. :-)

Not a problem! It is clear now, so when you use it again, I'll know what you
mean.

Moving forward, I suspect a large amount of work will be committed directly
to trunk. It is just much more efficient at moving things forward. And this
*is* version control, after all. There isn't any way for us to permanently
break things. :-)

Cheers,
-g

Re: single repository status

Posted by Mathias Bauer <Ma...@gmx.net>.
On 09.07.2011 20:23, Greg Stein wrote:

> I don't understand this. There are changesets in sbclean that are *not* in
> OOO340. Why would we not want them?
> 
> And what is the term "masterfixes" that you're using here?

After DEV300_m106 was tagged and OOO340 has been branched off, some
build breakers have been found. Fixes have been applied on both code
lines, means: in different repositories. Thus their changesets are not
identical, but the fix nevertheless exists in both repos.

"Masterfix" means that this is a change that was not done on a child
workspace, but directly committed to the master code line of the repo.
Sorry for using an insider term. I promise to improve. :-)

Regards,
Mathias


Re: single repository status

Posted by Greg Stein <gs...@gmail.com>.
I don't understand this. There are changesets in sbclean that are *not* in
OOO340. Why would we not want them?

And what is the term "masterfixes" that you're using here?

Thanks,
-g
On Jul 9, 2011 7:16 AM, "Mathias Bauer" <Ma...@gmx.net> wrote:
> Hi Greg,
>
> you gave reenabled "sbclean": the only difference to ooo340 it contains
> are some masterfixes on the dev300 code line that are in ooo340 anyway
> (otherwise the code couldn't have been built). That's the reason why I
> left it disabled. We can remove it from the list.
>
> There may be some other cws that contain these masterfixes, that's
> something we have to deal with later.
>
> Regards,
> Mathias
>
> On 09.07.2011 07:57, Greg Stein wrote:
>
>> Hi all,
>>
>> I borrowed some space on some ASF equipment and used the
>> 'fetch-all-cws.sh' script to pull down all of the CWS repositories (in
>> addition to the OOO340 and master_l10n/OOO340 repositories). This
>> consumed 77 Gb of disk space.
>>
>> The fetch-all-cws.sh script has been updated with all my fixes to make
>> this happen, and the cws-list.txt file has been updated to reflect all
>> CWSs that have actual content beyond OOO340.
>>
>> I have updated single-hg.sh to do a basic combination of the
>> repositories. This produces a 2.7 Gb repository.
>>
>> However! The single-hg script does not apply any tags/bookmarks for
>> the heads introduced by the pulls from the CWS repositories.
>> Originally, I thought to apply bookmarks, but ... meh. We can just
>> apply tags. It isn't like we are going to push to the repository and
>> need the bookmark to float with the changes. So the next step is to
>> update single-hg to apply the appropriate tags. At that point, I will
>> publish a bundle so that most people can skip all the above steps and
>> start playing with the repository that will feed into the next step
>> (convert to svn).
>>
>> Once the tags are properly marked, then we can start testing the
>> conversion to Subversion. Please note, however: the hg conversion
>> script does *not* process tags. We will have to write that. We will
>> also have to somehow manage construction of branches. We may also have
>> to update the conversion tool to properly handle the "merge"
>> changesets that Hg records.
>>
>> In the single repository that I constructed, there are 102 heads. I
>> believe these will become branches.
>>
>> We will need a script to take the single repository as input, and run
>> the conversion process. I think there will multiple inputs to that
>> process, so there will be additional input files to coordinate and
>> drive the conversion. All of this should be placed into the tools/dev/
>> directory. Help is wanted!
>>
>> Cheers,
>> -g
>>
>

Re: single repository status

Posted by Stephan Bergmann <st...@googlemail.com>.
On Jul 9, 2011, at 1:16 PM, Mathias Bauer wrote:
> you gave reenabled "sbclean": the only difference to ooo340 it contains
> are some masterfixes on the dev300 code line that are in ooo340 anyway
> (otherwise the code couldn't have been built). That's the reason why I
> left it disabled. We can remove it from the list.

Yes, CWS sbclean was my "dummy clean master CWS for testing," just tracking tip of DEV300.  Drop it.

-Stephan

Re: single repository status

Posted by Mathias Bauer <Ma...@gmx.net>.
Hi Greg,

you gave reenabled "sbclean": the only difference to ooo340 it contains
are some masterfixes on the dev300 code line that are in ooo340 anyway
(otherwise the code couldn't have been built). That's the reason why I
left it disabled. We can remove it from the list.

There may be some other cws that contain these masterfixes, that's
something we have to deal with later.

Regards,
Mathias

On 09.07.2011 07:57, Greg Stein wrote:

> Hi all,
> 
> I borrowed some space on some ASF equipment and used the
> 'fetch-all-cws.sh' script to pull down all of the CWS repositories (in
> addition to the OOO340 and master_l10n/OOO340 repositories). This
> consumed 77 Gb of disk space.
> 
> The fetch-all-cws.sh script has been updated with all my fixes to make
> this happen, and the cws-list.txt file has been updated to reflect all
> CWSs that have actual content beyond OOO340.
> 
> I have updated single-hg.sh to do a basic combination of the
> repositories. This produces a 2.7 Gb repository.
> 
> However! The single-hg script does not apply any tags/bookmarks for
> the heads introduced by the pulls from the CWS repositories.
> Originally, I thought to apply bookmarks, but ... meh. We can just
> apply tags. It isn't like we are going to push to the repository and
> need the bookmark to float with the changes. So the next step is to
> update single-hg to apply the appropriate tags. At that point, I will
> publish a bundle so that most people can skip all the above steps and
> start playing with the repository that will feed into the next step
> (convert to svn).
> 
> Once the tags are properly marked, then we can start testing the
> conversion to Subversion. Please note, however: the hg conversion
> script does *not* process tags. We will have to write that. We will
> also have to somehow manage construction of branches. We may also have
> to update the conversion tool to properly handle the "merge"
> changesets that Hg records.
> 
> In the single repository that I constructed, there are 102 heads. I
> believe these will become branches.
> 
> We will need a script to take the single repository as input, and run
> the conversion process. I think there will multiple inputs to that
> process, so there will be additional input files to coordinate and
> drive the conversion. All of this should be placed into the tools/dev/
> directory. Help is wanted!
> 
> Cheers,
> -g
>