You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@openoffice.apache.org by Rob Weir <ap...@robweir.com> on 2011/06/27 22:06:23 UTC

Getting to our first build

Today marks the start of our third week at Apache.  We've been doing a
lot of account set ups, exploring the web site infrastructure,
exploring the migration issues we'll be dealing with and getting to
know each other.  This is good.

Up on the wiki we have a project plan matrix:

https://cwiki.apache.org/confluence/display/OOOUSERS/Project+Planning

And one of those cells is for the dev work needed to get to a first
build.  Not "first release", which would require QA, IP checklist and
PPMC/PMC approval, but first build which is a bit simpler:

https://cwiki.apache.org/confluence/display/OOOUSERS/Build-Dev-Plan

I've seen a few threads on the list related to investigations of
various copyleft dependencies and whether they are indeed needed, and
whether there are good substitutes.  This is good.

But what I haven't seen is the higher level outline of how we are
going to get to that first build.

I think one approach would be to start with everything, which should
presumably build, and then subtract.  So check in everything from OOo
into SVN, verify that it builds.  That establishes a known state.
Then verify the IP.  Maybe use SVN properties to tag the files that
were covered by Oracle's SGA.  Anything not tagged needs to be
investigated.  Some things lead to requests for amending the Oracle
SGA. When we get those, we indicate so in an SVN property.  Some
things will be GPL/LPGL.  These get also get tagged with properties
before being deleted.  We continue to iterate until all files
remaining in the repository have a property indicating that we've
proven their provenance. Ideally, as things are removed, we do so in a
way that we can always still build.  So we start in a well-defined
state and stay in a well-defined state.

The above approach also has the advantage of allowing some degree of
parallel processing, with multiple committers working on different
parts of the project.

Of course, if there is a better way of doing this, please let me know,
or even better update the plan on the wiki so we document what the
approach is.

Regards,

-Rob

Re: Getting to our first build

Posted by Robert Burrell Donkin <ro...@gmail.com>.

On Tue, Jun 28, 2011 at 5:07 PM, Greg Stein <gs...@gmail.com> wrote:
> Top-posting is just fine for replies where you're talking about the
> message in general. If you're replying to specific pieces, then yeah:
> in-line comments are best.
>
> We have no rules against top-posting at Apache. We want to communicate
> rather than get all crazy about the *form* of the communication.

+1

Robert

Re: Getting to our first build

Posted by Greg Stein <gs...@gmail.com>.

Top-posting is just fine for replies where you're talking about the
message in general. If you're replying to specific pieces, then yeah:
in-line comments are best.

We have no rules against top-posting at Apache. We want to communicate
rather than get all crazy about the *form* of the communication.

Cheers,
-g

On Tue, Jun 28, 2011 at 12:04, Pedro F. Giffuni <gi...@tutopia.com> wrote:
> (Sorry for top posting.. for now it's just more practical.)
>
> Perhaps some of the migration stuff can be done
> in the older OOo site?
> - replacing the GNU iconv header is trivial.
> - ICU needs to be updated to 4.8 before working on the
> regex replacement.
>
> This depends on people with commit privileges there,
> of course.
>
> Pedro.
>
> --- On Tue, 6/28/11, Rob Weir <ap...@robweir.com> wrote:
> ...
>> Hi Mathias,
>>
>> I don't know whether my approach is feasible either.
>> I know we can
>> set properties on files in SVN.  You can retrieve them
>> individually,
>> but I don't see a way to query them, e.g., list all files
>> that don't
>> have a license property, or download all files that have a
>> license
>> property set to Apache 2.0.
>>
>> So fa, I think that you've been doing most of the code
>> investigations.
>>  So I'd trust your judgement on what the next steps should
>> be.  Do you
>> have any thoughts what work remains for the next 1 or 2
>> weeks?  For
>> example, is Oracle currently reviewing the additional SGA
>> requesets?
>> Or do we need to request this still?
>>
>> If I understand the rules at Apache (and it is certainly
>> possible I
>> have this wrong, but in that case Im sure someone will
>> quickly correct
>> me), a Podling can check in all of the code, including
>> parts that are
>> LGPL/GPL. We can make builds from that.  But we are
>> not permitted to
>> make a releases or to graduate from a podling until we have
>> gone
>> through the IP checklist, including dealing with code that
>> has an
>> incompatible license.
>>
>> Of course, if you think you are close to having a "clean"
>> version of
>> OOo ready to check in, then I don't want to interrupt the
>> fine work
>> that you are already doing.  But in that case I think
>> it would help if
>> we had a "roadmap" for the next couple of weeks, of what
>> tasks
>> remains, so others can help as well.
>>
>> -Rob
>>
>>
>> On Tue, Jun 28, 2011 at 4:55 AM, Mathias Bauer <Ma...@gmx.net>
>> wrote:
>> > On 27.06.2011 22:06, Rob Weir wrote:
>> >>
>> >> I think one approach would be to start with
>> everything, which should
>> >> presumably build, and then subtract.  So check in
>> everything from OOo
>> >> into SVN, verify that it builds.  That
>> establishes a known state.
>> >> Then verify the IP.  Maybe use SVN properties to
>> tag the files that
>> >> were covered by Oracle's SGA.  Anything not
>> tagged needs to be
>> >> investigated.  Some things lead to requests for
>> amending the Oracle
>> >> SGA. When we get those, we indicate so in an SVN
>> property.  Some
>> >> things will be GPL/LPGL.  These get also get
>> tagged with properties
>> >> before being deleted.  We continue to iterate
>> until all files
>> >> remaining in the repository have a property
>> indicating that we've
>> >> proven their provenance. Ideally, as things are
>> removed, we do so in a
>> >> way that we can always still build.  So we start
>> in a well-defined
>> >> state and stay in a well-defined state.
>> >
>> > I can't judge whether this approach is feasible. If it
>> is, I can provide
>> > information about IP from a developers POV. The files
>> that definitely are
>> > not owned by Oracle are already listed in the OOo
>> wiki. I tend to assume
>> > that all other files are under Oracle's copyright
>> until stated otherwise.
>> > But again, I can't judge whether we can go this way.
>> >
>> > Regards,
>> > Mathias
>> >
>>
>> Rr
>

Re: Getting to our first build

Posted by Robert Burrell Donkin <ro...@gmail.com>.

(Reintroducing myself, I'm an Apache Member with some knowledge of
releases, builds and legal stuff. I signed up to provide some hands on
help in these areas. I've also been involved with the Incubator for a
while so I might also jump in with .)

(At Apache, we conventionally avoid top posting and like to cut
content whilst preserving context. Good subjects with renaming when
necessary is also seen to be a Good Thing. This tends to produce more
concise and inclusive discussion threads by preserving immediate
context for a particular point. So please forgive my editing...)

On Tue, Jun 28, 2011 at 12:34 PM, Rob Weir <ap...@robweir.com> wrote:
> On Tue, Jun 28, 2011 at 4:55 AM, Mathias Bauer <Ma...@gmx.net> wrote:
>> On 27.06.2011 22:06, Rob Weir wrote:

<moved>

>>> I think one approach would be to start with everything, which should
>>> presumably build, and then subtract.  So check in everything from OOo
>>> into SVN, verify that it builds.  That establishes a known state.
>>> Then verify the IP.  Maybe use SVN properties to tag the files that
>>> were covered by Oracle's SGA.  Anything not tagged needs to be
>>> investigated.  Some things lead to requests for amending the Oracle
>>> SGA. When we get those, we indicate so in an SVN property.  Some
>>> things will be GPL/LPGL.  These get also get tagged with properties
>>> before being deleted.  We continue to iterate until all files
>>> remaining in the repository have a property indicating that we've
>>> proven their provenance. Ideally, as things are removed, we do so in a
>>> way that we can always still build.  So we start in a well-defined
>>> state and stay in a well-defined state.
>>
>> I can't judge whether this approach is feasible.

</moved>

> I don't know whether my approach is feasible either.  I know we can
> set properties on files in SVN.  You can retrieve them individually,
> but I don't see a way to query them, e.g., list all files that don't
> have a license property, or download all files that have a license
> property set to Apache 2.0.

IMHO new tools are going to be needed sooner or later. For example,
the Jakarta Project led to Ant and Maven. So, if a query tool is
needed for subversion, one can probably be written.

FWIW Ant is a procedural build language. Maven is a declarative one.
IMHO to sustain a rich and diverse downstream ecology, OOo will need a
compositional build language layer.

Robert

Re: Getting to our first build

Posted by Mathias Bauer <Ma...@gmx.net>.

On 29.06.2011 11:58, Greg Stein wrote:
> On Wed, Jun 29, 2011 at 04:26, Mathias Bauer<Ma...@gmx.net>  wrote:
>> On 29.06.2011 00:07, Greg Stein wrote:
>>>
>>> On Tue, Jun 28, 2011 at 15:31, Rob Weir<ap...@robweir.com>    wrote:
>>>>
>>>> Let me summarize what I'm hearing the initial steps then are.
>>>>
>>>> 1)  We take the OOo source code with tag OOO340_m1 from
>>>> hg.services.openoffice.org, including the full history, and convert
>>>> that into a SVN repository, e.g.,:   hg convert --dest-type svn
>>>> hgreponame svnreponame
>>>>
>>>>
>>>> Who does this?  Is this something that can be done remotely, or do we
>>>> need an Oracle admin to do this for us?
>>>
>>> We would do this. We have all the access that we need (open source, yay!)
>>>
>>> I've started on a script to create this (local) Hg repository. See
>>> tools/dev/single-hg.sh for my first bits. I'm trying it out now, but
>>> it is probably going to take a while to run :-P  (I also have no idea
>>> about CWSs)
>>
>> Did you clone the repo at
>>
>> http://hg.services.openoffice.org/OOO340?
>>
>> This is the one we should use.
>
> As you can see from the script, it is designed around DEV300.
>
> I thought we wanted the latest development branch?

The OOO340 repo was branched off from DEV300 and after the branch 
nothing was added to DEV300, only to OOO340, as we have been in release 
stabilisation mode where only "release critical issues" have been worked on.

Regards,
Mathias

Re: Getting to our first build

Posted by Mathias Bauer <Ma...@gmx.net>.

On 29.06.2011 12:53, Reizinger Zoltán wrote:

> 2011.06.29. 12:23 keltezéssel, Michael Stahl írta:
>> i still think it makes sense to go a step further and actually merge
>> all finished CWSes into OOO340 using HG first, because that is by far
>> the easiest way and doesn't have any technical pitfalls.
>>
> This will not good in database part of OOo, the cws hsqldb19 finished
> and waits for dev300 integration toward OOo 3.5.
> it is contain hsqldb 2.x.x database engine which is incompatible
> (convert all data into new version whic is not reversable) to presently
> used 1.8.0.10 version. Then the new version file opens in older version
> of OOo, but not usable, warning came up to use newer version of OOo.
> If you merge it it will cause mess.
> The using merging cws hsqldb needs to discussed in project in later
> time. It is cause incompatibilities, but the hsqldb 2.x.x has more
> features which is good for database users.
Indeed Michael's suggestion to merge finished CWS first is too much: we 
should create branches for them in svn, but merging must be decided on 
for each CWS individually. There may be other reasons why a CWS 
shouldn't be merged to master, e.g. because the work on it is still not 
finished and it will either break the master or introduce horrible bugs.

Thanks for the heads-up regarding CWS hsqld19.

Regards,
Mathias

Re: Getting to our first build

Posted by Reizinger Zoltán <zr...@hdsnet.hu>.

2011.06.29. 12:23 keltezéssel, Michael Stahl írta:
> On 29.06.2011 11:58, Greg Stein wrote:
>> On Wed, Jun 29, 2011 at 04:26, Mathias Bauer<Ma...@gmx.net>  
>> wrote:
>>> On 29.06.2011 00:07, Greg Stein wrote:
>>>>
>>>> On Tue, Jun 28, 2011 at 15:31, Rob Weir<ap...@robweir.com>    wrote:
>>>>>
>>>>> Let me summarize what I'm hearing the initial steps then are.
>>>>>
>>>>> 1)  We take the OOo source code with tag OOO340_m1 from
>>>>> hg.services.openoffice.org, including the full history, and convert
>>>>> that into a SVN repository, e.g.,:   hg convert --dest-type svn
>>>>> hgreponame svnreponame
>>>>>
>>>>>
>>>>> Who does this?  Is this something that can be done remotely, or do we
>>>>> need an Oracle admin to do this for us?
>>>>
>>>> We would do this. We have all the access that we need (open source, 
>>>> yay!)
>>>>
>>>> I've started on a script to create this (local) Hg repository. See
>>>> tools/dev/single-hg.sh for my first bits. I'm trying it out now, but
>>>> it is probably going to take a while to run :-P  (I also have no idea
>>>> about CWSs)
>>>
>>> Did you clone the repo at
>>>
>>> http://hg.services.openoffice.org/OOO340?
>>>
>>> This is the one we should use.
>>
>> As you can see from the script, it is designed around DEV300.
>>
>> I thought we wanted the latest development branch?
>
> actually no CWS was integrated on the development DEV300 after the 
> release OOO340 was branched off, while on OOO340 release relevant 
> CWSes were integrated.
> so OOO340 contains the latest and greatest stuff.
>
>>> Pulling the CWS should be faster: create a local clone of your 
>>> existing repo
>>> for each cws and pull from the CWS repro at hg.services.ooo. This 
>>> will pull
>>> only the change sets not in your local repo and create a second head
>>> revision in it. This revision could be moved over to svn or you 
>>> could create
>>> a patch from it against whatever version that is on the hg repo.
>>
>> Ah! That's and awesome improvement. Thanks. I'll incorporate that into
>> the scripting.
>>
>> I don't think we want patches. I continue to believe we want a single
>> Hg repository with "everything", and we convert that to Subversion,
>> and then load it into svn.apache.org.
>
> if it is possible to convert HG heads to SVN branches, that would be 
> the way to go.
>
> i still think it makes sense to go a step further and actually merge 
> all finished CWSes into OOO340 using HG first, because that is by far 
> the easiest way and doesn't have any technical pitfalls.
>
This will not good in database part of OOo, the cws hsqldb19 finished 
and waits for dev300 integration toward OOo 3.5.
it is contain hsqldb 2.x.x database engine which is incompatible 
(convert all data into new version whic is not reversable) to presently 
used 1.8.0.10 version. Then the new version file opens in older version 
of OOo, but not usable, warning came up to use newer version of OOo.
If you merge it it will cause mess.
The using merging cws hsqldb needs to discussed in project in later 
time. It is cause incompatibilities, but the hsqldb 2.x.x has more 
features which is good for database users.
Zoltan

Re: Getting to our first build

Posted by Greg Stein <gs...@gmail.com>.

On Wed, Jun 29, 2011 at 06:23, Michael Stahl <ms...@openoffice.org> wrote:
> On 29.06.2011 11:58, Greg Stein wrote:
>...
>> I don't think we want patches. I continue to believe we want a single
>> Hg repository with "everything", and we convert that to Subversion,
>> and then load it into svn.apache.org.
>
> if it is possible to convert HG heads to SVN branches, that would be the way
> to go.
>
> i still think it makes sense to go a step further and actually merge all
> finished CWSes into OOO340 using HG first, because that is by far the
> easiest way and doesn't have any technical pitfalls.

Right. We can let Hg tie together all of the branching that was done.
Then, we can port that over to svn for loading into the repository at
the ASF.

There are about 2000 merge commits in the main Hg repository (ie. revs
with two parents). These will need some special care. I haven't
thought much on the problem on how to represent these in svn. I don't
think it is a problem... we just need to ensure that svn:mergeinfo is
set properly.

Of course, the CWSs probably have more merge commits, but once we have
a pattern established, then we'll be fine.

I do note that the primary Hg repository has only one head ("tip").
And the one CWS that I looked at was similar. We may not have a
problem with dangling/anonymous heads.

re: OOO340 in your later email. Interesting. I've seen calls for
grabbing content from a different tag/tip/branch/whatever. It seems
that we could simply pull "all branches" and then sort out which we'd
like to call "trunk" in the svn repo. It seems that OOO340 has the
most up-to-date changes on it, so we'd make that trunk.

Cheers,
-g

Re: Getting to our first build

Posted by Michael Stahl <ms...@openoffice.org>.

On 29.06.2011 11:58, Greg Stein wrote:
> On Wed, Jun 29, 2011 at 04:26, Mathias Bauer<Ma...@gmx.net>  wrote:
>> On 29.06.2011 00:07, Greg Stein wrote:
>>>
>>> On Tue, Jun 28, 2011 at 15:31, Rob Weir<ap...@robweir.com>    wrote:
>>>>
>>>> Let me summarize what I'm hearing the initial steps then are.
>>>>
>>>> 1)  We take the OOo source code with tag OOO340_m1 from
>>>> hg.services.openoffice.org, including the full history, and convert
>>>> that into a SVN repository, e.g.,:   hg convert --dest-type svn
>>>> hgreponame svnreponame
>>>>
>>>>
>>>> Who does this?  Is this something that can be done remotely, or do we
>>>> need an Oracle admin to do this for us?
>>>
>>> We would do this. We have all the access that we need (open source, yay!)
>>>
>>> I've started on a script to create this (local) Hg repository. See
>>> tools/dev/single-hg.sh for my first bits. I'm trying it out now, but
>>> it is probably going to take a while to run :-P  (I also have no idea
>>> about CWSs)
>>
>> Did you clone the repo at
>>
>> http://hg.services.openoffice.org/OOO340?
>>
>> This is the one we should use.
>
> As you can see from the script, it is designed around DEV300.
>
> I thought we wanted the latest development branch?

actually no CWS was integrated on the development DEV300 after the 
release OOO340 was branched off, while on OOO340 release relevant CWSes 
were integrated.
so OOO340 contains the latest and greatest stuff.

>> Pulling the CWS should be faster: create a local clone of your existing repo
>> for each cws and pull from the CWS repro at hg.services.ooo. This will pull
>> only the change sets not in your local repo and create a second head
>> revision in it. This revision could be moved over to svn or you could create
>> a patch from it against whatever version that is on the hg repo.
>
> Ah! That's and awesome improvement. Thanks. I'll incorporate that into
> the scripting.
>
> I don't think we want patches. I continue to believe we want a single
> Hg repository with "everything", and we convert that to Subversion,
> and then load it into svn.apache.org.

if it is possible to convert HG heads to SVN branches, that would be the 
way to go.

i still think it makes sense to go a step further and actually merge all 
finished CWSes into OOO340 using HG first, because that is by far the 
easiest way and doesn't have any technical pitfalls.

-- 
"The evolution of languages: FORTRAN is a non-typed language.
  C is a weakly typed language.  Ada is a strongly typed language.
  C++ is a strongly hyped language." -- Ron Sercely

Re: Getting to our first build

Posted by Greg Stein <gs...@gmail.com>.

On Wed, Jun 29, 2011 at 04:26, Mathias Bauer <Ma...@gmx.net> wrote:
> On 29.06.2011 00:07, Greg Stein wrote:
>>
>> On Tue, Jun 28, 2011 at 15:31, Rob Weir<ap...@robweir.com>  wrote:
>>>
>>> Let me summarize what I'm hearing the initial steps then are.
>>>
>>> 1)  We take the OOo source code with tag OOO340_m1 from
>>> hg.services.openoffice.org, including the full history, and convert
>>> that into a SVN repository, e.g.,:   hg convert --dest-type svn
>>> hgreponame svnreponame
>>>
>>>
>>> Who does this?  Is this something that can be done remotely, or do we
>>> need an Oracle admin to do this for us?
>>
>> We would do this. We have all the access that we need (open source, yay!)
>>
>> I've started on a script to create this (local) Hg repository. See
>> tools/dev/single-hg.sh for my first bits. I'm trying it out now, but
>> it is probably going to take a while to run :-P  (I also have no idea
>> about CWSs)
>
> Did you clone the repo at
>
> http://hg.services.openoffice.org/OOO340?
>
> This is the one we should use.

As you can see from the script, it is designed around DEV300.

I thought we wanted the latest development branch?

> Pulling the CWS should be faster: create a local clone of your existing repo
> for each cws and pull from the CWS repro at hg.services.ooo. This will pull
> only the change sets not in your local repo and create a second head
> revision in it. This revision could be moved over to svn or you could create
> a patch from it against whatever version that is on the hg repo.

Ah! That's and awesome improvement. Thanks. I'll incorporate that into
the scripting.

I don't think we want patches. I continue to believe we want a single
Hg repository with "everything", and we convert that to Subversion,
and then load it into svn.apache.org.

Cheers,
-g

Re: Getting to our first build

Posted by Mathias Bauer <Ma...@gmx.net>.

On 29.06.2011 00:07, Greg Stein wrote:
> On Tue, Jun 28, 2011 at 15:31, Rob Weir<ap...@robweir.com>  wrote:
>> Let me summarize what I'm hearing the initial steps then are.
>>
>> 1)  We take the OOo source code with tag OOO340_m1 from
>> hg.services.openoffice.org, including the full history, and convert
>> that into a SVN repository, e.g.,:   hg convert --dest-type svn
>> hgreponame svnreponame
>>
>>
>> Who does this?  Is this something that can be done remotely, or do we
>> need an Oracle admin to do this for us?
>
> We would do this. We have all the access that we need (open source, yay!)
>
> I've started on a script to create this (local) Hg repository. See
> tools/dev/single-hg.sh for my first bits. I'm trying it out now, but
> it is probably going to take a while to run :-P  (I also have no idea
> about CWSs)

Did you clone the repo at

http://hg.services.openoffice.org/OOO340?

This is the one we should use.

Pulling the CWS should be faster: create a local clone of your existing 
repo for each cws and pull from the CWS repro at hg.services.ooo. This 
will pull only the change sets not in your local repo and create a 
second head revision in it. This revision could be moved over to svn or 
you could create a patch from it against whatever version that is on the 
hg repo.

Regards,
Mathias

Re: Getting to our first build

Posted by Greg Stein <gs...@gmail.com>.

On Tue, Jun 28, 2011 at 15:31, Rob Weir <ap...@robweir.com> wrote:
> Let me summarize what I'm hearing the initial steps then are.
>
> 1)  We take the OOo source code with tag OOO340_m1 from
> hg.services.openoffice.org, including the full history, and convert
> that into a SVN repository, e.g.,:   hg convert --dest-type svn
> hgreponame svnreponame
>
>
> Who does this?  Is this something that can be done remotely, or do we
> need an Oracle admin to do this for us?

We would do this. We have all the access that we need (open source, yay!)

I've started on a script to create this (local) Hg repository. See
tools/dev/single-hg.sh for my first bits. I'm trying it out now, but
it is probably going to take a while to run :-P  (I also have no idea
about CWSs)

The script allows all of us to recreate and test this step.

> 2) We 'svnadmin dump' the repository, and pass the resulting dumpfile
> to the Infrastructure team.
>
> Same question.  Who does the dump?

One of us. I'm sure we will have lots of volunteers.

> 3) Apache Infrastructure will schedule a "Subversion is read-only"
> period, probably on a weekend. They will then use 'svnadmin load' to
> load all the revisions into the Apache repository.
>
>
> I'm assume the ideal situation would be to have the svn dump file on
> an Oracle ftp server, so Infrastructure can grab it.

I suspect that one of us would run the whole process on
people.apache.org, and then point Infra at the resulting dumpfile.

> Are there any quality checks we need between steps 1 and 2 or between
> 2 and 3?  It would be nice, for example, to verify that things like
> EOL characters or encodings or mime types were right.  Anyone have a
> sense for what the most common ways to screw this up would be?

That's exactly why I've started on a script that we can all use,
review, modify, verify, etc. This should be a reproducible and
documented action.

Cheers,
-g

Re: Getting to our first build

Posted by Rob Weir <ap...@robweir.com>.

Let me summarize what I'm hearing the initial steps then are.

1)  We take the OOo source code with tag OOO340_m1 from
hg.services.openoffice.org, including the full history, and convert
that into a SVN repository, e.g.,:   hg convert --dest-type svn
hgreponame svnreponame


Who does this?  Is this something that can be done remotely, or do we
need an Oracle admin to do this for us?


2) We 'svnadmin dump' the repository, and pass the resulting dumpfile
to the Infrastructure team.

Same question.  Who does the dump?


3) Apache Infrastructure will schedule a "Subversion is read-only"
period, probably on a weekend. They will then use 'svnadmin load' to
load all the revisions into the Apache repository.


I'm assume the ideal situation would be to have the svn dump file on
an Oracle ftp server, so Infrastructure can grab it.

Are there any quality checks we need between steps 1 and 2 or between
2 and 3?  It would be nice, for example, to verify that things like
EOL characters or encodings or mime types were right.  Anyone have a
sense for what the most common ways to screw this up would be?

-Rob

Re: Getting to our first build

Posted by Greg Stein <gs...@gmail.com>.

On Tue, Jun 28, 2011 at 12:59, Rob Weir <ap...@robweir.com> wrote:
> On Tue, Jun 28, 2011 at 12:35 PM, Mathias Bauer <Ma...@gmx.net> wrote:
>...
>> So let me try to summarize:
>>
>> We take the OOo source code with tag OOO340_m1 from
>> hg.services.openoffice.org, including the full history, means: we will just
>> import it from hg to svn. Then we use the lists of "naughty" files I have
>> created and remove the files with "svn remove" that may not stay in the
>> Apache repository.

Yup.

> I see it like this:
>
> 1) Coordinate the check in so we can do it on the Apache server if
> possible (to save time).  Also, we should disable the commit list
> notification emails during the check in.  Otherwise we'll cause
> everyone's email files to explode and catch on fire.

We need to get all of the content from Hg that we would like. Merged
into one fat repository. Convert that thing to a Subversion
repository. Then we 'svnadmin dump' the repository, and pass the
resulting dumpfile to the Infrastructure team. They will schedule a
"Subversion is read-only" period, probably on a weekend. They will
then use 'svnadmin load' to haul all the revisions into the Apache
repository. This is the most efficient mechanism, it will avoid people
committing during the load (creating potential burps in the revision
sequence for OOo), and it will avoid sending commit messages.

Again, please refer to:
  http://cmpilato.blogspot.com/2009/11/revisionist-history.html

I believe we want to pull all the bits together in a repeatable and
documented fashion before handing the dumpfile over to Infrastructure
for loading.

>...
> 4) At this point the code will be a mix:
>
> a) Oracle SGA-provided code.  No action needed on these.
>
> b) 3rd party code with a compatible license.  We need a list of these
> so we can provided proper notice in our releases, to respect the terms
> of the license.

Yes. These notices go into trunk/NOTICE, per the instructions of ALv2.

> c) 3rd party code with an incompatible license.  We need to understand
> these dependencies and agree on how to handle them.
>
> d) Oracle code that was not included in the initial SGA.  We need to
> ask Oracle to amend their SGA to include these files.
>
> I think we can handle some of this in parallel.

Yup.

>> In result we will have some files with LGPL or MPL in our repository (in the
>> history), but on "head" they will be removed. "head" will have only files
>> that are owned by Oracle (and will get ASL from Oracle) and those files that
>> are not owned by Oracle, but are part of the current OOo repository and have
>> a license that is compatible with ASL.
>>
>
> This is correct for incompatible 3rd party licensed code.  We would
> "svn delete" these.  But this does not need to be an immediate thing,
> where we quickly delete all GPL code as quickly as possible.  We can
> do it module by module, replacing pieces as we go.  The main
> restriction is that we cannot do a podling release with the
> incompatible code in it.  But we can spent a month or more, if needed,
> working in the repository, doing developer builds, etc.

This is correct (both Mathias and Rob).

>
>
>> Then we start to fix the build.
>>
>
> There are different opinions on this.  My preference is to never have
> the build be broken.  Keep it always in buildable, runnable condition.
>  Otherwise it is hard to expect that other project members will test
> their code before checking it in.  If the build is broken it tends to
> get even more broken very quickly.  I'd recommend that we clean up the
> GPL dependencies library by library, always keeping the build working.
>  Of course, that is in a perfect world.  But we should try to aim for
> that, I think.

Fully agreed.

> So in summary, I'm proposing that we check in the existing code, with
> history, and verify that we can build it.  That establish a "stable
> build".  Then work on the provenance review, making additional Oracle
> requests and removing incompatible libraries as we go, but always
> aiming to keep the build stable.

We should strive very hard to make a *single* request from Oracle. Any
request involves the time/efforts/cost of their Legal team. In
fairness to them, we should attempt to limit our demand of their time.

Cheers,
-g

Re: Getting to our first build

Posted by Dave Fisher <da...@comcast.net>.

Hi Rob,

On Jun 28, 2011, at 9:59 AM, Rob Weir wrote:

> On Tue, Jun 28, 2011 at 12:35 PM, Mathias Bauer <Ma...@gmx.net> wrote:
>> On 28.06.2011 18:05, Greg Stein wrote:
>>> 
>>> On Tue, Jun 28, 2011 at 07:34, Rob Weir<ap...@robweir.com>  wrote:
>>>> 
>>>> ...
>>>> Hi Mathias,
>>>> 
>>>> I don't know whether my approach is feasible either.  I know we can
>>>> set properties on files in SVN.  You can retrieve them individually,
>>>> but I don't see a way to query them, e.g., list all files that don't
>>>> have a license property, or download all files that have a license
>>>> property set to Apache 2.0.
>>> 
>>> I'm not entirely sure about tagging like this. An interesting idea,
>>> definitely.
>>> 
>>> In any case, you're right in terms of Subversion's query capabilities.
>>> You can list properties of nodes, but you cannot form queries to
>>> return nodes with certain property configurations.
>>> 
>>> I somewhat prefer managing the IP aspect with separate lists of files,
>>> rather than injecting that information into the repository.
>>> 
>>>> So fa, I think that you've been doing most of the code investigations.
>>>>  So I'd trust your judgement on what the next steps should be.  Do you
>>>> have any thoughts what work remains for the next 1 or 2 weeks?  For
>>>> example, is Oracle currently reviewing the additional SGA requesets?
>>>> Or do we need to request this still?
>>> 
>>> Nobody has made a request. Nobody has produced a list of files to request.
>>> 
>>>> If I understand the rules at Apache (and it is certainly possible I
>>>> have this wrong, but in that case Im sure someone will quickly correct
>>>> me), a Podling can check in all of the code, including parts that are
>>>> LGPL/GPL. We can make builds from that.  But we are not permitted to
>>>> make a releases or to graduate from a podling until we have gone
>>>> through the IP checklist, including dealing with code that has an
>>>> incompatible license.
>>> 
>>> You have this entirely correct. Thanks!
>>> 
>>>> Of course, if you think you are close to having a "clean" version of
>>>> OOo ready to check in, then I don't want to interrupt the fine work
>>>> that you are already doing.  But in that case I think it would help if
>>>> we had a "roadmap" for the next couple of weeks, of what tasks
>>>> remains, so others can help as well.
>>> 
>>> I still believe that we would like *history* rather than simply
>>> copying over "tip" from the old repository. Having that history in one
>>> repository is so incredibly useful to so many people, that I cannot
>>> see why we would skip it. It costs us pain *now*, but think about how
>>> long this codebase will live? Will people a decade from now want to
>>> use two repositories to investigate history?
>> 
>> So let me try to summarize:
>> 
>> We take the OOo source code with tag OOO340_m1 from
>> hg.services.openoffice.org, including the full history, means: we will just
>> import it from hg to svn. Then we use the lists of "naughty" files I have
>> created and remove the files with "svn remove" that may not stay in the
>> Apache repository.
>> 
> 
> I see it like this:
> 
> 1) Coordinate the check in so we can do it on the Apache server if
> possible (to save time).  Also, we should disable the commit list
> notification emails during the check in.  Otherwise we'll cause
> everyone's email files to explode and catch on fire.
> 
> 2) Check in the OOO340_m1 code, with history.
> 
> 3) Verify that we can extract that source and build it.  I know that
> sounds like a trivial thing, but I think we can use this to encourage
> as many project members as possible to download the source, set up a
> build environment and confirm that they can build.  Having many
> committers with a working build environment will help us going
> forward.
> 
> 4) At this point the code will be a mix:
> 
> a) Oracle SGA-provided code.  No action needed on these.
> 
> b) 3rd party code with a compatible license.  We need a list of these
> so we can provided proper notice in our releases, to respect the terms
> of the license.
> 
> c) 3rd party code with an incompatible license.  We need to understand
> these dependencies and agree on how to handle them.
> 
> d) Oracle code that was not included in the initial SGA.  We need to
> ask Oracle to amend their SGA to include these files.
> 
> I think we can handle some of this in parallel.
> 
>> In result we will have some files with LGPL or MPL in our repository (in the
>> history), but on "head" they will be removed. "head" will have only files
>> that are owned by Oracle (and will get ASL from Oracle) and those files that
>> are not owned by Oracle, but are part of the current OOo repository and have
>> a license that is compatible with ASL.
>> 
> 
> This is correct for incompatible 3rd party licensed code.  We would
> "svn delete" these.  But this does not need to be an immediate thing,
> where we quickly delete all GPL code as quickly as possible.  We can
> do it module by module, replacing pieces as we go.  The main
> restriction is that we cannot do a podling release with the
> incompatible code in it.  But we can spent a month or more, if needed,
> working in the repository, doing developer builds, etc.

This is good. If the question of provenance ever comes up, by having the files in SVN, removed and then replaced we have clear evidence instead of the absence of evidence.


> 
> 
>> Then we start to fix the build.
>> 
> 
> There are different opinions on this.  My preference is to never have
> the build be broken.  Keep it always in buildable, runnable condition.
> Otherwise it is hard to expect that other project members will test
> their code before checking it in.  If the build is broken it tends to
> get even more broken very quickly.  I'd recommend that we clean up the
> GPL dependencies library by library, always keeping the build working.
> Of course, that is in a perfect world.  But we should try to aim for
> that, I think.
> 
> So in summary, I'm proposing that we check in the existing code, with
> history, and verify that we can build it.  That establish a "stable
> build".  Then work on the provenance review, making additional Oracle
> requests and removing incompatible libraries as we go, but always
> aiming to keep the build stable.

I think that this is a good plan.

Once we have a stable build we should make use of Continuous Integration - http://ci.apache.org/

The buildbot that is used for CMS is one example. I think that Hudson is being replaced by Jenkins.

Regards,
Dave



> 
> -Rob
> 
> 
>> Is that correct?
>> 
>> Regards,
>> Mathias
>> 
>> 
>> 
>>> 
>>> Cheers,
>>> -g
>>> 
>> 
>>

Re: Getting to our first build

Posted by Andrew Rist <an...@oracle.com>.

+1

( I realize I'm several messages behind here, but after a couple of days 
OOTO I've been doing the Sisyphean task working my way up the mountain 
of ooo-dev messages)



On 6/28/2011 9:59 AM, Rob Weir wrote:
> On Tue, Jun 28, 2011 at 12:35 PM, Mathias Bauer<Ma...@gmx.net>  wrote:
>> On 28.06.2011 18:05, Greg Stein wrote:
>>> On Tue, Jun 28, 2011 at 07:34, Rob Weir<ap...@robweir.com>    wrote:
>>>> ...
>>>> Hi Mathias,
>>>>
>>>> I don't know whether my approach is feasible either.  I know we can
>>>> set properties on files in SVN.  You can retrieve them individually,
>>>> but I don't see a way to query them, e.g., list all files that don't
>>>> have a license property, or download all files that have a license
>>>> property set to Apache 2.0.
>>> I'm not entirely sure about tagging like this. An interesting idea,
>>> definitely.
>>>
>>> In any case, you're right in terms of Subversion's query capabilities.
>>> You can list properties of nodes, but you cannot form queries to
>>> return nodes with certain property configurations.
>>>
>>> I somewhat prefer managing the IP aspect with separate lists of files,
>>> rather than injecting that information into the repository.
>>>
>>>> So fa, I think that you've been doing most of the code investigations.
>>>>   So I'd trust your judgement on what the next steps should be.  Do you
>>>> have any thoughts what work remains for the next 1 or 2 weeks?  For
>>>> example, is Oracle currently reviewing the additional SGA requesets?
>>>> Or do we need to request this still?
>>> Nobody has made a request. Nobody has produced a list of files to request.
>>>
>>>> If I understand the rules at Apache (and it is certainly possible I
>>>> have this wrong, but in that case Im sure someone will quickly correct
>>>> me), a Podling can check in all of the code, including parts that are
>>>> LGPL/GPL. We can make builds from that.  But we are not permitted to
>>>> make a releases or to graduate from a podling until we have gone
>>>> through the IP checklist, including dealing with code that has an
>>>> incompatible license.
>>> You have this entirely correct. Thanks!
>>>
>>>> Of course, if you think you are close to having a "clean" version of
>>>> OOo ready to check in, then I don't want to interrupt the fine work
>>>> that you are already doing.  But in that case I think it would help if
>>>> we had a "roadmap" for the next couple of weeks, of what tasks
>>>> remains, so others can help as well.
>>> I still believe that we would like *history* rather than simply
>>> copying over "tip" from the old repository. Having that history in one
>>> repository is so incredibly useful to so many people, that I cannot
>>> see why we would skip it. It costs us pain *now*, but think about how
>>> long this codebase will live? Will people a decade from now want to
>>> use two repositories to investigate history?
>> So let me try to summarize:
>>
>> We take the OOo source code with tag OOO340_m1 from
>> hg.services.openoffice.org, including the full history, means: we will just
>> import it from hg to svn. Then we use the lists of "naughty" files I have
>> created and remove the files with "svn remove" that may not stay in the
>> Apache repository.
>>
> I see it like this:
>
> 1) Coordinate the check in so we can do it on the Apache server if
> possible (to save time).  Also, we should disable the commit list
> notification emails during the check in.  Otherwise we'll cause
> everyone's email files to explode and catch on fire.
>
> 2) Check in the OOO340_m1 code, with history.
>
> 3) Verify that we can extract that source and build it.  I know that
> sounds like a trivial thing, but I think we can use this to encourage
> as many project members as possible to download the source, set up a
> build environment and confirm that they can build.  Having many
> committers with a working build environment will help us going
> forward.
>
> 4) At this point the code will be a mix:
>
> a) Oracle SGA-provided code.  No action needed on these.
>
> b) 3rd party code with a compatible license.  We need a list of these
> so we can provided proper notice in our releases, to respect the terms
> of the license.
>
> c) 3rd party code with an incompatible license.  We need to understand
> these dependencies and agree on how to handle them.
>
> d) Oracle code that was not included in the initial SGA.  We need to
> ask Oracle to amend their SGA to include these files.
>
> I think we can handle some of this in parallel.
>
>> In result we will have some files with LGPL or MPL in our repository (in the
>> history), but on "head" they will be removed. "head" will have only files
>> that are owned by Oracle (and will get ASL from Oracle) and those files that
>> are not owned by Oracle, but are part of the current OOo repository and have
>> a license that is compatible with ASL.
>>
> This is correct for incompatible 3rd party licensed code.  We would
> "svn delete" these.  But this does not need to be an immediate thing,
> where we quickly delete all GPL code as quickly as possible.  We can
> do it module by module, replacing pieces as we go.  The main
> restriction is that we cannot do a podling release with the
> incompatible code in it.  But we can spent a month or more, if needed,
> working in the repository, doing developer builds, etc.
>
>
>> Then we start to fix the build.
>>
> There are different opinions on this.  My preference is to never have
> the build be broken.  Keep it always in buildable, runnable condition.
>   Otherwise it is hard to expect that other project members will test
> their code before checking it in.  If the build is broken it tends to
> get even more broken very quickly.  I'd recommend that we clean up the
> GPL dependencies library by library, always keeping the build working.
>   Of course, that is in a perfect world.  But we should try to aim for
> that, I think.
>
> So in summary, I'm proposing that we check in the existing code, with
> history, and verify that we can build it.  That establish a "stable
> build".  Then work on the provenance review, making additional Oracle
> requests and removing incompatible libraries as we go, but always
> aiming to keep the build stable.
>
> -Rob
>
>
>> Is that correct?
>>
>> Regards,
>> Mathias
>>
>>
>>
>>> Cheers,
>>> -g
>>>
>>

Re: Getting to our first build

Posted by Rob Weir <ap...@robweir.com>.

On Tue, Jun 28, 2011 at 12:35 PM, Mathias Bauer <Ma...@gmx.net> wrote:
> On 28.06.2011 18:05, Greg Stein wrote:
>>
>> On Tue, Jun 28, 2011 at 07:34, Rob Weir<ap...@robweir.com>  wrote:
>>>
>>> ...
>>> Hi Mathias,
>>>
>>> I don't know whether my approach is feasible either.  I know we can
>>> set properties on files in SVN.  You can retrieve them individually,
>>> but I don't see a way to query them, e.g., list all files that don't
>>> have a license property, or download all files that have a license
>>> property set to Apache 2.0.
>>
>> I'm not entirely sure about tagging like this. An interesting idea,
>> definitely.
>>
>> In any case, you're right in terms of Subversion's query capabilities.
>> You can list properties of nodes, but you cannot form queries to
>> return nodes with certain property configurations.
>>
>> I somewhat prefer managing the IP aspect with separate lists of files,
>> rather than injecting that information into the repository.
>>
>>> So fa, I think that you've been doing most of the code investigations.
>>>  So I'd trust your judgement on what the next steps should be.  Do you
>>> have any thoughts what work remains for the next 1 or 2 weeks?  For
>>> example, is Oracle currently reviewing the additional SGA requesets?
>>> Or do we need to request this still?
>>
>> Nobody has made a request. Nobody has produced a list of files to request.
>>
>>> If I understand the rules at Apache (and it is certainly possible I
>>> have this wrong, but in that case Im sure someone will quickly correct
>>> me), a Podling can check in all of the code, including parts that are
>>> LGPL/GPL. We can make builds from that.  But we are not permitted to
>>> make a releases or to graduate from a podling until we have gone
>>> through the IP checklist, including dealing with code that has an
>>> incompatible license.
>>
>> You have this entirely correct. Thanks!
>>
>>> Of course, if you think you are close to having a "clean" version of
>>> OOo ready to check in, then I don't want to interrupt the fine work
>>> that you are already doing.  But in that case I think it would help if
>>> we had a "roadmap" for the next couple of weeks, of what tasks
>>> remains, so others can help as well.
>>
>> I still believe that we would like *history* rather than simply
>> copying over "tip" from the old repository. Having that history in one
>> repository is so incredibly useful to so many people, that I cannot
>> see why we would skip it. It costs us pain *now*, but think about how
>> long this codebase will live? Will people a decade from now want to
>> use two repositories to investigate history?
>
> So let me try to summarize:
>
> We take the OOo source code with tag OOO340_m1 from
> hg.services.openoffice.org, including the full history, means: we will just
> import it from hg to svn. Then we use the lists of "naughty" files I have
> created and remove the files with "svn remove" that may not stay in the
> Apache repository.
>

I see it like this:

1) Coordinate the check in so we can do it on the Apache server if
possible (to save time).  Also, we should disable the commit list
notification emails during the check in.  Otherwise we'll cause
everyone's email files to explode and catch on fire.

2) Check in the OOO340_m1 code, with history.

3) Verify that we can extract that source and build it.  I know that
sounds like a trivial thing, but I think we can use this to encourage
as many project members as possible to download the source, set up a
build environment and confirm that they can build.  Having many
committers with a working build environment will help us going
forward.

4) At this point the code will be a mix:

a) Oracle SGA-provided code.  No action needed on these.

b) 3rd party code with a compatible license.  We need a list of these
so we can provided proper notice in our releases, to respect the terms
of the license.

c) 3rd party code with an incompatible license.  We need to understand
these dependencies and agree on how to handle them.

d) Oracle code that was not included in the initial SGA.  We need to
ask Oracle to amend their SGA to include these files.

I think we can handle some of this in parallel.

> In result we will have some files with LGPL or MPL in our repository (in the
> history), but on "head" they will be removed. "head" will have only files
> that are owned by Oracle (and will get ASL from Oracle) and those files that
> are not owned by Oracle, but are part of the current OOo repository and have
> a license that is compatible with ASL.
>

This is correct for incompatible 3rd party licensed code.  We would
"svn delete" these.  But this does not need to be an immediate thing,
where we quickly delete all GPL code as quickly as possible.  We can
do it module by module, replacing pieces as we go.  The main
restriction is that we cannot do a podling release with the
incompatible code in it.  But we can spent a month or more, if needed,
working in the repository, doing developer builds, etc.

> Then we start to fix the build.
>

There are different opinions on this.  My preference is to never have
the build be broken.  Keep it always in buildable, runnable condition.
 Otherwise it is hard to expect that other project members will test
their code before checking it in.  If the build is broken it tends to
get even more broken very quickly.  I'd recommend that we clean up the
GPL dependencies library by library, always keeping the build working.
 Of course, that is in a perfect world.  But we should try to aim for
that, I think.

So in summary, I'm proposing that we check in the existing code, with
history, and verify that we can build it.  That establish a "stable
build".  Then work on the provenance review, making additional Oracle
requests and removing incompatible libraries as we go, but always
aiming to keep the build stable.

-Rob

> Is that correct?
>
> Regards,
> Mathias
>
>
>
>>
>> Cheers,
>> -g
>>
>
>

Re: Getting to our first build

Posted by Mathias Bauer <Ma...@gmx.net>.

On 28.06.2011 18:05, Greg Stein wrote:
> On Tue, Jun 28, 2011 at 07:34, Rob Weir<ap...@robweir.com>  wrote:
>> ...
>> Hi Mathias,
>>
>> I don't know whether my approach is feasible either.  I know we can
>> set properties on files in SVN.  You can retrieve them individually,
>> but I don't see a way to query them, e.g., list all files that don't
>> have a license property, or download all files that have a license
>> property set to Apache 2.0.
>
> I'm not entirely sure about tagging like this. An interesting idea, definitely.
>
> In any case, you're right in terms of Subversion's query capabilities.
> You can list properties of nodes, but you cannot form queries to
> return nodes with certain property configurations.
>
> I somewhat prefer managing the IP aspect with separate lists of files,
> rather than injecting that information into the repository.
>
>> So fa, I think that you've been doing most of the code investigations.
>>   So I'd trust your judgement on what the next steps should be.  Do you
>> have any thoughts what work remains for the next 1 or 2 weeks?  For
>> example, is Oracle currently reviewing the additional SGA requesets?
>> Or do we need to request this still?
>
> Nobody has made a request. Nobody has produced a list of files to request.
>
>> If I understand the rules at Apache (and it is certainly possible I
>> have this wrong, but in that case Im sure someone will quickly correct
>> me), a Podling can check in all of the code, including parts that are
>> LGPL/GPL. We can make builds from that.  But we are not permitted to
>> make a releases or to graduate from a podling until we have gone
>> through the IP checklist, including dealing with code that has an
>> incompatible license.
>
> You have this entirely correct. Thanks!
>
>> Of course, if you think you are close to having a "clean" version of
>> OOo ready to check in, then I don't want to interrupt the fine work
>> that you are already doing.  But in that case I think it would help if
>> we had a "roadmap" for the next couple of weeks, of what tasks
>> remains, so others can help as well.
>
> I still believe that we would like *history* rather than simply
> copying over "tip" from the old repository. Having that history in one
> repository is so incredibly useful to so many people, that I cannot
> see why we would skip it. It costs us pain *now*, but think about how
> long this codebase will live? Will people a decade from now want to
> use two repositories to investigate history?

So let me try to summarize:

We take the OOo source code with tag OOO340_m1 from 
hg.services.openoffice.org, including the full history, means: we will 
just import it from hg to svn. Then we use the lists of "naughty" files 
I have created and remove the files with "svn remove" that may not stay 
in the Apache repository.

In result we will have some files with LGPL or MPL in our repository (in 
the history), but on "head" they will be removed. "head" will have only 
files that are owned by Oracle (and will get ASL from Oracle) and those 
files that are not owned by Oracle, but are part of the current OOo 
repository and have a license that is compatible with ASL.

Then we start to fix the build.

Is that correct?

Regards,
Mathias



>
> Cheers,
> -g
>

Re: Getting to our first build

Posted by Robert Burrell Donkin <ro...@gmail.com>.

On Tue, Jun 28, 2011 at 5:05 PM, Greg Stein <gs...@gmail.com> wrote:
> On Tue, Jun 28, 2011 at 07:34, Rob Weir <ap...@robweir.com> wrote:

<snip>

>> Of course, if you think you are close to having a "clean" version of
>> OOo ready to check in, then I don't want to interrupt the fine work
>> that you are already doing.  But in that case I think it would help if
>> we had a "roadmap" for the next couple of weeks, of what tasks
>> remains, so others can help as well.
>
> I still believe that we would like *history* rather than simply
> copying over "tip" from the old repository. Having that history in one
> repository is so incredibly useful to so many people, that I cannot
> see why we would skip it.

+1

Perhaps the history could be checked into a read only directory. As
the IP is checked and cleaned, resources could be copied across into
the new working directory structure. Scripting magic could be used to
compose a build from these two sources.

Opinions? Objections? Improvements?

Robert

Re: Getting to our first build

Posted by Greg Stein <gs...@gmail.com>.

On Tue, Jun 28, 2011 at 07:34, Rob Weir <ap...@robweir.com> wrote:
>...
> Hi Mathias,
>
> I don't know whether my approach is feasible either.  I know we can
> set properties on files in SVN.  You can retrieve them individually,
> but I don't see a way to query them, e.g., list all files that don't
> have a license property, or download all files that have a license
> property set to Apache 2.0.

I'm not entirely sure about tagging like this. An interesting idea, definitely.

In any case, you're right in terms of Subversion's query capabilities.
You can list properties of nodes, but you cannot form queries to
return nodes with certain property configurations.

I somewhat prefer managing the IP aspect with separate lists of files,
rather than injecting that information into the repository.

> So fa, I think that you've been doing most of the code investigations.
>  So I'd trust your judgement on what the next steps should be.  Do you
> have any thoughts what work remains for the next 1 or 2 weeks?  For
> example, is Oracle currently reviewing the additional SGA requesets?
> Or do we need to request this still?

Nobody has made a request. Nobody has produced a list of files to request.

> If I understand the rules at Apache (and it is certainly possible I
> have this wrong, but in that case Im sure someone will quickly correct
> me), a Podling can check in all of the code, including parts that are
> LGPL/GPL. We can make builds from that.  But we are not permitted to
> make a releases or to graduate from a podling until we have gone
> through the IP checklist, including dealing with code that has an
> incompatible license.

You have this entirely correct. Thanks!

> Of course, if you think you are close to having a "clean" version of
> OOo ready to check in, then I don't want to interrupt the fine work
> that you are already doing.  But in that case I think it would help if
> we had a "roadmap" for the next couple of weeks, of what tasks
> remains, so others can help as well.

I still believe that we would like *history* rather than simply
copying over "tip" from the old repository. Having that history in one
repository is so incredibly useful to so many people, that I cannot
see why we would skip it. It costs us pain *now*, but think about how
long this codebase will live? Will people a decade from now want to
use two repositories to investigate history?

Cheers,
-g

Re: ICU update

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.

--- On Wed, 6/29/11, Eike Rathke <oo...@erack.de> wrote:
...

> Hi Pedro,
> 
> On Tuesday, 2011-06-28 09:43:12 -0700, Pedro F. Giffuni
> wrote:
> 
> > FWIW, ICU 4.8 runs out of memory when building here.
> 
> How much (real+virtual) memory, which platform and
> compiler?
> 

Well, 2G RAM + 4G swap but I think the problem is that
I am hitting a per-user system limit.

This is part of a series of updates to the OOo
external dependencies that I am doing on FreeBSD's
ports tree, so I am checking with the maintainers
and making sure I don't break other applications.
I will send a report to the list when I finish
updating my system. For now I'd recommend using
ICU 4.6.1.

Pedro.

Re: IC update

Posted by Eike Rathke <oo...@erack.de>.

Hi Pedro,

On Tuesday, 2011-06-28 09:43:12 -0700, Pedro F. Giffuni wrote:

> FWIW, ICU 4.8 runs out of memory when building here.

How much (real+virtual) memory, which platform and compiler?

  Eike

-- 
 PGP/OpenPGP/GnuPG encrypted mail preferred in all private communication.
 Key ID: 0x293C05FD - 997A 4C60 CE41 0149 0DB3  9E96 2F1A D073 293C 05FD

IC update (was Re: Getting to our first build)

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.

--- On Tue, 6/28/11, Pedro F. Giffuni <gi...@tutopia.com> wrote:
...
> 
> Perhaps some of the migration stuff can be done
> in the older OOo site?
...
> - ICU needs to be updated to 4.8 before working on the
> regex replacement.
> 

FWIW, ICU 4.8 runs out of memory when building here.
ICU 4.6.1 is just fine.

Pedro.

Re: Getting to our first build

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.

(Sorry for top posting.. for now it's just more practical.)

Perhaps some of the migration stuff can be done
in the older OOo site?
- replacing the GNU iconv header is trivial.
- ICU needs to be updated to 4.8 before working on the
regex replacement.

This depends on people with commit privileges there,
of course.

Pedro.

--- On Tue, 6/28/11, Rob Weir <ap...@robweir.com> wrote:
...
> Hi Mathias,
> 
> I don't know whether my approach is feasible either. 
> I know we can
> set properties on files in SVN.  You can retrieve them
> individually,
> but I don't see a way to query them, e.g., list all files
> that don't
> have a license property, or download all files that have a
> license
> property set to Apache 2.0.
> 
> So fa, I think that you've been doing most of the code
> investigations.
>  So I'd trust your judgement on what the next steps should
> be.  Do you
> have any thoughts what work remains for the next 1 or 2
> weeks?  For
> example, is Oracle currently reviewing the additional SGA
> requesets?
> Or do we need to request this still?
> 
> If I understand the rules at Apache (and it is certainly
> possible I
> have this wrong, but in that case Im sure someone will
> quickly correct
> me), a Podling can check in all of the code, including
> parts that are
> LGPL/GPL. We can make builds from that.  But we are
> not permitted to
> make a releases or to graduate from a podling until we have
> gone
> through the IP checklist, including dealing with code that
> has an
> incompatible license.
> 
> Of course, if you think you are close to having a "clean"
> version of
> OOo ready to check in, then I don't want to interrupt the
> fine work
> that you are already doing.  But in that case I think
> it would help if
> we had a "roadmap" for the next couple of weeks, of what
> tasks
> remains, so others can help as well.
> 
> -Rob
> 
> 
> On Tue, Jun 28, 2011 at 4:55 AM, Mathias Bauer <Ma...@gmx.net>
> wrote:
> > On 27.06.2011 22:06, Rob Weir wrote:
> >>
> >> I think one approach would be to start with
> everything, which should
> >> presumably build, and then subtract.  So check in
> everything from OOo
> >> into SVN, verify that it builds.  That
> establishes a known state.
> >> Then verify the IP.  Maybe use SVN properties to
> tag the files that
> >> were covered by Oracle's SGA.  Anything not
> tagged needs to be
> >> investigated.  Some things lead to requests for
> amending the Oracle
> >> SGA. When we get those, we indicate so in an SVN
> property.  Some
> >> things will be GPL/LPGL.  These get also get
> tagged with properties
> >> before being deleted.  We continue to iterate
> until all files
> >> remaining in the repository have a property
> indicating that we've
> >> proven their provenance. Ideally, as things are
> removed, we do so in a
> >> way that we can always still build.  So we start
> in a well-defined
> >> state and stay in a well-defined state.
> >
> > I can't judge whether this approach is feasible. If it
> is, I can provide
> > information about IP from a developers POV. The files
> that definitely are
> > not owned by Oracle are already listed in the OOo
> wiki. I tend to assume
> > that all other files are under Oracle's copyright
> until stated otherwise.
> > But again, I can't judge whether we can go this way.
> >
> > Regards,
> > Mathias
> >
> 
> Rr

Re: Getting to our first build

Posted by Rob Weir <ap...@robweir.com>.

Hi Mathias,

I don't know whether my approach is feasible either.  I know we can
set properties on files in SVN.  You can retrieve them individually,
but I don't see a way to query them, e.g., list all files that don't
have a license property, or download all files that have a license
property set to Apache 2.0.

So fa, I think that you've been doing most of the code investigations.
 So I'd trust your judgement on what the next steps should be.  Do you
have any thoughts what work remains for the next 1 or 2 weeks?  For
example, is Oracle currently reviewing the additional SGA requesets?
Or do we need to request this still?

If I understand the rules at Apache (and it is certainly possible I
have this wrong, but in that case Im sure someone will quickly correct
me), a Podling can check in all of the code, including parts that are
LGPL/GPL. We can make builds from that.  But we are not permitted to
make a releases or to graduate from a podling until we have gone
through the IP checklist, including dealing with code that has an
incompatible license.

Of course, if you think you are close to having a "clean" version of
OOo ready to check in, then I don't want to interrupt the fine work
that you are already doing.  But in that case I think it would help if
we had a "roadmap" for the next couple of weeks, of what tasks
remains, so others can help as well.

-Rob

On Tue, Jun 28, 2011 at 4:55 AM, Mathias Bauer <Ma...@gmx.net> wrote:
> On 27.06.2011 22:06, Rob Weir wrote:
>>
>> I think one approach would be to start with everything, which should
>> presumably build, and then subtract.  So check in everything from OOo
>> into SVN, verify that it builds.  That establishes a known state.
>> Then verify the IP.  Maybe use SVN properties to tag the files that
>> were covered by Oracle's SGA.  Anything not tagged needs to be
>> investigated.  Some things lead to requests for amending the Oracle
>> SGA. When we get those, we indicate so in an SVN property.  Some
>> things will be GPL/LPGL.  These get also get tagged with properties
>> before being deleted.  We continue to iterate until all files
>> remaining in the repository have a property indicating that we've
>> proven their provenance. Ideally, as things are removed, we do so in a
>> way that we can always still build.  So we start in a well-defined
>> state and stay in a well-defined state.
>
> I can't judge whether this approach is feasible. If it is, I can provide
> information about IP from a developers POV. The files that definitely are
> not owned by Oracle are already listed in the OOo wiki. I tend to assume
> that all other files are under Oracle's copyright until stated otherwise.
> But again, I can't judge whether we can go this way.
>
> Regards,
> Mathias
>

Re: Getting to our first build

Posted by Mathias Bauer <Ma...@gmx.net>.

On 27.06.2011 22:06, Rob Weir wrote:
> I think one approach would be to start with everything, which should
> presumably build, and then subtract.  So check in everything from OOo
> into SVN, verify that it builds.  That establishes a known state.
> Then verify the IP.  Maybe use SVN properties to tag the files that
> were covered by Oracle's SGA.  Anything not tagged needs to be
> investigated.  Some things lead to requests for amending the Oracle
> SGA. When we get those, we indicate so in an SVN property.  Some
> things will be GPL/LPGL.  These get also get tagged with properties
> before being deleted.  We continue to iterate until all files
> remaining in the repository have a property indicating that we've
> proven their provenance. Ideally, as things are removed, we do so in a
> way that we can always still build.  So we start in a well-defined
> state and stay in a well-defined state.

I can't judge whether this approach is feasible. If it is, I can provide 
information about IP from a developers POV. The files that definitely 
are not owned by Oracle are already listed in the OOo wiki. I tend to 
assume that all other files are under Oracle's copyright until stated 
otherwise. But again, I can't judge whether we can go this way.

Regards,
Mathias

Re: Getting to our first build

Posted by "Pedro F. Giffuni" <gi...@tutopia.com>.

--- On Mon, 6/27/11, Rob Weir <ap...@robweir.com> wrote:

...
> 
> And one of those cells is for the dev work needed to get to
> a first build.  Not "first release", which would require QA,
> IP checklist and PPMC/PMC approval, but first build which
> is a bit simpler:
> 
> https://cwiki.apache.org/confluence/display/OOOUSERS/Build-Dev-Plan
> 
> I've seen a few threads on the list related to
> investigations of various copyleft dependencies and
> whether they are indeed needed, and
> whether there are good substitutes.  This is good.
> 
And it's not over at all, but I wanted to wait until
we have something in SVN to discuss ;-).

> But what I haven't seen is the higher level outline of how
> we are going to get to that first build.
> 

I thought we were waiting for Oracle to go over the list
of files but that GPL stuff wasn't going to be transferred
to SVN at all.

> I think one approach would be to start with everything,
> which should presumably build, and then subtract.
> So check in everything from OOo into SVN, verify that
> it builds.  That establishes a known state.

I like that approach too. I noticed there's this:
http://svn.services.openoffice.org/ooo/branches/OOO320/

But then I think first build = last Oracle OOo (m340?)

> Then verify the IP.

This has already been done except for some of the external
dependencies.

I think there are two possibilities:

1- Wait for Oracle legal to OK the SGA and import only
the granted files. This way we start "clean".
2- Don't wait, but instead import the m340 tree and
start working on the dependencies while Oracle legal
does their part (soon?). This way we start somewhat
tainted but we can work now and we never get broken
builds.

> 
> Of course, if there is a better way of doing this, please
> let me know,

I have no answers, but ultimately it all depends on
the person(s) in charge(?).

Pedro.