You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Dave Fisher <da...@comcast.net> on 2011/07/04 03:07:13 UTC

Re: svn commit: r1142528 - in /incubator/ooo/trunk/tools/dev: fetch-all-web.sh web-list.txt

This is a script and text file for fetching and maintaining an svn checkout of many OOo project's Kenai webcontent.

I followed the same pattern for the script and text file as Greg did for the CWS Mercurial pulls.

dave$ ./fetch-all-web.sh web-list.txt ~/Documents/webtest
============ './projects' exists. Updating ...
At revision 3.
============ './www' exists. Updating ...
At revision 53.
============ './download' exists. Updating ...
At revision 296.
============ './development' exists. Updating ...
At revision 15.

Regards,
Dave

On Jul 3, 2011, at 5:48 PM, wave@apache.org wrote:

> Author: wave
> Date: Mon Jul  4 00:48:01 2011
> New Revision: 1142528
> 
> URL: http://svn.apache.org/viewvc?rev=1142528&view=rev
> Log:
> A script for pulling webcontent from Kenai's svn repos plus the start to the web-project list. The script follows the pattern of fetch-all-cws.sh. It is a similar process.
> 
> Added:
>    incubator/ooo/trunk/tools/dev/fetch-all-web.sh   (with props)
>    incubator/ooo/trunk/tools/dev/web-list.txt   (with props)
> 
> Added: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
> URL: http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/fetch-all-web.sh?rev=1142528&view=auto
> ==============================================================================
> --- incubator/ooo/trunk/tools/dev/fetch-all-web.sh (added)
> +++ incubator/ooo/trunk/tools/dev/fetch-all-web.sh Mon Jul  4 00:48:01 2011
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +#
> +# Licensed to the Apache Software Foundation (ASF) under one
> +# or more contributor license agreements.  See the NOTICE file
> +# distributed with this work for additional information
> +# regarding copyright ownership.  The ASF licenses this file
> +# to you under the Apache License, Version 2.0 (the
> +# "License"); you may not use this file except in compliance
> +# with the License.  You may obtain a copy of the License at
> +#
> +#   http://www.apache.org/licenses/LICENSE-2.0
> +#
> +# Unless required by applicable law or agreed to in writing,
> +# software distributed under the License is distributed on an
> +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> +# KIND, either express or implied.  See the License for the
> +# specific language governing permissions and limitations
> +# under the License.
> +#
> +
> +#
> +# Use this script to fetch all a project's webcontent for the projects
> +# listed in the specified file (typically, webcontent-list.txt).
> +#
> +# See https://cwiki.apache.org/confluence/display/OOOUSERS/OOo-Sitemap
> +# for a note on the checkout from the Kenai svn repository.
> +#
> +# USAGE:
> +#   $ ./fetch-all-web.sh WEB-LIST WORK-DIR
> +#
> +#     WEB-LIST is a file containing the list of Projects to fetch
> +#       (see the file tools/dev/webcontent-list.txt)
> +#     WORK-DIR each project's webcontent will be created in a
> +#       subdirectory of WORK-DIR
> +#
> +#  Future steps will include scripts to transform the content for
> +#  the Apache CMS or a Confluence Wiki import
> +#
> +
> +if test "$#" != 2; then
> +  echo "USAGE: $0 WEB-LIST WORK-DIR"
> +  exit 1
> +fi
> +
> +REPOS='https://svn.openoffice.org/svn/'
> +REPOS2='~webcontent'
> +
> +# Make the work directory, in case it does not exist
> +if test ! -e "$2"; then
> +  mkdir "$2"
> +fi
> +
> +# Turn the parameters into absolute paths
> +work=`(cd "$2" ; pwd)`
> +
> +webdir=`dirname "$1"`
> +webfile=`basename "$1"`
> +weblist=`(cd "$webdir" ; pwd)`/$webfile
> +
> +
> +for webproject in `grep '^./' $weblist` ; do
> +  cd "$work"
> +
> +  webrepos=${REPOS}${webproject}${REPOS2}
> +
> +  if test -d "$webproject" ; then
> +    echo "============ '$project' exists. Updating ..."
> +    cd "$webproject"
> +    svn update
> +
> +  elif test -e "$webproject" ; then
> +    echo "ERROR: '$webproject' exists and is not a directory."
> +    exit 1
> +
> +  # filter out empty CWS: hg incoming returns 1 if there's nothing to pull
> +  else
> +    echo "============ '$webproject' is being created ..."
> +    svn co $webrepos $webproject
> +  fi
> +
> +done
> 
> Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
> ------------------------------------------------------------------------------
>    svn:eol-style = native
> 
> Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
> ------------------------------------------------------------------------------
>    svn:executable = *
> 
> Added: incubator/ooo/trunk/tools/dev/web-list.txt
> URL: http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/web-list.txt?rev=1142528&view=auto
> ==============================================================================
> --- incubator/ooo/trunk/tools/dev/web-list.txt (added)
> +++ incubator/ooo/trunk/tools/dev/web-list.txt Mon Jul  4 00:48:01 2011
> @@ -0,0 +1,35 @@
> +#
> +# Licensed to the Apache Software Foundation (ASF) under one
> +# or more contributor license agreements.  See the NOTICE file
> +# distributed with this work for additional information
> +# regarding copyright ownership.  The ASF licenses this file
> +# to you under the Apache License, Version 2.0 (the
> +# "License"); you may not use this file except in compliance
> +# with the License.  You may obtain a copy of the License at
> +#
> +#   http://www.apache.org/licenses/LICENSE-2.0
> +#
> +# Unless required by applicable law or agreed to in writing,
> +# software distributed under the License is distributed on an
> +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> +# KIND, either express or implied.  See the License for the
> +# specific language governing permissions and limitations
> +# under the License.
> +#
> +
> +#
> +# This file contains a list of every project's webcontent currently
> +# hosted on Oracle's Kenai svn repository at:
> +#   https://svn.openoffice.org/svn/<$projectname>~webcontent
> +#
> +# The webcontent repositories that should not be pulled are commented out,
> +# with a short explanation why.
> +#
> +# Note: for automated processing of this file, use only lines that
> +# begin with "./".
> +#
> +
> +./projects
> +./www
> +./download
> +./development
> 
> Propchange: incubator/ooo/trunk/tools/dev/web-list.txt
> ------------------------------------------------------------------------------
>    svn:eol-style = native
> 
> 


Re: svn commit: r1142528 - in /incubator/ooo/trunk/tools/dev: fetch-all-web.sh web-list.txt

Posted by Dave Fisher <da...@comcast.net>.
On Jul 6, 2011, at 10:42 AM, Greg Stein wrote:

> This is cool. I have one basic question: do you want the latest
> content, or do you want full history?

Well. I think that history was erased with the conversion to Kenai as I am seeing most everything in the initial revision of when they were imported. There is a clear revision 1 for most of www.

So, yes history will be important, but there is not a whole lot.

> 
> If "latest content", then you could use "svn export" rather than "svn
> checkout". However, export won't pick up changes from upstream. The
> script would need to skip existing directories, rather than update
> them.

I followed the pattern with your hg script, so that an aborted checkout could be restarted at any point and be successful. Some of these projects are huge.

For now we are going to work with these and they may change (Kay Schenk has commit rights over in OOo/Kenai.)

http://openoffice.org/projects/www/sources/webcontent/show

> 
> If you want history, then we'll want to use something like svnsync to
> copy history into local repositories. (or svnrdump from the upcoming
> 1.7 release)
> 
> Thoughts?

Not sure. I'd like to know what those close to the OOo content think about this.

I'm not sure if we will do things like this:

(1) Copy the webcontent into the project's svn as data.

(2) Transform it into the format that we want to maintain in the ApacheCMS also in svn.

(3) Add the proper extensions and views to the Apache CMS and our local lib to
	(a) create html web content.
	(b) create odf content.
	(c) create pdf content

or

(1) Copy the webcontent into a scratch area and transform it into a directory structure to be imported into svn.

(2)  Add the proper extensions and views to the Apache CMS and our local lib to
	(a) create html web content.
	(b) create odf content.
	(c) create pdf content


I was thinking of (1)/(2) until we get a better handle on the process.

But maybe it would be best to get everything into the archive.

If we do (1)(2)(3) then we'll need to schedule a weekend process with Infra. Correct? When will svnrdump be available and will it work with OOo/Kenai? What does that process look like.

Regards,
Dave

> 
> Cheers,
> -g
> 
> On Sun, Jul 3, 2011 at 21:07, Dave Fisher <da...@comcast.net> wrote:
>> This is a script and text file for fetching and maintaining an svn checkout of many OOo project's Kenai webcontent.
>> 
>> I followed the same pattern for the script and text file as Greg did for the CWS Mercurial pulls.
>> 
>> dave$ ./fetch-all-web.sh web-list.txt ~/Documents/webtest
>> ============ './projects' exists. Updating ...
>> At revision 3.
>> ============ './www' exists. Updating ...
>> At revision 53.
>> ============ './download' exists. Updating ...
>> At revision 296.
>> ============ './development' exists. Updating ...
>> At revision 15.
>> 
>> Regards,
>> Dave
>> 
>> On Jul 3, 2011, at 5:48 PM, wave@apache.org wrote:
>> 
>>> Author: wave
>>> Date: Mon Jul  4 00:48:01 2011
>>> New Revision: 1142528
>>> 
>>> URL: http://svn.apache.org/viewvc?rev=1142528&view=rev
>>> Log:
>>> A script for pulling webcontent from Kenai's svn repos plus the start to the web-project list. The script follows the pattern of fetch-all-cws.sh. It is a similar process.
>>> 
>>> Added:
>>>    incubator/ooo/trunk/tools/dev/fetch-all-web.sh   (with props)
>>>    incubator/ooo/trunk/tools/dev/web-list.txt   (with props)
>>> 
>>> Added: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
>>> URL: http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/fetch-all-web.sh?rev=1142528&view=auto
>>> ==============================================================================
>>> --- incubator/ooo/trunk/tools/dev/fetch-all-web.sh (added)
>>> +++ incubator/ooo/trunk/tools/dev/fetch-all-web.sh Mon Jul  4 00:48:01 2011
>>> @@ -0,0 +1,81 @@
>>> +#!/bin/sh
>>> +#
>>> +# Licensed to the Apache Software Foundation (ASF) under one
>>> +# or more contributor license agreements.  See the NOTICE file
>>> +# distributed with this work for additional information
>>> +# regarding copyright ownership.  The ASF licenses this file
>>> +# to you under the Apache License, Version 2.0 (the
>>> +# "License"); you may not use this file except in compliance
>>> +# with the License.  You may obtain a copy of the License at
>>> +#
>>> +#   http://www.apache.org/licenses/LICENSE-2.0
>>> +#
>>> +# Unless required by applicable law or agreed to in writing,
>>> +# software distributed under the License is distributed on an
>>> +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
>>> +# KIND, either express or implied.  See the License for the
>>> +# specific language governing permissions and limitations
>>> +# under the License.
>>> +#
>>> +
>>> +#
>>> +# Use this script to fetch all a project's webcontent for the projects
>>> +# listed in the specified file (typically, webcontent-list.txt).
>>> +#
>>> +# See https://cwiki.apache.org/confluence/display/OOOUSERS/OOo-Sitemap
>>> +# for a note on the checkout from the Kenai svn repository.
>>> +#
>>> +# USAGE:
>>> +#   $ ./fetch-all-web.sh WEB-LIST WORK-DIR
>>> +#
>>> +#     WEB-LIST is a file containing the list of Projects to fetch
>>> +#       (see the file tools/dev/webcontent-list.txt)
>>> +#     WORK-DIR each project's webcontent will be created in a
>>> +#       subdirectory of WORK-DIR
>>> +#
>>> +#  Future steps will include scripts to transform the content for
>>> +#  the Apache CMS or a Confluence Wiki import
>>> +#
>>> +
>>> +if test "$#" != 2; then
>>> +  echo "USAGE: $0 WEB-LIST WORK-DIR"
>>> +  exit 1
>>> +fi
>>> +
>>> +REPOS='https://svn.openoffice.org/svn/'
>>> +REPOS2='~webcontent'
>>> +
>>> +# Make the work directory, in case it does not exist
>>> +if test ! -e "$2"; then
>>> +  mkdir "$2"
>>> +fi
>>> +
>>> +# Turn the parameters into absolute paths
>>> +work=`(cd "$2" ; pwd)`
>>> +
>>> +webdir=`dirname "$1"`
>>> +webfile=`basename "$1"`
>>> +weblist=`(cd "$webdir" ; pwd)`/$webfile
>>> +
>>> +
>>> +for webproject in `grep '^./' $weblist` ; do
>>> +  cd "$work"
>>> +
>>> +  webrepos=${REPOS}${webproject}${REPOS2}
>>> +
>>> +  if test -d "$webproject" ; then
>>> +    echo "============ '$project' exists. Updating ..."
>>> +    cd "$webproject"
>>> +    svn update
>>> +
>>> +  elif test -e "$webproject" ; then
>>> +    echo "ERROR: '$webproject' exists and is not a directory."
>>> +    exit 1
>>> +
>>> +  # filter out empty CWS: hg incoming returns 1 if there's nothing to pull
>>> +  else
>>> +    echo "============ '$webproject' is being created ..."
>>> +    svn co $webrepos $webproject
>>> +  fi
>>> +
>>> +done
>>> 
>>> Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
>>> ------------------------------------------------------------------------------
>>>    svn:eol-style = native
>>> 
>>> Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
>>> ------------------------------------------------------------------------------
>>>    svn:executable = *
>>> 
>>> Added: incubator/ooo/trunk/tools/dev/web-list.txt
>>> URL: http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/web-list.txt?rev=1142528&view=auto
>>> ==============================================================================
>>> --- incubator/ooo/trunk/tools/dev/web-list.txt (added)
>>> +++ incubator/ooo/trunk/tools/dev/web-list.txt Mon Jul  4 00:48:01 2011
>>> @@ -0,0 +1,35 @@
>>> +#
>>> +# Licensed to the Apache Software Foundation (ASF) under one
>>> +# or more contributor license agreements.  See the NOTICE file
>>> +# distributed with this work for additional information
>>> +# regarding copyright ownership.  The ASF licenses this file
>>> +# to you under the Apache License, Version 2.0 (the
>>> +# "License"); you may not use this file except in compliance
>>> +# with the License.  You may obtain a copy of the License at
>>> +#
>>> +#   http://www.apache.org/licenses/LICENSE-2.0
>>> +#
>>> +# Unless required by applicable law or agreed to in writing,
>>> +# software distributed under the License is distributed on an
>>> +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
>>> +# KIND, either express or implied.  See the License for the
>>> +# specific language governing permissions and limitations
>>> +# under the License.
>>> +#
>>> +
>>> +#
>>> +# This file contains a list of every project's webcontent currently
>>> +# hosted on Oracle's Kenai svn repository at:
>>> +#   https://svn.openoffice.org/svn/<$projectname>~webcontent
>>> +#
>>> +# The webcontent repositories that should not be pulled are commented out,
>>> +# with a short explanation why.
>>> +#
>>> +# Note: for automated processing of this file, use only lines that
>>> +# begin with "./".
>>> +#
>>> +
>>> +./projects
>>> +./www
>>> +./download
>>> +./development
>>> 
>>> Propchange: incubator/ooo/trunk/tools/dev/web-list.txt
>>> ------------------------------------------------------------------------------
>>>    svn:eol-style = native
>>> 
>>> 
>> 
>> 


Re: svn commit: r1142528 - in /incubator/ooo/trunk/tools/dev: fetch-all-web.sh web-list.txt

Posted by Greg Stein <gs...@gmail.com>.
This is cool. I have one basic question: do you want the latest
content, or do you want full history?

If "latest content", then you could use "svn export" rather than "svn
checkout". However, export won't pick up changes from upstream. The
script would need to skip existing directories, rather than update
them.

If you want history, then we'll want to use something like svnsync to
copy history into local repositories. (or svnrdump from the upcoming
1.7 release)

Thoughts?

Cheers,
-g

On Sun, Jul 3, 2011 at 21:07, Dave Fisher <da...@comcast.net> wrote:
> This is a script and text file for fetching and maintaining an svn checkout of many OOo project's Kenai webcontent.
>
> I followed the same pattern for the script and text file as Greg did for the CWS Mercurial pulls.
>
> dave$ ./fetch-all-web.sh web-list.txt ~/Documents/webtest
> ============ './projects' exists. Updating ...
> At revision 3.
> ============ './www' exists. Updating ...
> At revision 53.
> ============ './download' exists. Updating ...
> At revision 296.
> ============ './development' exists. Updating ...
> At revision 15.
>
> Regards,
> Dave
>
> On Jul 3, 2011, at 5:48 PM, wave@apache.org wrote:
>
>> Author: wave
>> Date: Mon Jul  4 00:48:01 2011
>> New Revision: 1142528
>>
>> URL: http://svn.apache.org/viewvc?rev=1142528&view=rev
>> Log:
>> A script for pulling webcontent from Kenai's svn repos plus the start to the web-project list. The script follows the pattern of fetch-all-cws.sh. It is a similar process.
>>
>> Added:
>>    incubator/ooo/trunk/tools/dev/fetch-all-web.sh   (with props)
>>    incubator/ooo/trunk/tools/dev/web-list.txt   (with props)
>>
>> Added: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
>> URL: http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/fetch-all-web.sh?rev=1142528&view=auto
>> ==============================================================================
>> --- incubator/ooo/trunk/tools/dev/fetch-all-web.sh (added)
>> +++ incubator/ooo/trunk/tools/dev/fetch-all-web.sh Mon Jul  4 00:48:01 2011
>> @@ -0,0 +1,81 @@
>> +#!/bin/sh
>> +#
>> +# Licensed to the Apache Software Foundation (ASF) under one
>> +# or more contributor license agreements.  See the NOTICE file
>> +# distributed with this work for additional information
>> +# regarding copyright ownership.  The ASF licenses this file
>> +# to you under the Apache License, Version 2.0 (the
>> +# "License"); you may not use this file except in compliance
>> +# with the License.  You may obtain a copy of the License at
>> +#
>> +#   http://www.apache.org/licenses/LICENSE-2.0
>> +#
>> +# Unless required by applicable law or agreed to in writing,
>> +# software distributed under the License is distributed on an
>> +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
>> +# KIND, either express or implied.  See the License for the
>> +# specific language governing permissions and limitations
>> +# under the License.
>> +#
>> +
>> +#
>> +# Use this script to fetch all a project's webcontent for the projects
>> +# listed in the specified file (typically, webcontent-list.txt).
>> +#
>> +# See https://cwiki.apache.org/confluence/display/OOOUSERS/OOo-Sitemap
>> +# for a note on the checkout from the Kenai svn repository.
>> +#
>> +# USAGE:
>> +#   $ ./fetch-all-web.sh WEB-LIST WORK-DIR
>> +#
>> +#     WEB-LIST is a file containing the list of Projects to fetch
>> +#       (see the file tools/dev/webcontent-list.txt)
>> +#     WORK-DIR each project's webcontent will be created in a
>> +#       subdirectory of WORK-DIR
>> +#
>> +#  Future steps will include scripts to transform the content for
>> +#  the Apache CMS or a Confluence Wiki import
>> +#
>> +
>> +if test "$#" != 2; then
>> +  echo "USAGE: $0 WEB-LIST WORK-DIR"
>> +  exit 1
>> +fi
>> +
>> +REPOS='https://svn.openoffice.org/svn/'
>> +REPOS2='~webcontent'
>> +
>> +# Make the work directory, in case it does not exist
>> +if test ! -e "$2"; then
>> +  mkdir "$2"
>> +fi
>> +
>> +# Turn the parameters into absolute paths
>> +work=`(cd "$2" ; pwd)`
>> +
>> +webdir=`dirname "$1"`
>> +webfile=`basename "$1"`
>> +weblist=`(cd "$webdir" ; pwd)`/$webfile
>> +
>> +
>> +for webproject in `grep '^./' $weblist` ; do
>> +  cd "$work"
>> +
>> +  webrepos=${REPOS}${webproject}${REPOS2}
>> +
>> +  if test -d "$webproject" ; then
>> +    echo "============ '$project' exists. Updating ..."
>> +    cd "$webproject"
>> +    svn update
>> +
>> +  elif test -e "$webproject" ; then
>> +    echo "ERROR: '$webproject' exists and is not a directory."
>> +    exit 1
>> +
>> +  # filter out empty CWS: hg incoming returns 1 if there's nothing to pull
>> +  else
>> +    echo "============ '$webproject' is being created ..."
>> +    svn co $webrepos $webproject
>> +  fi
>> +
>> +done
>>
>> Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
>> ------------------------------------------------------------------------------
>>    svn:eol-style = native
>>
>> Propchange: incubator/ooo/trunk/tools/dev/fetch-all-web.sh
>> ------------------------------------------------------------------------------
>>    svn:executable = *
>>
>> Added: incubator/ooo/trunk/tools/dev/web-list.txt
>> URL: http://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/web-list.txt?rev=1142528&view=auto
>> ==============================================================================
>> --- incubator/ooo/trunk/tools/dev/web-list.txt (added)
>> +++ incubator/ooo/trunk/tools/dev/web-list.txt Mon Jul  4 00:48:01 2011
>> @@ -0,0 +1,35 @@
>> +#
>> +# Licensed to the Apache Software Foundation (ASF) under one
>> +# or more contributor license agreements.  See the NOTICE file
>> +# distributed with this work for additional information
>> +# regarding copyright ownership.  The ASF licenses this file
>> +# to you under the Apache License, Version 2.0 (the
>> +# "License"); you may not use this file except in compliance
>> +# with the License.  You may obtain a copy of the License at
>> +#
>> +#   http://www.apache.org/licenses/LICENSE-2.0
>> +#
>> +# Unless required by applicable law or agreed to in writing,
>> +# software distributed under the License is distributed on an
>> +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
>> +# KIND, either express or implied.  See the License for the
>> +# specific language governing permissions and limitations
>> +# under the License.
>> +#
>> +
>> +#
>> +# This file contains a list of every project's webcontent currently
>> +# hosted on Oracle's Kenai svn repository at:
>> +#   https://svn.openoffice.org/svn/<$projectname>~webcontent
>> +#
>> +# The webcontent repositories that should not be pulled are commented out,
>> +# with a short explanation why.
>> +#
>> +# Note: for automated processing of this file, use only lines that
>> +# begin with "./".
>> +#
>> +
>> +./projects
>> +./www
>> +./download
>> +./development
>>
>> Propchange: incubator/ooo/trunk/tools/dev/web-list.txt
>> ------------------------------------------------------------------------------
>>    svn:eol-style = native
>>
>>
>
>