You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Julien Nioche <li...@gmail.com> on 2010/06/24 13:17:39 UTC

Re: svnpubsub for the Tika web site

What about doing the same for Nutch? Any reason not to?

J.

On 21 June 2010 15:46, Mattmann, Chris A (388J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> +1million
>
> Been wishing for this for a while! :)
>
> Cheers,
> Chris
>
>
>
> On 6/21/10 3:02 AM, "Jukka Zitting" <ju...@gmail.com> wrote:
>
> Hi,
>
> The PDFBox web site [1] is now managed using the new svnpubsub
> mechanism set up by the infra team. Basically, the generated web site
> is committed to svn along with the site sources, and the svnpubsub
> magic will automatically publish the latest changes as soon as they've
> been committed. No more hours waiting for the rsync delay or wondering
> if the CI build setup works correctly. :-) See PDFBOX-623 [2] for the
> basic site update process now used by PDFBox.
>
> I'd like to set up a similar system also for Tika. We already have a
> Maven generated site, so it'll be easy to duplicate the setup from
> PDFBox.
>
> WDYT?
>
> [1] http://pdfbox.apache.org/
> [2] https://issues.apache.org/jira/browse/PDFBOX-623
>
> BR,
>
> Jukka Zitting
>
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: Chris.Mattmann@jpl.nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/<http://sunset.usc.edu/%7Emattmann/>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>


-- 
DigitalPebble Ltd

Open Source Solutions for Text Engineering
http://www.digitalpebble.com

Re: svnpubsub for the Tika web site

Posted by Julien Nioche <li...@gmail.com>.
hi

 +1 Julien. Totally agree here. The process will be slightly different with
> Nutch as opposed to Tika since Nutch uses Forrest and Tika uses Maven, but
> totally doable.
>

yes, the generation of the pages with Forrest is still done on our side
AFAIK so that should not affect the publication with svnpubsub


>
> If you have time today could you file an INFRA ticket and ask for svnpubsub
> to be set up for Nutch? Probably a good idea to point the infra@ folks to
> the documentation on the wiki on how to publish the Nutch website to give
> them a start.
>

Done - see https://issues.apache.org/jira/browse/NUTCH-834 and
https://issues.apache.org/jira/browse/INFRA-2822

I have not moved the stuff from nutch/trunk to nutch/site though, not clear
whether this was a prerequisite or not

Thanks

J.


>
>
>
> On 6/24/10 4:17 AM, "Julien Nioche" <li...@gmail.com> wrote:
>
> What about doing the same for Nutch? Any reason not to?
>
> J.
>
> On 21 June 2010 15:46, Mattmann, Chris A (388J) <
> chris.a.mattmann@jpl.nasa.gov> wrote:
>
> +1million
>
> Been wishing for this for a while! :)
>
> Cheers,
> Chris
>
>
>
> On 6/21/10 3:02 AM, "Jukka Zitting" <ju...@gmail.com> wrote:
>
> Hi,
>
> The PDFBox web site [1] is now managed using the new svnpubsub
> mechanism set up by the infra team. Basically, the generated web site
> is committed to svn along with the site sources, and the svnpubsub
> magic will automatically publish the latest changes as soon as they've
> been committed. No more hours waiting for the rsync delay or wondering
> if the CI build setup works correctly. :-) See PDFBOX-623 [2] for the
> basic site update process now used by PDFBox.
>
> I'd like to set up a similar system also for Tika. We already have a
> Maven generated site, so it'll be easy to duplicate the setup from
> PDFBox.
>
> WDYT?
>
> [1] http://pdfbox.apache.org/
> [2] https://issues.apache.org/jira/browse/PDFBOX-623
>
> BR,
>
> Jukka Zitting
>
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: Chris.Mattmann@jpl.nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/<http://sunset.usc.edu/%7Emattmann/><
> http://sunset.usc.edu/%7Emattmann/>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: *Chris.Mattmann@jpl.nasa.gov
> *WWW:   *http://sunset.usc.edu/~mattmann/<http://sunset.usc.edu/%7Emattmann/>
> *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>


-- 
DigitalPebble Ltd

Open Source Solutions for Text Engineering
http://www.digitalpebble.com

Re: svnpubsub for the Tika web site

Posted by Alex McLintock <al...@gmail.com>.
Let's do it. The website is pretty broken right now so someone needs
to work on it... Might as well do it this way. (The API docs are
missing)

(Not actually voting since I am not currently a committer :-)

Alex

On 24 June 2010 12:17, Julien Nioche <li...@gmail.com> wrote:
> What about doing the same for Nutch? Any reason not to?
>
> J.
>
> On 21 June 2010 15:46, Mattmann, Chris A (388J)
> <ch...@jpl.nasa.gov> wrote:
>>
>> +1million
>>
>> Been wishing for this for a while! :)
>>
>> Cheers,
>> Chris
>>
>>
>>
>> On 6/21/10 3:02 AM, "Jukka Zitting" <ju...@gmail.com> wrote:
>>
>> Hi,
>>
>> The PDFBox web site [1] is now managed using the new svnpubsub
>> mechanism set up by the infra team. Basically, the generated web site
>> is committed to svn along with the site sources, and the svnpubsub
>> magic will automatically publish the latest changes as soon as they've
>> been committed. No more hours waiting for the rsync delay or wondering
>> if the CI build setup works correctly. :-) See PDFBOX-623 [2] for the
>> basic site update process now used by PDFBox.
>>
>> I'd like to set up a similar system also for Tika. We already have a
>> Maven generated site, so it'll be easy to duplicate the setup from
>> PDFBox.
>>
>> WDYT?
>>
>> [1] http://pdfbox.apache.org/
>> [2] https://issues.apache.org/jira/browse/PDFBOX-623
>>
>> BR,
>>
>> Jukka Zitting
>>
>>
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: Chris.Mattmann@jpl.nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>
>
>
> --
> DigitalPebble Ltd
>
> Open Source Solutions for Text Engineering
> http://www.digitalpebble.com
>

Re: svnpubsub for the Tika web site

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
+1 Julien. Totally agree here. The process will be slightly different with Nutch as opposed to Tika since Nutch uses Forrest and Tika uses Maven, but totally doable.

If you have time today could you file an INFRA ticket and ask for svnpubsub to be set up for Nutch? Probably a good idea to point the infra@ folks to the documentation on the wiki on how to publish the Nutch website to give them a start.

If you don't have time today I could do this probably tomorrow...

Cheers,
Chris



On 6/24/10 4:17 AM, "Julien Nioche" <li...@gmail.com> wrote:

What about doing the same for Nutch? Any reason not to?

J.

On 21 June 2010 15:46, Mattmann, Chris A (388J) <ch...@jpl.nasa.gov> wrote:
+1million

Been wishing for this for a while! :)

Cheers,
Chris



On 6/21/10 3:02 AM, "Jukka Zitting" <ju...@gmail.com> wrote:

Hi,

The PDFBox web site [1] is now managed using the new svnpubsub
mechanism set up by the infra team. Basically, the generated web site
is committed to svn along with the site sources, and the svnpubsub
magic will automatically publish the latest changes as soon as they've
been committed. No more hours waiting for the rsync delay or wondering
if the CI build setup works correctly. :-) See PDFBOX-623 [2] for the
basic site update process now used by PDFBox.

I'd like to set up a similar system also for Tika. We already have a
Maven generated site, so it'll be easy to duplicate the setup from
PDFBox.

WDYT?

[1] http://pdfbox.apache.org/
[2] https://issues.apache.org/jira/browse/PDFBOX-623

BR,

Jukka Zitting



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/ <http://sunset.usc.edu/%7Emattmann/>
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++