You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Stefan Groschupf <sg...@media-style.com> on 2006/04/28 15:23:41 UTC

Admin Gui beta test (was Re: ATB: Heritrix)

Hi there,

since building the gui is some how complicated I was thinking about  
providing a ready to use binary.
This may be would help to get some more beta testers we currently  
looking for.
Any thoughts?

However I afraid that this would hit my server to hard and I have to  
pay for traffic. :-/
Does any one has an idea where we can mirror this file for free?
Any volunteer is very welcome.

Thanks.
Stefan




Am 28.04.2006 um 15:14 schrieb Aled Jones:

> Thanks for your replies guys.  I hadn't realised that the admin gui  
> was
> already in development.
> We should be able to cope till it gets released ;-)
>
> Thanks again
> Aled
>
>> -----Neges Wreiddiol-----/-----Original Message-----
>> Oddi wrth/From: Dan Morrill [mailto:ralph.morrill@baker.edu]
>> Anfonwyd/Sent: 28 April 2006 14:07
>> At/To: nutch-user@lucene.apache.org
>> Pwnc/Subject: RE: Heritrix
>>
>> Aled,
>>
>> I used heritrix before going over to nutch, while it is an
>> excellent program, with lots of good things to offer, it
>> didn't quite meet my need, and when designing the
>> architecture had too many dependencies for me to be comfortable with.
>>
>> If you want to run an internet archive though, heritrix can
>> not be beat, if you want to run a search engine, nutch is a
>> good choice.
>>
>> My personal opinion.
>> r/d
>>
>> -----Original Message-----
>> From: Aled Jones [mailto:Aled.Jones@comtec-europe.co.uk]
>> Sent: Friday, April 28, 2006 1:59 AM
>> To: nutch-user@lucene.apache.org
>> Subject: Heritrix
>>
>> Hi
>>
>> Anyone used Heritrix (http://crawler.archive.org/) as a
>> crawler?  How does it compare with the Nutch crawler?  Can
>> Nutch serve its crawled
>> results?   Main reason I'm interested is that it has a WUI interface
>> that might make maintenance for the IT guys easier, although
>> I know that some of you guys are working on an interface.
>>
>> Cheers
>> Aled
>>
>>
>> ###########################################
>>
>> This message has been scanned by F-Secure Anti-Virus for
>> Microsoft Exchange.
>> For more information, connect to http://www.f-secure.com/
>> **************************************************************
>> **********
>> This e-mail and any attachments are strictly confidential and
>> intended solely for the addressee. They may contain
>> information which is covered by legal, professional or other
>> privilege. If you are not the intended addressee, you must
>> not copy the e-mail or the attachments, or use them for any
>> purpose or disclose their contents to any other person. To do
>> so may be unlawful. If you have received this transmission in
>> error, please notify us as soon as possible and delete the
>> message and attachments from all places in your computer
>> where they are stored.
>>
>> Although we have scanned this e-mail and any attachments for
>> viruses, it is your responsibility to ensure that they are
>> actually virus free.
>>
>>
>>
>>
> ###########################################
>
> This message has been scanned by F-Secure Anti-Virus for Microsoft  
> Exchange.
> For more information, connect to http://www.f-secure.com/
>
> ********************************************************************** 
> **
> This e-mail and any attachments are strictly confidential and  
> intended solely for the addressee. They may contain information  
> which is covered by legal, professional or other privilege. If you  
> are not the intended addressee, you must not copy the e-mail or the  
> attachments, or use them for any purpose or disclose their contents  
> to any other person. To do so may be unlawful. If you have received  
> this transmission in error, please notify us as soon as possible  
> and delete the message and attachments from all places in your  
> computer where they are stored.
>
> Although we have scanned this e-mail and any attachments for  
> viruses, it is your responsibility to ensure that they are actually  
> virus free.
>
>
>

---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com



Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by "Insurance Squared Inc." <gc...@insurancesquared.com>.
I'd prefer not to make a longterm committment, but if a 2mbit connection 
is good enough and this is a short term thing, I'll step up if no one 
else can.  I could probably make a longer term committment in a few 
weeks.  Worst case I can host it at home.

glenn


gekkokid wrote:

> what about putting it on sourceforge?
> http://sourceforge.net/projects/nutch
>
>
>
> ----- Original Message ----- From: "Doug Cutting" <cu...@apache.org>
> To: <nu...@lucene.apache.org>
> Sent: Saturday, April 29, 2006 12:18 AM
> Subject: Re: Admin Gui beta test (was Re: ATB: Heritrix)
>
>
>> Stefan Groschupf wrote:
>>
>>>> If that fails, you could attach it to a page on the Wiki, no?
>>>
>>>
>>> Is that a good idea? The file is that big and we already got the  
>>> request to use the apache mirror servers.
>>
>>
>> The Apache mirrors are really for signed, Apache releases, which this 
>> is not.  Is it too big for the wiki?
>>
>> Doug
>>
>

Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by gekkokid <me...@gekkokid.org.uk>.
what about putting it on sourceforge?
http://sourceforge.net/projects/nutch



----- Original Message ----- 
From: "Doug Cutting" <cu...@apache.org>
To: <nu...@lucene.apache.org>
Sent: Saturday, April 29, 2006 12:18 AM
Subject: Re: Admin Gui beta test (was Re: ATB: Heritrix)


> Stefan Groschupf wrote:
>>> If that fails, you could attach it to a page on the Wiki, no?
>> 
>> Is that a good idea? The file is that big and we already got the  
>> request to use the apache mirror servers.
> 
> The Apache mirrors are really for signed, Apache releases, which this is 
> not.  Is it too big for the wiki?
> 
> Doug
>

Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by Doug Cutting <cu...@apache.org>.
Stefan Groschupf wrote:
>> If that fails, you could attach it to a page on the Wiki, no?
> 
> Is that a good idea? The file is that big and we already got the  
> request to use the apache mirror servers.

The Apache mirrors are really for signed, Apache releases, which this is 
not.  Is it too big for the wiki?

Doug

Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by gekkokid <me...@gekkokid.org.uk>.
use yousendit.com and post the link here :)

----- Original Message ----- 
From: "Karsten Dello" <de...@mi.fu-berlin.de>
To: <nu...@lucene.apache.org>
Sent: Tuesday, May 02, 2006 8:50 PM
Subject: Re: Admin Gui beta test (was Re: ATB: Heritrix)


> Hi Stefan,
>
> did you find a solution?
> I'd really like to give the admin gui a try.
>
> Cheers
>
> Karsten
>
>
> PS: My offer to host that file is still open :-)
>
> Stefan Groschupf schrieb:
>
>>
>>>> I think it should be possible to put your binary at the Apache  site, 
>>>> probably Doug will be the right person to talk to ...
>>>
>>>
>>> Have you tried attaching it to a Jira issue?
>>
>> The nutch -xxxtar.gz is 67MB. The maximum file upload size is 10.00 Mb .
>>
>>>
>>> If that fails, you could attach it to a page on the Wiki, no?
>>
>> Is that a good idea? The file is that big and we already got the  request 
>> to use the apache mirror servers.
>> Anyway I already got some offline offers from people, just was  thinking 
>> it is a good idea to leave that running under the nutch  project flag.
>>
>>
>>
>>
>>
>>
>>
>
>
> 


Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by Karsten Dello <de...@mi.fu-berlin.de>.
Hi Stefan,

did you find a solution?
I'd really like to give the admin gui a try.

Cheers

Karsten


PS: My offer to host that file is still open :-)

Stefan Groschupf schrieb:

> 
>>> I think it should be possible to put your binary at the Apache  site, 
>>> probably Doug will be the right person to talk to ...
>>
>>
>> Have you tried attaching it to a Jira issue?
> 
> The nutch -xxxtar.gz is 67MB. The maximum file upload size is 10.00 Mb .
> 
>>
>> If that fails, you could attach it to a page on the Wiki, no?
> 
> Is that a good idea? The file is that big and we already got the  
> request to use the apache mirror servers.
> Anyway I already got some offline offers from people, just was  thinking 
> it is a good idea to leave that running under the nutch  project flag.
> 
> 
> 
> 
> 
> 
> 



Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by Stefan Groschupf <sg...@media-style.com>.
>> I think it should be possible to put your binary at the Apache  
>> site, probably Doug will be the right person to talk to ...
>
> Have you tried attaching it to a Jira issue?
The nutch -xxxtar.gz is 67MB. The maximum file upload size is 10.00 Mb .
>
> If that fails, you could attach it to a page on the Wiki, no?
Is that a good idea? The file is that big and we already got the  
request to use the apache mirror servers.
Anyway I already got some offline offers from people, just was  
thinking it is a good idea to leave that running under the nutch  
project flag.









Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by Doug Cutting <cu...@apache.org>.
Andrzej Bialecki wrote:
> I think it should be possible to put your binary at the Apache site, 
> probably Doug will be the right person to talk to ...

Have you tried attaching it to a Jira issue?

If that fails, you could attach it to a page on the Wiki, no?

Doug

Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by Andrzej Bialecki <ab...@getopt.org>.
Stefan Groschupf wrote:
> Hi there,
>
> since building the gui is some how complicated I was thinking about 
> providing a ready to use binary.
> This may be would help to get some more beta testers we currently 
> looking for.
> Any thoughts?
>
> However I afraid that this would hit my server to hard and I have to 
> pay for traffic. :-/
> Does any one has an idea where we can mirror this file for free?
> Any volunteer is very welcome.

I think it should be possible to put your binary at the Apache site, 
probably Doug will be the right person to talk to ...

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



RE: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by Dan Morrill <ra...@baker.edu>.
Stefan -

I can host the file at http://www.oaktreesecurity.com if you would like. I
have about 2 gigs of bandwidth a month, and I use maybe 10 megs, I think I
can accommodate. I am more than happy to host a free standing binary. 

Do you have a windows compatible version (or will it run in cygwin), or is
it Linux only?

r/d

-----Original Message-----
From: Stefan Groschupf [mailto:sg@media-style.com] 
Sent: Friday, April 28, 2006 6:24 AM
To: nutch-user@lucene.apache.org
Subject: Admin Gui beta test (was Re: ATB: Heritrix)

Hi there,

since building the gui is some how complicated I was thinking about  
providing a ready to use binary.
This may be would help to get some more beta testers we currently  
looking for.
Any thoughts?

However I afraid that this would hit my server to hard and I have to  
pay for traffic. :-/
Does any one has an idea where we can mirror this file for free?
Any volunteer is very welcome.

Thanks.
Stefan




Am 28.04.2006 um 15:14 schrieb Aled Jones:

> Thanks for your replies guys.  I hadn't realised that the admin gui  
> was
> already in development.
> We should be able to cope till it gets released ;-)
>
> Thanks again
> Aled
>
>> -----Neges Wreiddiol-----/-----Original Message-----
>> Oddi wrth/From: Dan Morrill [mailto:ralph.morrill@baker.edu]
>> Anfonwyd/Sent: 28 April 2006 14:07
>> At/To: nutch-user@lucene.apache.org
>> Pwnc/Subject: RE: Heritrix
>>
>> Aled,
>>
>> I used heritrix before going over to nutch, while it is an
>> excellent program, with lots of good things to offer, it
>> didn't quite meet my need, and when designing the
>> architecture had too many dependencies for me to be comfortable with.
>>
>> If you want to run an internet archive though, heritrix can
>> not be beat, if you want to run a search engine, nutch is a
>> good choice.
>>
>> My personal opinion.
>> r/d
>>
>> -----Original Message-----
>> From: Aled Jones [mailto:Aled.Jones@comtec-europe.co.uk]
>> Sent: Friday, April 28, 2006 1:59 AM
>> To: nutch-user@lucene.apache.org
>> Subject: Heritrix
>>
>> Hi
>>
>> Anyone used Heritrix (http://crawler.archive.org/) as a
>> crawler?  How does it compare with the Nutch crawler?  Can
>> Nutch serve its crawled
>> results?   Main reason I'm interested is that it has a WUI interface
>> that might make maintenance for the IT guys easier, although
>> I know that some of you guys are working on an interface.
>>
>> Cheers
>> Aled
>>
>>
>> ###########################################
>>
>> This message has been scanned by F-Secure Anti-Virus for
>> Microsoft Exchange.
>> For more information, connect to http://www.f-secure.com/
>> **************************************************************
>> **********
>> This e-mail and any attachments are strictly confidential and
>> intended solely for the addressee. They may contain
>> information which is covered by legal, professional or other
>> privilege. If you are not the intended addressee, you must
>> not copy the e-mail or the attachments, or use them for any
>> purpose or disclose their contents to any other person. To do
>> so may be unlawful. If you have received this transmission in
>> error, please notify us as soon as possible and delete the
>> message and attachments from all places in your computer
>> where they are stored.
>>
>> Although we have scanned this e-mail and any attachments for
>> viruses, it is your responsibility to ensure that they are
>> actually virus free.
>>
>>
>>
>>
> ###########################################
>
> This message has been scanned by F-Secure Anti-Virus for Microsoft  
> Exchange.
> For more information, connect to http://www.f-secure.com/
>
> ********************************************************************** 
> **
> This e-mail and any attachments are strictly confidential and  
> intended solely for the addressee. They may contain information  
> which is covered by legal, professional or other privilege. If you  
> are not the intended addressee, you must not copy the e-mail or the  
> attachments, or use them for any purpose or disclose their contents  
> to any other person. To do so may be unlawful. If you have received  
> this transmission in error, please notify us as soon as possible  
> and delete the message and attachments from all places in your  
> computer where they are stored.
>
> Although we have scanned this e-mail and any attachments for  
> viruses, it is your responsibility to ensure that they are actually  
> virus free.
>
>
>

---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com



Re: Admin Gui beta test (was Re: ATB: Heritrix)

Posted by sudhendra seshachala <su...@yahoo.com>.
Hi Stefan
  I would be willing to host the app.
  I have virutal dedicated server from Godaddy with Fedora core2 and apache webserver and tomcat running.
  The IP address is http://68.178.249.66 Right now, on webserver side, I have a default page (hosted by godaddy running)
  But can make sure the Admin GUI is running.. I might need some help, but should not be a problem at all.
   
   
  Thanks
  Sudhi
  

Stefan Groschupf <sg...@media-style.com> wrote:
  Hi there,

since building the gui is some how complicated I was thinking about 
providing a ready to use binary.
This may be would help to get some more beta testers we currently 
looking for.
Any thoughts?

However I afraid that this would hit my server to hard and I have to 
pay for traffic. :-/
Does any one has an idea where we can mirror this file for free?
Any volunteer is very welcome.

Thanks.
Stefan




Am 28.04.2006 um 15:14 schrieb Aled Jones:

> Thanks for your replies guys. I hadn't realised that the admin gui 
> was
> already in development.
> We should be able to cope till it gets released ;-)
>
> Thanks again
> Aled
>
>> -----Neges Wreiddiol-----/-----Original Message-----
>> Oddi wrth/From: Dan Morrill [mailto:ralph.morrill@baker.edu]
>> Anfonwyd/Sent: 28 April 2006 14:07
>> At/To: nutch-user@lucene.apache.org
>> Pwnc/Subject: RE: Heritrix
>>
>> Aled,
>>
>> I used heritrix before going over to nutch, while it is an
>> excellent program, with lots of good things to offer, it
>> didn't quite meet my need, and when designing the
>> architecture had too many dependencies for me to be comfortable with.
>>
>> If you want to run an internet archive though, heritrix can
>> not be beat, if you want to run a search engine, nutch is a
>> good choice.
>>
>> My personal opinion.
>> r/d
>>
>> -----Original Message-----
>> From: Aled Jones [mailto:Aled.Jones@comtec-europe.co.uk]
>> Sent: Friday, April 28, 2006 1:59 AM
>> To: nutch-user@lucene.apache.org
>> Subject: Heritrix
>>
>> Hi
>>
>> Anyone used Heritrix (http://crawler.archive.org/) as a
>> crawler? How does it compare with the Nutch crawler? Can
>> Nutch serve its crawled
>> results? Main reason I'm interested is that it has a WUI interface
>> that might make maintenance for the IT guys easier, although
>> I know that some of you guys are working on an interface.
>>
>> Cheers
>> Aled
>>
>>
>> ###########################################
>>
>> This message has been scanned by F-Secure Anti-Virus for
>> Microsoft Exchange.
>> For more information, connect to http://www.f-secure.com/
>> **************************************************************
>> **********
>> This e-mail and any attachments are strictly confidential and
>> intended solely for the addressee. They may contain
>> information which is covered by legal, professional or other
>> privilege. If you are not the intended addressee, you must
>> not copy the e-mail or the attachments, or use them for any
>> purpose or disclose their contents to any other person. To do
>> so may be unlawful. If you have received this transmission in
>> error, please notify us as soon as possible and delete the
>> message and attachments from all places in your computer
>> where they are stored.
>>
>> Although we have scanned this e-mail and any attachments for
>> viruses, it is your responsibility to ensure that they are
>> actually virus free.
>>
>>
>>
>>
> ###########################################
>
> This message has been scanned by F-Secure Anti-Virus for Microsoft 
> Exchange.
> For more information, connect to http://www.f-secure.com/
>
> ********************************************************************** 
> **
> This e-mail and any attachments are strictly confidential and 
> intended solely for the addressee. They may contain information 
> which is covered by legal, professional or other privilege. If you 
> are not the intended addressee, you must not copy the e-mail or the 
> attachments, or use them for any purpose or disclose their contents 
> to any other person. To do so may be unlawful. If you have received 
> this transmission in error, please notify us as soon as possible 
> and delete the message and attachments from all places in your 
> computer where they are stored.
>
> Although we have scanned this e-mail and any attachments for 
> viruses, it is your responsibility to ensure that they are actually 
> virus free.
>
>
>

---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com





  Sudhi Seshachala
  http://sudhilogs.blogspot.com/
   


			
---------------------------------
Yahoo! Mail goes everywhere you do.  Get it on your phone.