You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Paul Tomblin <pt...@xcski.com> on 2009/09/22 15:35:11 UTC
Where should I do this?
I want to output to a file or database every url/filename that's crawled,
along with the status. I figure I can do this with a plugin, but I'm not
sure where to slot it into the plugin hierarchy. Any suggestions?
--
http://www.linkedin.com/in/paultomblin
Re: Event search engine
Posted by Brian Ulicny <bu...@alum.mit.edu>.
Here's a recently announced event search engine:
http://searchengineland.com/what-where-when-travel-local-search-combine-goby-com-26395
Just heard of it today.
Brian Ulicny
On Wed, 23 Sep 2009 09:27 +0200, "Michael Wechner"
<mi...@wyona.com> wrote:
> Mitia Notaras schrieb:
> > Hi there,
> >
> > The two event search engines I found are down :
> > betherebesquare.com
> > and
> > BusyTonight.com
> >
> > I would like your advice :
> > Is it difficult to build one?
>
> I guess it depends on the details of the requirements. Do you have a
> requirements sheet?
> > I have knowledge of web techniques (html, php, js)
> > But not much apache.
>
> yes, I think one needs some Lucene/Nutch (and more general Java/Webapp)
> knowledge.
>
> Cheers
>
> Michael
> >
> > Thanks.
> > Mitia
> >
> > PS : from the mail exchanges I see, It seems complex
> >
> >
> >
>
--
Brian Ulicny
bulicny at alum dot mit dot edu
home: 781-721-5746
fax: 360-361-5746
Re: Event search engine
Posted by Michael Wechner <mi...@wyona.com>.
Mitia Notaras schrieb:
> Hi there,
>
> The two event search engines I found are down :
> betherebesquare.com
> and
> BusyTonight.com
>
> I would like your advice :
> Is it difficult to build one?
I guess it depends on the details of the requirements. Do you have a
requirements sheet?
> I have knowledge of web techniques (html, php, js)
> But not much apache.
yes, I think one needs some Lucene/Nutch (and more general Java/Webapp)
knowledge.
Cheers
Michael
>
> Thanks.
> Mitia
>
> PS : from the mail exchanges I see, It seems complex
>
>
>
Event search engine
Posted by Mitia Notaras <mi...@orange.fr>.
Hi there,
The two event search engines I found are down :
betherebesquare.com
and
BusyTonight.com
I would like your advice :
Is it difficult to build one?
I have knowledge of web techniques (html, php, js)
But not much apache.
Thanks.
Mitia
PS : from the mail exchanges I see, It seems complex
Re: Where should I do this?
Posted by Sandeep Tata <sa...@gmail.com>.
One possibility is to just parse and use the output from "bin/nutch
readdb <crawldb> -dump"
On Tue, Sep 22, 2009 at 6:35 AM, Paul Tomblin <pt...@xcski.com> wrote:
> I want to output to a file or database every url/filename that's crawled,
> along with the status. I figure I can do this with a plugin, but I'm not
> sure where to slot it into the plugin hierarchy. Any suggestions?
>
> --
> http://www.linkedin.com/in/paultomblin
>