You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Paul Tomblin <pt...@xcski.com> on 2009/09/22 15:35:11 UTC

Where should I do this?

I want to output to a file or database every url/filename that's crawled,
along with the status.  I figure I can do this with a plugin, but I'm not
sure where to slot it into the plugin hierarchy.  Any suggestions?

-- 
http://www.linkedin.com/in/paultomblin

Re: Event search engine

Posted by Brian Ulicny <bu...@alum.mit.edu>.
Here's a recently announced event search engine:

http://searchengineland.com/what-where-when-travel-local-search-combine-goby-com-26395

Just heard of it today.

Brian Ulicny

On Wed, 23 Sep 2009 09:27 +0200, "Michael Wechner"
<mi...@wyona.com> wrote:
> Mitia Notaras schrieb:
> > Hi there,
> >
> > The two event search engines I found are down :
> > betherebesquare.com
> > and
> > BusyTonight.com
> >
> > I would like your advice :
> > Is it difficult to build one?
> 
> I guess it depends on the details of the requirements. Do you have a 
> requirements sheet?
> > I have knowledge of web techniques (html, php, js)
> > But not much apache.
> 
> yes, I think one needs some Lucene/Nutch (and more general Java/Webapp) 
> knowledge.
> 
> Cheers
> 
> Michael
> >
> > Thanks.
> > Mitia
> >
> > PS : from the mail exchanges I see, It seems complex
> >
> >
> >
> 
-- 
  Brian Ulicny
  bulicny at alum dot mit dot edu
  home: 781-721-5746
  fax: 360-361-5746



Re: Event search engine

Posted by Michael Wechner <mi...@wyona.com>.
Mitia Notaras schrieb:
> Hi there,
>
> The two event search engines I found are down :
> betherebesquare.com
> and
> BusyTonight.com
>
> I would like your advice :
> Is it difficult to build one?

I guess it depends on the details of the requirements. Do you have a 
requirements sheet?
> I have knowledge of web techniques (html, php, js)
> But not much apache.

yes, I think one needs some Lucene/Nutch (and more general Java/Webapp) 
knowledge.

Cheers

Michael
>
> Thanks.
> Mitia
>
> PS : from the mail exchanges I see, It seems complex
>
>
>


Event search engine

Posted by Mitia Notaras <mi...@orange.fr>.
Hi there,

The two event search engines I found are down :
betherebesquare.com
and
BusyTonight.com

I would like your advice :
Is it difficult to build one?
I have knowledge of web techniques (html, php, js)
But not much apache.

Thanks.
Mitia

PS : from the mail exchanges I see, It seems complex




Re: Where should I do this?

Posted by Sandeep Tata <sa...@gmail.com>.
One possibility is to just parse and use the output from "bin/nutch
readdb <crawldb> -dump"

On Tue, Sep 22, 2009 at 6:35 AM, Paul Tomblin <pt...@xcski.com> wrote:
> I want to output to a file or database every url/filename that's crawled,
> along with the status.  I figure I can do this with a plugin, but I'm not
> sure where to slot it into the plugin hierarchy.  Any suggestions?
>
> --
> http://www.linkedin.com/in/paultomblin
>