You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Roannel Fernández Hernández <ro...@uci.cu> on 2018/06/29 04:23:38 UTC

Events out-of-the-box


Hi folks, 




I'm using Nutch 1.14 and I have to send notifications to a RabbitMQ queue when a every step starts and ends. So, my question is: Do I have to change the code to achieve this or is there an easier way? How can I do this? 




If code should be changed I think is a good idea provide out-of-the-box events for each step. We can even pass the counters from the context to each event. I don't know, it's just an idea. 




What do you think guys? 




Regards 

UCIENCIA 2018: III Conferencia Científica Internacional de la Universidad de las Ciencias Informáticas. 
Del 24-26 de septiembre, 2018 http://uciencia.uci.cu http://eventos.uci.cu

RE: [MASSMAIL]RE: Events out-of-the-box

Posted by Yossi Tamari <yo...@pipl.com>.
Hi Roannel,

I am not using, and was not even aware of Nutch's ability to emit events. I just read https://issues.apache.org/jira/browse/NUTCH-2132?attachmentOrder=desc where they basically had the same discussion.
If the general capability already exists, it seems like a good idea to add more functionality to it. I also agree with Sebastian's comments there: the default behaviour should not change, and there shouldn't be a performance cost, especially to those not using this feature (which in this case should be easy, it's just on step begin/end).

BTW, from an architectural point of view, I see these, and the ones in the original ticket, as Logging Events, not Integration Events, and as such I think the approach should have been to use structured log messages instead of RabbitMQ. I'm not sure how much support log4j has for structured log messages, but I know that newer log libraries have good support for sending the same message as text to a normal log and as a structured message to another appender.

	Yossi.

> -----Original Message-----
> From: Roannel Fernández Hernández <ro...@uci.cu>
> Sent: 05 July 2018 04:05
> To: user@nutch.apache.org
> Subject: Re: [MASSMAIL]RE: Events out-of-the-box
> 
> Hi Yossi
> 
> Thanks for your answer. I've been testing your idea of the appender, but I think
> is too hard get the counters from the context by this via. I truly believe Nutch
> should provide some mainly events out-of-the-box using the included publisher
> component. Do you agree with me?
> 
> Regards
> 
> ----- Mensaje original -----
> > De: "Yossi Tamari" <yo...@pipl.com>
> > Para: user@nutch.apache.org
> > Enviados: Viernes, 29 de Junio 2018 2:09:52
> > Asunto: [MASSMAIL]RE: Events out-of-the-box
> >
> > This is not something I actually did, but you should be able to
> > achieve this by adding a log4j appender for RabbitMQ (such as
> > https://github.com/plant42/rabbitmq-log4j-appender), and configuring
> > the relevant loggers and filters to send only the logging events you
> > need to that appender.
> > BTW, if you just want "fetching started/fetching ended" style
> > messages, you can simply add it to the crawl script, no need to touch the Java
> code.
> >
> > > -----Original Message-----
> > > From: Roannel Fernández Hernández <ro...@uci.cu>
> > > Sent: 29 June 2018 06:24
> > > To: user@nutch.apache.org
> > > Subject: Events out-of-the-box
> > >
> > >
> > >
> > > Hi folks,
> > >
> > >
> > >
> > >
> > > I'm using Nutch 1.14 and I have to send notifications to a RabbitMQ
> > > queue when a every step starts and ends. So, my question is: Do I
> > > have to change the code to achieve this or is there an easier way?
> > > How can I do this?
> > >
> > >
> > >
> > >
> > > If code should be changed I think is a good idea provide
> > > out-of-the-box events for each step. We can even pass the counters
> > > from the context to each event. I don't know, it's just an idea.
> > >
> > >
> > >
> > >
> > > What do you think guys?
> > >
> > >
> > >
> > >
> > > Regards
> > >
> > > UCIENCIA 2018: III Conferencia Científica Internacional de la
> > > Universidad de las Ciencias Informáticas.
> > > Del 24-26 de septiembre, 2018 http://uciencia.uci.cu
> > > http://eventos.uci.cu
> >
> >
> UCIENCIA 2018: III Conferencia Científica Internacional de la Universidad de las
> Ciencias Informáticas.
> Del 24-26 de septiembre, 2018 http://uciencia.uci.cu http://eventos.uci.cu


Re: [MASSMAIL]RE: Events out-of-the-box

Posted by Roannel Fernández Hernández <ro...@uci.cu>.
Hi Yossi

Thanks for your answer. I've been testing your idea of the appender, but I think is too hard get the counters from the context by this via. I truly believe Nutch should provide some mainly events out-of-the-box using the included publisher component. Do you agree with me?

Regards

----- Mensaje original -----
> De: "Yossi Tamari" <yo...@pipl.com>
> Para: user@nutch.apache.org
> Enviados: Viernes, 29 de Junio 2018 2:09:52
> Asunto: [MASSMAIL]RE: Events out-of-the-box
> 
> This is not something I actually did, but you should be able to achieve this
> by adding a log4j appender for RabbitMQ (such as
> https://github.com/plant42/rabbitmq-log4j-appender), and configuring the
> relevant loggers and filters to send only the logging events you need to
> that appender.
> BTW, if you just want "fetching started/fetching ended" style messages, you
> can simply add it to the crawl script, no need to touch the Java code.
> 
> > -----Original Message-----
> > From: Roannel Fernández Hernández <ro...@uci.cu>
> > Sent: 29 June 2018 06:24
> > To: user@nutch.apache.org
> > Subject: Events out-of-the-box
> > 
> > 
> > 
> > Hi folks,
> > 
> > 
> > 
> > 
> > I'm using Nutch 1.14 and I have to send notifications to a RabbitMQ queue
> > when
> > a every step starts and ends. So, my question is: Do I have to change the
> > code to
> > achieve this or is there an easier way? How can I do this?
> > 
> > 
> > 
> > 
> > If code should be changed I think is a good idea provide out-of-the-box
> > events
> > for each step. We can even pass the counters from the context to each
> > event. I
> > don't know, it's just an idea.
> > 
> > 
> > 
> > 
> > What do you think guys?
> > 
> > 
> > 
> > 
> > Regards
> > 
> > UCIENCIA 2018: III Conferencia Científica Internacional de la Universidad
> > de las
> > Ciencias Informáticas.
> > Del 24-26 de septiembre, 2018 http://uciencia.uci.cu http://eventos.uci.cu
> 
> 
UCIENCIA 2018: III Conferencia Científica Internacional de la Universidad de las Ciencias Informáticas. 
Del 24-26 de septiembre, 2018 http://uciencia.uci.cu http://eventos.uci.cu

RE: Events out-of-the-box

Posted by Yossi Tamari <yo...@pipl.com>.
This is not something I actually did, but you should be able to achieve this by adding a log4j appender for RabbitMQ (such as https://github.com/plant42/rabbitmq-log4j-appender), and configuring the relevant loggers and filters to send only the logging events you need to that appender.
BTW, if you just want "fetching started/fetching ended" style messages, you can simply add it to the crawl script, no need to touch the Java code.

> -----Original Message-----
> From: Roannel Fernández Hernández <ro...@uci.cu>
> Sent: 29 June 2018 06:24
> To: user@nutch.apache.org
> Subject: Events out-of-the-box
> 
> 
> 
> Hi folks,
> 
> 
> 
> 
> I'm using Nutch 1.14 and I have to send notifications to a RabbitMQ queue when
> a every step starts and ends. So, my question is: Do I have to change the code to
> achieve this or is there an easier way? How can I do this?
> 
> 
> 
> 
> If code should be changed I think is a good idea provide out-of-the-box events
> for each step. We can even pass the counters from the context to each event. I
> don't know, it's just an idea.
> 
> 
> 
> 
> What do you think guys?
> 
> 
> 
> 
> Regards
> 
> UCIENCIA 2018: III Conferencia Científica Internacional de la Universidad de las
> Ciencias Informáticas.
> Del 24-26 de septiembre, 2018 http://uciencia.uci.cu http://eventos.uci.cu