You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by Vijith <vi...@gmail.com> on 2012/05/11 11:37:40 UTC

Separate logger for nutch

Hi,

How can I create a separate project specific log in addition to the
existing log.
I am running nutch in eploy mode.
Also I want some urls filtered by my urlfilter to be stored in an external
flat file. How can I achieve this.

-- 
*Thanks & Regards*
*
*
*Vijith V*

Re: Separate logger for nutch

Posted by Vijith <vi...@gmail.com>.

@Markus
I tried that way, editing the log4j properties file, worked fine in local
mode (in eclipse). But not generating any file in deply mode (running job
file).

Is this the only way we can run nutch in deploy mode (ie running the job
file in hadoop).
Is it possible to run nutch in deploy mode without using the job file (like
local mode run from eclipse).
I believe if that is possible, the separate logging may work as well.

On Fri, May 11, 2012 at 8:33 PM, Markus Jelsma
<ma...@openindex.io>wrote:

> Hi
>
> Nutch uses Log4j and with it you can write log output from different
> classes or different log levels to different output files. I'm sure this
> will work with Nutch in local mode so i believe you can make it happen with
> Hadoop but may be tricky, or not possible.
>
> Cheers
>
>
>
> On Fri, 11 May 2012 14:59:13 +0200, Ferdy Galema <fe...@kalooga.com>
> wrote:
>
>> There is, every task gets run a temporary working directory. But in
>> general
>> the output is cleaned after the task completes. If you want to save "side
>> data" you have to figure a workaround. This page should give you a few
>> pointers:
>>
>> http://hadoop.apache.org/**common/docs/r0.20.2/mapred_**
>> tutorial.html#Task+Side-**Effect+Files<http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+Side-Effect+Files>
>>
>> On Fri, May 11, 2012 at 2:36 PM, Vijith <vi...@gmail.com> wrote:
>>
>>  Thanks Ferdy.
>>> So does this mean that there is no way nutch can connect to a flat file /
>>> database etc. while in deploy mode.
>>>
>>>
>>> On Fri, May 11, 2012 at 5:44 PM, Ferdy Galema <ferdy.galema@kalooga.com
>>> >wrote:
>>>
>>> > When running hadoop in deploy mode the actual tasks are ran by the
>>> > mapreduce framework so you have to check the mapreduce "user" logs.
>>> Either
>>> > use the jobtracker interface or check them directly on the nodes in
>>> > HADOOP_HOME/logs/userlogs or something like that.
>>> >
>>> > On Fri, May 11, 2012 at 1:11 PM, Vijith <vi...@gmail.com> wrote:
>>> >
>>> > > I have tried with a seperate logger and a printWriter objects to do
>>> this.
>>> > > It works in local mode but not in deploy mode.
>>> > > I am running the nutch job file. Its running and generating the
>>> hadoop
>>> > log
>>> > > without any errors. But the files are not created in any of the
>>> nodes.
>>> > >
>>> > > On Fri, May 11, 2012 at 3:07 PM, Vijith <vi...@gmail.com>
>>> wrote:
>>> > >
>>> > > > Hi,
>>> > > >
>>> > > > How can I create a separate project specific log in addition to the
>>> > > > existing log.
>>> > > > I am running nutch in eploy mode.
>>> > > > Also I want some urls filtered by my urlfilter to be stored in an
>>> > > external
>>> > > > flat file. How can I achieve this.
>>> > > >
>>> > > > --
>>> > > > *Thanks & Regards*
>>> > > > *
>>> > > > *
>>> > > > *Vijith V*
>>> > > >
>>> > > >
>>> > > >
>>> > >
>>> > >
>>> > > --
>>> > > *Thanks & Regards*
>>> > > *
>>> > > *
>>> > > *Vijith V*
>>> > >
>>> >
>>>
>>>
>>>
>>> --
>>> *Thanks & Regards*
>>> *
>>> *
>>> *Vijith V*
>>>
>>>
> --
> Markus Jelsma - CTO - Openindex
>



-- 
*Thanks & Regards*
*
*
*Vijith V*

Re: Separate logger for nutch

Posted by Markus Jelsma <ma...@openindex.io>.

 Hi

 Nutch uses Log4j and with it you can write log output from different 
 classes or different log levels to different output files. I'm sure this 
 will work with Nutch in local mode so i believe you can make it happen 
 with Hadoop but may be tricky, or not possible.

 Cheers


 On Fri, 11 May 2012 14:59:13 +0200, Ferdy Galema 
 <fe...@kalooga.com> wrote:
> There is, every task gets run a temporary working directory. But in 
> general
> the output is cleaned after the task completes. If you want to save 
> "side
> data" you have to figure a workaround. This page should give you a 
> few
> pointers:
> 
> http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+Side-Effect+Files
>
> On Fri, May 11, 2012 at 2:36 PM, Vijith <vi...@gmail.com> 
> wrote:
>
>> Thanks Ferdy.
>> So does this mean that there is no way nutch can connect to a flat 
>> file /
>> database etc. while in deploy mode.
>>
>>
>> On Fri, May 11, 2012 at 5:44 PM, Ferdy Galema 
>> <ferdy.galema@kalooga.com
>> >wrote:
>>
>> > When running hadoop in deploy mode the actual tasks are ran by the
>> > mapreduce framework so you have to check the mapreduce "user" 
>> logs.
>> Either
>> > use the jobtracker interface or check them directly on the nodes 
>> in
>> > HADOOP_HOME/logs/userlogs or something like that.
>> >
>> > On Fri, May 11, 2012 at 1:11 PM, Vijith <vi...@gmail.com> 
>> wrote:
>> >
>> > > I have tried with a seperate logger and a printWriter objects to 
>> do
>> this.
>> > > It works in local mode but not in deploy mode.
>> > > I am running the nutch job file. Its running and generating the 
>> hadoop
>> > log
>> > > without any errors. But the files are not created in any of the 
>> nodes.
>> > >
>> > > On Fri, May 11, 2012 at 3:07 PM, Vijith <vi...@gmail.com> 
>> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > How can I create a separate project specific log in addition 
>> to the
>> > > > existing log.
>> > > > I am running nutch in eploy mode.
>> > > > Also I want some urls filtered by my urlfilter to be stored in 
>> an
>> > > external
>> > > > flat file. How can I achieve this.
>> > > >
>> > > > --
>> > > > *Thanks & Regards*
>> > > > *
>> > > > *
>> > > > *Vijith V*
>> > > >
>> > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > *Thanks & Regards*
>> > > *
>> > > *
>> > > *Vijith V*
>> > >
>> >
>>
>>
>>
>> --
>> *Thanks & Regards*
>> *
>> *
>> *Vijith V*
>>

-- 
 Markus Jelsma - CTO - Openindex

Re: Separate logger for nutch

Posted by Ferdy Galema <fe...@kalooga.com>.

There is, every task gets run a temporary working directory. But in general
the output is cleaned after the task completes. If you want to save "side
data" you have to figure a workaround. This page should give you a few
pointers:
http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+Side-Effect+Files

On Fri, May 11, 2012 at 2:36 PM, Vijith <vi...@gmail.com> wrote:

> Thanks Ferdy.
> So does this mean that there is no way nutch can connect to a flat file /
> database etc. while in deploy mode.
>
>
> On Fri, May 11, 2012 at 5:44 PM, Ferdy Galema <ferdy.galema@kalooga.com
> >wrote:
>
> > When running hadoop in deploy mode the actual tasks are ran by the
> > mapreduce framework so you have to check the mapreduce "user" logs.
> Either
> > use the jobtracker interface or check them directly on the nodes in
> > HADOOP_HOME/logs/userlogs or something like that.
> >
> > On Fri, May 11, 2012 at 1:11 PM, Vijith <vi...@gmail.com> wrote:
> >
> > > I have tried with a seperate logger and a printWriter objects to do
> this.
> > > It works in local mode but not in deploy mode.
> > > I am running the nutch job file. Its running and generating the hadoop
> > log
> > > without any errors. But the files are not created in any of the nodes.
> > >
> > > On Fri, May 11, 2012 at 3:07 PM, Vijith <vi...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > How can I create a separate project specific log in addition to the
> > > > existing log.
> > > > I am running nutch in eploy mode.
> > > > Also I want some urls filtered by my urlfilter to be stored in an
> > > external
> > > > flat file. How can I achieve this.
> > > >
> > > > --
> > > > *Thanks & Regards*
> > > > *
> > > > *
> > > > *Vijith V*
> > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > *Thanks & Regards*
> > > *
> > > *
> > > *Vijith V*
> > >
> >
>
>
>
> --
> *Thanks & Regards*
> *
> *
> *Vijith V*
>

Re: Separate logger for nutch

Posted by Vijith <vi...@gmail.com>.

Thanks Ferdy.
So does this mean that there is no way nutch can connect to a flat file /
database etc. while in deploy mode.


On Fri, May 11, 2012 at 5:44 PM, Ferdy Galema <fe...@kalooga.com>wrote:

> When running hadoop in deploy mode the actual tasks are ran by the
> mapreduce framework so you have to check the mapreduce "user" logs. Either
> use the jobtracker interface or check them directly on the nodes in
> HADOOP_HOME/logs/userlogs or something like that.
>
> On Fri, May 11, 2012 at 1:11 PM, Vijith <vi...@gmail.com> wrote:
>
> > I have tried with a seperate logger and a printWriter objects to do this.
> > It works in local mode but not in deploy mode.
> > I am running the nutch job file. Its running and generating the hadoop
> log
> > without any errors. But the files are not created in any of the nodes.
> >
> > On Fri, May 11, 2012 at 3:07 PM, Vijith <vi...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > How can I create a separate project specific log in addition to the
> > > existing log.
> > > I am running nutch in eploy mode.
> > > Also I want some urls filtered by my urlfilter to be stored in an
> > external
> > > flat file. How can I achieve this.
> > >
> > > --
> > > *Thanks & Regards*
> > > *
> > > *
> > > *Vijith V*
> > >
> > >
> > >
> >
> >
> > --
> > *Thanks & Regards*
> > *
> > *
> > *Vijith V*
> >
>



-- 
*Thanks & Regards*
*
*
*Vijith V*

Re: Separate logger for nutch

Posted by Ferdy Galema <fe...@kalooga.com>.

When running hadoop in deploy mode the actual tasks are ran by the
mapreduce framework so you have to check the mapreduce "user" logs. Either
use the jobtracker interface or check them directly on the nodes in
HADOOP_HOME/logs/userlogs or something like that.

On Fri, May 11, 2012 at 1:11 PM, Vijith <vi...@gmail.com> wrote:

> I have tried with a seperate logger and a printWriter objects to do this.
> It works in local mode but not in deploy mode.
> I am running the nutch job file. Its running and generating the hadoop log
> without any errors. But the files are not created in any of the nodes.
>
> On Fri, May 11, 2012 at 3:07 PM, Vijith <vi...@gmail.com> wrote:
>
> > Hi,
> >
> > How can I create a separate project specific log in addition to the
> > existing log.
> > I am running nutch in eploy mode.
> > Also I want some urls filtered by my urlfilter to be stored in an
> external
> > flat file. How can I achieve this.
> >
> > --
> > *Thanks & Regards*
> > *
> > *
> > *Vijith V*
> >
> >
> >
>
>
> --
> *Thanks & Regards*
> *
> *
> *Vijith V*
>

Re: Separate logger for nutch

Posted by Vijith <vi...@gmail.com>.

I have tried with a seperate logger and a printWriter objects to do this.
It works in local mode but not in deploy mode.
I am running the nutch job file. Its running and generating the hadoop log
without any errors. But the files are not created in any of the nodes.

On Fri, May 11, 2012 at 3:07 PM, Vijith <vi...@gmail.com> wrote:

> Hi,
>
> How can I create a separate project specific log in addition to the
> existing log.
> I am running nutch in eploy mode.
> Also I want some urls filtered by my urlfilter to be stored in an external
> flat file. How can I achieve this.
>
> --
> *Thanks & Regards*
> *
> *
> *Vijith V*
>
>
>

-- 
*Thanks & Regards*
*
*
*Vijith V*