You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Christian Schneider <cs...@gmail.com> on 2013/03/19 11:03:57 UTC

How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Hi,
I found out that these logs are stored directly at the TaskNodes.

We need to have them stored over a long time (some months or better a
year). What is a good way of doing that?

With my current knowledge I would write a cron job that picks up all the
files every few minutes.
But I guess thats not the best approach...

Best Regards,
Christian.

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Christian Schneider <cs...@gmail.com>.
Hi Jagat,
could you give me a short hint which source and Sink I should use?

Because it would be pretty good to have a 1:1 copy from the log folder of
the task nodes. Smth. like:
/<hostname of the
TaskTracker>/var/log/hadoop-0.20-mapreduce/userlogs/job_201303181503_0248/attempt_201303181503_0248_m_000023_0/*

Best Regards,
Christian.


2013/3/19 Christian Schneider <cs...@gmail.com>

> Hi Jagat,
> Thank you. That sounds good. I will have a  look at it.
>
> Best Regards,
> Christian.
>
>
> 2013/3/19 Jagat Singh <ja...@gmail.com>
>
>> Hello,
>>
>> You should be looking at Flume.
>>
>> Its made for this
>>
>> http://flume.apache.org/
>>
>> Thanks,
>>
>> Jagat Singh
>>
>>
>> On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
>> cschneiderpublic@gmail.com> wrote:
>>
>>> Hi,
>>> I found out that these logs are stored directly at the TaskNodes.
>>>
>>> We need to have them stored over a long time (some months or better a
>>> year). What is a good way of doing that?
>>>
>>> With my current knowledge I would write a cron job that picks up all the
>>> files every few minutes.
>>> But I guess thats not the best approach...
>>>
>>> Best Regards,
>>> Christian.
>>>
>>
>>
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Christian Schneider <cs...@gmail.com>.
Hi Jagat,
could you give me a short hint which source and Sink I should use?

Because it would be pretty good to have a 1:1 copy from the log folder of
the task nodes. Smth. like:
/<hostname of the
TaskTracker>/var/log/hadoop-0.20-mapreduce/userlogs/job_201303181503_0248/attempt_201303181503_0248_m_000023_0/*

Best Regards,
Christian.


2013/3/19 Christian Schneider <cs...@gmail.com>

> Hi Jagat,
> Thank you. That sounds good. I will have a  look at it.
>
> Best Regards,
> Christian.
>
>
> 2013/3/19 Jagat Singh <ja...@gmail.com>
>
>> Hello,
>>
>> You should be looking at Flume.
>>
>> Its made for this
>>
>> http://flume.apache.org/
>>
>> Thanks,
>>
>> Jagat Singh
>>
>>
>> On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
>> cschneiderpublic@gmail.com> wrote:
>>
>>> Hi,
>>> I found out that these logs are stored directly at the TaskNodes.
>>>
>>> We need to have them stored over a long time (some months or better a
>>> year). What is a good way of doing that?
>>>
>>> With my current knowledge I would write a cron job that picks up all the
>>> files every few minutes.
>>> But I guess thats not the best approach...
>>>
>>> Best Regards,
>>> Christian.
>>>
>>
>>
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Christian Schneider <cs...@gmail.com>.
Hi Jagat,
could you give me a short hint which source and Sink I should use?

Because it would be pretty good to have a 1:1 copy from the log folder of
the task nodes. Smth. like:
/<hostname of the
TaskTracker>/var/log/hadoop-0.20-mapreduce/userlogs/job_201303181503_0248/attempt_201303181503_0248_m_000023_0/*

Best Regards,
Christian.


2013/3/19 Christian Schneider <cs...@gmail.com>

> Hi Jagat,
> Thank you. That sounds good. I will have a  look at it.
>
> Best Regards,
> Christian.
>
>
> 2013/3/19 Jagat Singh <ja...@gmail.com>
>
>> Hello,
>>
>> You should be looking at Flume.
>>
>> Its made for this
>>
>> http://flume.apache.org/
>>
>> Thanks,
>>
>> Jagat Singh
>>
>>
>> On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
>> cschneiderpublic@gmail.com> wrote:
>>
>>> Hi,
>>> I found out that these logs are stored directly at the TaskNodes.
>>>
>>> We need to have them stored over a long time (some months or better a
>>> year). What is a good way of doing that?
>>>
>>> With my current knowledge I would write a cron job that picks up all the
>>> files every few minutes.
>>> But I guess thats not the best approach...
>>>
>>> Best Regards,
>>> Christian.
>>>
>>
>>
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Christian Schneider <cs...@gmail.com>.
Hi Jagat,
could you give me a short hint which source and Sink I should use?

Because it would be pretty good to have a 1:1 copy from the log folder of
the task nodes. Smth. like:
/<hostname of the
TaskTracker>/var/log/hadoop-0.20-mapreduce/userlogs/job_201303181503_0248/attempt_201303181503_0248_m_000023_0/*

Best Regards,
Christian.


2013/3/19 Christian Schneider <cs...@gmail.com>

> Hi Jagat,
> Thank you. That sounds good. I will have a  look at it.
>
> Best Regards,
> Christian.
>
>
> 2013/3/19 Jagat Singh <ja...@gmail.com>
>
>> Hello,
>>
>> You should be looking at Flume.
>>
>> Its made for this
>>
>> http://flume.apache.org/
>>
>> Thanks,
>>
>> Jagat Singh
>>
>>
>> On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
>> cschneiderpublic@gmail.com> wrote:
>>
>>> Hi,
>>> I found out that these logs are stored directly at the TaskNodes.
>>>
>>> We need to have them stored over a long time (some months or better a
>>> year). What is a good way of doing that?
>>>
>>> With my current knowledge I would write a cron job that picks up all the
>>> files every few minutes.
>>> But I guess thats not the best approach...
>>>
>>> Best Regards,
>>> Christian.
>>>
>>
>>
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Christian Schneider <cs...@gmail.com>.
Hi Jagat,
Thank you. That sounds good. I will have a  look at it.

Best Regards,
Christian.


2013/3/19 Jagat Singh <ja...@gmail.com>

> Hello,
>
> You should be looking at Flume.
>
> Its made for this
>
> http://flume.apache.org/
>
> Thanks,
>
> Jagat Singh
>
>
> On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
> cschneiderpublic@gmail.com> wrote:
>
>> Hi,
>> I found out that these logs are stored directly at the TaskNodes.
>>
>> We need to have them stored over a long time (some months or better a
>> year). What is a good way of doing that?
>>
>> With my current knowledge I would write a cron job that picks up all the
>> files every few minutes.
>> But I guess thats not the best approach...
>>
>> Best Regards,
>> Christian.
>>
>
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Christian Schneider <cs...@gmail.com>.
Hi Jagat,
Thank you. That sounds good. I will have a  look at it.

Best Regards,
Christian.


2013/3/19 Jagat Singh <ja...@gmail.com>

> Hello,
>
> You should be looking at Flume.
>
> Its made for this
>
> http://flume.apache.org/
>
> Thanks,
>
> Jagat Singh
>
>
> On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
> cschneiderpublic@gmail.com> wrote:
>
>> Hi,
>> I found out that these logs are stored directly at the TaskNodes.
>>
>> We need to have them stored over a long time (some months or better a
>> year). What is a good way of doing that?
>>
>> With my current knowledge I would write a cron job that picks up all the
>> files every few minutes.
>> But I guess thats not the best approach...
>>
>> Best Regards,
>> Christian.
>>
>
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Christian Schneider <cs...@gmail.com>.
Hi Jagat,
Thank you. That sounds good. I will have a  look at it.

Best Regards,
Christian.


2013/3/19 Jagat Singh <ja...@gmail.com>

> Hello,
>
> You should be looking at Flume.
>
> Its made for this
>
> http://flume.apache.org/
>
> Thanks,
>
> Jagat Singh
>
>
> On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
> cschneiderpublic@gmail.com> wrote:
>
>> Hi,
>> I found out that these logs are stored directly at the TaskNodes.
>>
>> We need to have them stored over a long time (some months or better a
>> year). What is a good way of doing that?
>>
>> With my current knowledge I would write a cron job that picks up all the
>> files every few minutes.
>> But I guess thats not the best approach...
>>
>> Best Regards,
>> Christian.
>>
>
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Christian Schneider <cs...@gmail.com>.
Hi Jagat,
Thank you. That sounds good. I will have a  look at it.

Best Regards,
Christian.


2013/3/19 Jagat Singh <ja...@gmail.com>

> Hello,
>
> You should be looking at Flume.
>
> Its made for this
>
> http://flume.apache.org/
>
> Thanks,
>
> Jagat Singh
>
>
> On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
> cschneiderpublic@gmail.com> wrote:
>
>> Hi,
>> I found out that these logs are stored directly at the TaskNodes.
>>
>> We need to have them stored over a long time (some months or better a
>> year). What is a good way of doing that?
>>
>> With my current knowledge I would write a cron job that picks up all the
>> files every few minutes.
>> But I guess thats not the best approach...
>>
>> Best Regards,
>> Christian.
>>
>
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Jagat Singh <ja...@gmail.com>.
Hello,

You should be looking at Flume.

Its made for this

http://flume.apache.org/

Thanks,

Jagat Singh

On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
cschneiderpublic@gmail.com> wrote:

> Hi,
> I found out that these logs are stored directly at the TaskNodes.
>
> We need to have them stored over a long time (some months or better a
> year). What is a good way of doing that?
>
> With my current knowledge I would write a cron job that picks up all the
> files every few minutes.
> But I guess thats not the best approach...
>
> Best Regards,
> Christian.
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Jagat Singh <ja...@gmail.com>.
Hello,

You should be looking at Flume.

Its made for this

http://flume.apache.org/

Thanks,

Jagat Singh

On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
cschneiderpublic@gmail.com> wrote:

> Hi,
> I found out that these logs are stored directly at the TaskNodes.
>
> We need to have them stored over a long time (some months or better a
> year). What is a good way of doing that?
>
> With my current knowledge I would write a cron job that picks up all the
> files every few minutes.
> But I guess thats not the best approach...
>
> Best Regards,
> Christian.
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Jagat Singh <ja...@gmail.com>.
Hello,

You should be looking at Flume.

Its made for this

http://flume.apache.org/

Thanks,

Jagat Singh

On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
cschneiderpublic@gmail.com> wrote:

> Hi,
> I found out that these logs are stored directly at the TaskNodes.
>
> We need to have them stored over a long time (some months or better a
> year). What is a good way of doing that?
>
> With my current knowledge I would write a cron job that picks up all the
> files every few minutes.
> But I guess thats not the best approach...
>
> Best Regards,
> Christian.
>

Re: How to Archive the Task Logs (Stdout, Stderr, Syslogs)

Posted by Jagat Singh <ja...@gmail.com>.
Hello,

You should be looking at Flume.

Its made for this

http://flume.apache.org/

Thanks,

Jagat Singh

On Tue, Mar 19, 2013 at 9:03 PM, Christian Schneider <
cschneiderpublic@gmail.com> wrote:

> Hi,
> I found out that these logs are stored directly at the TaskNodes.
>
> We need to have them stored over a long time (some months or better a
> year). What is a good way of doing that?
>
> With my current knowledge I would write a cron job that picks up all the
> files every few minutes.
> But I guess thats not the best approach...
>
> Best Regards,
> Christian.
>