You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Johannes Zillmann <jz...@googlemail.com> on 2014/10/02 10:20:49 UTC

post task hook

Hey guys,

is there any post task hook in Tez (like the OutputCommitter in MapReduce had) ?
Like to do certain actions (like accessing task logs) once the task completes, whether or not the task was successful and whether or not the user provided processor got executed or the task failed even before.

Johannes

Re: post task hook

Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.
Hi Johannes,

In that case I think the post task hook you mentioned is in client side,
otherwise we still could not see logs in client side if the post task hook
is in runtime-api ( Processor/Input/Output).  In tez's client api, we don't
expose much task information to client. We only expose DAGStatus and
VertexStatus to users.

And let us know if you experience the case that exceptions not propogate to
client.



Best Regards,
Jeff Zhang


On Wed, Oct 8, 2014 at 2:31 PM, Johannes Zillmann <jz...@googlemail.com>
wrote:

> Hey Jeff,
>
> so the reason for copying the task logs is indeed better error-diagnostic.
> MapReduce/Tez usually only reported 30 to 50% of the exceptions to the
> client in my experience.
> So if all of TEZ-1240 is done that might not be much an issue any more…
>
> thanks
> Johannes
>
>
> On 08 Oct 2014, at 08:24, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
>
> > Hi Johannes,
> >
> > Currently you can see the diagnostics in client if the task fails in
> processor, is that what you want ?
> >
> > Here's the jira tracking this,
> https://issues.apache.org/jira/browse/TEZ-1240
> >
> > If you find any exception that is not caught, please create a ticket
> under this.
> >
> >
> >
> > Best Regards,
> > Jeff Zhang
> >
> >
> > On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann <
> jzillmann@googlemail.com> wrote:
> > Hey Mr. Zang,
> >
> > so main use case is fetching the tasks logs in case the task fails. I
> can do that in a try-catch blog in the processor itself but it has 2
> disadvantages:
> > - log might not be complete
> > - in case the task fails in not user provided code (outside of the
> processor), we don’t capture it
> >
> > Johannes
> >
> > On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
> >
> > > Hi Johannes,
> > >
> > > You can do some post task in Processor, please refer SimpleProcessor
> which has method postOp(), but can do a limited things and can not do
> things like accessing task logs.
> > > Could you let us know your purpose on customize the post task hook ?
> > >
> > >
> > > Best Regards,
> > > Jeff Zhang
> > >
> > >
> > > On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <
> jzillmann@googlemail.com> wrote:
> > > Hey guys,
> > >
> > > is there any post task hook in Tez (like the OutputCommitter in
> MapReduce had) ?
> > > Like to do certain actions (like accessing task logs) once the task
> completes, whether or not the task was successful and whether or not the
> user provided processor got executed or the task failed even before.
> > >
> > > Johannes
> > >
> > >
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity to which it is addressed and may contain information that is
> confidential, privileged and exempt from disclosure under applicable law.
> If the reader of this message is not the intended recipient, you are hereby
> notified that any printing, copying, dissemination, distribution,
> disclosure or forwarding of this communication is strictly prohibited. If
> you have received this communication in error, please contact the sender
> immediately and delete it from your system. Thank You.
> >
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: post task hook

Posted by Johannes Zillmann <jz...@googlemail.com>.
Ok, will do!

Johannes

On 08 Oct 2014, at 21:56, Bikas Saha <bi...@hortonworks.com> wrote:

> We would ideally like to the see Tez do a good job about reporting errors.
> So if you find cases where this is not happening then please open jiras
> for them. Users should not have to work around Tez issues.
> 
> Bikas
> 
> -----Original Message-----
> From: Johannes Zillmann [mailto:jzillmann@googlemail.com]
> Sent: Tuesday, October 07, 2014 11:31 PM
> To: user@tez.apache.org
> Subject: Re: post task hook
> 
> Hey Jeff,
> 
> so the reason for copying the task logs is indeed better error-diagnostic.
> MapReduce/Tez usually only reported 30 to 50% of the exceptions to the
> client in my experience.
> So if all of TEZ-1240 is done that might not be much an issue any more.
> 
> thanks
> Johannes
> 
> 
> On 08 Oct 2014, at 08:24, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
> 
>> Hi Johannes,
>> 
>> Currently you can see the diagnostics in client if the task fails in
> processor, is that what you want ?
>> 
>> Here's the jira tracking this,
> https://issues.apache.org/jira/browse/TEZ-1240
>> 
>> If you find any exception that is not caught, please create a ticket
> under this.
>> 
>> 
>> 
>> Best Regards,
>> Jeff Zhang
>> 
>> 
>> On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann
> <jz...@googlemail.com> wrote:
>> Hey Mr. Zang,
>> 
>> so main use case is fetching the tasks logs in case the task fails. I
> can do that in a try-catch blog in the processor itself but it has 2
> disadvantages:
>> - log might not be complete
>> - in case the task fails in not user provided code (outside of the
> processor), we don't capture it
>> 
>> Johannes
>> 
>> On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
>> 
>>> Hi Johannes,
>>> 
>>> You can do some post task in Processor, please refer SimpleProcessor
> which has method postOp(), but can do a limited things and can not do
> things like accessing task logs.
>>> Could you let us know your purpose on customize the post task hook ?
>>> 
>>> 
>>> Best Regards,
>>> Jeff Zhang
>>> 
>>> 
>>> On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann
> <jz...@googlemail.com> wrote:
>>> Hey guys,
>>> 
>>> is there any post task hook in Tez (like the OutputCommitter in
> MapReduce had) ?
>>> Like to do certain actions (like accessing task logs) once the task
> completes, whether or not the task was successful and whether or not the
> user provided processor got executed or the task failed even before.
>>> 
>>> Johannes
>>> 
>>> 
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or
> entity to which it is addressed and may contain information that is
> confidential, privileged and exempt from disclosure under applicable law.
> If the reader of this message is not the intended recipient, you are
> hereby notified that any printing, copying, dissemination, distribution,
> disclosure or forwarding of this communication is strictly prohibited. If
> you have received this communication in error, please contact the sender
> immediately and delete it from your system. Thank You.
>> 
>> 
>> 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified
> that any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender
> immediately and delete it from your system. Thank You.
> 
> -- 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader 
> of this message is not the intended recipient, you are hereby notified that 
> any printing, copying, dissemination, distribution, disclosure or 
> forwarding of this communication is strictly prohibited. If you have 
> received this communication in error, please contact the sender immediately 
> and delete it from your system. Thank You.


RE: post task hook

Posted by Bikas Saha <bi...@hortonworks.com>.
We would ideally like to the see Tez do a good job about reporting errors.
So if you find cases where this is not happening then please open jiras
for them. Users should not have to work around Tez issues.

Bikas

-----Original Message-----
From: Johannes Zillmann [mailto:jzillmann@googlemail.com]
Sent: Tuesday, October 07, 2014 11:31 PM
To: user@tez.apache.org
Subject: Re: post task hook

Hey Jeff,

so the reason for copying the task logs is indeed better error-diagnostic.
MapReduce/Tez usually only reported 30 to 50% of the exceptions to the
client in my experience.
So if all of TEZ-1240 is done that might not be much an issue any more.

thanks
Johannes


On 08 Oct 2014, at 08:24, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
wrote:

> Hi Johannes,
>
> Currently you can see the diagnostics in client if the task fails in
processor, is that what you want ?
>
> Here's the jira tracking this,
https://issues.apache.org/jira/browse/TEZ-1240
>
> If you find any exception that is not caught, please create a ticket
under this.
>
>
>
> Best Regards,
> Jeff Zhang
>
>
> On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann
<jz...@googlemail.com> wrote:
> Hey Mr. Zang,
>
> so main use case is fetching the tasks logs in case the task fails. I
can do that in a try-catch blog in the processor itself but it has 2
disadvantages:
> - log might not be complete
> - in case the task fails in not user provided code (outside of the
processor), we don't capture it
>
> Johannes
>
> On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
wrote:
>
> > Hi Johannes,
> >
> > You can do some post task in Processor, please refer SimpleProcessor
which has method postOp(), but can do a limited things and can not do
things like accessing task logs.
> > Could you let us know your purpose on customize the post task hook ?
> >
> >
> > Best Regards,
> > Jeff Zhang
> >
> >
> > On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann
<jz...@googlemail.com> wrote:
> > Hey guys,
> >
> > is there any post task hook in Tez (like the OutputCommitter in
MapReduce had) ?
> > Like to do certain actions (like accessing task logs) once the task
completes, whether or not the task was successful and whether or not the
user provided processor got executed or the task failed even before.
> >
> > Johannes
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or
entity to which it is addressed and may contain information that is
confidential, privileged and exempt from disclosure under applicable law.
If the reader of this message is not the intended recipient, you are
hereby notified that any printing, copying, dissemination, distribution,
disclosure or forwarding of this communication is strictly prohibited. If
you have received this communication in error, please contact the sender
immediately and delete it from your system. Thank You.
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
to which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified
that any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender
immediately and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: post task hook

Posted by Johannes Zillmann <jz...@googlemail.com>.
Hey Jeff,

so the reason for copying the task logs is indeed better error-diagnostic. MapReduce/Tez usually only reported 30 to 50% of the exceptions to the client in my experience.
So if all of TEZ-1240 is done that might not be much an issue any more…

thanks
Johannes


On 08 Oct 2014, at 08:24, Jianfeng (Jeff) Zhang <jz...@hortonworks.com> wrote:

> Hi Johannes,
> 
> Currently you can see the diagnostics in client if the task fails in processor, is that what you want ?
> 
> Here's the jira tracking this, https://issues.apache.org/jira/browse/TEZ-1240
> 
> If you find any exception that is not caught, please create a ticket under this.
> 
> 
> 
> Best Regards,
> Jeff Zhang
> 
> 
> On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann <jz...@googlemail.com> wrote:
> Hey Mr. Zang,
> 
> so main use case is fetching the tasks logs in case the task fails. I can do that in a try-catch blog in the processor itself but it has 2 disadvantages:
> - log might not be complete
> - in case the task fails in not user provided code (outside of the processor), we don’t capture it
> 
> Johannes
> 
> On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com> wrote:
> 
> > Hi Johannes,
> >
> > You can do some post task in Processor, please refer SimpleProcessor which has method postOp(), but can do a limited things and can not do things like accessing task logs.
> > Could you let us know your purpose on customize the post task hook ?
> >
> >
> > Best Regards,
> > Jeff Zhang
> >
> >
> > On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <jz...@googlemail.com> wrote:
> > Hey guys,
> >
> > is there any post task hook in Tez (like the OutputCommitter in MapReduce had) ?
> > Like to do certain actions (like accessing task logs) once the task completes, whether or not the task was successful and whether or not the user provided processor got executed or the task failed even before.
> >
> > Johannes
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


Re: post task hook

Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.
Hi Johannes,

Currently you can see the diagnostics in client if the task fails in
processor, is that what you want ?

Here's the jira tracking this,
https://issues.apache.org/jira/browse/TEZ-1240

If you find any exception that is not caught, please create a ticket under
this.



Best Regards,
Jeff Zhang


On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann <jz...@googlemail.com>
wrote:

> Hey Mr. Zang,
>
> so main use case is fetching the tasks logs in case the task fails. I can
> do that in a try-catch blog in the processor itself but it has 2
> disadvantages:
> - log might not be complete
> - in case the task fails in not user provided code (outside of the
> processor), we don’t capture it
>
> Johannes
>
> On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
>
> > Hi Johannes,
> >
> > You can do some post task in Processor, please refer SimpleProcessor
> which has method postOp(), but can do a limited things and can not do
> things like accessing task logs.
> > Could you let us know your purpose on customize the post task hook ?
> >
> >
> > Best Regards,
> > Jeff Zhang
> >
> >
> > On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <
> jzillmann@googlemail.com> wrote:
> > Hey guys,
> >
> > is there any post task hook in Tez (like the OutputCommitter in
> MapReduce had) ?
> > Like to do certain actions (like accessing task logs) once the task
> completes, whether or not the task was successful and whether or not the
> user provided processor got executed or the task failed even before.
> >
> > Johannes
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: post task hook

Posted by Johannes Zillmann <jz...@googlemail.com>.
Hey Mr. Zang,

so main use case is fetching the tasks logs in case the task fails. I can do that in a try-catch blog in the processor itself but it has 2 disadvantages:
- log might not be complete
- in case the task fails in not user provided code (outside of the processor), we don’t capture it

Johannes

On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com> wrote:

> Hi Johannes,
> 
> You can do some post task in Processor, please refer SimpleProcessor which has method postOp(), but can do a limited things and can not do things like accessing task logs.
> Could you let us know your purpose on customize the post task hook ?
>  
> 
> Best Regards,
> Jeff Zhang
> 
> 
> On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <jz...@googlemail.com> wrote:
> Hey guys,
> 
> is there any post task hook in Tez (like the OutputCommitter in MapReduce had) ?
> Like to do certain actions (like accessing task logs) once the task completes, whether or not the task was successful and whether or not the user provided processor got executed or the task failed even before.
> 
> Johannes
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


Re: post task hook

Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.
Hi Johannes,

You can do some post task in Processor, please refer SimpleProcessor which
has method postOp(), but can do a limited things and can not do things like
accessing task logs.
Could you let us know your purpose on customize the post task hook ?


Best Regards,
Jeff Zhang


On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <jz...@googlemail.com>
wrote:

> Hey guys,
>
> is there any post task hook in Tez (like the OutputCommitter in MapReduce
> had) ?
> Like to do certain actions (like accessing task logs) once the task
> completes, whether or not the task was successful and whether or not the
> user provided processor got executed or the task failed even before.
>
> Johannes

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.