You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Johannes Zillmann <jz...@googlemail.com> on 2014/10/02 10:20:49 UTC
post task hook
Hey guys,
is there any post task hook in Tez (like the OutputCommitter in MapReduce had) ?
Like to do certain actions (like accessing task logs) once the task completes, whether or not the task was successful and whether or not the user provided processor got executed or the task failed even before.
Johannes
Re: post task hook
Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.
Hi Johannes,
In that case I think the post task hook you mentioned is in client side,
otherwise we still could not see logs in client side if the post task hook
is in runtime-api ( Processor/Input/Output). In tez's client api, we don't
expose much task information to client. We only expose DAGStatus and
VertexStatus to users.
And let us know if you experience the case that exceptions not propogate to
client.
Best Regards,
Jeff Zhang
On Wed, Oct 8, 2014 at 2:31 PM, Johannes Zillmann <jz...@googlemail.com>
wrote:
> Hey Jeff,
>
> so the reason for copying the task logs is indeed better error-diagnostic.
> MapReduce/Tez usually only reported 30 to 50% of the exceptions to the
> client in my experience.
> So if all of TEZ-1240 is done that might not be much an issue any more…
>
> thanks
> Johannes
>
>
> On 08 Oct 2014, at 08:24, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
>
> > Hi Johannes,
> >
> > Currently you can see the diagnostics in client if the task fails in
> processor, is that what you want ?
> >
> > Here's the jira tracking this,
> https://issues.apache.org/jira/browse/TEZ-1240
> >
> > If you find any exception that is not caught, please create a ticket
> under this.
> >
> >
> >
> > Best Regards,
> > Jeff Zhang
> >
> >
> > On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann <
> jzillmann@googlemail.com> wrote:
> > Hey Mr. Zang,
> >
> > so main use case is fetching the tasks logs in case the task fails. I
> can do that in a try-catch blog in the processor itself but it has 2
> disadvantages:
> > - log might not be complete
> > - in case the task fails in not user provided code (outside of the
> processor), we don’t capture it
> >
> > Johannes
> >
> > On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
> >
> > > Hi Johannes,
> > >
> > > You can do some post task in Processor, please refer SimpleProcessor
> which has method postOp(), but can do a limited things and can not do
> things like accessing task logs.
> > > Could you let us know your purpose on customize the post task hook ?
> > >
> > >
> > > Best Regards,
> > > Jeff Zhang
> > >
> > >
> > > On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <
> jzillmann@googlemail.com> wrote:
> > > Hey guys,
> > >
> > > is there any post task hook in Tez (like the OutputCommitter in
> MapReduce had) ?
> > > Like to do certain actions (like accessing task logs) once the task
> completes, whether or not the task was successful and whether or not the
> user provided processor got executed or the task failed even before.
> > >
> > > Johannes
> > >
> > >
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity to which it is addressed and may contain information that is
> confidential, privileged and exempt from disclosure under applicable law.
> If the reader of this message is not the intended recipient, you are hereby
> notified that any printing, copying, dissemination, distribution,
> disclosure or forwarding of this communication is strictly prohibited. If
> you have received this communication in error, please contact the sender
> immediately and delete it from your system. Thank You.
> >
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: post task hook
Posted by Johannes Zillmann <jz...@googlemail.com>.
Ok, will do!
Johannes
On 08 Oct 2014, at 21:56, Bikas Saha <bi...@hortonworks.com> wrote:
> We would ideally like to the see Tez do a good job about reporting errors.
> So if you find cases where this is not happening then please open jiras
> for them. Users should not have to work around Tez issues.
>
> Bikas
>
> -----Original Message-----
> From: Johannes Zillmann [mailto:jzillmann@googlemail.com]
> Sent: Tuesday, October 07, 2014 11:31 PM
> To: user@tez.apache.org
> Subject: Re: post task hook
>
> Hey Jeff,
>
> so the reason for copying the task logs is indeed better error-diagnostic.
> MapReduce/Tez usually only reported 30 to 50% of the exceptions to the
> client in my experience.
> So if all of TEZ-1240 is done that might not be much an issue any more.
>
> thanks
> Johannes
>
>
> On 08 Oct 2014, at 08:24, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
>
>> Hi Johannes,
>>
>> Currently you can see the diagnostics in client if the task fails in
> processor, is that what you want ?
>>
>> Here's the jira tracking this,
> https://issues.apache.org/jira/browse/TEZ-1240
>>
>> If you find any exception that is not caught, please create a ticket
> under this.
>>
>>
>>
>> Best Regards,
>> Jeff Zhang
>>
>>
>> On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann
> <jz...@googlemail.com> wrote:
>> Hey Mr. Zang,
>>
>> so main use case is fetching the tasks logs in case the task fails. I
> can do that in a try-catch blog in the processor itself but it has 2
> disadvantages:
>> - log might not be complete
>> - in case the task fails in not user provided code (outside of the
> processor), we don't capture it
>>
>> Johannes
>>
>> On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
>>
>>> Hi Johannes,
>>>
>>> You can do some post task in Processor, please refer SimpleProcessor
> which has method postOp(), but can do a limited things and can not do
> things like accessing task logs.
>>> Could you let us know your purpose on customize the post task hook ?
>>>
>>>
>>> Best Regards,
>>> Jeff Zhang
>>>
>>>
>>> On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann
> <jz...@googlemail.com> wrote:
>>> Hey guys,
>>>
>>> is there any post task hook in Tez (like the OutputCommitter in
> MapReduce had) ?
>>> Like to do certain actions (like accessing task logs) once the task
> completes, whether or not the task was successful and whether or not the
> user provided processor got executed or the task failed even before.
>>>
>>> Johannes
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or
> entity to which it is addressed and may contain information that is
> confidential, privileged and exempt from disclosure under applicable law.
> If the reader of this message is not the intended recipient, you are
> hereby notified that any printing, copying, dissemination, distribution,
> disclosure or forwarding of this communication is strictly prohibited. If
> you have received this communication in error, please contact the sender
> immediately and delete it from your system. Thank You.
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified
> that any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender
> immediately and delete it from your system. Thank You.
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
RE: post task hook
Posted by Bikas Saha <bi...@hortonworks.com>.
We would ideally like to the see Tez do a good job about reporting errors.
So if you find cases where this is not happening then please open jiras
for them. Users should not have to work around Tez issues.
Bikas
-----Original Message-----
From: Johannes Zillmann [mailto:jzillmann@googlemail.com]
Sent: Tuesday, October 07, 2014 11:31 PM
To: user@tez.apache.org
Subject: Re: post task hook
Hey Jeff,
so the reason for copying the task logs is indeed better error-diagnostic.
MapReduce/Tez usually only reported 30 to 50% of the exceptions to the
client in my experience.
So if all of TEZ-1240 is done that might not be much an issue any more.
thanks
Johannes
On 08 Oct 2014, at 08:24, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
wrote:
> Hi Johannes,
>
> Currently you can see the diagnostics in client if the task fails in
processor, is that what you want ?
>
> Here's the jira tracking this,
https://issues.apache.org/jira/browse/TEZ-1240
>
> If you find any exception that is not caught, please create a ticket
under this.
>
>
>
> Best Regards,
> Jeff Zhang
>
>
> On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann
<jz...@googlemail.com> wrote:
> Hey Mr. Zang,
>
> so main use case is fetching the tasks logs in case the task fails. I
can do that in a try-catch blog in the processor itself but it has 2
disadvantages:
> - log might not be complete
> - in case the task fails in not user provided code (outside of the
processor), we don't capture it
>
> Johannes
>
> On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
wrote:
>
> > Hi Johannes,
> >
> > You can do some post task in Processor, please refer SimpleProcessor
which has method postOp(), but can do a limited things and can not do
things like accessing task logs.
> > Could you let us know your purpose on customize the post task hook ?
> >
> >
> > Best Regards,
> > Jeff Zhang
> >
> >
> > On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann
<jz...@googlemail.com> wrote:
> > Hey guys,
> >
> > is there any post task hook in Tez (like the OutputCommitter in
MapReduce had) ?
> > Like to do certain actions (like accessing task logs) once the task
completes, whether or not the task was successful and whether or not the
user provided processor got executed or the task failed even before.
> >
> > Johannes
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or
entity to which it is addressed and may contain information that is
confidential, privileged and exempt from disclosure under applicable law.
If the reader of this message is not the intended recipient, you are
hereby notified that any printing, copying, dissemination, distribution,
disclosure or forwarding of this communication is strictly prohibited. If
you have received this communication in error, please contact the sender
immediately and delete it from your system. Thank You.
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
to which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified
that any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender
immediately and delete it from your system. Thank You.
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: post task hook
Posted by Johannes Zillmann <jz...@googlemail.com>.
Hey Jeff,
so the reason for copying the task logs is indeed better error-diagnostic. MapReduce/Tez usually only reported 30 to 50% of the exceptions to the client in my experience.
So if all of TEZ-1240 is done that might not be much an issue any more…
thanks
Johannes
On 08 Oct 2014, at 08:24, Jianfeng (Jeff) Zhang <jz...@hortonworks.com> wrote:
> Hi Johannes,
>
> Currently you can see the diagnostics in client if the task fails in processor, is that what you want ?
>
> Here's the jira tracking this, https://issues.apache.org/jira/browse/TEZ-1240
>
> If you find any exception that is not caught, please create a ticket under this.
>
>
>
> Best Regards,
> Jeff Zhang
>
>
> On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann <jz...@googlemail.com> wrote:
> Hey Mr. Zang,
>
> so main use case is fetching the tasks logs in case the task fails. I can do that in a try-catch blog in the processor itself but it has 2 disadvantages:
> - log might not be complete
> - in case the task fails in not user provided code (outside of the processor), we don’t capture it
>
> Johannes
>
> On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com> wrote:
>
> > Hi Johannes,
> >
> > You can do some post task in Processor, please refer SimpleProcessor which has method postOp(), but can do a limited things and can not do things like accessing task logs.
> > Could you let us know your purpose on customize the post task hook ?
> >
> >
> > Best Regards,
> > Jeff Zhang
> >
> >
> > On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <jz...@googlemail.com> wrote:
> > Hey guys,
> >
> > is there any post task hook in Tez (like the OutputCommitter in MapReduce had) ?
> > Like to do certain actions (like accessing task logs) once the task completes, whether or not the task was successful and whether or not the user provided processor got executed or the task failed even before.
> >
> > Johannes
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: post task hook
Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.
Hi Johannes,
Currently you can see the diagnostics in client if the task fails in
processor, is that what you want ?
Here's the jira tracking this,
https://issues.apache.org/jira/browse/TEZ-1240
If you find any exception that is not caught, please create a ticket under
this.
Best Regards,
Jeff Zhang
On Wed, Oct 8, 2014 at 2:08 PM, Johannes Zillmann <jz...@googlemail.com>
wrote:
> Hey Mr. Zang,
>
> so main use case is fetching the tasks logs in case the task fails. I can
> do that in a try-catch blog in the processor itself but it has 2
> disadvantages:
> - log might not be complete
> - in case the task fails in not user provided code (outside of the
> processor), we don’t capture it
>
> Johannes
>
> On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com>
> wrote:
>
> > Hi Johannes,
> >
> > You can do some post task in Processor, please refer SimpleProcessor
> which has method postOp(), but can do a limited things and can not do
> things like accessing task logs.
> > Could you let us know your purpose on customize the post task hook ?
> >
> >
> > Best Regards,
> > Jeff Zhang
> >
> >
> > On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <
> jzillmann@googlemail.com> wrote:
> > Hey guys,
> >
> > is there any post task hook in Tez (like the OutputCommitter in
> MapReduce had) ?
> > Like to do certain actions (like accessing task logs) once the task
> completes, whether or not the task was successful and whether or not the
> user provided processor got executed or the task failed even before.
> >
> > Johannes
> >
> >
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: post task hook
Posted by Johannes Zillmann <jz...@googlemail.com>.
Hey Mr. Zang,
so main use case is fetching the tasks logs in case the task fails. I can do that in a try-catch blog in the processor itself but it has 2 disadvantages:
- log might not be complete
- in case the task fails in not user provided code (outside of the processor), we don’t capture it
Johannes
On 08 Oct 2014, at 01:58, Jianfeng (Jeff) Zhang <jz...@hortonworks.com> wrote:
> Hi Johannes,
>
> You can do some post task in Processor, please refer SimpleProcessor which has method postOp(), but can do a limited things and can not do things like accessing task logs.
> Could you let us know your purpose on customize the post task hook ?
>
>
> Best Regards,
> Jeff Zhang
>
>
> On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <jz...@googlemail.com> wrote:
> Hey guys,
>
> is there any post task hook in Tez (like the OutputCommitter in MapReduce had) ?
> Like to do certain actions (like accessing task logs) once the task completes, whether or not the task was successful and whether or not the user provided processor got executed or the task failed even before.
>
> Johannes
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: post task hook
Posted by "Jianfeng (Jeff) Zhang" <jz...@hortonworks.com>.
Hi Johannes,
You can do some post task in Processor, please refer SimpleProcessor which
has method postOp(), but can do a limited things and can not do things like
accessing task logs.
Could you let us know your purpose on customize the post task hook ?
Best Regards,
Jeff Zhang
On Thu, Oct 2, 2014 at 4:20 PM, Johannes Zillmann <jz...@googlemail.com>
wrote:
> Hey guys,
>
> is there any post task hook in Tez (like the OutputCommitter in MapReduce
> had) ?
> Like to do certain actions (like accessing task logs) once the task
> completes, whether or not the task was successful and whether or not the
> user provided processor got executed or the task failed even before.
>
> Johannes
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.