You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by Prabhu Joseph <pr...@gmail.com> on 2020/12/23 05:32:34 UTC

Tez + YARN ATSv2

Hi All,

  I am trying to integrate TEZ with ATSv2 as we are hitting scalability
issues with ATS1.5. Looks we need a HistoryLoggingService implementation to
emit Tez HistoryEvents from Tez jobs to ATSv2 and change in Tez UI code to
use ATS2 REST API to fetch timeline events.

Want to check if someone has already done this, if so can you share the
same.

Thanks,
Prabhu Joseph

Re: Tez + YARN ATSv2

Posted by Sreenath Somarajapuram <ss...@cloudera.com.INVALID>.
Hi,

One main reason why Tez UI wasn't updated to use ATS V2, was because ATS V2
didn't provide an option to display the Tez UI landing page, that displays
all DAGs. I don't know how it is right now, but back then all ATS V2 APIs
needed an application id, and we could only query for data under a specific
application.

One possible fix that we discussed was to ask the user to enter an
application id on opening Tez UI. But that didn't seem good enough, and
further development didn't happen.

Thanks,
Sreenath

On Wed, Dec 23, 2020 at 1:50 PM Prabhu Joseph <pr...@gmail.com>
wrote:

> Thanks, that's very useful.
>
> On Wed, Dec 23, 2020, 12:46 PM Jonathan Eagles <je...@gmail.com> wrote:
>
>> This feature was in progress for some time but progress was halted.
>>
>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/TEZ-3820
>>
>> Many of the stakeholders can be found on that ticket
>>
>> The issue regarding scale depends on usage. I run atsv1.5 and write
>> 300,000 history files a day. But they are written directly to hdfs and not
>> the timeline server REST api. The rest app cannot handle the scale or
>> bandwidth after around 8000 jobs per day IIRC. Reading from the timeline
>> server has similar issues. We serve the tez UI, but bulk history files are
>> parsed into a hive dB for serious queries.
>>
>> On Tue, Dec 22, 2020, 11:33 PM Prabhu Joseph <pr...@gmail.com>
>> wrote:
>>
>>> Hi All,
>>>
>>>   I am trying to integrate TEZ with ATSv2 as we are hitting scalability
>>> issues with ATS1.5. Looks we need a HistoryLoggingService implementation
>>> to
>>> emit Tez HistoryEvents from Tez jobs to ATSv2 and change in Tez UI code
>>> to
>>> use ATS2 REST API to fetch timeline events.
>>>
>>> Want to check if someone has already done this, if so can you share the
>>> same.
>>>
>>> Thanks,
>>> Prabhu Joseph
>>>
>>

Re: Tez + YARN ATSv2

Posted by Sreenath Somarajapuram <ss...@cloudera.com>.
Hi,

One main reason why Tez UI wasn't updated to use ATS V2, was because ATS V2
didn't provide an option to display the Tez UI landing page, that displays
all DAGs. I don't know how it is right now, but back then all ATS V2 APIs
needed an application id, and we could only query for data under a specific
application.

One possible fix that we discussed was to ask the user to enter an
application id on opening Tez UI. But that didn't seem good enough, and
further development didn't happen.

Thanks,
Sreenath

On Wed, Dec 23, 2020 at 1:50 PM Prabhu Joseph <pr...@gmail.com>
wrote:

> Thanks, that's very useful.
>
> On Wed, Dec 23, 2020, 12:46 PM Jonathan Eagles <je...@gmail.com> wrote:
>
>> This feature was in progress for some time but progress was halted.
>>
>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/TEZ-3820
>>
>> Many of the stakeholders can be found on that ticket
>>
>> The issue regarding scale depends on usage. I run atsv1.5 and write
>> 300,000 history files a day. But they are written directly to hdfs and not
>> the timeline server REST api. The rest app cannot handle the scale or
>> bandwidth after around 8000 jobs per day IIRC. Reading from the timeline
>> server has similar issues. We serve the tez UI, but bulk history files are
>> parsed into a hive dB for serious queries.
>>
>> On Tue, Dec 22, 2020, 11:33 PM Prabhu Joseph <pr...@gmail.com>
>> wrote:
>>
>>> Hi All,
>>>
>>>   I am trying to integrate TEZ with ATSv2 as we are hitting scalability
>>> issues with ATS1.5. Looks we need a HistoryLoggingService implementation
>>> to
>>> emit Tez HistoryEvents from Tez jobs to ATSv2 and change in Tez UI code
>>> to
>>> use ATS2 REST API to fetch timeline events.
>>>
>>> Want to check if someone has already done this, if so can you share the
>>> same.
>>>
>>> Thanks,
>>> Prabhu Joseph
>>>
>>

Re: Tez + YARN ATSv2

Posted by Prabhu Joseph <pr...@gmail.com>.
Thanks, that's very useful.

On Wed, Dec 23, 2020, 12:46 PM Jonathan Eagles <je...@gmail.com> wrote:

> This feature was in progress for some time but progress was halted.
>
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/TEZ-3820
>
> Many of the stakeholders can be found on that ticket
>
> The issue regarding scale depends on usage. I run atsv1.5 and write
> 300,000 history files a day. But they are written directly to hdfs and not
> the timeline server REST api. The rest app cannot handle the scale or
> bandwidth after around 8000 jobs per day IIRC. Reading from the timeline
> server has similar issues. We serve the tez UI, but bulk history files are
> parsed into a hive dB for serious queries.
>
> On Tue, Dec 22, 2020, 11:33 PM Prabhu Joseph <pr...@gmail.com>
> wrote:
>
>> Hi All,
>>
>>   I am trying to integrate TEZ with ATSv2 as we are hitting scalability
>> issues with ATS1.5. Looks we need a HistoryLoggingService implementation
>> to
>> emit Tez HistoryEvents from Tez jobs to ATSv2 and change in Tez UI code to
>> use ATS2 REST API to fetch timeline events.
>>
>> Want to check if someone has already done this, if so can you share the
>> same.
>>
>> Thanks,
>> Prabhu Joseph
>>
>

Re: Tez + YARN ATSv2

Posted by Prabhu Joseph <pr...@gmail.com>.
Thanks, that's very useful.

On Wed, Dec 23, 2020, 12:46 PM Jonathan Eagles <je...@gmail.com> wrote:

> This feature was in progress for some time but progress was halted.
>
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/TEZ-3820
>
> Many of the stakeholders can be found on that ticket
>
> The issue regarding scale depends on usage. I run atsv1.5 and write
> 300,000 history files a day. But they are written directly to hdfs and not
> the timeline server REST api. The rest app cannot handle the scale or
> bandwidth after around 8000 jobs per day IIRC. Reading from the timeline
> server has similar issues. We serve the tez UI, but bulk history files are
> parsed into a hive dB for serious queries.
>
> On Tue, Dec 22, 2020, 11:33 PM Prabhu Joseph <pr...@gmail.com>
> wrote:
>
>> Hi All,
>>
>>   I am trying to integrate TEZ with ATSv2 as we are hitting scalability
>> issues with ATS1.5. Looks we need a HistoryLoggingService implementation
>> to
>> emit Tez HistoryEvents from Tez jobs to ATSv2 and change in Tez UI code to
>> use ATS2 REST API to fetch timeline events.
>>
>> Want to check if someone has already done this, if so can you share the
>> same.
>>
>> Thanks,
>> Prabhu Joseph
>>
>

Re: Tez + YARN ATSv2

Posted by Jonathan Eagles <je...@gmail.com>.
This feature was in progress for some time but progress was halted.

https://issues.apache.org/jira/plugins/servlet/mobile#issue/TEZ-3820

Many of the stakeholders can be found on that ticket

The issue regarding scale depends on usage. I run atsv1.5 and write 300,000
history files a day. But they are written directly to hdfs and not the
timeline server REST api. The rest app cannot handle the scale or bandwidth
after around 8000 jobs per day IIRC. Reading from the timeline server has
similar issues. We serve the tez UI, but bulk history files are parsed into
a hive dB for serious queries.

On Tue, Dec 22, 2020, 11:33 PM Prabhu Joseph <pr...@gmail.com>
wrote:

> Hi All,
>
>   I am trying to integrate TEZ with ATSv2 as we are hitting scalability
> issues with ATS1.5. Looks we need a HistoryLoggingService implementation to
> emit Tez HistoryEvents from Tez jobs to ATSv2 and change in Tez UI code to
> use ATS2 REST API to fetch timeline events.
>
> Want to check if someone has already done this, if so can you share the
> same.
>
> Thanks,
> Prabhu Joseph
>

Re: Tez + YARN ATSv2

Posted by Jonathan Eagles <je...@gmail.com>.
This feature was in progress for some time but progress was halted.

https://issues.apache.org/jira/plugins/servlet/mobile#issue/TEZ-3820

Many of the stakeholders can be found on that ticket

The issue regarding scale depends on usage. I run atsv1.5 and write 300,000
history files a day. But they are written directly to hdfs and not the
timeline server REST api. The rest app cannot handle the scale or bandwidth
after around 8000 jobs per day IIRC. Reading from the timeline server has
similar issues. We serve the tez UI, but bulk history files are parsed into
a hive dB for serious queries.

On Tue, Dec 22, 2020, 11:33 PM Prabhu Joseph <pr...@gmail.com>
wrote:

> Hi All,
>
>   I am trying to integrate TEZ with ATSv2 as we are hitting scalability
> issues with ATS1.5. Looks we need a HistoryLoggingService implementation to
> emit Tez HistoryEvents from Tez jobs to ATSv2 and change in Tez UI code to
> use ATS2 REST API to fetch timeline events.
>
> Want to check if someone has already done this, if so can you share the
> same.
>
> Thanks,
> Prabhu Joseph
>