You are viewing a plain text version of this content. The canonical link for it is here.

Posted to yarn-dev@hadoop.apache.org by Sangjin Lee <sj...@apache.org> on 2016/06/20 17:26:45 UTC

[DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Hi all,

I’d like to open a discussion on merging the Timeline Service v.2 feature
to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been developing the
feature in a feature branch (YARN-2928 [3]) for a while, and we are
reasonably confident that the state of the feature meets the criteria to be
merged onto trunk and we'd love folks to get their hands on it and provide
valuable feedback so that we can make it production-ready.

In a nutshell, Timeline Service v.2 delivers significant scalability and
usability improvements based on a new architecture. You can browse the
requirements/design doc, the storage schema doc, the new entity/data model,
the YARN documentation, and also discussions on subsequent milestones on
YARN-2928 [1].

What we would like to merge to trunk is termed "alpha 1" (milestone 1). The
feature has a complete end-to-end read/write flow, and you should be able
to start setting it up and testing it. At a high level, the following are
the key features that have been implemented:

- distributed writers (collectors) as NM aux services
- HBase storage
- new entity model that includes flows
- setting the flow context via YARN app tags
- real time metrics aggregation to the application level and the flow level
- rich REST API that supports filters, complex conditionals, limits,
content selection, etc.
- YARN generic events and system metrics
- integration with Distributed Shell and MapReduce

There are a total of 139 subtasks that were completed as part of this
effort.

We paid close attention to ensure that once disabled Timeline Service v.2
does not impact existing functionality when disabled (by default).

I'd like to call out a couple of things to discuss in particular.

*First*, if the merge vote is approved, to which branch should this be
merged and what would be the release version? My preference is that *it
would be merged to branch "trunk" and be part of 3.0.0-alpha1* if approved.
Since the 3.0.0-alpha1 is in active progress, I wanted to get your thoughts
on this.

*Second*, Timeline Service v.2 introduces a dependency on HBase from YARN.
It is not a cyclical dependency (as HBase does not really depend on YARN).
However, the version of Hadoop that HBase currently supports lags behind
the Hadoop version that Timeline Service is based on, so there is a
potential for subtle dependency conflicts. We made some efforts to isolate
the issue (see [4] and [5]). The HBase folks have also been responsive in
keeping up with the trunk as much as they can. Nonetheless, this is
something to keep in mind.

I would love to get your thoughts on these and more before we open a real
voting thread. Thanks!

Regards,
Sangjin

[1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
[2] MAPREDUCE-6331: https://issues.apache.org/jira/browse/MAPREDUCE-6331
[3] YARN-2928 commits: https://github.com/apache/hadoop/commits/YARN-2928
[4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
[5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.

For the catch up, I meant during Hadoop Summit next week.

On Tue, Jun 21, 2016 at 10:28 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> The reasons for my asking about alternate implementations: (1) ease of
> trying it out for Yarn devs and iteration for bug fixes, improvements and
> (2) ease of trying it for app-writers/users to figure out if they should
> use the ATS. Again, personally, I don't see this as necessary for the merge
> itself, but more so for adoption.
>
> A test implementation would be enough for #1, and would partially address
> #2. A more substantial implementation would be nice, but I guess we need to
> look at the ROI to decide whether adding that is a good idea.
>
> On completeness, I agree. Further, for some backend implementations, it is
> possible that a particular aggregation/query might be possible but too
> expensive to turn on. What are your thoughts on provisions for the admin to
> turn off some queries/aggregations?
>
> Orthogonal: is there interest here to catch up on ATS specifically one of
> the days? May be, during the breaks or after the sessions?
>
> On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <ll...@hortonworks.com> wrote:
>
>> HDFS or other non-HBase implementations are very helpful. We didn’t focus
>> on those implementations in the first milestone because we would like to
>> have one working version as a starting point. We can certainly add more
>> implementations when the feature gets more mature.
>>
>> This said, one of my concerns when building these storage implementations
>> is “completeness”. We have added a lot of supports to data aggregation. As
>> of today, part of the aggregation (flow run aggregation) may be performed
>> as HBase coprocessors. When implementing comparable storage impls, it is
>> worth noting that one may want to provide some equivalent things to perform
>> those aggregations (to really make one implementation “complete enough”,
>> or, “interchangeable” to the existing HBase impl).
>>
>> Li Lu
>> > On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org> wrote:
>> >
>> > Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd
>> like
>> > to get a better sense of what you're thinking of. Are you interested in
>> > strictly a test implementation (e.g. perfectly fine in a single node
>> setup)
>> > or a more substantial implementation (may not scale but needs to work
>> in a
>> > more realistic setup)?
>> >
>> > Regards,
>> > Sangjin
>> >
>> > On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis <jrottinghuis@gmail.com
>> >
>> > wrote:
>> >
>> >> Thanks Karthik and Tsuyoshi for bringing up good points.
>> >>
>> >> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track
>> this
>> >> discussion and capture all the merits and challenges in one single
>> place.
>> >>
>> >> Thanks,
>> >>
>> >> Joep
>> >>
>> >> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org>
>> wrote:
>> >>
>> >>> Thanks Sangjin for starting the discussion.
>> >>>
>> >>>>> *First*, if the merge vote is approved, to which branch should this
>> be
>> >>> merged and what would be the release version?
>> >>>
>> >>> As you mentioned, I think it's reasonable for us to target trunk and
>> >>> 3.0.0-alpha.
>> >>>
>> >>>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >>>> We can however, potentially change the Local File System based
>> >>> implementation to a HDFS based implementation and have it as an
>> alternate
>> >>> for non-production use,
>> >>>
>> >>> In Apache Big Data 2016 NA, some users also mentioned that they need
>> HDFS
>> >>> implementation. Currently it's pending, but I and Varun tried to work
>> to
>> >>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
>> >>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's
>> useful
>> >>> for small cluster(e.g. smaller than 10 machines). After merging the
>> >> current
>> >>> implementation into trunk, I'm interested in resuming YARN-3874
>> >> work(maybe
>> >>> Varun is also interested in).
>> >>>
>> >>> Regards,
>> >>> - Tsuyoshi
>> >>>
>> >>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <
>> varun.saxena@huawei.com>
>> >>> wrote:
>> >>>> Thanks Karthik for sharing your views.
>> >>>>
>> >>>> With regards to merging, it would help to have clear documentation on
>> >> how
>> >>> to setup and use ATS.
>> >>>> --> We do have documentation on this. You and others who are
>> interested
>> >>> can check out YARN-5174 which is the latest documentation related JIRA
>> >> for
>> >>> ATSv2.
>> >>>>
>> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >>>> --> We do have a File System based implementation but it is strictly
>> >> for
>> >>> test purposes (as we write data into a local file). It does not
>> support
>> >> all
>> >>> the features of Timeline Service v.2 as well.
>> >>>> Regarding LevelDB, Timeline Service v.2 has distributed writers and
>> >> Level
>> >>> DB writes data (log files or SSTable files) to local file system. This
>> >>> means there will be no easy way to have a LevelDB based implementation
>> >>> because we would not know where to read the data from, especially
>> while
>> >>> fetching flow level information.
>> >>>> We can however, potentially change the Local File System based
>> >>> implementation to a HDFS based implementation and have it as an
>> alternate
>> >>> for non-production use, if there is a potential need for it, based on
>> >>> community feedback. This however, would have to be further discussed
>> with
>> >>> the team.
>> >>>>
>> >>>> Regards,
>> >>>> Varun Saxena.
>> >>>>
>> >>>> -----Original Message-----
>> >>>> From: Karthik Kambatla [mailto:kasha@cloudera.com]
>> >>>> Sent: 21 June 2016 10:29
>> >>>> To: Sangjin Lee
>> >>>> Cc: yarn-dev@hadoop.apache.org
>> >>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
>> >> trunk
>> >>>>
>> >>>> Firstly, thanks Sangjin and others for driving this major feature.
>> >>>>
>> >>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as
>> it
>> >>> will give early access to downstream users.
>> >>>>
>> >>>> With regards to merging, it would help to have clear documentation on
>> >> how
>> >>> to setup and use ATS.
>> >>>>
>> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB? I
>> understand
>> >>> this wouldn't scale, but would it help with initial adoption and
>> feedback
>> >>> from early users?
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>
>> >> wrote:
>> >>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> I’d like to open a discussion on merging the Timeline Service v.2
>> >>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been
>> >>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
>> >>>>> while, and we are reasonably confident that the state of the feature
>> >>>>> meets the criteria to be merged onto trunk and we'd love folks to
>> get
>> >>>>> their hands on it and provide valuable feedback so that we can make
>> it
>> >>> production-ready.
>> >>>>>
>> >>>>> In a nutshell, Timeline Service v.2 delivers significant scalability
>> >>>>> and usability improvements based on a new architecture. You can
>> browse
>> >>>>> the requirements/design doc, the storage schema doc, the new
>> >>>>> entity/data model, the YARN documentation, and also discussions on
>> >>>>> subsequent milestones on
>> >>>>> YARN-2928 [1].
>> >>>>>
>> >>>>> What we would like to merge to trunk is termed "alpha 1" (milestone
>> >>>>> 1). The feature has a complete end-to-end read/write flow, and you
>> >>>>> should be able to start setting it up and testing it. At a high
>> level,
>> >>>>> the following are the key features that have been implemented:
>> >>>>>
>> >>>>> - distributed writers (collectors) as NM aux services
>> >>>>> - HBase storage
>> >>>>> - new entity model that includes flows
>> >>>>> - setting the flow context via YARN app tags
>> >>>>> - real time metrics aggregation to the application level and the
>> flow
>> >>>>> level
>> >>>>> - rich REST API that supports filters, complex conditionals, limits,
>> >>>>> content selection, etc.
>> >>>>> - YARN generic events and system metrics
>> >>>>> - integration with Distributed Shell and MapReduce
>> >>>>>
>> >>>>> There are a total of 139 subtasks that were completed as part of
>> this
>> >>>>> effort.
>> >>>>>
>> >>>>> We paid close attention to ensure that once disabled Timeline
>> Service
>> >>>>> v.2 does not impact existing functionality when disabled (by
>> default).
>> >>>>>
>> >>>>> I'd like to call out a couple of things to discuss in particular.
>> >>>>>
>> >>>>> *First*, if the merge vote is approved, to which branch should this
>> be
>> >>>>> merged and what would be the release version? My preference is that
>> >>>>> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1*
>> if
>> >>> approved.
>> >>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
>> >>>>> thoughts on this.
>> >>>>>
>> >>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase from
>> >>> YARN.
>> >>>>> It is not a cyclical dependency (as HBase does not really depend on
>> >>> YARN).
>> >>>>> However, the version of Hadoop that HBase currently supports lags
>> >>>>> behind the Hadoop version that Timeline Service is based on, so
>> there
>> >>>>> is a potential for subtle dependency conflicts. We made some efforts
>> >>>>> to isolate the issue (see [4] and [5]). The HBase folks have also
>> been
>> >>>>> responsive in keeping up with the trunk as much as they can.
>> >>>>> Nonetheless, this is something to keep in mind.
>> >>>>>
>> >>>>> I would love to get your thoughts on these and more before we open a
>> >>>>> real voting thread. Thanks!
>> >>>>>
>> >>>>> Regards,
>> >>>>> Sangjin
>> >>>>>
>> >>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
>> >>>>> [2] MAPREDUCE-6331:
>> >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
>> >>>>> [3] YARN-2928 commits:
>> >>>>> https://github.com/apache/hadoop/commits/YARN-2928
>> >>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
>> >>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
>> >>>>>
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> >>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>> >>>>
>> >>>
>> >>
>>
>>
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Sangjin Lee <sj...@apache.org>.

Thanks everyone for chiming in on the discussion. Since no blockers were
raised, I'll go ahead and start a vote thread.

Regards,
Sangjin

On Fri, Jun 24, 2016 at 11:32 AM, 俊平堵 <ju...@apache.org> wrote:

> Thanks Sangjin, Li and for sharing your points also. Yes. That's my
> original point that we shouldn't bind the merge of YARN-2928 to trunk with
> any alpha release in the short term. Actually, from this ATS v2 merge case,
> we can see the value of keeping trunk independent of short-term releases as
> the bar of trunk merging is different from alpha release.
> Let's discuss 3.0.0-alpha release plan and scope in other threads and
> focus on merging ATS v2 to trunk here. Again, big +1 to merge ATS v2 to
> trunk.
>
> 2016-06-24 10:37 GMT-07:00 Li Lu <ll...@hortonworks.com>:
>
>>
>> On Jun 24, 2016, at 09:59, Sangjin Lee <sjlee@apache.org<mailto:
>> sjlee@apache.org>> wrote:
>>
>> Also for my understanding, the implication of merging it to trunk is that
>> it would be included in 3.0.0-alpha1 (unless 3.0.0-alpha1 gets cut before
>> the merge), right?
>> Thanks Sangjin and yes, if the 3.0.0-alpha branch is cut after we merge,
>> that will be included? That said, maybe we do not want to strongly couple
>> the merge plan with release plans now since YARN-2928 not yet merged in
>> trunk?
>>
>
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by 俊平堵 <ju...@apache.org>.

Thanks Sangjin, Li and for sharing your points also. Yes. That's my
original point that we shouldn't bind the merge of YARN-2928 to trunk with
any alpha release in the short term. Actually, from this ATS v2 merge case,
we can see the value of keeping trunk independent of short-term releases as
the bar of trunk merging is different from alpha release.
Let's discuss 3.0.0-alpha release plan and scope in other threads and focus
on merging ATS v2 to trunk here. Again, big +1 to merge ATS v2 to trunk.

2016-06-24 10:37 GMT-07:00 Li Lu <ll...@hortonworks.com>:

>
> On Jun 24, 2016, at 09:59, Sangjin Lee <sjlee@apache.org<mailto:
> sjlee@apache.org>> wrote:
>
> Also for my understanding, the implication of merging it to trunk is that
> it would be included in 3.0.0-alpha1 (unless 3.0.0-alpha1 gets cut before
> the merge), right?
> Thanks Sangjin and yes, if the 3.0.0-alpha branch is cut after we merge,
> that will be included? That said, maybe we do not want to strongly couple
> the merge plan with release plans now since YARN-2928 not yet merged in
> trunk?
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Sangjin Lee <sj...@apache.org>.

Yes of course. The discussion is solely about the merge to trunk.

Thanks,
Sangjin

On Fri, Jun 24, 2016 at 10:37 AM, Li Lu <ll...@hortonworks.com> wrote:

>
> On Jun 24, 2016, at 09:59, Sangjin Lee <sjlee@apache.org<mailto:
> sjlee@apache.org>> wrote:
>
> Also for my understanding, the implication of merging it to trunk is that
> it would be included in 3.0.0-alpha1 (unless 3.0.0-alpha1 gets cut before
> the merge), right?
> Thanks Sangjin and yes, if the 3.0.0-alpha branch is cut after we merge,
> that will be included? That said, maybe we do not want to strongly couple
> the merge plan with release plans now since YARN-2928 not yet merged in
> trunk?
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Li Lu <ll...@hortonworks.com>.

On Jun 24, 2016, at 09:59, Sangjin Lee <sj...@apache.org>> wrote:

Also for my understanding, the implication of merging it to trunk is that it would be included in 3.0.0-alpha1 (unless 3.0.0-alpha1 gets cut before the merge), right?
Thanks Sangjin and yes, if the 3.0.0-alpha branch is cut after we merge, that will be included? That said, maybe we do not want to strongly couple the merge plan with release plans now since YARN-2928 not yet merged in trunk?

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Sangjin Lee <sj...@apache.org>.

Thanks Junping for raising a good point. Yes, the fact that security is not
implemented in the current version is definitely a concern. FYI, the
timeline service v.2 documentation (attached on YARN-2928) makes it clear
that it is alpha 1 and should only be used for testing or evaluation. We
can make it more explicit in the documentation and warn users specifically
that security is not implemented. How does that sound?

Also for my understanding, the implication of merging it to trunk is that
it would be included in 3.0.0-alpha1 (unless 3.0.0-alpha1 gets cut before
the merge), right?

Thanks,
Sangjin

On Thu, Jun 23, 2016 at 8:30 PM, Tsuyoshi Ozawa <oz...@apache.org> wrote:

> Hi Junping,
>
> Thanks for your good suggestion.
>
> > However, my concern to release it in 3.0.0-alpha (even as an alpha
> feature) is we haven't provide any security support in ATS v2 yet.
> > Enabling this feature without understanding the risk here could be a
> disaster to end-user (even in a test cluster).
>
> You're right. Can we document and clarify that it's still  "alpha 1",
> and it doesn't have security features. I also think ATS 1.5 supports
> security features, so it's good for production - we should document it
> officially.
>
> Thanks,
> - Tsuyoshi
>
> On Thu, Jun 23, 2016 at 5:45 PM, 俊平堵 <ju...@apache.org> wrote:
> > Big +1 on merging ATS-v2 to trunk. However, my concern to release it in
> > 3.0.0-alpha (even as an alpha feature) is we haven't provide any security
> > support in ATS v2 yet. Enabling this feature without understanding the
> risk
> > here could be a disaster to end-user (even in a test cluster).
> >
> > Kudos to everyone who contributes patches, include: Sangjin, Li,
> Vrushali,
> > Naga, Varun, Joep and Zhijie.
> >
> > Thanks,
> >
> > Junping
> >
> > 2016-06-23 13:32 GMT-07:00 Sangjin Lee <sj...@apache.org>:
> >>
> >> Thanks folks for the good discussion!
> >>
> >> I'm going to keep it open for a few more days as I'd love to get
> feedback
> >> from more people. I am thinking of opening a voting thread right after
> the
> >> Hadoop Summit next week if there are no objections. Thanks!
> >>
> >> Regards,
> >> Sangjin
> >>
> >> On Tue, Jun 21, 2016 at 9:51 PM, Li Lu <ll...@hortonworks.com> wrote:
> >>
> >> > I agree that having non-Hbase impls may attract more potential users
> to
> >> > ATS. Actually I remember we do have some JIRAs for HDFS
> implementations.
> >> > With regard to aggregation, yes, if there are more options on storage
> >> > implementations we really need to find some ways to describe their
> >> > implications to different kinds of aggressions.
> >> >
> >> > +1 for the idea of some group chats! The break after the ATS talk may
> be
> >> > a
> >> > good candidate?
> >> >
> >> > Li Lu
> >> >
> >> > On Jun 21, 2016, at 21:28, Karthik Kambatla <ka...@cloudera.com>
> wrote:
> >> >
> >> > The reasons for my asking about alternate implementations: (1) ease of
> >> > trying it out for Yarn devs and iteration for bug fixes, improvements
> >> > and
> >> > (2) ease of trying it for app-writers/users to figure out if they
> should
> >> > use the ATS. Again, personally, I don't see this as necessary for the
> >> > merge
> >> > itself, but more so for adoption.
> >> >
> >> > A test implementation would be enough for #1, and would partially
> >> > address
> >> > #2. A more substantial implementation would be nice, but I guess we
> need
> >> > to
> >> > look at the ROI to decide whether adding that is a good idea.
> >> >
> >> > On completeness, I agree. Further, for some backend implementations,
> it
> >> > is
> >> > possible that a particular aggregation/query might be possible but too
> >> > expensive to turn on. What are your thoughts on provisions for the
> admin
> >> > to
> >> > turn off some queries/aggregations?
> >> >
> >> > Orthogonal: is there interest here to catch up on ATS specifically one
> >> > of
> >> > the days? May be, during the breaks or after the sessions?
> >> >
> >> > On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <ll...@hortonworks.com> wrote:
> >> >
> >> >> HDFS or other non-HBase implementations are very helpful. We didn’t
> >> >> focus
> >> >> on those implementations in the first milestone because we would like
> >> >> to
> >> >> have one working version as a starting point. We can certainly add
> more
> >> >> implementations when the feature gets more mature.
> >> >>
> >> >> This said, one of my concerns when building these storage
> >> >> implementations
> >> >> is “completeness”. We have added a lot of supports to data
> aggregation.
> >> >> As
> >> >> of today, part of the aggregation (flow run aggregation) may be
> >> >> performed
> >> >> as HBase coprocessors. When implementing comparable storage impls, it
> >> >> is
> >> >> worth noting that one may want to provide some equivalent things to
> >> >> perform
> >> >> those aggregations (to really make one implementation “complete
> >> >> enough”,
> >> >> or, “interchangeable” to the existing HBase impl).
> >> >>
> >> >> Li Lu
> >> >> > On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org> wrote:
> >> >> >
> >> >> > Thanks Karthik and Tsuyoshi. Regarding alternate implementations,
> I'd
> >> >> like
> >> >> > to get a better sense of what you're thinking of. Are you
> interested
> >> >> > in
> >> >> > strictly a test implementation (e.g. perfectly fine in a single
> node
> >> >> setup)
> >> >> > or a more substantial implementation (may not scale but needs to
> work
> >> >> in a
> >> >> > more realistic setup)?
> >> >> >
> >> >> > Regards,
> >> >> > Sangjin
> >> >> >
> >> >> > On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis
> >> >> > <jrottinghuis@gmail.com
> >> >> >
> >> >> > wrote:
> >> >> >
> >> >> >> Thanks Karthik and Tsuyoshi for bringing up good points.
> >> >> >>
> >> >> >> I've opened https://issues.apache.org/jira/browse/YARN-5281 to
> track
> >> >> this
> >> >> >> discussion and capture all the merits and challenges in one single
> >> >> place.
> >> >> >>
> >> >> >> Thanks,
> >> >> >>
> >> >> >> Joep
> >> >> >>
> >> >> >> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <ozawa@apache.org
> >
> >> >> wrote:
> >> >> >>
> >> >> >>> Thanks Sangjin for starting the discussion.
> >> >> >>>
> >> >> >>>>> *First*, if the merge vote is approved, to which branch should
> >> >> >>>>> this
> >> >> be
> >> >> >>> merged and what would be the release version?
> >> >> >>>
> >> >> >>> As you mentioned, I think it's reasonable for us to target trunk
> >> >> >>> and
> >> >> >>> 3.0.0-alpha.
> >> >> >>>
> >> >> >>>>> Slightly unrelated to the merge, do we plan to support any
> other
> >> >> >> simpler
> >> >> >>> backend for users to try out, in addition to HBase? LevelDB?
> >> >> >>>> We can however, potentially change the Local File System based
> >> >> >>> implementation to a HDFS based implementation and have it as an
> >> >> alternate
> >> >> >>> for non-production use,
> >> >> >>>
> >> >> >>> In Apache Big Data 2016 NA, some users also mentioned that they
> >> >> >>> need
> >> >> HDFS
> >> >> >>> implementation. Currently it's pending, but I and Varun tried to
> >> >> >>> work
> >> >> to
> >> >> >>> support HDFS backend(YARN-3874). As Karthik mentioned, it's
> useful
> >> >> >>> for
> >> >> >>> early users to try v2.0 APIs though it's doesn't scale. IMHO,
> it's
> >> >> useful
> >> >> >>> for small cluster(e.g. smaller than 10 machines). After merging
> the
> >> >> >> current
> >> >> >>> implementation into trunk, I'm interested in resuming YARN-3874
> >> >> >> work(maybe
> >> >> >>> Varun is also interested in).
> >> >> >>>
> >> >> >>> Regards,
> >> >> >>> - Tsuyoshi
> >> >> >>>
> >> >> >>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <
> >> >> varun.saxena@huawei.com>
> >> >> >>> wrote:
> >> >> >>>> Thanks Karthik for sharing your views.
> >> >> >>>>
> >> >> >>>> With regards to merging, it would help to have clear
> documentation
> >> >> >>>> on
> >> >> >> how
> >> >> >>> to setup and use ATS.
> >> >> >>>> --> We do have documentation on this. You and others who are
> >> >> interested
> >> >> >>> can check out YARN-5174 which is the latest documentation related
> >> >> >>> JIRA
> >> >> >> for
> >> >> >>> ATSv2.
> >> >> >>>>
> >> >> >>>> Slightly unrelated to the merge, do we plan to support any other
> >> >> >> simpler
> >> >> >>> backend for users to try out, in addition to HBase? LevelDB?
> >> >> >>>> --> We do have a File System based implementation but it is
> >> >> >>>> strictly
> >> >> >> for
> >> >> >>> test purposes (as we write data into a local file). It does not
> >> >> support
> >> >> >> all
> >> >> >>> the features of Timeline Service v.2 as well.
> >> >> >>>> Regarding LevelDB, Timeline Service v.2 has distributed writers
> >> >> >>>> and
> >> >> >> Level
> >> >> >>> DB writes data (log files or SSTable files) to local file system.
> >> >> >>> This
> >> >> >>> means there will be no easy way to have a LevelDB based
> >> >> >>> implementation
> >> >> >>> because we would not know where to read the data from, especially
> >> >> while
> >> >> >>> fetching flow level information.
> >> >> >>>> We can however, potentially change the Local File System based
> >> >> >>> implementation to a HDFS based implementation and have it as an
> >> >> alternate
> >> >> >>> for non-production use, if there is a potential need for it,
> based
> >> >> >>> on
> >> >> >>> community feedback. This however, would have to be further
> >> >> >>> discussed
> >> >> with
> >> >> >>> the team.
> >> >> >>>>
> >> >> >>>> Regards,
> >> >> >>>> Varun Saxena.
> >> >> >>>>
> >> >> >>>> -----Original Message-----
> >> >> >>>> From: Karthik Kambatla [mailto:kasha@cloudera.com]
> >> >> >>>> Sent: 21 June 2016 10:29
> >> >> >>>> To: Sangjin Lee
> >> >> >>>> Cc: yarn-dev@hadoop.apache.org
> >> >> >>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2)
> to
> >> >> >> trunk
> >> >> >>>>
> >> >> >>>> Firstly, thanks Sangjin and others for driving this major
> feature.
> >> >> >>>>
> >> >> >>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable,
> >> >> >>>> as
> >> >> it
> >> >> >>> will give early access to downstream users.
> >> >> >>>>
> >> >> >>>> With regards to merging, it would help to have clear
> documentation
> >> >> >>>> on
> >> >> >> how
> >> >> >>> to setup and use ATS.
> >> >> >>>>
> >> >> >>>> Slightly unrelated to the merge, do we plan to support any other
> >> >> >> simpler
> >> >> >>> backend for users to try out, in addition to HBase? LevelDB? I
> >> >> understand
> >> >> >>> this wouldn't scale, but would it help with initial adoption and
> >> >> feedback
> >> >> >>> from early users?
> >> >> >>>>
> >> >> >>>>
> >> >> >>>>
> >> >> >>>>
> >> >> >>>>
> >> >> >>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sjlee@apache.org
> >
> >> >> >> wrote:
> >> >> >>>>
> >> >> >>>>> Hi all,
> >> >> >>>>>
> >> >> >>>>> I’d like to open a discussion on merging the Timeline Service
> v.2
> >> >> >>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have
> >> >> >>>>> been
> >> >> >>>>> developing the feature in a feature branch (YARN-2928 [3]) for
> a
> >> >> >>>>> while, and we are reasonably confident that the state of the
> >> >> >>>>> feature
> >> >> >>>>> meets the criteria to be merged onto trunk and we'd love folks
> to
> >> >> get
> >> >> >>>>> their hands on it and provide valuable feedback so that we can
> >> >> >>>>> make
> >> >> it
> >> >> >>> production-ready.
> >> >> >>>>>
> >> >> >>>>> In a nutshell, Timeline Service v.2 delivers significant
> >> >> >>>>> scalability
> >> >> >>>>> and usability improvements based on a new architecture. You can
> >> >> browse
> >> >> >>>>> the requirements/design doc, the storage schema doc, the new
> >> >> >>>>> entity/data model, the YARN documentation, and also discussions
> >> >> >>>>> on
> >> >> >>>>> subsequent milestones on
> >> >> >>>>> YARN-2928 [1].
> >> >> >>>>>
> >> >> >>>>> What we would like to merge to trunk is termed "alpha 1"
> >> >> >>>>> (milestone
> >> >> >>>>> 1). The feature has a complete end-to-end read/write flow, and
> >> >> >>>>> you
> >> >> >>>>> should be able to start setting it up and testing it. At a high
> >> >> level,
> >> >> >>>>> the following are the key features that have been implemented:
> >> >> >>>>>
> >> >> >>>>> - distributed writers (collectors) as NM aux services
> >> >> >>>>> - HBase storage
> >> >> >>>>> - new entity model that includes flows
> >> >> >>>>> - setting the flow context via YARN app tags
> >> >> >>>>> - real time metrics aggregation to the application level and
> the
> >> >> flow
> >> >> >>>>> level
> >> >> >>>>> - rich REST API that supports filters, complex conditionals,
> >> >> >>>>> limits,
> >> >> >>>>> content selection, etc.
> >> >> >>>>> - YARN generic events and system metrics
> >> >> >>>>> - integration with Distributed Shell and MapReduce
> >> >> >>>>>
> >> >> >>>>> There are a total of 139 subtasks that were completed as part
> of
> >> >> this
> >> >> >>>>> effort.
> >> >> >>>>>
> >> >> >>>>> We paid close attention to ensure that once disabled Timeline
> >> >> Service
> >> >> >>>>> v.2 does not impact existing functionality when disabled (by
> >> >> default).
> >> >> >>>>>
> >> >> >>>>> I'd like to call out a couple of things to discuss in
> particular.
> >> >> >>>>>
> >> >> >>>>> *First*, if the merge vote is approved, to which branch should
> >> >> >>>>> this
> >> >> be
> >> >> >>>>> merged and what would be the release version? My preference is
> >> >> >>>>> that
> >> >> >>>>> *it would be merged to branch "trunk" and be part of
> >> >> >>>>> 3.0.0-alpha1*
> >> >> if
> >> >> >>> approved.
> >> >> >>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get
> >> >> >>>>> your
> >> >> >>>>> thoughts on this.
> >> >> >>>>>
> >> >> >>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase
> >> >> >>>>> from
> >> >> >>> YARN.
> >> >> >>>>> It is not a cyclical dependency (as HBase does not really
> depend
> >> >> >>>>> on
> >> >> >>> YARN).
> >> >> >>>>> However, the version of Hadoop that HBase currently supports
> lags
> >> >> >>>>> behind the Hadoop version that Timeline Service is based on, so
> >> >> there
> >> >> >>>>> is a potential for subtle dependency conflicts. We made some
> >> >> >>>>> efforts
> >> >> >>>>> to isolate the issue (see [4] and [5]). The HBase folks have
> also
> >> >> been
> >> >> >>>>> responsive in keeping up with the trunk as much as they can.
> >> >> >>>>> Nonetheless, this is something to keep in mind.
> >> >> >>>>>
> >> >> >>>>> I would love to get your thoughts on these and more before we
> >> >> >>>>> open a
> >> >> >>>>> real voting thread. Thanks!
> >> >> >>>>>
> >> >> >>>>> Regards,
> >> >> >>>>> Sangjin
> >> >> >>>>>
> >> >> >>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
> >> >> >>>>> [2] MAPREDUCE-6331:
> >> >> >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
> >> >> >>>>> [3] YARN-2928 commits:
> >> >> >>>>> https://github.com/apache/hadoop/commits/YARN-2928
> >> >> >>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
> >> >> >>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
> >> >> >>>>>
> >> >> >>>>
> >> >> >>>>
> >> >> >>>>
> ---------------------------------------------------------------------
> >> >> >>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> >> >> >>>> For additional commands, e-mail:
> yarn-dev-help@hadoop.apache.org
> >> >> >>>>
> >> >> >>>
> >> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >
> >
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Tsuyoshi Ozawa <oz...@apache.org>.

Hi Junping,

Thanks for your good suggestion.

> However, my concern to release it in 3.0.0-alpha (even as an alpha feature) is we haven't provide any security support in ATS v2 yet.
> Enabling this feature without understanding the risk here could be a disaster to end-user (even in a test cluster).

You're right. Can we document and clarify that it's still  "alpha 1",
and it doesn't have security features. I also think ATS 1.5 supports
security features, so it's good for production - we should document it
officially.

Thanks,
- Tsuyoshi

On Thu, Jun 23, 2016 at 5:45 PM, 俊平堵 <ju...@apache.org> wrote:
> Big +1 on merging ATS-v2 to trunk. However, my concern to release it in
> 3.0.0-alpha (even as an alpha feature) is we haven't provide any security
> support in ATS v2 yet. Enabling this feature without understanding the risk
> here could be a disaster to end-user (even in a test cluster).
>
> Kudos to everyone who contributes patches, include: Sangjin, Li, Vrushali,
> Naga, Varun, Joep and Zhijie.
>
> Thanks,
>
> Junping
>
> 2016-06-23 13:32 GMT-07:00 Sangjin Lee <sj...@apache.org>:
>>
>> Thanks folks for the good discussion!
>>
>> I'm going to keep it open for a few more days as I'd love to get feedback
>> from more people. I am thinking of opening a voting thread right after the
>> Hadoop Summit next week if there are no objections. Thanks!
>>
>> Regards,
>> Sangjin
>>
>> On Tue, Jun 21, 2016 at 9:51 PM, Li Lu <ll...@hortonworks.com> wrote:
>>
>> > I agree that having non-Hbase impls may attract more potential users to
>> > ATS. Actually I remember we do have some JIRAs for HDFS implementations.
>> > With regard to aggregation, yes, if there are more options on storage
>> > implementations we really need to find some ways to describe their
>> > implications to different kinds of aggressions.
>> >
>> > +1 for the idea of some group chats! The break after the ATS talk may be
>> > a
>> > good candidate?
>> >
>> > Li Lu
>> >
>> > On Jun 21, 2016, at 21:28, Karthik Kambatla <ka...@cloudera.com> wrote:
>> >
>> > The reasons for my asking about alternate implementations: (1) ease of
>> > trying it out for Yarn devs and iteration for bug fixes, improvements
>> > and
>> > (2) ease of trying it for app-writers/users to figure out if they should
>> > use the ATS. Again, personally, I don't see this as necessary for the
>> > merge
>> > itself, but more so for adoption.
>> >
>> > A test implementation would be enough for #1, and would partially
>> > address
>> > #2. A more substantial implementation would be nice, but I guess we need
>> > to
>> > look at the ROI to decide whether adding that is a good idea.
>> >
>> > On completeness, I agree. Further, for some backend implementations, it
>> > is
>> > possible that a particular aggregation/query might be possible but too
>> > expensive to turn on. What are your thoughts on provisions for the admin
>> > to
>> > turn off some queries/aggregations?
>> >
>> > Orthogonal: is there interest here to catch up on ATS specifically one
>> > of
>> > the days? May be, during the breaks or after the sessions?
>> >
>> > On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <ll...@hortonworks.com> wrote:
>> >
>> >> HDFS or other non-HBase implementations are very helpful. We didn’t
>> >> focus
>> >> on those implementations in the first milestone because we would like
>> >> to
>> >> have one working version as a starting point. We can certainly add more
>> >> implementations when the feature gets more mature.
>> >>
>> >> This said, one of my concerns when building these storage
>> >> implementations
>> >> is “completeness”. We have added a lot of supports to data aggregation.
>> >> As
>> >> of today, part of the aggregation (flow run aggregation) may be
>> >> performed
>> >> as HBase coprocessors. When implementing comparable storage impls, it
>> >> is
>> >> worth noting that one may want to provide some equivalent things to
>> >> perform
>> >> those aggregations (to really make one implementation “complete
>> >> enough”,
>> >> or, “interchangeable” to the existing HBase impl).
>> >>
>> >> Li Lu
>> >> > On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org> wrote:
>> >> >
>> >> > Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd
>> >> like
>> >> > to get a better sense of what you're thinking of. Are you interested
>> >> > in
>> >> > strictly a test implementation (e.g. perfectly fine in a single node
>> >> setup)
>> >> > or a more substantial implementation (may not scale but needs to work
>> >> in a
>> >> > more realistic setup)?
>> >> >
>> >> > Regards,
>> >> > Sangjin
>> >> >
>> >> > On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis
>> >> > <jrottinghuis@gmail.com
>> >> >
>> >> > wrote:
>> >> >
>> >> >> Thanks Karthik and Tsuyoshi for bringing up good points.
>> >> >>
>> >> >> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track
>> >> this
>> >> >> discussion and capture all the merits and challenges in one single
>> >> place.
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> Joep
>> >> >>
>> >> >> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org>
>> >> wrote:
>> >> >>
>> >> >>> Thanks Sangjin for starting the discussion.
>> >> >>>
>> >> >>>>> *First*, if the merge vote is approved, to which branch should
>> >> >>>>> this
>> >> be
>> >> >>> merged and what would be the release version?
>> >> >>>
>> >> >>> As you mentioned, I think it's reasonable for us to target trunk
>> >> >>> and
>> >> >>> 3.0.0-alpha.
>> >> >>>
>> >> >>>>> Slightly unrelated to the merge, do we plan to support any other
>> >> >> simpler
>> >> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >> >>>> We can however, potentially change the Local File System based
>> >> >>> implementation to a HDFS based implementation and have it as an
>> >> alternate
>> >> >>> for non-production use,
>> >> >>>
>> >> >>> In Apache Big Data 2016 NA, some users also mentioned that they
>> >> >>> need
>> >> HDFS
>> >> >>> implementation. Currently it's pending, but I and Varun tried to
>> >> >>> work
>> >> to
>> >> >>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful
>> >> >>> for
>> >> >>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's
>> >> useful
>> >> >>> for small cluster(e.g. smaller than 10 machines). After merging the
>> >> >> current
>> >> >>> implementation into trunk, I'm interested in resuming YARN-3874
>> >> >> work(maybe
>> >> >>> Varun is also interested in).
>> >> >>>
>> >> >>> Regards,
>> >> >>> - Tsuyoshi
>> >> >>>
>> >> >>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <
>> >> varun.saxena@huawei.com>
>> >> >>> wrote:
>> >> >>>> Thanks Karthik for sharing your views.
>> >> >>>>
>> >> >>>> With regards to merging, it would help to have clear documentation
>> >> >>>> on
>> >> >> how
>> >> >>> to setup and use ATS.
>> >> >>>> --> We do have documentation on this. You and others who are
>> >> interested
>> >> >>> can check out YARN-5174 which is the latest documentation related
>> >> >>> JIRA
>> >> >> for
>> >> >>> ATSv2.
>> >> >>>>
>> >> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> >> simpler
>> >> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >> >>>> --> We do have a File System based implementation but it is
>> >> >>>> strictly
>> >> >> for
>> >> >>> test purposes (as we write data into a local file). It does not
>> >> support
>> >> >> all
>> >> >>> the features of Timeline Service v.2 as well.
>> >> >>>> Regarding LevelDB, Timeline Service v.2 has distributed writers
>> >> >>>> and
>> >> >> Level
>> >> >>> DB writes data (log files or SSTable files) to local file system.
>> >> >>> This
>> >> >>> means there will be no easy way to have a LevelDB based
>> >> >>> implementation
>> >> >>> because we would not know where to read the data from, especially
>> >> while
>> >> >>> fetching flow level information.
>> >> >>>> We can however, potentially change the Local File System based
>> >> >>> implementation to a HDFS based implementation and have it as an
>> >> alternate
>> >> >>> for non-production use, if there is a potential need for it, based
>> >> >>> on
>> >> >>> community feedback. This however, would have to be further
>> >> >>> discussed
>> >> with
>> >> >>> the team.
>> >> >>>>
>> >> >>>> Regards,
>> >> >>>> Varun Saxena.
>> >> >>>>
>> >> >>>> -----Original Message-----
>> >> >>>> From: Karthik Kambatla [mailto:kasha@cloudera.com]
>> >> >>>> Sent: 21 June 2016 10:29
>> >> >>>> To: Sangjin Lee
>> >> >>>> Cc: yarn-dev@hadoop.apache.org
>> >> >>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
>> >> >> trunk
>> >> >>>>
>> >> >>>> Firstly, thanks Sangjin and others for driving this major feature.
>> >> >>>>
>> >> >>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable,
>> >> >>>> as
>> >> it
>> >> >>> will give early access to downstream users.
>> >> >>>>
>> >> >>>> With regards to merging, it would help to have clear documentation
>> >> >>>> on
>> >> >> how
>> >> >>> to setup and use ATS.
>> >> >>>>
>> >> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> >> simpler
>> >> >>> backend for users to try out, in addition to HBase? LevelDB? I
>> >> understand
>> >> >>> this wouldn't scale, but would it help with initial adoption and
>> >> feedback
>> >> >>> from early users?
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>
>> >> >> wrote:
>> >> >>>>
>> >> >>>>> Hi all,
>> >> >>>>>
>> >> >>>>> I’d like to open a discussion on merging the Timeline Service v.2
>> >> >>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have
>> >> >>>>> been
>> >> >>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
>> >> >>>>> while, and we are reasonably confident that the state of the
>> >> >>>>> feature
>> >> >>>>> meets the criteria to be merged onto trunk and we'd love folks to
>> >> get
>> >> >>>>> their hands on it and provide valuable feedback so that we can
>> >> >>>>> make
>> >> it
>> >> >>> production-ready.
>> >> >>>>>
>> >> >>>>> In a nutshell, Timeline Service v.2 delivers significant
>> >> >>>>> scalability
>> >> >>>>> and usability improvements based on a new architecture. You can
>> >> browse
>> >> >>>>> the requirements/design doc, the storage schema doc, the new
>> >> >>>>> entity/data model, the YARN documentation, and also discussions
>> >> >>>>> on
>> >> >>>>> subsequent milestones on
>> >> >>>>> YARN-2928 [1].
>> >> >>>>>
>> >> >>>>> What we would like to merge to trunk is termed "alpha 1"
>> >> >>>>> (milestone
>> >> >>>>> 1). The feature has a complete end-to-end read/write flow, and
>> >> >>>>> you
>> >> >>>>> should be able to start setting it up and testing it. At a high
>> >> level,
>> >> >>>>> the following are the key features that have been implemented:
>> >> >>>>>
>> >> >>>>> - distributed writers (collectors) as NM aux services
>> >> >>>>> - HBase storage
>> >> >>>>> - new entity model that includes flows
>> >> >>>>> - setting the flow context via YARN app tags
>> >> >>>>> - real time metrics aggregation to the application level and the
>> >> flow
>> >> >>>>> level
>> >> >>>>> - rich REST API that supports filters, complex conditionals,
>> >> >>>>> limits,
>> >> >>>>> content selection, etc.
>> >> >>>>> - YARN generic events and system metrics
>> >> >>>>> - integration with Distributed Shell and MapReduce
>> >> >>>>>
>> >> >>>>> There are a total of 139 subtasks that were completed as part of
>> >> this
>> >> >>>>> effort.
>> >> >>>>>
>> >> >>>>> We paid close attention to ensure that once disabled Timeline
>> >> Service
>> >> >>>>> v.2 does not impact existing functionality when disabled (by
>> >> default).
>> >> >>>>>
>> >> >>>>> I'd like to call out a couple of things to discuss in particular.
>> >> >>>>>
>> >> >>>>> *First*, if the merge vote is approved, to which branch should
>> >> >>>>> this
>> >> be
>> >> >>>>> merged and what would be the release version? My preference is
>> >> >>>>> that
>> >> >>>>> *it would be merged to branch "trunk" and be part of
>> >> >>>>> 3.0.0-alpha1*
>> >> if
>> >> >>> approved.
>> >> >>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get
>> >> >>>>> your
>> >> >>>>> thoughts on this.
>> >> >>>>>
>> >> >>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase
>> >> >>>>> from
>> >> >>> YARN.
>> >> >>>>> It is not a cyclical dependency (as HBase does not really depend
>> >> >>>>> on
>> >> >>> YARN).
>> >> >>>>> However, the version of Hadoop that HBase currently supports lags
>> >> >>>>> behind the Hadoop version that Timeline Service is based on, so
>> >> there
>> >> >>>>> is a potential for subtle dependency conflicts. We made some
>> >> >>>>> efforts
>> >> >>>>> to isolate the issue (see [4] and [5]). The HBase folks have also
>> >> been
>> >> >>>>> responsive in keeping up with the trunk as much as they can.
>> >> >>>>> Nonetheless, this is something to keep in mind.
>> >> >>>>>
>> >> >>>>> I would love to get your thoughts on these and more before we
>> >> >>>>> open a
>> >> >>>>> real voting thread. Thanks!
>> >> >>>>>
>> >> >>>>> Regards,
>> >> >>>>> Sangjin
>> >> >>>>>
>> >> >>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
>> >> >>>>> [2] MAPREDUCE-6331:
>> >> >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
>> >> >>>>> [3] YARN-2928 commits:
>> >> >>>>> https://github.com/apache/hadoop/commits/YARN-2928
>> >> >>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
>> >> >>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
>> >> >>>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> ---------------------------------------------------------------------
>> >> >>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> >> >>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>> >> >>>>
>> >> >>>
>> >> >>
>> >>
>> >>
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by 俊平堵 <ju...@apache.org>.

Big +1 on merging ATS-v2 to trunk. However, my concern to release it in
3.0.0-alpha (even as an alpha feature) is we haven't provide any security
support in ATS v2 yet. Enabling this feature without understanding the risk
here could be a disaster to end-user (even in a test cluster).

Kudos to everyone who contributes patches, include: Sangjin, Li, Vrushali,
Naga, Varun, Joep and Zhijie.

Thanks,

Junping

2016-06-23 13:32 GMT-07:00 Sangjin Lee <sj...@apache.org>:

> Thanks folks for the good discussion!
>
> I'm going to keep it open for a few more days as I'd love to get feedback
> from more people. I am thinking of opening a voting thread right after the
> Hadoop Summit next week if there are no objections. Thanks!
>
> Regards,
> Sangjin
>
> On Tue, Jun 21, 2016 at 9:51 PM, Li Lu <ll...@hortonworks.com> wrote:
>
> > I agree that having non-Hbase impls may attract more potential users to
> > ATS. Actually I remember we do have some JIRAs for HDFS implementations.
> > With regard to aggregation, yes, if there are more options on storage
> > implementations we really need to find some ways to describe their
> > implications to different kinds of aggressions.
> >
> > +1 for the idea of some group chats! The break after the ATS talk may be
> a
> > good candidate?
> >
> > Li Lu
> >
> > On Jun 21, 2016, at 21:28, Karthik Kambatla <ka...@cloudera.com> wrote:
> >
> > The reasons for my asking about alternate implementations: (1) ease of
> > trying it out for Yarn devs and iteration for bug fixes, improvements and
> > (2) ease of trying it for app-writers/users to figure out if they should
> > use the ATS. Again, personally, I don't see this as necessary for the
> merge
> > itself, but more so for adoption.
> >
> > A test implementation would be enough for #1, and would partially address
> > #2. A more substantial implementation would be nice, but I guess we need
> to
> > look at the ROI to decide whether adding that is a good idea.
> >
> > On completeness, I agree. Further, for some backend implementations, it
> is
> > possible that a particular aggregation/query might be possible but too
> > expensive to turn on. What are your thoughts on provisions for the admin
> to
> > turn off some queries/aggregations?
> >
> > Orthogonal: is there interest here to catch up on ATS specifically one of
> > the days? May be, during the breaks or after the sessions?
> >
> > On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <ll...@hortonworks.com> wrote:
> >
> >> HDFS or other non-HBase implementations are very helpful. We didn’t
> focus
> >> on those implementations in the first milestone because we would like to
> >> have one working version as a starting point. We can certainly add more
> >> implementations when the feature gets more mature.
> >>
> >> This said, one of my concerns when building these storage
> implementations
> >> is “completeness”. We have added a lot of supports to data aggregation.
> As
> >> of today, part of the aggregation (flow run aggregation) may be
> performed
> >> as HBase coprocessors. When implementing comparable storage impls, it is
> >> worth noting that one may want to provide some equivalent things to
> perform
> >> those aggregations (to really make one implementation “complete enough”,
> >> or, “interchangeable” to the existing HBase impl).
> >>
> >> Li Lu
> >> > On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org> wrote:
> >> >
> >> > Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd
> >> like
> >> > to get a better sense of what you're thinking of. Are you interested
> in
> >> > strictly a test implementation (e.g. perfectly fine in a single node
> >> setup)
> >> > or a more substantial implementation (may not scale but needs to work
> >> in a
> >> > more realistic setup)?
> >> >
> >> > Regards,
> >> > Sangjin
> >> >
> >> > On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis <
> jrottinghuis@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> Thanks Karthik and Tsuyoshi for bringing up good points.
> >> >>
> >> >> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track
> >> this
> >> >> discussion and capture all the merits and challenges in one single
> >> place.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Joep
> >> >>
> >> >> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org>
> >> wrote:
> >> >>
> >> >>> Thanks Sangjin for starting the discussion.
> >> >>>
> >> >>>>> *First*, if the merge vote is approved, to which branch should
> this
> >> be
> >> >>> merged and what would be the release version?
> >> >>>
> >> >>> As you mentioned, I think it's reasonable for us to target trunk and
> >> >>> 3.0.0-alpha.
> >> >>>
> >> >>>>> Slightly unrelated to the merge, do we plan to support any other
> >> >> simpler
> >> >>> backend for users to try out, in addition to HBase? LevelDB?
> >> >>>> We can however, potentially change the Local File System based
> >> >>> implementation to a HDFS based implementation and have it as an
> >> alternate
> >> >>> for non-production use,
> >> >>>
> >> >>> In Apache Big Data 2016 NA, some users also mentioned that they need
> >> HDFS
> >> >>> implementation. Currently it's pending, but I and Varun tried to
> work
> >> to
> >> >>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful
> for
> >> >>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's
> >> useful
> >> >>> for small cluster(e.g. smaller than 10 machines). After merging the
> >> >> current
> >> >>> implementation into trunk, I'm interested in resuming YARN-3874
> >> >> work(maybe
> >> >>> Varun is also interested in).
> >> >>>
> >> >>> Regards,
> >> >>> - Tsuyoshi
> >> >>>
> >> >>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <
> >> varun.saxena@huawei.com>
> >> >>> wrote:
> >> >>>> Thanks Karthik for sharing your views.
> >> >>>>
> >> >>>> With regards to merging, it would help to have clear documentation
> on
> >> >> how
> >> >>> to setup and use ATS.
> >> >>>> --> We do have documentation on this. You and others who are
> >> interested
> >> >>> can check out YARN-5174 which is the latest documentation related
> JIRA
> >> >> for
> >> >>> ATSv2.
> >> >>>>
> >> >>>> Slightly unrelated to the merge, do we plan to support any other
> >> >> simpler
> >> >>> backend for users to try out, in addition to HBase? LevelDB?
> >> >>>> --> We do have a File System based implementation but it is
> strictly
> >> >> for
> >> >>> test purposes (as we write data into a local file). It does not
> >> support
> >> >> all
> >> >>> the features of Timeline Service v.2 as well.
> >> >>>> Regarding LevelDB, Timeline Service v.2 has distributed writers and
> >> >> Level
> >> >>> DB writes data (log files or SSTable files) to local file system.
> This
> >> >>> means there will be no easy way to have a LevelDB based
> implementation
> >> >>> because we would not know where to read the data from, especially
> >> while
> >> >>> fetching flow level information.
> >> >>>> We can however, potentially change the Local File System based
> >> >>> implementation to a HDFS based implementation and have it as an
> >> alternate
> >> >>> for non-production use, if there is a potential need for it, based
> on
> >> >>> community feedback. This however, would have to be further discussed
> >> with
> >> >>> the team.
> >> >>>>
> >> >>>> Regards,
> >> >>>> Varun Saxena.
> >> >>>>
> >> >>>> -----Original Message-----
> >> >>>> From: Karthik Kambatla [mailto:kasha@cloudera.com]
> >> >>>> Sent: 21 June 2016 10:29
> >> >>>> To: Sangjin Lee
> >> >>>> Cc: yarn-dev@hadoop.apache.org
> >> >>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
> >> >> trunk
> >> >>>>
> >> >>>> Firstly, thanks Sangjin and others for driving this major feature.
> >> >>>>
> >> >>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as
> >> it
> >> >>> will give early access to downstream users.
> >> >>>>
> >> >>>> With regards to merging, it would help to have clear documentation
> on
> >> >> how
> >> >>> to setup and use ATS.
> >> >>>>
> >> >>>> Slightly unrelated to the merge, do we plan to support any other
> >> >> simpler
> >> >>> backend for users to try out, in addition to HBase? LevelDB? I
> >> understand
> >> >>> this wouldn't scale, but would it help with initial adoption and
> >> feedback
> >> >>> from early users?
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>
> >> >> wrote:
> >> >>>>
> >> >>>>> Hi all,
> >> >>>>>
> >> >>>>> I’d like to open a discussion on merging the Timeline Service v.2
> >> >>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have
> been
> >> >>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
> >> >>>>> while, and we are reasonably confident that the state of the
> feature
> >> >>>>> meets the criteria to be merged onto trunk and we'd love folks to
> >> get
> >> >>>>> their hands on it and provide valuable feedback so that we can
> make
> >> it
> >> >>> production-ready.
> >> >>>>>
> >> >>>>> In a nutshell, Timeline Service v.2 delivers significant
> scalability
> >> >>>>> and usability improvements based on a new architecture. You can
> >> browse
> >> >>>>> the requirements/design doc, the storage schema doc, the new
> >> >>>>> entity/data model, the YARN documentation, and also discussions on
> >> >>>>> subsequent milestones on
> >> >>>>> YARN-2928 [1].
> >> >>>>>
> >> >>>>> What we would like to merge to trunk is termed "alpha 1"
> (milestone
> >> >>>>> 1). The feature has a complete end-to-end read/write flow, and you
> >> >>>>> should be able to start setting it up and testing it. At a high
> >> level,
> >> >>>>> the following are the key features that have been implemented:
> >> >>>>>
> >> >>>>> - distributed writers (collectors) as NM aux services
> >> >>>>> - HBase storage
> >> >>>>> - new entity model that includes flows
> >> >>>>> - setting the flow context via YARN app tags
> >> >>>>> - real time metrics aggregation to the application level and the
> >> flow
> >> >>>>> level
> >> >>>>> - rich REST API that supports filters, complex conditionals,
> limits,
> >> >>>>> content selection, etc.
> >> >>>>> - YARN generic events and system metrics
> >> >>>>> - integration with Distributed Shell and MapReduce
> >> >>>>>
> >> >>>>> There are a total of 139 subtasks that were completed as part of
> >> this
> >> >>>>> effort.
> >> >>>>>
> >> >>>>> We paid close attention to ensure that once disabled Timeline
> >> Service
> >> >>>>> v.2 does not impact existing functionality when disabled (by
> >> default).
> >> >>>>>
> >> >>>>> I'd like to call out a couple of things to discuss in particular.
> >> >>>>>
> >> >>>>> *First*, if the merge vote is approved, to which branch should
> this
> >> be
> >> >>>>> merged and what would be the release version? My preference is
> that
> >> >>>>> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1*
> >> if
> >> >>> approved.
> >> >>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
> >> >>>>> thoughts on this.
> >> >>>>>
> >> >>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase
> from
> >> >>> YARN.
> >> >>>>> It is not a cyclical dependency (as HBase does not really depend
> on
> >> >>> YARN).
> >> >>>>> However, the version of Hadoop that HBase currently supports lags
> >> >>>>> behind the Hadoop version that Timeline Service is based on, so
> >> there
> >> >>>>> is a potential for subtle dependency conflicts. We made some
> efforts
> >> >>>>> to isolate the issue (see [4] and [5]). The HBase folks have also
> >> been
> >> >>>>> responsive in keeping up with the trunk as much as they can.
> >> >>>>> Nonetheless, this is something to keep in mind.
> >> >>>>>
> >> >>>>> I would love to get your thoughts on these and more before we
> open a
> >> >>>>> real voting thread. Thanks!
> >> >>>>>
> >> >>>>> Regards,
> >> >>>>> Sangjin
> >> >>>>>
> >> >>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
> >> >>>>> [2] MAPREDUCE-6331:
> >> >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
> >> >>>>> [3] YARN-2928 commits:
> >> >>>>> https://github.com/apache/hadoop/commits/YARN-2928
> >> >>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
> >> >>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
> >> >>>>>
> >> >>>>
> >> >>>>
> ---------------------------------------------------------------------
> >> >>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> >> >>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
> >> >>>>
> >> >>>
> >> >>
> >>
> >>
> >
> >
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Sangjin Lee <sj...@apache.org>.

Thanks folks for the good discussion!

I'm going to keep it open for a few more days as I'd love to get feedback
from more people. I am thinking of opening a voting thread right after the
Hadoop Summit next week if there are no objections. Thanks!

Regards,
Sangjin

On Tue, Jun 21, 2016 at 9:51 PM, Li Lu <ll...@hortonworks.com> wrote:

> I agree that having non-Hbase impls may attract more potential users to
> ATS. Actually I remember we do have some JIRAs for HDFS implementations.
> With regard to aggregation, yes, if there are more options on storage
> implementations we really need to find some ways to describe their
> implications to different kinds of aggressions.
>
> +1 for the idea of some group chats! The break after the ATS talk may be a
> good candidate?
>
> Li Lu
>
> On Jun 21, 2016, at 21:28, Karthik Kambatla <ka...@cloudera.com> wrote:
>
> The reasons for my asking about alternate implementations: (1) ease of
> trying it out for Yarn devs and iteration for bug fixes, improvements and
> (2) ease of trying it for app-writers/users to figure out if they should
> use the ATS. Again, personally, I don't see this as necessary for the merge
> itself, but more so for adoption.
>
> A test implementation would be enough for #1, and would partially address
> #2. A more substantial implementation would be nice, but I guess we need to
> look at the ROI to decide whether adding that is a good idea.
>
> On completeness, I agree. Further, for some backend implementations, it is
> possible that a particular aggregation/query might be possible but too
> expensive to turn on. What are your thoughts on provisions for the admin to
> turn off some queries/aggregations?
>
> Orthogonal: is there interest here to catch up on ATS specifically one of
> the days? May be, during the breaks or after the sessions?
>
> On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <ll...@hortonworks.com> wrote:
>
>> HDFS or other non-HBase implementations are very helpful. We didn’t focus
>> on those implementations in the first milestone because we would like to
>> have one working version as a starting point. We can certainly add more
>> implementations when the feature gets more mature.
>>
>> This said, one of my concerns when building these storage implementations
>> is “completeness”. We have added a lot of supports to data aggregation. As
>> of today, part of the aggregation (flow run aggregation) may be performed
>> as HBase coprocessors. When implementing comparable storage impls, it is
>> worth noting that one may want to provide some equivalent things to perform
>> those aggregations (to really make one implementation “complete enough”,
>> or, “interchangeable” to the existing HBase impl).
>>
>> Li Lu
>> > On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org> wrote:
>> >
>> > Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd
>> like
>> > to get a better sense of what you're thinking of. Are you interested in
>> > strictly a test implementation (e.g. perfectly fine in a single node
>> setup)
>> > or a more substantial implementation (may not scale but needs to work
>> in a
>> > more realistic setup)?
>> >
>> > Regards,
>> > Sangjin
>> >
>> > On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis <jrottinghuis@gmail.com
>> >
>> > wrote:
>> >
>> >> Thanks Karthik and Tsuyoshi for bringing up good points.
>> >>
>> >> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track
>> this
>> >> discussion and capture all the merits and challenges in one single
>> place.
>> >>
>> >> Thanks,
>> >>
>> >> Joep
>> >>
>> >> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org>
>> wrote:
>> >>
>> >>> Thanks Sangjin for starting the discussion.
>> >>>
>> >>>>> *First*, if the merge vote is approved, to which branch should this
>> be
>> >>> merged and what would be the release version?
>> >>>
>> >>> As you mentioned, I think it's reasonable for us to target trunk and
>> >>> 3.0.0-alpha.
>> >>>
>> >>>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >>>> We can however, potentially change the Local File System based
>> >>> implementation to a HDFS based implementation and have it as an
>> alternate
>> >>> for non-production use,
>> >>>
>> >>> In Apache Big Data 2016 NA, some users also mentioned that they need
>> HDFS
>> >>> implementation. Currently it's pending, but I and Varun tried to work
>> to
>> >>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
>> >>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's
>> useful
>> >>> for small cluster(e.g. smaller than 10 machines). After merging the
>> >> current
>> >>> implementation into trunk, I'm interested in resuming YARN-3874
>> >> work(maybe
>> >>> Varun is also interested in).
>> >>>
>> >>> Regards,
>> >>> - Tsuyoshi
>> >>>
>> >>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <
>> varun.saxena@huawei.com>
>> >>> wrote:
>> >>>> Thanks Karthik for sharing your views.
>> >>>>
>> >>>> With regards to merging, it would help to have clear documentation on
>> >> how
>> >>> to setup and use ATS.
>> >>>> --> We do have documentation on this. You and others who are
>> interested
>> >>> can check out YARN-5174 which is the latest documentation related JIRA
>> >> for
>> >>> ATSv2.
>> >>>>
>> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB?
>> >>>> --> We do have a File System based implementation but it is strictly
>> >> for
>> >>> test purposes (as we write data into a local file). It does not
>> support
>> >> all
>> >>> the features of Timeline Service v.2 as well.
>> >>>> Regarding LevelDB, Timeline Service v.2 has distributed writers and
>> >> Level
>> >>> DB writes data (log files or SSTable files) to local file system. This
>> >>> means there will be no easy way to have a LevelDB based implementation
>> >>> because we would not know where to read the data from, especially
>> while
>> >>> fetching flow level information.
>> >>>> We can however, potentially change the Local File System based
>> >>> implementation to a HDFS based implementation and have it as an
>> alternate
>> >>> for non-production use, if there is a potential need for it, based on
>> >>> community feedback. This however, would have to be further discussed
>> with
>> >>> the team.
>> >>>>
>> >>>> Regards,
>> >>>> Varun Saxena.
>> >>>>
>> >>>> -----Original Message-----
>> >>>> From: Karthik Kambatla [mailto:kasha@cloudera.com]
>> >>>> Sent: 21 June 2016 10:29
>> >>>> To: Sangjin Lee
>> >>>> Cc: yarn-dev@hadoop.apache.org
>> >>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
>> >> trunk
>> >>>>
>> >>>> Firstly, thanks Sangjin and others for driving this major feature.
>> >>>>
>> >>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as
>> it
>> >>> will give early access to downstream users.
>> >>>>
>> >>>> With regards to merging, it would help to have clear documentation on
>> >> how
>> >>> to setup and use ATS.
>> >>>>
>> >>>> Slightly unrelated to the merge, do we plan to support any other
>> >> simpler
>> >>> backend for users to try out, in addition to HBase? LevelDB? I
>> understand
>> >>> this wouldn't scale, but would it help with initial adoption and
>> feedback
>> >>> from early users?
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>
>> >> wrote:
>> >>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> I’d like to open a discussion on merging the Timeline Service v.2
>> >>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been
>> >>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
>> >>>>> while, and we are reasonably confident that the state of the feature
>> >>>>> meets the criteria to be merged onto trunk and we'd love folks to
>> get
>> >>>>> their hands on it and provide valuable feedback so that we can make
>> it
>> >>> production-ready.
>> >>>>>
>> >>>>> In a nutshell, Timeline Service v.2 delivers significant scalability
>> >>>>> and usability improvements based on a new architecture. You can
>> browse
>> >>>>> the requirements/design doc, the storage schema doc, the new
>> >>>>> entity/data model, the YARN documentation, and also discussions on
>> >>>>> subsequent milestones on
>> >>>>> YARN-2928 [1].
>> >>>>>
>> >>>>> What we would like to merge to trunk is termed "alpha 1" (milestone
>> >>>>> 1). The feature has a complete end-to-end read/write flow, and you
>> >>>>> should be able to start setting it up and testing it. At a high
>> level,
>> >>>>> the following are the key features that have been implemented:
>> >>>>>
>> >>>>> - distributed writers (collectors) as NM aux services
>> >>>>> - HBase storage
>> >>>>> - new entity model that includes flows
>> >>>>> - setting the flow context via YARN app tags
>> >>>>> - real time metrics aggregation to the application level and the
>> flow
>> >>>>> level
>> >>>>> - rich REST API that supports filters, complex conditionals, limits,
>> >>>>> content selection, etc.
>> >>>>> - YARN generic events and system metrics
>> >>>>> - integration with Distributed Shell and MapReduce
>> >>>>>
>> >>>>> There are a total of 139 subtasks that were completed as part of
>> this
>> >>>>> effort.
>> >>>>>
>> >>>>> We paid close attention to ensure that once disabled Timeline
>> Service
>> >>>>> v.2 does not impact existing functionality when disabled (by
>> default).
>> >>>>>
>> >>>>> I'd like to call out a couple of things to discuss in particular.
>> >>>>>
>> >>>>> *First*, if the merge vote is approved, to which branch should this
>> be
>> >>>>> merged and what would be the release version? My preference is that
>> >>>>> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1*
>> if
>> >>> approved.
>> >>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
>> >>>>> thoughts on this.
>> >>>>>
>> >>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase from
>> >>> YARN.
>> >>>>> It is not a cyclical dependency (as HBase does not really depend on
>> >>> YARN).
>> >>>>> However, the version of Hadoop that HBase currently supports lags
>> >>>>> behind the Hadoop version that Timeline Service is based on, so
>> there
>> >>>>> is a potential for subtle dependency conflicts. We made some efforts
>> >>>>> to isolate the issue (see [4] and [5]). The HBase folks have also
>> been
>> >>>>> responsive in keeping up with the trunk as much as they can.
>> >>>>> Nonetheless, this is something to keep in mind.
>> >>>>>
>> >>>>> I would love to get your thoughts on these and more before we open a
>> >>>>> real voting thread. Thanks!
>> >>>>>
>> >>>>> Regards,
>> >>>>> Sangjin
>> >>>>>
>> >>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
>> >>>>> [2] MAPREDUCE-6331:
>> >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
>> >>>>> [3] YARN-2928 commits:
>> >>>>> https://github.com/apache/hadoop/commits/YARN-2928
>> >>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
>> >>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
>> >>>>>
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> >>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>> >>>>
>> >>>
>> >>
>>
>>
>
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Li Lu <ll...@hortonworks.com>.

I agree that having non-Hbase impls may attract more potential users to ATS. Actually I remember we do have some JIRAs for HDFS implementations. With regard to aggregation, yes, if there are more options on storage implementations we really need to find some ways to describe their implications to different kinds of aggressions.

+1 for the idea of some group chats! The break after the ATS talk may be a good candidate?

Li Lu

On Jun 21, 2016, at 21:28, Karthik Kambatla <ka...@cloudera.com>> wrote:

The reasons for my asking about alternate implementations: (1) ease of trying it out for Yarn devs and iteration for bug fixes, improvements and (2) ease of trying it for app-writers/users to figure out if they should use the ATS. Again, personally, I don't see this as necessary for the merge itself, but more so for adoption.

A test implementation would be enough for #1, and would partially address #2. A more substantial implementation would be nice, but I guess we need to look at the ROI to decide whether adding that is a good idea.

On completeness, I agree. Further, for some backend implementations, it is possible that a particular aggregation/query might be possible but too expensive to turn on. What are your thoughts on provisions for the admin to turn off some queries/aggregations?

Orthogonal: is there interest here to catch up on ATS specifically one of the days? May be, during the breaks or after the sessions?

On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <ll...@hortonworks.com>> wrote:
HDFS or other non-HBase implementations are very helpful. We didn’t focus on those implementations in the first milestone because we would like to have one working version as a starting point. We can certainly add more implementations when the feature gets more mature.

This said, one of my concerns when building these storage implementations is “completeness”. We have added a lot of supports to data aggregation. As of today, part of the aggregation (flow run aggregation) may be performed as HBase coprocessors. When implementing comparable storage impls, it is worth noting that one may want to provide some equivalent things to perform those aggregations (to really make one implementation “complete enough”, or, “interchangeable” to the existing HBase impl).

Li Lu
> On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org>> wrote:
>
> Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd like
> to get a better sense of what you're thinking of. Are you interested in
> strictly a test implementation (e.g. perfectly fine in a single node setup)
> or a more substantial implementation (may not scale but needs to work in a
> more realistic setup)?
>
> Regards,
> Sangjin
>
> On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis <jr...@gmail.com>>
> wrote:
>
>> Thanks Karthik and Tsuyoshi for bringing up good points.
>>
>> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track this
>> discussion and capture all the merits and challenges in one single place.
>>
>> Thanks,
>>
>> Joep
>>
>> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org>> wrote:
>>
>>> Thanks Sangjin for starting the discussion.
>>>
>>>>> *First*, if the merge vote is approved, to which branch should this be
>>> merged and what would be the release version?
>>>
>>> As you mentioned, I think it's reasonable for us to target trunk and
>>> 3.0.0-alpha.
>>>
>>>>> Slightly unrelated to the merge, do we plan to support any other
>> simpler
>>> backend for users to try out, in addition to HBase? LevelDB?
>>>> We can however, potentially change the Local File System based
>>> implementation to a HDFS based implementation and have it as an alternate
>>> for non-production use,
>>>
>>> In Apache Big Data 2016 NA, some users also mentioned that they need HDFS
>>> implementation. Currently it's pending, but I and Varun tried to work to
>>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
>>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's useful
>>> for small cluster(e.g. smaller than 10 machines). After merging the
>> current
>>> implementation into trunk, I'm interested in resuming YARN-3874
>> work(maybe
>>> Varun is also interested in).
>>>
>>> Regards,
>>> - Tsuyoshi
>>>
>>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <va...@huawei.com>>
>>> wrote:
>>>> Thanks Karthik for sharing your views.
>>>>
>>>> With regards to merging, it would help to have clear documentation on
>> how
>>> to setup and use ATS.
>>>> --> We do have documentation on this. You and others who are interested
>>> can check out YARN-5174 which is the latest documentation related JIRA
>> for
>>> ATSv2.
>>>>
>>>> Slightly unrelated to the merge, do we plan to support any other
>> simpler
>>> backend for users to try out, in addition to HBase? LevelDB?
>>>> --> We do have a File System based implementation but it is strictly
>> for
>>> test purposes (as we write data into a local file). It does not support
>> all
>>> the features of Timeline Service v.2 as well.
>>>> Regarding LevelDB, Timeline Service v.2 has distributed writers and
>> Level
>>> DB writes data (log files or SSTable files) to local file system. This
>>> means there will be no easy way to have a LevelDB based implementation
>>> because we would not know where to read the data from, especially while
>>> fetching flow level information.
>>>> We can however, potentially change the Local File System based
>>> implementation to a HDFS based implementation and have it as an alternate
>>> for non-production use, if there is a potential need for it, based on
>>> community feedback. This however, would have to be further discussed with
>>> the team.
>>>>
>>>> Regards,
>>>> Varun Saxena.
>>>>
>>>> -----Original Message-----
>>>> From: Karthik Kambatla [mailto:kasha@cloudera.com<ma...@cloudera.com>]
>>>> Sent: 21 June 2016 10:29
>>>> To: Sangjin Lee
>>>> Cc: yarn-dev@hadoop.apache.org<ma...@hadoop.apache.org>
>>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
>> trunk
>>>>
>>>> Firstly, thanks Sangjin and others for driving this major feature.
>>>>
>>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as it
>>> will give early access to downstream users.
>>>>
>>>> With regards to merging, it would help to have clear documentation on
>> how
>>> to setup and use ATS.
>>>>
>>>> Slightly unrelated to the merge, do we plan to support any other
>> simpler
>>> backend for users to try out, in addition to HBase? LevelDB? I understand
>>> this wouldn't scale, but would it help with initial adoption and feedback
>>> from early users?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>>
>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I’d like to open a discussion on merging the Timeline Service v.2
>>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been
>>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
>>>>> while, and we are reasonably confident that the state of the feature
>>>>> meets the criteria to be merged onto trunk and we'd love folks to get
>>>>> their hands on it and provide valuable feedback so that we can make it
>>> production-ready.
>>>>>
>>>>> In a nutshell, Timeline Service v.2 delivers significant scalability
>>>>> and usability improvements based on a new architecture. You can browse
>>>>> the requirements/design doc, the storage schema doc, the new
>>>>> entity/data model, the YARN documentation, and also discussions on
>>>>> subsequent milestones on
>>>>> YARN-2928 [1].
>>>>>
>>>>> What we would like to merge to trunk is termed "alpha 1" (milestone
>>>>> 1). The feature has a complete end-to-end read/write flow, and you
>>>>> should be able to start setting it up and testing it. At a high level,
>>>>> the following are the key features that have been implemented:
>>>>>
>>>>> - distributed writers (collectors) as NM aux services
>>>>> - HBase storage
>>>>> - new entity model that includes flows
>>>>> - setting the flow context via YARN app tags
>>>>> - real time metrics aggregation to the application level and the flow
>>>>> level
>>>>> - rich REST API that supports filters, complex conditionals, limits,
>>>>> content selection, etc.
>>>>> - YARN generic events and system metrics
>>>>> - integration with Distributed Shell and MapReduce
>>>>>
>>>>> There are a total of 139 subtasks that were completed as part of this
>>>>> effort.
>>>>>
>>>>> We paid close attention to ensure that once disabled Timeline Service
>>>>> v.2 does not impact existing functionality when disabled (by default).
>>>>>
>>>>> I'd like to call out a couple of things to discuss in particular.
>>>>>
>>>>> *First*, if the merge vote is approved, to which branch should this be
>>>>> merged and what would be the release version? My preference is that
>>>>> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1* if
>>> approved.
>>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
>>>>> thoughts on this.
>>>>>
>>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase from
>>> YARN.
>>>>> It is not a cyclical dependency (as HBase does not really depend on
>>> YARN).
>>>>> However, the version of Hadoop that HBase currently supports lags
>>>>> behind the Hadoop version that Timeline Service is based on, so there
>>>>> is a potential for subtle dependency conflicts. We made some efforts
>>>>> to isolate the issue (see [4] and [5]). The HBase folks have also been
>>>>> responsive in keeping up with the trunk as much as they can.
>>>>> Nonetheless, this is something to keep in mind.
>>>>>
>>>>> I would love to get your thoughts on these and more before we open a
>>>>> real voting thread. Thanks!
>>>>>
>>>>> Regards,
>>>>> Sangjin
>>>>>
>>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
>>>>> [2] MAPREDUCE-6331:
>>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
>>>>> [3] YARN-2928 commits:
>>>>> https://github.com/apache/hadoop/commits/YARN-2928
>>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
>>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org<ma...@hadoop.apache.org>
>>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org<ma...@hadoop.apache.org>
>>>>
>>>
>>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.

The reasons for my asking about alternate implementations: (1) ease of
trying it out for Yarn devs and iteration for bug fixes, improvements and
(2) ease of trying it for app-writers/users to figure out if they should
use the ATS. Again, personally, I don't see this as necessary for the merge
itself, but more so for adoption.

A test implementation would be enough for #1, and would partially address
#2. A more substantial implementation would be nice, but I guess we need to
look at the ROI to decide whether adding that is a good idea.

On completeness, I agree. Further, for some backend implementations, it is
possible that a particular aggregation/query might be possible but too
expensive to turn on. What are your thoughts on provisions for the admin to
turn off some queries/aggregations?

Orthogonal: is there interest here to catch up on ATS specifically one of
the days? May be, during the breaks or after the sessions?

On Tue, Jun 21, 2016 at 6:15 PM, Li Lu <ll...@hortonworks.com> wrote:

> HDFS or other non-HBase implementations are very helpful. We didn’t focus
> on those implementations in the first milestone because we would like to
> have one working version as a starting point. We can certainly add more
> implementations when the feature gets more mature.
>
> This said, one of my concerns when building these storage implementations
> is “completeness”. We have added a lot of supports to data aggregation. As
> of today, part of the aggregation (flow run aggregation) may be performed
> as HBase coprocessors. When implementing comparable storage impls, it is
> worth noting that one may want to provide some equivalent things to perform
> those aggregations (to really make one implementation “complete enough”,
> or, “interchangeable” to the existing HBase impl).
>
> Li Lu
> > On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org> wrote:
> >
> > Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd
> like
> > to get a better sense of what you're thinking of. Are you interested in
> > strictly a test implementation (e.g. perfectly fine in a single node
> setup)
> > or a more substantial implementation (may not scale but needs to work in
> a
> > more realistic setup)?
> >
> > Regards,
> > Sangjin
> >
> > On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis <jr...@gmail.com>
> > wrote:
> >
> >> Thanks Karthik and Tsuyoshi for bringing up good points.
> >>
> >> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track
> this
> >> discussion and capture all the merits and challenges in one single
> place.
> >>
> >> Thanks,
> >>
> >> Joep
> >>
> >> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org>
> wrote:
> >>
> >>> Thanks Sangjin for starting the discussion.
> >>>
> >>>>> *First*, if the merge vote is approved, to which branch should this
> be
> >>> merged and what would be the release version?
> >>>
> >>> As you mentioned, I think it's reasonable for us to target trunk and
> >>> 3.0.0-alpha.
> >>>
> >>>>> Slightly unrelated to the merge, do we plan to support any other
> >> simpler
> >>> backend for users to try out, in addition to HBase? LevelDB?
> >>>> We can however, potentially change the Local File System based
> >>> implementation to a HDFS based implementation and have it as an
> alternate
> >>> for non-production use,
> >>>
> >>> In Apache Big Data 2016 NA, some users also mentioned that they need
> HDFS
> >>> implementation. Currently it's pending, but I and Varun tried to work
> to
> >>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
> >>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's
> useful
> >>> for small cluster(e.g. smaller than 10 machines). After merging the
> >> current
> >>> implementation into trunk, I'm interested in resuming YARN-3874
> >> work(maybe
> >>> Varun is also interested in).
> >>>
> >>> Regards,
> >>> - Tsuyoshi
> >>>
> >>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <varun.saxena@huawei.com
> >
> >>> wrote:
> >>>> Thanks Karthik for sharing your views.
> >>>>
> >>>> With regards to merging, it would help to have clear documentation on
> >> how
> >>> to setup and use ATS.
> >>>> --> We do have documentation on this. You and others who are
> interested
> >>> can check out YARN-5174 which is the latest documentation related JIRA
> >> for
> >>> ATSv2.
> >>>>
> >>>> Slightly unrelated to the merge, do we plan to support any other
> >> simpler
> >>> backend for users to try out, in addition to HBase? LevelDB?
> >>>> --> We do have a File System based implementation but it is strictly
> >> for
> >>> test purposes (as we write data into a local file). It does not support
> >> all
> >>> the features of Timeline Service v.2 as well.
> >>>> Regarding LevelDB, Timeline Service v.2 has distributed writers and
> >> Level
> >>> DB writes data (log files or SSTable files) to local file system. This
> >>> means there will be no easy way to have a LevelDB based implementation
> >>> because we would not know where to read the data from, especially while
> >>> fetching flow level information.
> >>>> We can however, potentially change the Local File System based
> >>> implementation to a HDFS based implementation and have it as an
> alternate
> >>> for non-production use, if there is a potential need for it, based on
> >>> community feedback. This however, would have to be further discussed
> with
> >>> the team.
> >>>>
> >>>> Regards,
> >>>> Varun Saxena.
> >>>>
> >>>> -----Original Message-----
> >>>> From: Karthik Kambatla [mailto:kasha@cloudera.com]
> >>>> Sent: 21 June 2016 10:29
> >>>> To: Sangjin Lee
> >>>> Cc: yarn-dev@hadoop.apache.org
> >>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
> >> trunk
> >>>>
> >>>> Firstly, thanks Sangjin and others for driving this major feature.
> >>>>
> >>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as it
> >>> will give early access to downstream users.
> >>>>
> >>>> With regards to merging, it would help to have clear documentation on
> >> how
> >>> to setup and use ATS.
> >>>>
> >>>> Slightly unrelated to the merge, do we plan to support any other
> >> simpler
> >>> backend for users to try out, in addition to HBase? LevelDB? I
> understand
> >>> this wouldn't scale, but would it help with initial adoption and
> feedback
> >>> from early users?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>
> >> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I’d like to open a discussion on merging the Timeline Service v.2
> >>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been
> >>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
> >>>>> while, and we are reasonably confident that the state of the feature
> >>>>> meets the criteria to be merged onto trunk and we'd love folks to get
> >>>>> their hands on it and provide valuable feedback so that we can make
> it
> >>> production-ready.
> >>>>>
> >>>>> In a nutshell, Timeline Service v.2 delivers significant scalability
> >>>>> and usability improvements based on a new architecture. You can
> browse
> >>>>> the requirements/design doc, the storage schema doc, the new
> >>>>> entity/data model, the YARN documentation, and also discussions on
> >>>>> subsequent milestones on
> >>>>> YARN-2928 [1].
> >>>>>
> >>>>> What we would like to merge to trunk is termed "alpha 1" (milestone
> >>>>> 1). The feature has a complete end-to-end read/write flow, and you
> >>>>> should be able to start setting it up and testing it. At a high
> level,
> >>>>> the following are the key features that have been implemented:
> >>>>>
> >>>>> - distributed writers (collectors) as NM aux services
> >>>>> - HBase storage
> >>>>> - new entity model that includes flows
> >>>>> - setting the flow context via YARN app tags
> >>>>> - real time metrics aggregation to the application level and the flow
> >>>>> level
> >>>>> - rich REST API that supports filters, complex conditionals, limits,
> >>>>> content selection, etc.
> >>>>> - YARN generic events and system metrics
> >>>>> - integration with Distributed Shell and MapReduce
> >>>>>
> >>>>> There are a total of 139 subtasks that were completed as part of this
> >>>>> effort.
> >>>>>
> >>>>> We paid close attention to ensure that once disabled Timeline Service
> >>>>> v.2 does not impact existing functionality when disabled (by
> default).
> >>>>>
> >>>>> I'd like to call out a couple of things to discuss in particular.
> >>>>>
> >>>>> *First*, if the merge vote is approved, to which branch should this
> be
> >>>>> merged and what would be the release version? My preference is that
> >>>>> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1* if
> >>> approved.
> >>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
> >>>>> thoughts on this.
> >>>>>
> >>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase from
> >>> YARN.
> >>>>> It is not a cyclical dependency (as HBase does not really depend on
> >>> YARN).
> >>>>> However, the version of Hadoop that HBase currently supports lags
> >>>>> behind the Hadoop version that Timeline Service is based on, so there
> >>>>> is a potential for subtle dependency conflicts. We made some efforts
> >>>>> to isolate the issue (see [4] and [5]). The HBase folks have also
> been
> >>>>> responsive in keeping up with the trunk as much as they can.
> >>>>> Nonetheless, this is something to keep in mind.
> >>>>>
> >>>>> I would love to get your thoughts on these and more before we open a
> >>>>> real voting thread. Thanks!
> >>>>>
> >>>>> Regards,
> >>>>> Sangjin
> >>>>>
> >>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
> >>>>> [2] MAPREDUCE-6331:
> >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
> >>>>> [3] YARN-2928 commits:
> >>>>> https://github.com/apache/hadoop/commits/YARN-2928
> >>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
> >>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
> >>>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> >>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
> >>>>
> >>>
> >>
>
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Li Lu <ll...@hortonworks.com>.

HDFS or other non-HBase implementations are very helpful. We didn’t focus on those implementations in the first milestone because we would like to have one working version as a starting point. We can certainly add more implementations when the feature gets more mature. 

This said, one of my concerns when building these storage implementations is “completeness”. We have added a lot of supports to data aggregation. As of today, part of the aggregation (flow run aggregation) may be performed as HBase coprocessors. When implementing comparable storage impls, it is worth noting that one may want to provide some equivalent things to perform those aggregations (to really make one implementation “complete enough”, or, “interchangeable” to the existing HBase impl). 

Li Lu
> On Jun 21, 2016, at 15:51, Sangjin Lee <sj...@apache.org> wrote:
> 
> Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd like
> to get a better sense of what you're thinking of. Are you interested in
> strictly a test implementation (e.g. perfectly fine in a single node setup)
> or a more substantial implementation (may not scale but needs to work in a
> more realistic setup)?
> 
> Regards,
> Sangjin
> 
> On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis <jr...@gmail.com>
> wrote:
> 
>> Thanks Karthik and Tsuyoshi for bringing up good points.
>> 
>> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track this
>> discussion and capture all the merits and challenges in one single place.
>> 
>> Thanks,
>> 
>> Joep
>> 
>> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
>> 
>>> Thanks Sangjin for starting the discussion.
>>> 
>>>>> *First*, if the merge vote is approved, to which branch should this be
>>> merged and what would be the release version?
>>> 
>>> As you mentioned, I think it's reasonable for us to target trunk and
>>> 3.0.0-alpha.
>>> 
>>>>> Slightly unrelated to the merge, do we plan to support any other
>> simpler
>>> backend for users to try out, in addition to HBase? LevelDB?
>>>> We can however, potentially change the Local File System based
>>> implementation to a HDFS based implementation and have it as an alternate
>>> for non-production use,
>>> 
>>> In Apache Big Data 2016 NA, some users also mentioned that they need HDFS
>>> implementation. Currently it's pending, but I and Varun tried to work to
>>> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
>>> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's useful
>>> for small cluster(e.g. smaller than 10 machines). After merging the
>> current
>>> implementation into trunk, I'm interested in resuming YARN-3874
>> work(maybe
>>> Varun is also interested in).
>>> 
>>> Regards,
>>> - Tsuyoshi
>>> 
>>> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <va...@huawei.com>
>>> wrote:
>>>> Thanks Karthik for sharing your views.
>>>> 
>>>> With regards to merging, it would help to have clear documentation on
>> how
>>> to setup and use ATS.
>>>> --> We do have documentation on this. You and others who are interested
>>> can check out YARN-5174 which is the latest documentation related JIRA
>> for
>>> ATSv2.
>>>> 
>>>> Slightly unrelated to the merge, do we plan to support any other
>> simpler
>>> backend for users to try out, in addition to HBase? LevelDB?
>>>> --> We do have a File System based implementation but it is strictly
>> for
>>> test purposes (as we write data into a local file). It does not support
>> all
>>> the features of Timeline Service v.2 as well.
>>>> Regarding LevelDB, Timeline Service v.2 has distributed writers and
>> Level
>>> DB writes data (log files or SSTable files) to local file system. This
>>> means there will be no easy way to have a LevelDB based implementation
>>> because we would not know where to read the data from, especially while
>>> fetching flow level information.
>>>> We can however, potentially change the Local File System based
>>> implementation to a HDFS based implementation and have it as an alternate
>>> for non-production use, if there is a potential need for it, based on
>>> community feedback. This however, would have to be further discussed with
>>> the team.
>>>> 
>>>> Regards,
>>>> Varun Saxena.
>>>> 
>>>> -----Original Message-----
>>>> From: Karthik Kambatla [mailto:kasha@cloudera.com]
>>>> Sent: 21 June 2016 10:29
>>>> To: Sangjin Lee
>>>> Cc: yarn-dev@hadoop.apache.org
>>>> Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
>> trunk
>>>> 
>>>> Firstly, thanks Sangjin and others for driving this major feature.
>>>> 
>>>> Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as it
>>> will give early access to downstream users.
>>>> 
>>>> With regards to merging, it would help to have clear documentation on
>> how
>>> to setup and use ATS.
>>>> 
>>>> Slightly unrelated to the merge, do we plan to support any other
>> simpler
>>> backend for users to try out, in addition to HBase? LevelDB? I understand
>>> this wouldn't scale, but would it help with initial adoption and feedback
>>> from early users?
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>
>> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I’d like to open a discussion on merging the Timeline Service v.2
>>>>> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been
>>>>> developing the feature in a feature branch (YARN-2928 [3]) for a
>>>>> while, and we are reasonably confident that the state of the feature
>>>>> meets the criteria to be merged onto trunk and we'd love folks to get
>>>>> their hands on it and provide valuable feedback so that we can make it
>>> production-ready.
>>>>> 
>>>>> In a nutshell, Timeline Service v.2 delivers significant scalability
>>>>> and usability improvements based on a new architecture. You can browse
>>>>> the requirements/design doc, the storage schema doc, the new
>>>>> entity/data model, the YARN documentation, and also discussions on
>>>>> subsequent milestones on
>>>>> YARN-2928 [1].
>>>>> 
>>>>> What we would like to merge to trunk is termed "alpha 1" (milestone
>>>>> 1). The feature has a complete end-to-end read/write flow, and you
>>>>> should be able to start setting it up and testing it. At a high level,
>>>>> the following are the key features that have been implemented:
>>>>> 
>>>>> - distributed writers (collectors) as NM aux services
>>>>> - HBase storage
>>>>> - new entity model that includes flows
>>>>> - setting the flow context via YARN app tags
>>>>> - real time metrics aggregation to the application level and the flow
>>>>> level
>>>>> - rich REST API that supports filters, complex conditionals, limits,
>>>>> content selection, etc.
>>>>> - YARN generic events and system metrics
>>>>> - integration with Distributed Shell and MapReduce
>>>>> 
>>>>> There are a total of 139 subtasks that were completed as part of this
>>>>> effort.
>>>>> 
>>>>> We paid close attention to ensure that once disabled Timeline Service
>>>>> v.2 does not impact existing functionality when disabled (by default).
>>>>> 
>>>>> I'd like to call out a couple of things to discuss in particular.
>>>>> 
>>>>> *First*, if the merge vote is approved, to which branch should this be
>>>>> merged and what would be the release version? My preference is that
>>>>> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1* if
>>> approved.
>>>>> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
>>>>> thoughts on this.
>>>>> 
>>>>> *Second*, Timeline Service v.2 introduces a dependency on HBase from
>>> YARN.
>>>>> It is not a cyclical dependency (as HBase does not really depend on
>>> YARN).
>>>>> However, the version of Hadoop that HBase currently supports lags
>>>>> behind the Hadoop version that Timeline Service is based on, so there
>>>>> is a potential for subtle dependency conflicts. We made some efforts
>>>>> to isolate the issue (see [4] and [5]). The HBase folks have also been
>>>>> responsive in keeping up with the trunk as much as they can.
>>>>> Nonetheless, this is something to keep in mind.
>>>>> 
>>>>> I would love to get your thoughts on these and more before we open a
>>>>> real voting thread. Thanks!
>>>>> 
>>>>> Regards,
>>>>> Sangjin
>>>>> 
>>>>> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
>>>>> [2] MAPREDUCE-6331:
>>>>> https://issues.apache.org/jira/browse/MAPREDUCE-6331
>>>>> [3] YARN-2928 commits:
>>>>> https://github.com/apache/hadoop/commits/YARN-2928
>>>>> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
>>>>> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
>>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by Sangjin Lee <sj...@apache.org>.

Thanks Karthik and Tsuyoshi. Regarding alternate implementations, I'd like
to get a better sense of what you're thinking of. Are you interested in
strictly a test implementation (e.g. perfectly fine in a single node setup)
or a more substantial implementation (may not scale but needs to work in a
more realistic setup)?

Regards,
Sangjin

On Tue, Jun 21, 2016 at 2:51 PM, J. Rottinghuis <jr...@gmail.com>
wrote:

> Thanks Karthik and Tsuyoshi for bringing up good points.
>
> I've opened https://issues.apache.org/jira/browse/YARN-5281 to track this
> discussion and capture all the merits and challenges in one single place.
>
> Thanks,
>
> Joep
>
> On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
>
> > Thanks Sangjin for starting the discussion.
> >
> > >> *First*, if the merge vote is approved, to which branch should this be
> > merged and what would be the release version?
> >
> > As you mentioned, I think it's reasonable for us to target trunk and
> > 3.0.0-alpha.
> >
> > >> Slightly unrelated to the merge, do we plan to support any other
> simpler
> > backend for users to try out, in addition to HBase? LevelDB?
> > > We can however, potentially change the Local File System based
> > implementation to a HDFS based implementation and have it as an alternate
> > for non-production use,
> >
> > In Apache Big Data 2016 NA, some users also mentioned that they need HDFS
> > implementation. Currently it's pending, but I and Varun tried to work to
> > support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
> > early users to try v2.0 APIs though it's doesn't scale. IMHO, it's useful
> > for small cluster(e.g. smaller than 10 machines). After merging the
> current
> > implementation into trunk, I'm interested in resuming YARN-3874
> work(maybe
> > Varun is also interested in).
> >
> > Regards,
> > - Tsuyoshi
> >
> > On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <va...@huawei.com>
> > wrote:
> > > Thanks Karthik for sharing your views.
> > >
> > > With regards to merging, it would help to have clear documentation on
> how
> > to setup and use ATS.
> > > --> We do have documentation on this. You and others who are interested
> > can check out YARN-5174 which is the latest documentation related JIRA
> for
> > ATSv2.
> > >
> > > Slightly unrelated to the merge, do we plan to support any other
> simpler
> > backend for users to try out, in addition to HBase? LevelDB?
> > > --> We do have a File System based implementation but it is strictly
> for
> > test purposes (as we write data into a local file). It does not support
> all
> > the features of Timeline Service v.2 as well.
> > > Regarding LevelDB, Timeline Service v.2 has distributed writers and
> Level
> > DB writes data (log files or SSTable files) to local file system. This
> > means there will be no easy way to have a LevelDB based implementation
> > because we would not know where to read the data from, especially while
> > fetching flow level information.
> > > We can however, potentially change the Local File System based
> > implementation to a HDFS based implementation and have it as an alternate
> > for non-production use, if there is a potential need for it, based on
> > community feedback. This however, would have to be further discussed with
> > the team.
> > >
> > > Regards,
> > > Varun Saxena.
> > >
> > > -----Original Message-----
> > > From: Karthik Kambatla [mailto:kasha@cloudera.com]
> > > Sent: 21 June 2016 10:29
> > > To: Sangjin Lee
> > > Cc: yarn-dev@hadoop.apache.org
> > > Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to
> trunk
> > >
> > > Firstly, thanks Sangjin and others for driving this major feature.
> > >
> > > Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as it
> > will give early access to downstream users.
> > >
> > > With regards to merging, it would help to have clear documentation on
> how
> > to setup and use ATS.
> > >
> > > Slightly unrelated to the merge, do we plan to support any other
> simpler
> > backend for users to try out, in addition to HBase? LevelDB? I understand
> > this wouldn't scale, but would it help with initial adoption and feedback
> > from early users?
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org>
> wrote:
> > >
> > >> Hi all,
> > >>
> > >> I’d like to open a discussion on merging the Timeline Service v.2
> > >> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been
> > >> developing the feature in a feature branch (YARN-2928 [3]) for a
> > >> while, and we are reasonably confident that the state of the feature
> > >> meets the criteria to be merged onto trunk and we'd love folks to get
> > >> their hands on it and provide valuable feedback so that we can make it
> > production-ready.
> > >>
> > >> In a nutshell, Timeline Service v.2 delivers significant scalability
> > >> and usability improvements based on a new architecture. You can browse
> > >> the requirements/design doc, the storage schema doc, the new
> > >> entity/data model, the YARN documentation, and also discussions on
> > >> subsequent milestones on
> > >> YARN-2928 [1].
> > >>
> > >> What we would like to merge to trunk is termed "alpha 1" (milestone
> > >> 1). The feature has a complete end-to-end read/write flow, and you
> > >> should be able to start setting it up and testing it. At a high level,
> > >> the following are the key features that have been implemented:
> > >>
> > >> - distributed writers (collectors) as NM aux services
> > >> - HBase storage
> > >> - new entity model that includes flows
> > >> - setting the flow context via YARN app tags
> > >> - real time metrics aggregation to the application level and the flow
> > >> level
> > >> - rich REST API that supports filters, complex conditionals, limits,
> > >> content selection, etc.
> > >> - YARN generic events and system metrics
> > >> - integration with Distributed Shell and MapReduce
> > >>
> > >> There are a total of 139 subtasks that were completed as part of this
> > >> effort.
> > >>
> > >> We paid close attention to ensure that once disabled Timeline Service
> > >> v.2 does not impact existing functionality when disabled (by default).
> > >>
> > >> I'd like to call out a couple of things to discuss in particular.
> > >>
> > >> *First*, if the merge vote is approved, to which branch should this be
> > >> merged and what would be the release version? My preference is that
> > >> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1* if
> > approved.
> > >> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
> > >> thoughts on this.
> > >>
> > >> *Second*, Timeline Service v.2 introduces a dependency on HBase from
> > YARN.
> > >> It is not a cyclical dependency (as HBase does not really depend on
> > YARN).
> > >> However, the version of Hadoop that HBase currently supports lags
> > >> behind the Hadoop version that Timeline Service is based on, so there
> > >> is a potential for subtle dependency conflicts. We made some efforts
> > >> to isolate the issue (see [4] and [5]). The HBase folks have also been
> > >> responsive in keeping up with the trunk as much as they can.
> > >> Nonetheless, this is something to keep in mind.
> > >>
> > >> I would love to get your thoughts on these and more before we open a
> > >> real voting thread. Thanks!
> > >>
> > >> Regards,
> > >> Sangjin
> > >>
> > >> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
> > >> [2] MAPREDUCE-6331:
> > >> https://issues.apache.org/jira/browse/MAPREDUCE-6331
> > >> [3] YARN-2928 commits:
> > >> https://github.com/apache/hadoop/commits/YARN-2928
> > >> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
> > >> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
> > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> > > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
> > >
> >
>

Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk

Posted by "J. Rottinghuis" <jr...@gmail.com>.

Thanks Karthik and Tsuyoshi for bringing up good points.

I've opened https://issues.apache.org/jira/browse/YARN-5281 to track this
discussion and capture all the merits and challenges in one single place.

Thanks,

Joep

On Tue, Jun 21, 2016 at 8:21 AM, Tsuyoshi Ozawa <oz...@apache.org> wrote:

> Thanks Sangjin for starting the discussion.
>
> >> *First*, if the merge vote is approved, to which branch should this be
> merged and what would be the release version?
>
> As you mentioned, I think it's reasonable for us to target trunk and
> 3.0.0-alpha.
>
> >> Slightly unrelated to the merge, do we plan to support any other simpler
> backend for users to try out, in addition to HBase? LevelDB?
> > We can however, potentially change the Local File System based
> implementation to a HDFS based implementation and have it as an alternate
> for non-production use,
>
> In Apache Big Data 2016 NA, some users also mentioned that they need HDFS
> implementation. Currently it's pending, but I and Varun tried to work to
> support HDFS backend(YARN-3874). As Karthik mentioned, it's useful for
> early users to try v2.0 APIs though it's doesn't scale. IMHO, it's useful
> for small cluster(e.g. smaller than 10 machines). After merging the current
> implementation into trunk, I'm interested in resuming YARN-3874 work(maybe
> Varun is also interested in).
>
> Regards,
> - Tsuyoshi
>
> On Tue, Jun 21, 2016 at 5:07 PM, Varun saxena <va...@huawei.com>
> wrote:
> > Thanks Karthik for sharing your views.
> >
> > With regards to merging, it would help to have clear documentation on how
> to setup and use ATS.
> > --> We do have documentation on this. You and others who are interested
> can check out YARN-5174 which is the latest documentation related JIRA for
> ATSv2.
> >
> > Slightly unrelated to the merge, do we plan to support any other simpler
> backend for users to try out, in addition to HBase? LevelDB?
> > --> We do have a File System based implementation but it is strictly for
> test purposes (as we write data into a local file). It does not support all
> the features of Timeline Service v.2 as well.
> > Regarding LevelDB, Timeline Service v.2 has distributed writers and Level
> DB writes data (log files or SSTable files) to local file system. This
> means there will be no easy way to have a LevelDB based implementation
> because we would not know where to read the data from, especially while
> fetching flow level information.
> > We can however, potentially change the Local File System based
> implementation to a HDFS based implementation and have it as an alternate
> for non-production use, if there is a potential need for it, based on
> community feedback. This however, would have to be further discussed with
> the team.
> >
> > Regards,
> > Varun Saxena.
> >
> > -----Original Message-----
> > From: Karthik Kambatla [mailto:kasha@cloudera.com]
> > Sent: 21 June 2016 10:29
> > To: Sangjin Lee
> > Cc: yarn-dev@hadoop.apache.org
> > Subject: Re: [DISCUSS] merging YARN-2928 (Timeline Service v.2) to trunk
> >
> > Firstly, thanks Sangjin and others for driving this major feature.
> >
> > Merging to trunk and including in 3.0.0-alpha1 seems reasonable, as it
> will give early access to downstream users.
> >
> > With regards to merging, it would help to have clear documentation on how
> to setup and use ATS.
> >
> > Slightly unrelated to the merge, do we plan to support any other simpler
> backend for users to try out, in addition to HBase? LevelDB? I understand
> this wouldn't scale, but would it help with initial adoption and feedback
> from early users?
> >
> >
> >
> >
> >
> > On Mon, Jun 20, 2016 at 10:26 AM, Sangjin Lee <sj...@apache.org> wrote:
> >
> >> Hi all,
> >>
> >> I’d like to open a discussion on merging the Timeline Service v.2
> >> feature to trunk (YARN-2928 and MAPREDUCE-6331) [1][2]. We have been
> >> developing the feature in a feature branch (YARN-2928 [3]) for a
> >> while, and we are reasonably confident that the state of the feature
> >> meets the criteria to be merged onto trunk and we'd love folks to get
> >> their hands on it and provide valuable feedback so that we can make it
> production-ready.
> >>
> >> In a nutshell, Timeline Service v.2 delivers significant scalability
> >> and usability improvements based on a new architecture. You can browse
> >> the requirements/design doc, the storage schema doc, the new
> >> entity/data model, the YARN documentation, and also discussions on
> >> subsequent milestones on
> >> YARN-2928 [1].
> >>
> >> What we would like to merge to trunk is termed "alpha 1" (milestone
> >> 1). The feature has a complete end-to-end read/write flow, and you
> >> should be able to start setting it up and testing it. At a high level,
> >> the following are the key features that have been implemented:
> >>
> >> - distributed writers (collectors) as NM aux services
> >> - HBase storage
> >> - new entity model that includes flows
> >> - setting the flow context via YARN app tags
> >> - real time metrics aggregation to the application level and the flow
> >> level
> >> - rich REST API that supports filters, complex conditionals, limits,
> >> content selection, etc.
> >> - YARN generic events and system metrics
> >> - integration with Distributed Shell and MapReduce
> >>
> >> There are a total of 139 subtasks that were completed as part of this
> >> effort.
> >>
> >> We paid close attention to ensure that once disabled Timeline Service
> >> v.2 does not impact existing functionality when disabled (by default).
> >>
> >> I'd like to call out a couple of things to discuss in particular.
> >>
> >> *First*, if the merge vote is approved, to which branch should this be
> >> merged and what would be the release version? My preference is that
> >> *it would be merged to branch "trunk" and be part of 3.0.0-alpha1* if
> approved.
> >> Since the 3.0.0-alpha1 is in active progress, I wanted to get your
> >> thoughts on this.
> >>
> >> *Second*, Timeline Service v.2 introduces a dependency on HBase from
> YARN.
> >> It is not a cyclical dependency (as HBase does not really depend on
> YARN).
> >> However, the version of Hadoop that HBase currently supports lags
> >> behind the Hadoop version that Timeline Service is based on, so there
> >> is a potential for subtle dependency conflicts. We made some efforts
> >> to isolate the issue (see [4] and [5]). The HBase folks have also been
> >> responsive in keeping up with the trunk as much as they can.
> >> Nonetheless, this is something to keep in mind.
> >>
> >> I would love to get your thoughts on these and more before we open a
> >> real voting thread. Thanks!
> >>
> >> Regards,
> >> Sangjin
> >>
> >> [1] YARN-2928: https://issues.apache.org/jira/browse/YARN-2928
> >> [2] MAPREDUCE-6331:
> >> https://issues.apache.org/jira/browse/MAPREDUCE-6331
> >> [3] YARN-2928 commits:
> >> https://github.com/apache/hadoop/commits/YARN-2928
> >> [4] YARN-5045: https://issues.apache.org/jira/browse/YARN-5045
> >> [5] YARN-5071: https://issues.apache.org/jira/browse/YARN-5071
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
> >
>