You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Roman Shaposhnik <rv...@apache.org> on 2014/11/01 00:06:18 UTC

[DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Hi!

I would like to propose HTrace to be consider for
Apache Incubator. The proposal is attached and
is also available on the wiki:
    https://wiki.apache.org/incubator/HTraceProposal

Please let me know what do you guys think and also
don't hesitate to massage the proposal on the wiki
based on the feedback from this thread.

Thanks,
Roman.

== Abstract ==
HTrace is a tracing framework intended for use with distributed
systems written in java.

== Proposal ==
HTrace is an aid for understanding system behavior and for reasoning
about performance
issues in distributed systems. HTrace is primarily a low impedance
library that a java
distributed system can incorporate to generate ‘breadcrumbs’ or
‘traces’ along the path
of execution, even as it crosses processes and machines. HTrace also
includes various
tools and glue for collecting, processing and ‘visualizing’ captured
execution traces
for analysis ex post facto of where time was spent and what resources
were consumed.

== Background ==
Distributed systems are made up of multiple software components
running on multiple
computers connected by networks. Debugging or profiling operations run
over non-trivial
distributed systems -- figuring execution paths and what services, machines, and
libraries participated in the processing of a request -- can be involved.

== Rationale ==
Rather than have each distributed system build its own custom
‘tracing’ libraries,
ideally all would use a single project that provides necessary
primitives and saves
each project building its own visualizations and processing tools anew.

Google described “...[a] large-scale distributed systems tracing infrastructure”
in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The paper
tells a compelling story of what is possible when disparate systems standardize
on a single tracing library and cooperate, ‘passing the baton’, filling out
trace context as executions cross systems.

HTrace aims to provide a rough equivalent in open source of the described core
Dapper tools and library.  As it is adopted by more projects, there will be a
‘network effect’ as HTrace will provide a more comprehensive view of activity
on the cluster.  For example, as HDFS gets HTrace support, we can connect this
with the HTrace support in HBase to follow HBase requests as they enter HDFS.

Given the success of HTrace depends on its being integrated by many  projects,
HTrace should be perceived as unhampered, free of any commercial, political,
or legal ‘taint’. Being an Apache project would help in this regard.

== Initial Goals ==
HTrace is a small project of narrow scope but with a grand vision:
  * Move the HTrace source and repository to Apache, a vendor-neutral
location. Currently HTrace resides at a Cloudera-hosted repository.
  * Add past contributors as committers and institute Apache governance.
  * Evangelize and encourage HTrace diffusion. Initially we will
continue a focus on the Hadoop space since that is where most of the
initial contributors work and it is where HTrace has been initially
deployed.
  * Building out the standalone visualization tool that ships with HTrace.
  * Build more community and add more committers

== Current Status ==
Currently HTrace has a viable Java trace library that can be interpolated
to create ‘traces’.  The work that needs to be done on this library is mostly
bug fixes, ease-of-use improvements, and performance tweaks.  In the future,
we may add libraries for other languages besides Java.

HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
(a tracing
sink and visualization system developed by Twitter
https://github.com/twitter/zipkin),
or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
(https://code.google.com/p/python-graph/).

Since the initial sprint in the summer of 2012 which saw HTrace patches proposed
for Apache HDFS and committed to Apache HBase, development has been sporadic;
mostly a single developer or two adding a feature or bug fixing. HTrace is
currently undergoing a new “spurt” of development with the effort to get HTrace
added to Apache HDFS revived and a new standalone viewing facility being added
in to HTrace itself.

HTrace has been integrated by Apache Phoenix.


=== Meritocracy ===
HTrace, up to this, has been run by Apache committers and PMC members.
We want to
build out a diverse developer and user community and run the HTrace project in
the Apache way.  Users and new contributors will be treated with respect and
welcomed; they will earn merit in the project by tendering quality patches
and support that move the project forward.  Those with a proven support and
quality patch track record will be encouraged to become committers.

=== Community ===
There are just a few developers involved at the moment. If our project
is accepted
by incubator, building community would be a primary initial goal.

=== Core Developers ===

Core developers include Apache members and members of the Hadoop and
HBase PMCs.
Of those listed, all have contributed to HTrace. Half are from Cloudera.
The remainder are Hortonworks, NTTData, Google, and Facebook employees.

=== Alignment ===
HTrace has been integrated into Apache HBase and Apache Phoenix.  Integration
into Apache HDFS is currently being worked on. Approaching the Apache YARN
project would be a likely next integration.


== Known Risks ==
As noted above, development has been sporadic up to this.  It may continue so.

HTrace is not the primary focus of any of the current list of contributors.
It is for all a side effort.  HTrace may lack sufficient impetus with such
a state of affairs.

For HTrace to tell a compelling story, it needs to be taken up by significant
projects that make up a traced distributed system.  For example, say YARN and
HBase take on HTrace but HDFS does not, then the HDFS portions of an end-to-end
operation will render opaque compromising our being able to tell a good story
around an execution. Because the picture painted has gaps, HTrace may be left
aside as ineffective.

=== Orphaned products ===
The proposers have a vested interest in making HTrace succeed, driving its
development and its insertion into projects we all work on. Its dispersion
will shine light on difficult to understand interactions amongst the various
systems we all work on. A working, integrated HTrace will add a useful
debugging mechanism to the Apache projects we all work on.


=== Inexperience with Open Source ===
The majority of the proposers here have day jobs that has them working near
full-time on (Apache) open source projects. A few of us have helped carry
other projects through incubator.  HTrace to date has been developed as
an open source project.

=== Homogenous Developers ===
The initial group of committers is small but already we have a healthy
diversity of participating companies.  We are bay-area challenged but
a Japanese contributor makes for a good counter balance.

=== Reliance on Salaried Developers ===
Most of the contributors are paid to work in the Hadoop ecosystem.
While we might wander from our current employers, we probably won’t
go far from the Hadoop tree.  Whoever the Hadoop employer, it is
plain a successful HTrace project is in everyone’s interest.
At least one of the developers has already changed employers but
his interest in seeing HTrace succeed prevails.

=== Relationships with Other Apache Products ===
For HTrace to succeed, it is critical we build good relations with
other distributed systems projects.  We intend to initially build
on relations we already have in place, mostly in the Hadoop space.

The HTrace project has been incorporated by Apache HBase and
Apache Phoenix. It is currently being actively integrated into
Apache HDFS.

We do not know of any equivalent or near-equivalent project
in the Apache space.

The Dapper paper notes precedent, in particular, the Berkeley
Rad Lab X-Trace project.

==== How HTrace relates to Zipkin ====
Zipkin is an Apache Licensed project from Twitter. It is a complete
tracing tool with trace collectors, trace viewers and tools to help
you generate traces. It is written in Scala.  If your project is
not Scala or if it is Java and you cannot afford a Scala dependency,
at a minimum, you need an alternate means of generating traces.
HTrace provides this facility for Java as well as bridging tools
to feed traces to Zipkin for query and display.

The projects complement each other.

=== A Excessive Fascination with the Apache Brand ===
While we intend to leverage the Apache ‘branding’ when talking to other
projects as testament of our project’s ‘neutrality’, we have no plans
for making use of Apache brand in press releases nor posting billboards
advertising acceptance of HTrace into Apache Incubator.


== Documentation ==
See [[http://htrace.org|htrace.org]] for the current state of the HTrace
project and documentation.

How to enable tracing in
[[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
Elliott Clark on
[[http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
in HBase]]

== Initial Source ==
Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in the
summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.


== Source and Intellectual Property Submission Plan ==
We know of no legal encumberments in the way of transfer of source to Apache.

== External Dependencies ==
HTrace includes third party libs. These include guava, jetty, junit, protobuf,
hbase, and thrift.  All dependencies are Apache licensed or licenses that are
palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
ProtoBufs are BSD licensed.

Cryptography
N/A

== Required Resources ==

=== Mailing lists ===
  * private@htrace.incubator.apache.org (moderated subscriptions)
  * commits@htrace.incubator.apache.org
  * dev@htrace.incubator.apache.org
  * issues@htrace.incubator.apache.org
  * user@htrace.incubator.apache.org

=== Git Repository ===
https://git-wip-us.apache.org/repos/asf/incubator-htrace.git

=== Issue Tracking ===
JIRA HTrace (HTRACE)

=== Other Resources ===
Means of setting up regular builds for htrace on builds.apache.org

== Initial Committers ==
  * Colin McCabe (cmccabe@apache.org)
  * Elliott Clark (eclark@apache.org)
  * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
  * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
  * Michael Stack (stack@apache.org)
  * Nick Dimiduk (ndimiduk@apache.org)
  * Todd Lipcon (todd@apache.org)


== Affiliations ==
  * Colin McCabe - Cloudera
  * Elliott Clark - Facebook
  * Jonathan Leavitt - Google
  * Masatake Iwasaki - NTTData
  * Michael Stack - Cloudera
  * Nick Dimiduk - Hortonworks
  * Todd Lipcon - Cloudera

== Sponsors ==

=== Champion ===
Roman Shaposhnik

=== Nominated Mentors ===
  * Michael Stack - Apache Member
  * Todd Lipcon - Apache Member

We will be soliciting more mentors as part of the proposal process.

=== Sponsoring Entity ===
We would like to propose Apache incubator to sponsor this project.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Branko Čibej <br...@apache.org>.
On 03.11.2014 19:12, Stack wrote:
> On Mon, Nov 3, 2014 at 8:01 AM, Branko Čibej <br...@apache.org> wrote:
>
>> On 03.11.2014 16:49, Stack wrote:
>>> On Sun, Nov 2, 2014 at 6:19 PM, Naresh Agarwal <
>> naresh.agarwal@inmobi.com>
>>> wrote:
>>>
>>>> Just curious if HTrace is aimed only for Hadoop infrastructure/Hadoop
>> based
>>>> applications or it can be used in any Java based systems?
>>>>
>>>>
>>> HTrace's provenance is Hadoop but the only hadoop 'taint' in HTrace is
>> the
>>> leading 'H' in its name; it should be fit for any java distributed
>> systems.
>>> Lets make this more plain in the proposal.
>> Would it hurt to remove the H from the project name, then?
>>
>> (I won't propose replacing it with "D", that would be really confusing.)
>>
>> -- Brane
>>
>> P.S.: Ooh ... "Distributed Tracing" -> Distress, which happens to be
>> what an admin feels when her distributed app goes wonkers ...
>>
> HTrace is 'fairly' generic and the github project is what comes up when you
> search. That said, I think your suggestion pretty great, funny. Its
> probably a bad name for a software project though? (Distrace?)

If Subversion and Git are OK, then Distress fits right in, wouldn't you
say? :)

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Stack <st...@duboce.net>.
On Mon, Nov 3, 2014 at 8:01 AM, Branko Čibej <br...@apache.org> wrote:

> On 03.11.2014 16:49, Stack wrote:
> > On Sun, Nov 2, 2014 at 6:19 PM, Naresh Agarwal <
> naresh.agarwal@inmobi.com>
> > wrote:
> >
> >> Just curious if HTrace is aimed only for Hadoop infrastructure/Hadoop
> based
> >> applications or it can be used in any Java based systems?
> >>
> >>
> > HTrace's provenance is Hadoop but the only hadoop 'taint' in HTrace is
> the
> > leading 'H' in its name; it should be fit for any java distributed
> systems.
> >
> > Lets make this more plain in the proposal.
>
> Would it hurt to remove the H from the project name, then?
>
> (I won't propose replacing it with "D", that would be really confusing.)
>
> -- Brane
>
> P.S.: Ooh ... "Distributed Tracing" -> Distress, which happens to be
> what an admin feels when her distributed app goes wonkers ...
>

HTrace is 'fairly' generic and the github project is what comes up when you
search. That said, I think your suggestion pretty great, funny. Its
probably a bad name for a software project though? (Distrace?)

St.Ack

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
Really awesome to see this getting off the ground!  Thanks, Roman!

Re: the name.  I don't feel super-strongly about this, but I prefer
the name "HTrace."  It's short and sweet, and very easy to Google for.
"Distress" is not going to be easy to Google for, and might tend to
turn up the wrong kinds of results.  I think this is pretty important
for a new project trying to get mindshare.

I think if anything, having Hadoop associated with the project should
help it to grow, since people know that it's being used by a real (and
really big) distributed systems project rather than just being an
experiment.  Of course, we should make it absolutely clear in the
documentation, project site, mailing list, etc. that there are no
dependencies on Hadoop jars and that any Java project (and maybe
later, non-Java projects) can use this.

best,
Colin

On Mon, Nov 3, 2014 at 8:01 AM, Branko Čibej <br...@apache.org> wrote:
> On 03.11.2014 16:49, Stack wrote:
>> On Sun, Nov 2, 2014 at 6:19 PM, Naresh Agarwal <na...@inmobi.com>
>> wrote:
>>
>>> Just curious if HTrace is aimed only for Hadoop infrastructure/Hadoop based
>>> applications or it can be used in any Java based systems?
>>>
>>>
>> HTrace's provenance is Hadoop but the only hadoop 'taint' in HTrace is the
>> leading 'H' in its name; it should be fit for any java distributed systems.
>>
>> Lets make this more plain in the proposal.
>
> Would it hurt to remove the H from the project name, then?
>
> (I won't propose replacing it with "D", that would be really confusing.)
>
> -- Brane
>
> P.S.: Ooh ... "Distributed Tracing" -> Distress, which happens to be
> what an admin feels when her distributed app goes wonkers ...
>
> -- Brane
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Branko Čibej <br...@apache.org>.
On 03.11.2014 16:49, Stack wrote:
> On Sun, Nov 2, 2014 at 6:19 PM, Naresh Agarwal <na...@inmobi.com>
> wrote:
>
>> Just curious if HTrace is aimed only for Hadoop infrastructure/Hadoop based
>> applications or it can be used in any Java based systems?
>>
>>
> HTrace's provenance is Hadoop but the only hadoop 'taint' in HTrace is the
> leading 'H' in its name; it should be fit for any java distributed systems.
>
> Lets make this more plain in the proposal.

Would it hurt to remove the H from the project name, then?

(I won't propose replacing it with "D", that would be really confusing.)

-- Brane

P.S.: Ooh ... "Distributed Tracing" -> Distress, which happens to be
what an admin feels when her distributed app goes wonkers ...

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Stack <st...@duboce.net>.
On Sun, Nov 2, 2014 at 6:19 PM, Naresh Agarwal <na...@inmobi.com>
wrote:

> Just curious if HTrace is aimed only for Hadoop infrastructure/Hadoop based
> applications or it can be used in any Java based systems?
>
>
HTrace's provenance is Hadoop but the only hadoop 'taint' in HTrace is the
leading 'H' in its name; it should be fit for any java distributed systems.

Lets make this more plain in the proposal.

Thanks Naresh,
St.Ack



> Thanks
> Naresh
>
> On Mon, Nov 3, 2014 at 1:34 AM, Andrew Purtell <ap...@apache.org>
> wrote:
>
> > Really great to see an incubation proposal for HTrace. If you need
> another
> > mentor, please consider me.
> >
> > I don't think you need to list "HTrace is not the primary focus of any of
> > the current list of contributors" as a risk. One can say that about many
> > (perhaps the majority) of contributors to Apache projects. We would hope
> > the incubation process develops a healthy community that sustains a level
> > of contribution that keeps the project moving forward, as we would hope
> for
> > all incubation candidates.
> >
> >
> >
> > On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik <rv...@apache.org>
> wrote:
> >
> > > Hi!
> > >
> > > I would like to propose HTrace to be consider for
> > > Apache Incubator. The proposal is attached and
> > > is also available on the wiki:
> > >     https://wiki.apache.org/incubator/HTraceProposal
> > >
> > > Please let me know what do you guys think and also
> > > don't hesitate to massage the proposal on the wiki
> > > based on the feedback from this thread.
> > >
> > > Thanks,
> > > Roman.
> > >
> > > == Abstract ==
> > > HTrace is a tracing framework intended for use with distributed
> > > systems written in java.
> > >
> > > == Proposal ==
> > > HTrace is an aid for understanding system behavior and for reasoning
> > > about performance
> > > issues in distributed systems. HTrace is primarily a low impedance
> > > library that a java
> > > distributed system can incorporate to generate ‘breadcrumbs’ or
> > > ‘traces’ along the path
> > > of execution, even as it crosses processes and machines. HTrace also
> > > includes various
> > > tools and glue for collecting, processing and ‘visualizing’ captured
> > > execution traces
> > > for analysis ex post facto of where time was spent and what resources
> > > were consumed.
> > >
> > > == Background ==
> > > Distributed systems are made up of multiple software components
> > > running on multiple
> > > computers connected by networks. Debugging or profiling operations run
> > > over non-trivial
> > > distributed systems -- figuring execution paths and what services,
> > > machines, and
> > > libraries participated in the processing of a request -- can be
> involved.
> > >
> > > == Rationale ==
> > > Rather than have each distributed system build its own custom
> > > ‘tracing’ libraries,
> > > ideally all would use a single project that provides necessary
> > > primitives and saves
> > > each project building its own visualizations and processing tools anew.
> > >
> > > Google described “...[a] large-scale distributed systems tracing
> > > infrastructure”
> > > in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure.
> The
> > > paper
> > > tells a compelling story of what is possible when disparate systems
> > > standardize
> > > on a single tracing library and cooperate, ‘passing the baton’, filling
> > out
> > > trace context as executions cross systems.
> > >
> > > HTrace aims to provide a rough equivalent in open source of the
> described
> > > core
> > > Dapper tools and library.  As it is adopted by more projects, there
> will
> > > be a
> > > ‘network effect’ as HTrace will provide a more comprehensive view of
> > > activity
> > > on the cluster.  For example, as HDFS gets HTrace support, we can
> connect
> > > this
> > > with the HTrace support in HBase to follow HBase requests as they enter
> > > HDFS.
> > >
> > > Given the success of HTrace depends on its being integrated by many
> > > projects,
> > > HTrace should be perceived as unhampered, free of any commercial,
> > > political,
> > > or legal ‘taint’. Being an Apache project would help in this regard.
> > >
> > > == Initial Goals ==
> > > HTrace is a small project of narrow scope but with a grand vision:
> > >   * Move the HTrace source and repository to Apache, a vendor-neutral
> > > location. Currently HTrace resides at a Cloudera-hosted repository.
> > >   * Add past contributors as committers and institute Apache
> governance.
> > >   * Evangelize and encourage HTrace diffusion. Initially we will
> > > continue a focus on the Hadoop space since that is where most of the
> > > initial contributors work and it is where HTrace has been initially
> > > deployed.
> > >   * Building out the standalone visualization tool that ships with
> > HTrace.
> > >   * Build more community and add more committers
> > >
> > > == Current Status ==
> > > Currently HTrace has a viable Java trace library that can be
> interpolated
> > > to create ‘traces’.  The work that needs to be done on this library is
> > > mostly
> > > bug fixes, ease-of-use improvements, and performance tweaks.  In the
> > > future,
> > > we may add libraries for other languages besides Java.
> > >
> > > HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
> > > (a tracing
> > > sink and visualization system developed by Twitter
> > > https://github.com/twitter/zipkin),
> > > or Apache HBase.  Executions can be viewed either in Zipkin or in
> pygraph
> > > (https://code.google.com/p/python-graph/).
> > >
> > > Since the initial sprint in the summer of 2012 which saw HTrace patches
> > > proposed
> > > for Apache HDFS and committed to Apache HBase, development has been
> > > sporadic;
> > > mostly a single developer or two adding a feature or bug fixing. HTrace
> > is
> > > currently undergoing a new “spurt” of development with the effort to
> get
> > > HTrace
> > > added to Apache HDFS revived and a new standalone viewing facility
> being
> > > added
> > > in to HTrace itself.
> > >
> > > HTrace has been integrated by Apache Phoenix.
> > >
> > >
> > > === Meritocracy ===
> > > HTrace, up to this, has been run by Apache committers and PMC members.
> > > We want to
> > > build out a diverse developer and user community and run the HTrace
> > > project in
> > > the Apache way.  Users and new contributors will be treated with
> respect
> > > and
> > > welcomed; they will earn merit in the project by tendering quality
> > patches
> > > and support that move the project forward.  Those with a proven support
> > and
> > > quality patch track record will be encouraged to become committers.
> > >
> > > === Community ===
> > > There are just a few developers involved at the moment. If our project
> > > is accepted
> > > by incubator, building community would be a primary initial goal.
> > >
> > > === Core Developers ===
> > >
> > > Core developers include Apache members and members of the Hadoop and
> > > HBase PMCs.
> > > Of those listed, all have contributed to HTrace. Half are from
> Cloudera.
> > > The remainder are Hortonworks, NTTData, Google, and Facebook employees.
> > >
> > > === Alignment ===
> > > HTrace has been integrated into Apache HBase and Apache Phoenix.
> > > Integration
> > > into Apache HDFS is currently being worked on. Approaching the Apache
> > YARN
> > > project would be a likely next integration.
> > >
> > >
> > > == Known Risks ==
> > > As noted above, development has been sporadic up to this.  It may
> > continue
> > > so.
> > >
> > > HTrace is not the primary focus of any of the current list of
> > contributors.
> > > It is for all a side effort.  HTrace may lack sufficient impetus with
> > such
> > > a state of affairs.
> > >
> > > For HTrace to tell a compelling story, it needs to be taken up by
> > > significant
> > > projects that make up a traced distributed system.  For example, say
> YARN
> > > and
> > > HBase take on HTrace but HDFS does not, then the HDFS portions of an
> > > end-to-end
> > > operation will render opaque compromising our being able to tell a good
> > > story
> > > around an execution. Because the picture painted has gaps, HTrace may
> be
> > > left
> > > aside as ineffective.
> > >
> > > === Orphaned products ===
> > > The proposers have a vested interest in making HTrace succeed, driving
> > its
> > > development and its insertion into projects we all work on. Its
> > dispersion
> > > will shine light on difficult to understand interactions amongst the
> > > various
> > > systems we all work on. A working, integrated HTrace will add a useful
> > > debugging mechanism to the Apache projects we all work on.
> > >
> > >
> > > === Inexperience with Open Source ===
> > > The majority of the proposers here have day jobs that has them working
> > near
> > > full-time on (Apache) open source projects. A few of us have helped
> carry
> > > other projects through incubator.  HTrace to date has been developed as
> > > an open source project.
> > >
> > > === Homogenous Developers ===
> > > The initial group of committers is small but already we have a healthy
> > > diversity of participating companies.  We are bay-area challenged but
> > > a Japanese contributor makes for a good counter balance.
> > >
> > > === Reliance on Salaried Developers ===
> > > Most of the contributors are paid to work in the Hadoop ecosystem.
> > > While we might wander from our current employers, we probably won’t
> > > go far from the Hadoop tree.  Whoever the Hadoop employer, it is
> > > plain a successful HTrace project is in everyone’s interest.
> > > At least one of the developers has already changed employers but
> > > his interest in seeing HTrace succeed prevails.
> > >
> > > === Relationships with Other Apache Products ===
> > > For HTrace to succeed, it is critical we build good relations with
> > > other distributed systems projects.  We intend to initially build
> > > on relations we already have in place, mostly in the Hadoop space.
> > >
> > > The HTrace project has been incorporated by Apache HBase and
> > > Apache Phoenix. It is currently being actively integrated into
> > > Apache HDFS.
> > >
> > > We do not know of any equivalent or near-equivalent project
> > > in the Apache space.
> > >
> > > The Dapper paper notes precedent, in particular, the Berkeley
> > > Rad Lab X-Trace project.
> > >
> > > ==== How HTrace relates to Zipkin ====
> > > Zipkin is an Apache Licensed project from Twitter. It is a complete
> > > tracing tool with trace collectors, trace viewers and tools to help
> > > you generate traces. It is written in Scala.  If your project is
> > > not Scala or if it is Java and you cannot afford a Scala dependency,
> > > at a minimum, you need an alternate means of generating traces.
> > > HTrace provides this facility for Java as well as bridging tools
> > > to feed traces to Zipkin for query and display.
> > >
> > > The projects complement each other.
> > >
> > > === A Excessive Fascination with the Apache Brand ===
> > > While we intend to leverage the Apache ‘branding’ when talking to other
> > > projects as testament of our project’s ‘neutrality’, we have no plans
> > > for making use of Apache brand in press releases nor posting billboards
> > > advertising acceptance of HTrace into Apache Incubator.
> > >
> > >
> > > == Documentation ==
> > > See [[http://htrace.org|htrace.org]] for the current state of the
> HTrace
> > > project and documentation.
> > >
> > > How to enable tracing in
> > > [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
> > > Elliott Clark on
> > > [[
> > http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
> > > in HBase]]
> > >
> > > == Initial Source ==
> > > Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in
> > the
> > > summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
> > >
> > >
> > > == Source and Intellectual Property Submission Plan ==
> > > We know of no legal encumberments in the way of transfer of source to
> > > Apache.
> > >
> > > == External Dependencies ==
> > > HTrace includes third party libs. These include guava, jetty, junit,
> > > protobuf,
> > > hbase, and thrift.  All dependencies are Apache licensed or licenses
> that
> > > are
> > > palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
> > > ProtoBufs are BSD licensed.
> > >
> > > Cryptography
> > > N/A
> > >
> > > == Required Resources ==
> > >
> > > === Mailing lists ===
> > >   * private@htrace.incubator.apache.org (moderated subscriptions)
> > >   * commits@htrace.incubator.apache.org
> > >   * dev@htrace.incubator.apache.org
> > >   * issues@htrace.incubator.apache.org
> > >   * user@htrace.incubator.apache.org
> > >
> > > === Git Repository ===
> > > https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
> > >
> > > === Issue Tracking ===
> > > JIRA HTrace (HTRACE)
> > >
> > > === Other Resources ===
> > > Means of setting up regular builds for htrace on builds.apache.org
> > >
> > > == Initial Committers ==
> > >   * Colin McCabe (cmccabe@apache.org)
> > >   * Elliott Clark (eclark@apache.org)
> > >   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
> > >   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
> > >   * Michael Stack (stack@apache.org)
> > >   * Nick Dimiduk (ndimiduk@apache.org)
> > >   * Todd Lipcon (todd@apache.org)
> > >
> > >
> > > == Affiliations ==
> > >   * Colin McCabe - Cloudera
> > >   * Elliott Clark - Facebook
> > >   * Jonathan Leavitt - Google
> > >   * Masatake Iwasaki - NTTData
> > >   * Michael Stack - Cloudera
> > >   * Nick Dimiduk - Hortonworks
> > >   * Todd Lipcon - Cloudera
> > >
> > > == Sponsors ==
> > >
> > > === Champion ===
> > > Roman Shaposhnik
> > >
> > > === Nominated Mentors ===
> > >   * Michael Stack - Apache Member
> > >   * Todd Lipcon - Apache Member
> > >
> > > We will be soliciting more mentors as part of the proposal process.
> > >
> > > === Sponsoring Entity ===
> > > We would like to propose Apache incubator to sponsor this project.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > For additional commands, e-mail: general-help@incubator.apache.org
> > >
> > >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
> --
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Naresh Agarwal <na...@inmobi.com>.
Just curious if HTrace is aimed only for Hadoop infrastructure/Hadoop based
applications or it can be used in any Java based systems?

Thanks
Naresh

On Mon, Nov 3, 2014 at 1:34 AM, Andrew Purtell <ap...@apache.org> wrote:

> Really great to see an incubation proposal for HTrace. If you need another
> mentor, please consider me.
>
> I don't think you need to list "HTrace is not the primary focus of any of
> the current list of contributors" as a risk. One can say that about many
> (perhaps the majority) of contributors to Apache projects. We would hope
> the incubation process develops a healthy community that sustains a level
> of contribution that keeps the project moving forward, as we would hope for
> all incubation candidates.
>
>
>
> On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>
> > Hi!
> >
> > I would like to propose HTrace to be consider for
> > Apache Incubator. The proposal is attached and
> > is also available on the wiki:
> >     https://wiki.apache.org/incubator/HTraceProposal
> >
> > Please let me know what do you guys think and also
> > don't hesitate to massage the proposal on the wiki
> > based on the feedback from this thread.
> >
> > Thanks,
> > Roman.
> >
> > == Abstract ==
> > HTrace is a tracing framework intended for use with distributed
> > systems written in java.
> >
> > == Proposal ==
> > HTrace is an aid for understanding system behavior and for reasoning
> > about performance
> > issues in distributed systems. HTrace is primarily a low impedance
> > library that a java
> > distributed system can incorporate to generate ‘breadcrumbs’ or
> > ‘traces’ along the path
> > of execution, even as it crosses processes and machines. HTrace also
> > includes various
> > tools and glue for collecting, processing and ‘visualizing’ captured
> > execution traces
> > for analysis ex post facto of where time was spent and what resources
> > were consumed.
> >
> > == Background ==
> > Distributed systems are made up of multiple software components
> > running on multiple
> > computers connected by networks. Debugging or profiling operations run
> > over non-trivial
> > distributed systems -- figuring execution paths and what services,
> > machines, and
> > libraries participated in the processing of a request -- can be involved.
> >
> > == Rationale ==
> > Rather than have each distributed system build its own custom
> > ‘tracing’ libraries,
> > ideally all would use a single project that provides necessary
> > primitives and saves
> > each project building its own visualizations and processing tools anew.
> >
> > Google described “...[a] large-scale distributed systems tracing
> > infrastructure”
> > in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The
> > paper
> > tells a compelling story of what is possible when disparate systems
> > standardize
> > on a single tracing library and cooperate, ‘passing the baton’, filling
> out
> > trace context as executions cross systems.
> >
> > HTrace aims to provide a rough equivalent in open source of the described
> > core
> > Dapper tools and library.  As it is adopted by more projects, there will
> > be a
> > ‘network effect’ as HTrace will provide a more comprehensive view of
> > activity
> > on the cluster.  For example, as HDFS gets HTrace support, we can connect
> > this
> > with the HTrace support in HBase to follow HBase requests as they enter
> > HDFS.
> >
> > Given the success of HTrace depends on its being integrated by many
> > projects,
> > HTrace should be perceived as unhampered, free of any commercial,
> > political,
> > or legal ‘taint’. Being an Apache project would help in this regard.
> >
> > == Initial Goals ==
> > HTrace is a small project of narrow scope but with a grand vision:
> >   * Move the HTrace source and repository to Apache, a vendor-neutral
> > location. Currently HTrace resides at a Cloudera-hosted repository.
> >   * Add past contributors as committers and institute Apache governance.
> >   * Evangelize and encourage HTrace diffusion. Initially we will
> > continue a focus on the Hadoop space since that is where most of the
> > initial contributors work and it is where HTrace has been initially
> > deployed.
> >   * Building out the standalone visualization tool that ships with
> HTrace.
> >   * Build more community and add more committers
> >
> > == Current Status ==
> > Currently HTrace has a viable Java trace library that can be interpolated
> > to create ‘traces’.  The work that needs to be done on this library is
> > mostly
> > bug fixes, ease-of-use improvements, and performance tweaks.  In the
> > future,
> > we may add libraries for other languages besides Java.
> >
> > HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
> > (a tracing
> > sink and visualization system developed by Twitter
> > https://github.com/twitter/zipkin),
> > or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
> > (https://code.google.com/p/python-graph/).
> >
> > Since the initial sprint in the summer of 2012 which saw HTrace patches
> > proposed
> > for Apache HDFS and committed to Apache HBase, development has been
> > sporadic;
> > mostly a single developer or two adding a feature or bug fixing. HTrace
> is
> > currently undergoing a new “spurt” of development with the effort to get
> > HTrace
> > added to Apache HDFS revived and a new standalone viewing facility being
> > added
> > in to HTrace itself.
> >
> > HTrace has been integrated by Apache Phoenix.
> >
> >
> > === Meritocracy ===
> > HTrace, up to this, has been run by Apache committers and PMC members.
> > We want to
> > build out a diverse developer and user community and run the HTrace
> > project in
> > the Apache way.  Users and new contributors will be treated with respect
> > and
> > welcomed; they will earn merit in the project by tendering quality
> patches
> > and support that move the project forward.  Those with a proven support
> and
> > quality patch track record will be encouraged to become committers.
> >
> > === Community ===
> > There are just a few developers involved at the moment. If our project
> > is accepted
> > by incubator, building community would be a primary initial goal.
> >
> > === Core Developers ===
> >
> > Core developers include Apache members and members of the Hadoop and
> > HBase PMCs.
> > Of those listed, all have contributed to HTrace. Half are from Cloudera.
> > The remainder are Hortonworks, NTTData, Google, and Facebook employees.
> >
> > === Alignment ===
> > HTrace has been integrated into Apache HBase and Apache Phoenix.
> > Integration
> > into Apache HDFS is currently being worked on. Approaching the Apache
> YARN
> > project would be a likely next integration.
> >
> >
> > == Known Risks ==
> > As noted above, development has been sporadic up to this.  It may
> continue
> > so.
> >
> > HTrace is not the primary focus of any of the current list of
> contributors.
> > It is for all a side effort.  HTrace may lack sufficient impetus with
> such
> > a state of affairs.
> >
> > For HTrace to tell a compelling story, it needs to be taken up by
> > significant
> > projects that make up a traced distributed system.  For example, say YARN
> > and
> > HBase take on HTrace but HDFS does not, then the HDFS portions of an
> > end-to-end
> > operation will render opaque compromising our being able to tell a good
> > story
> > around an execution. Because the picture painted has gaps, HTrace may be
> > left
> > aside as ineffective.
> >
> > === Orphaned products ===
> > The proposers have a vested interest in making HTrace succeed, driving
> its
> > development and its insertion into projects we all work on. Its
> dispersion
> > will shine light on difficult to understand interactions amongst the
> > various
> > systems we all work on. A working, integrated HTrace will add a useful
> > debugging mechanism to the Apache projects we all work on.
> >
> >
> > === Inexperience with Open Source ===
> > The majority of the proposers here have day jobs that has them working
> near
> > full-time on (Apache) open source projects. A few of us have helped carry
> > other projects through incubator.  HTrace to date has been developed as
> > an open source project.
> >
> > === Homogenous Developers ===
> > The initial group of committers is small but already we have a healthy
> > diversity of participating companies.  We are bay-area challenged but
> > a Japanese contributor makes for a good counter balance.
> >
> > === Reliance on Salaried Developers ===
> > Most of the contributors are paid to work in the Hadoop ecosystem.
> > While we might wander from our current employers, we probably won’t
> > go far from the Hadoop tree.  Whoever the Hadoop employer, it is
> > plain a successful HTrace project is in everyone’s interest.
> > At least one of the developers has already changed employers but
> > his interest in seeing HTrace succeed prevails.
> >
> > === Relationships with Other Apache Products ===
> > For HTrace to succeed, it is critical we build good relations with
> > other distributed systems projects.  We intend to initially build
> > on relations we already have in place, mostly in the Hadoop space.
> >
> > The HTrace project has been incorporated by Apache HBase and
> > Apache Phoenix. It is currently being actively integrated into
> > Apache HDFS.
> >
> > We do not know of any equivalent or near-equivalent project
> > in the Apache space.
> >
> > The Dapper paper notes precedent, in particular, the Berkeley
> > Rad Lab X-Trace project.
> >
> > ==== How HTrace relates to Zipkin ====
> > Zipkin is an Apache Licensed project from Twitter. It is a complete
> > tracing tool with trace collectors, trace viewers and tools to help
> > you generate traces. It is written in Scala.  If your project is
> > not Scala or if it is Java and you cannot afford a Scala dependency,
> > at a minimum, you need an alternate means of generating traces.
> > HTrace provides this facility for Java as well as bridging tools
> > to feed traces to Zipkin for query and display.
> >
> > The projects complement each other.
> >
> > === A Excessive Fascination with the Apache Brand ===
> > While we intend to leverage the Apache ‘branding’ when talking to other
> > projects as testament of our project’s ‘neutrality’, we have no plans
> > for making use of Apache brand in press releases nor posting billboards
> > advertising acceptance of HTrace into Apache Incubator.
> >
> >
> > == Documentation ==
> > See [[http://htrace.org|htrace.org]] for the current state of the HTrace
> > project and documentation.
> >
> > How to enable tracing in
> > [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
> > Elliott Clark on
> > [[
> http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
> > in HBase]]
> >
> > == Initial Source ==
> > Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in
> the
> > summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
> >
> >
> > == Source and Intellectual Property Submission Plan ==
> > We know of no legal encumberments in the way of transfer of source to
> > Apache.
> >
> > == External Dependencies ==
> > HTrace includes third party libs. These include guava, jetty, junit,
> > protobuf,
> > hbase, and thrift.  All dependencies are Apache licensed or licenses that
> > are
> > palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
> > ProtoBufs are BSD licensed.
> >
> > Cryptography
> > N/A
> >
> > == Required Resources ==
> >
> > === Mailing lists ===
> >   * private@htrace.incubator.apache.org (moderated subscriptions)
> >   * commits@htrace.incubator.apache.org
> >   * dev@htrace.incubator.apache.org
> >   * issues@htrace.incubator.apache.org
> >   * user@htrace.incubator.apache.org
> >
> > === Git Repository ===
> > https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
> >
> > === Issue Tracking ===
> > JIRA HTrace (HTRACE)
> >
> > === Other Resources ===
> > Means of setting up regular builds for htrace on builds.apache.org
> >
> > == Initial Committers ==
> >   * Colin McCabe (cmccabe@apache.org)
> >   * Elliott Clark (eclark@apache.org)
> >   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
> >   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
> >   * Michael Stack (stack@apache.org)
> >   * Nick Dimiduk (ndimiduk@apache.org)
> >   * Todd Lipcon (todd@apache.org)
> >
> >
> > == Affiliations ==
> >   * Colin McCabe - Cloudera
> >   * Elliott Clark - Facebook
> >   * Jonathan Leavitt - Google
> >   * Masatake Iwasaki - NTTData
> >   * Michael Stack - Cloudera
> >   * Nick Dimiduk - Hortonworks
> >   * Todd Lipcon - Cloudera
> >
> > == Sponsors ==
> >
> > === Champion ===
> > Roman Shaposhnik
> >
> > === Nominated Mentors ===
> >   * Michael Stack - Apache Member
> >   * Todd Lipcon - Apache Member
> >
> > We will be soliciting more mentors as part of the proposal process.
> >
> > === Sponsoring Entity ===
> > We would like to propose Apache incubator to sponsor this project.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Andrew Purtell <ap...@apache.org>.
Really great to see an incubation proposal for HTrace. If you need another
mentor, please consider me.

I don't think you need to list "HTrace is not the primary focus of any of
the current list of contributors" as a risk. One can say that about many
(perhaps the majority) of contributors to Apache projects. We would hope
the incubation process develops a healthy community that sustains a level
of contribution that keeps the project moving forward, as we would hope for
all incubation candidates.



On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> Hi!
>
> I would like to propose HTrace to be consider for
> Apache Incubator. The proposal is attached and
> is also available on the wiki:
>     https://wiki.apache.org/incubator/HTraceProposal
>
> Please let me know what do you guys think and also
> don't hesitate to massage the proposal on the wiki
> based on the feedback from this thread.
>
> Thanks,
> Roman.
>
> == Abstract ==
> HTrace is a tracing framework intended for use with distributed
> systems written in java.
>
> == Proposal ==
> HTrace is an aid for understanding system behavior and for reasoning
> about performance
> issues in distributed systems. HTrace is primarily a low impedance
> library that a java
> distributed system can incorporate to generate ‘breadcrumbs’ or
> ‘traces’ along the path
> of execution, even as it crosses processes and machines. HTrace also
> includes various
> tools and glue for collecting, processing and ‘visualizing’ captured
> execution traces
> for analysis ex post facto of where time was spent and what resources
> were consumed.
>
> == Background ==
> Distributed systems are made up of multiple software components
> running on multiple
> computers connected by networks. Debugging or profiling operations run
> over non-trivial
> distributed systems -- figuring execution paths and what services,
> machines, and
> libraries participated in the processing of a request -- can be involved.
>
> == Rationale ==
> Rather than have each distributed system build its own custom
> ‘tracing’ libraries,
> ideally all would use a single project that provides necessary
> primitives and saves
> each project building its own visualizations and processing tools anew.
>
> Google described “...[a] large-scale distributed systems tracing
> infrastructure”
> in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The
> paper
> tells a compelling story of what is possible when disparate systems
> standardize
> on a single tracing library and cooperate, ‘passing the baton’, filling out
> trace context as executions cross systems.
>
> HTrace aims to provide a rough equivalent in open source of the described
> core
> Dapper tools and library.  As it is adopted by more projects, there will
> be a
> ‘network effect’ as HTrace will provide a more comprehensive view of
> activity
> on the cluster.  For example, as HDFS gets HTrace support, we can connect
> this
> with the HTrace support in HBase to follow HBase requests as they enter
> HDFS.
>
> Given the success of HTrace depends on its being integrated by many
> projects,
> HTrace should be perceived as unhampered, free of any commercial,
> political,
> or legal ‘taint’. Being an Apache project would help in this regard.
>
> == Initial Goals ==
> HTrace is a small project of narrow scope but with a grand vision:
>   * Move the HTrace source and repository to Apache, a vendor-neutral
> location. Currently HTrace resides at a Cloudera-hosted repository.
>   * Add past contributors as committers and institute Apache governance.
>   * Evangelize and encourage HTrace diffusion. Initially we will
> continue a focus on the Hadoop space since that is where most of the
> initial contributors work and it is where HTrace has been initially
> deployed.
>   * Building out the standalone visualization tool that ships with HTrace.
>   * Build more community and add more committers
>
> == Current Status ==
> Currently HTrace has a viable Java trace library that can be interpolated
> to create ‘traces’.  The work that needs to be done on this library is
> mostly
> bug fixes, ease-of-use improvements, and performance tweaks.  In the
> future,
> we may add libraries for other languages besides Java.
>
> HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
> (a tracing
> sink and visualization system developed by Twitter
> https://github.com/twitter/zipkin),
> or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
> (https://code.google.com/p/python-graph/).
>
> Since the initial sprint in the summer of 2012 which saw HTrace patches
> proposed
> for Apache HDFS and committed to Apache HBase, development has been
> sporadic;
> mostly a single developer or two adding a feature or bug fixing. HTrace is
> currently undergoing a new “spurt” of development with the effort to get
> HTrace
> added to Apache HDFS revived and a new standalone viewing facility being
> added
> in to HTrace itself.
>
> HTrace has been integrated by Apache Phoenix.
>
>
> === Meritocracy ===
> HTrace, up to this, has been run by Apache committers and PMC members.
> We want to
> build out a diverse developer and user community and run the HTrace
> project in
> the Apache way.  Users and new contributors will be treated with respect
> and
> welcomed; they will earn merit in the project by tendering quality patches
> and support that move the project forward.  Those with a proven support and
> quality patch track record will be encouraged to become committers.
>
> === Community ===
> There are just a few developers involved at the moment. If our project
> is accepted
> by incubator, building community would be a primary initial goal.
>
> === Core Developers ===
>
> Core developers include Apache members and members of the Hadoop and
> HBase PMCs.
> Of those listed, all have contributed to HTrace. Half are from Cloudera.
> The remainder are Hortonworks, NTTData, Google, and Facebook employees.
>
> === Alignment ===
> HTrace has been integrated into Apache HBase and Apache Phoenix.
> Integration
> into Apache HDFS is currently being worked on. Approaching the Apache YARN
> project would be a likely next integration.
>
>
> == Known Risks ==
> As noted above, development has been sporadic up to this.  It may continue
> so.
>
> HTrace is not the primary focus of any of the current list of contributors.
> It is for all a side effort.  HTrace may lack sufficient impetus with such
> a state of affairs.
>
> For HTrace to tell a compelling story, it needs to be taken up by
> significant
> projects that make up a traced distributed system.  For example, say YARN
> and
> HBase take on HTrace but HDFS does not, then the HDFS portions of an
> end-to-end
> operation will render opaque compromising our being able to tell a good
> story
> around an execution. Because the picture painted has gaps, HTrace may be
> left
> aside as ineffective.
>
> === Orphaned products ===
> The proposers have a vested interest in making HTrace succeed, driving its
> development and its insertion into projects we all work on. Its dispersion
> will shine light on difficult to understand interactions amongst the
> various
> systems we all work on. A working, integrated HTrace will add a useful
> debugging mechanism to the Apache projects we all work on.
>
>
> === Inexperience with Open Source ===
> The majority of the proposers here have day jobs that has them working near
> full-time on (Apache) open source projects. A few of us have helped carry
> other projects through incubator.  HTrace to date has been developed as
> an open source project.
>
> === Homogenous Developers ===
> The initial group of committers is small but already we have a healthy
> diversity of participating companies.  We are bay-area challenged but
> a Japanese contributor makes for a good counter balance.
>
> === Reliance on Salaried Developers ===
> Most of the contributors are paid to work in the Hadoop ecosystem.
> While we might wander from our current employers, we probably won’t
> go far from the Hadoop tree.  Whoever the Hadoop employer, it is
> plain a successful HTrace project is in everyone’s interest.
> At least one of the developers has already changed employers but
> his interest in seeing HTrace succeed prevails.
>
> === Relationships with Other Apache Products ===
> For HTrace to succeed, it is critical we build good relations with
> other distributed systems projects.  We intend to initially build
> on relations we already have in place, mostly in the Hadoop space.
>
> The HTrace project has been incorporated by Apache HBase and
> Apache Phoenix. It is currently being actively integrated into
> Apache HDFS.
>
> We do not know of any equivalent or near-equivalent project
> in the Apache space.
>
> The Dapper paper notes precedent, in particular, the Berkeley
> Rad Lab X-Trace project.
>
> ==== How HTrace relates to Zipkin ====
> Zipkin is an Apache Licensed project from Twitter. It is a complete
> tracing tool with trace collectors, trace viewers and tools to help
> you generate traces. It is written in Scala.  If your project is
> not Scala or if it is Java and you cannot afford a Scala dependency,
> at a minimum, you need an alternate means of generating traces.
> HTrace provides this facility for Java as well as bridging tools
> to feed traces to Zipkin for query and display.
>
> The projects complement each other.
>
> === A Excessive Fascination with the Apache Brand ===
> While we intend to leverage the Apache ‘branding’ when talking to other
> projects as testament of our project’s ‘neutrality’, we have no plans
> for making use of Apache brand in press releases nor posting billboards
> advertising acceptance of HTrace into Apache Incubator.
>
>
> == Documentation ==
> See [[http://htrace.org|htrace.org]] for the current state of the HTrace
> project and documentation.
>
> How to enable tracing in
> [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
> Elliott Clark on
> [[http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
> in HBase]]
>
> == Initial Source ==
> Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in the
> summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
>
>
> == Source and Intellectual Property Submission Plan ==
> We know of no legal encumberments in the way of transfer of source to
> Apache.
>
> == External Dependencies ==
> HTrace includes third party libs. These include guava, jetty, junit,
> protobuf,
> hbase, and thrift.  All dependencies are Apache licensed or licenses that
> are
> palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
> ProtoBufs are BSD licensed.
>
> Cryptography
> N/A
>
> == Required Resources ==
>
> === Mailing lists ===
>   * private@htrace.incubator.apache.org (moderated subscriptions)
>   * commits@htrace.incubator.apache.org
>   * dev@htrace.incubator.apache.org
>   * issues@htrace.incubator.apache.org
>   * user@htrace.incubator.apache.org
>
> === Git Repository ===
> https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
>
> === Issue Tracking ===
> JIRA HTrace (HTRACE)
>
> === Other Resources ===
> Means of setting up regular builds for htrace on builds.apache.org
>
> == Initial Committers ==
>   * Colin McCabe (cmccabe@apache.org)
>   * Elliott Clark (eclark@apache.org)
>   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
>   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
>   * Michael Stack (stack@apache.org)
>   * Nick Dimiduk (ndimiduk@apache.org)
>   * Todd Lipcon (todd@apache.org)
>
>
> == Affiliations ==
>   * Colin McCabe - Cloudera
>   * Elliott Clark - Facebook
>   * Jonathan Leavitt - Google
>   * Masatake Iwasaki - NTTData
>   * Michael Stack - Cloudera
>   * Nick Dimiduk - Hortonworks
>   * Todd Lipcon - Cloudera
>
> == Sponsors ==
>
> === Champion ===
> Roman Shaposhnik
>
> === Nominated Mentors ===
>   * Michael Stack - Apache Member
>   * Todd Lipcon - Apache Member
>
> We will be soliciting more mentors as part of the proposal process.
>
> === Sponsoring Entity ===
> We would like to propose Apache incubator to sponsor this project.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Tue, Nov 4, 2014 at 1:50 AM, Steve Loughran <st...@hortonworks.com> wrote:
> the code inside is all org.htrace; changing that would be painful for both
> the developers and the current users

Sure. But we'd have to migrate to org.apache anyway -- I don't think
it matters then.

> who owns htrace.org?

I believe it belongs to one of the devs. That said, why is it even relevant?

Thanks,
Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Steve Loughran <st...@hortonworks.com>.
the code inside is all org.htrace; changing that would be painful for both
the developers and the current users

who owns htrace.org?

On 3 November 2014 19:27, Roman Shaposhnik <rv...@apache.org> wrote:

> Hi!
>
> Thanks for the positive feedback and volunteering. I think the
> more mentors the merrier -- all the folks who volunteered
> please add your names to the wiki.
>
> As for the name, personally, I really like Distrace. That said,
> I'd leave this bikesched to be painted for later ;-)
>
> Andrew, great point on the wording: I'll update the proposal.
>
> Finally, since I'm currently on vacation, I'll let this thread
> go for a little longer and will start the official VOTE in a few
> days.
>
> Thanks,
> Roman.
>
> On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik <rv...@apache.org> wrote:
> > Hi!
> >
> > I would like to propose HTrace to be consider for
> > Apache Incubator. The proposal is attached and
> > is also available on the wiki:
> >     https://wiki.apache.org/incubator/HTraceProposal
> >
> > Please let me know what do you guys think and also
> > don't hesitate to massage the proposal on the wiki
> > based on the feedback from this thread.
> >
> > Thanks,
> > Roman.
> >
> > == Abstract ==
> > HTrace is a tracing framework intended for use with distributed
> > systems written in java.
> >
> > == Proposal ==
> > HTrace is an aid for understanding system behavior and for reasoning
> > about performance
> > issues in distributed systems. HTrace is primarily a low impedance
> > library that a java
> > distributed system can incorporate to generate ‘breadcrumbs’ or
> > ‘traces’ along the path
> > of execution, even as it crosses processes and machines. HTrace also
> > includes various
> > tools and glue for collecting, processing and ‘visualizing’ captured
> > execution traces
> > for analysis ex post facto of where time was spent and what resources
> > were consumed.
> >
> > == Background ==
> > Distributed systems are made up of multiple software components
> > running on multiple
> > computers connected by networks. Debugging or profiling operations run
> > over non-trivial
> > distributed systems -- figuring execution paths and what services,
> machines, and
> > libraries participated in the processing of a request -- can be involved.
> >
> > == Rationale ==
> > Rather than have each distributed system build its own custom
> > ‘tracing’ libraries,
> > ideally all would use a single project that provides necessary
> > primitives and saves
> > each project building its own visualizations and processing tools anew.
> >
> > Google described “...[a] large-scale distributed systems tracing
> infrastructure”
> > in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The
> paper
> > tells a compelling story of what is possible when disparate systems
> standardize
> > on a single tracing library and cooperate, ‘passing the baton’, filling
> out
> > trace context as executions cross systems.
> >
> > HTrace aims to provide a rough equivalent in open source of the
> described core
> > Dapper tools and library.  As it is adopted by more projects, there will
> be a
> > ‘network effect’ as HTrace will provide a more comprehensive view of
> activity
> > on the cluster.  For example, as HDFS gets HTrace support, we can
> connect this
> > with the HTrace support in HBase to follow HBase requests as they enter
> HDFS.
> >
> > Given the success of HTrace depends on its being integrated by many
> projects,
> > HTrace should be perceived as unhampered, free of any commercial,
> political,
> > or legal ‘taint’. Being an Apache project would help in this regard.
> >
> > == Initial Goals ==
> > HTrace is a small project of narrow scope but with a grand vision:
> >   * Move the HTrace source and repository to Apache, a vendor-neutral
> > location. Currently HTrace resides at a Cloudera-hosted repository.
> >   * Add past contributors as committers and institute Apache governance.
> >   * Evangelize and encourage HTrace diffusion. Initially we will
> > continue a focus on the Hadoop space since that is where most of the
> > initial contributors work and it is where HTrace has been initially
> > deployed.
> >   * Building out the standalone visualization tool that ships with
> HTrace.
> >   * Build more community and add more committers
> >
> > == Current Status ==
> > Currently HTrace has a viable Java trace library that can be interpolated
> > to create ‘traces’.  The work that needs to be done on this library is
> mostly
> > bug fixes, ease-of-use improvements, and performance tweaks.  In the
> future,
> > we may add libraries for other languages besides Java.
> >
> > HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
> > (a tracing
> > sink and visualization system developed by Twitter
> > https://github.com/twitter/zipkin),
> > or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
> > (https://code.google.com/p/python-graph/).
> >
> > Since the initial sprint in the summer of 2012 which saw HTrace patches
> proposed
> > for Apache HDFS and committed to Apache HBase, development has been
> sporadic;
> > mostly a single developer or two adding a feature or bug fixing. HTrace
> is
> > currently undergoing a new “spurt” of development with the effort to get
> HTrace
> > added to Apache HDFS revived and a new standalone viewing facility being
> added
> > in to HTrace itself.
> >
> > HTrace has been integrated by Apache Phoenix.
> >
> >
> > === Meritocracy ===
> > HTrace, up to this, has been run by Apache committers and PMC members.
> > We want to
> > build out a diverse developer and user community and run the HTrace
> project in
> > the Apache way.  Users and new contributors will be treated with respect
> and
> > welcomed; they will earn merit in the project by tendering quality
> patches
> > and support that move the project forward.  Those with a proven support
> and
> > quality patch track record will be encouraged to become committers.
> >
> > === Community ===
> > There are just a few developers involved at the moment. If our project
> > is accepted
> > by incubator, building community would be a primary initial goal.
> >
> > === Core Developers ===
> >
> > Core developers include Apache members and members of the Hadoop and
> > HBase PMCs.
> > Of those listed, all have contributed to HTrace. Half are from Cloudera.
> > The remainder are Hortonworks, NTTData, Google, and Facebook employees.
> >
> > === Alignment ===
> > HTrace has been integrated into Apache HBase and Apache Phoenix.
> Integration
> > into Apache HDFS is currently being worked on. Approaching the Apache
> YARN
> > project would be a likely next integration.
> >
> >
> > == Known Risks ==
> > As noted above, development has been sporadic up to this.  It may
> continue so.
> >
> > HTrace is not the primary focus of any of the current list of
> contributors.
> > It is for all a side effort.  HTrace may lack sufficient impetus with
> such
> > a state of affairs.
> >
> > For HTrace to tell a compelling story, it needs to be taken up by
> significant
> > projects that make up a traced distributed system.  For example, say
> YARN and
> > HBase take on HTrace but HDFS does not, then the HDFS portions of an
> end-to-end
> > operation will render opaque compromising our being able to tell a good
> story
> > around an execution. Because the picture painted has gaps, HTrace may be
> left
> > aside as ineffective.
> >
> > === Orphaned products ===
> > The proposers have a vested interest in making HTrace succeed, driving
> its
> > development and its insertion into projects we all work on. Its
> dispersion
> > will shine light on difficult to understand interactions amongst the
> various
> > systems we all work on. A working, integrated HTrace will add a useful
> > debugging mechanism to the Apache projects we all work on.
> >
> >
> > === Inexperience with Open Source ===
> > The majority of the proposers here have day jobs that has them working
> near
> > full-time on (Apache) open source projects. A few of us have helped carry
> > other projects through incubator.  HTrace to date has been developed as
> > an open source project.
> >
> > === Homogenous Developers ===
> > The initial group of committers is small but already we have a healthy
> > diversity of participating companies.  We are bay-area challenged but
> > a Japanese contributor makes for a good counter balance.
> >
> > === Reliance on Salaried Developers ===
> > Most of the contributors are paid to work in the Hadoop ecosystem.
> > While we might wander from our current employers, we probably won’t
> > go far from the Hadoop tree.  Whoever the Hadoop employer, it is
> > plain a successful HTrace project is in everyone’s interest.
> > At least one of the developers has already changed employers but
> > his interest in seeing HTrace succeed prevails.
> >
> > === Relationships with Other Apache Products ===
> > For HTrace to succeed, it is critical we build good relations with
> > other distributed systems projects.  We intend to initially build
> > on relations we already have in place, mostly in the Hadoop space.
> >
> > The HTrace project has been incorporated by Apache HBase and
> > Apache Phoenix. It is currently being actively integrated into
> > Apache HDFS.
> >
> > We do not know of any equivalent or near-equivalent project
> > in the Apache space.
> >
> > The Dapper paper notes precedent, in particular, the Berkeley
> > Rad Lab X-Trace project.
> >
> > ==== How HTrace relates to Zipkin ====
> > Zipkin is an Apache Licensed project from Twitter. It is a complete
> > tracing tool with trace collectors, trace viewers and tools to help
> > you generate traces. It is written in Scala.  If your project is
> > not Scala or if it is Java and you cannot afford a Scala dependency,
> > at a minimum, you need an alternate means of generating traces.
> > HTrace provides this facility for Java as well as bridging tools
> > to feed traces to Zipkin for query and display.
> >
> > The projects complement each other.
> >
> > === A Excessive Fascination with the Apache Brand ===
> > While we intend to leverage the Apache ‘branding’ when talking to other
> > projects as testament of our project’s ‘neutrality’, we have no plans
> > for making use of Apache brand in press releases nor posting billboards
> > advertising acceptance of HTrace into Apache Incubator.
> >
> >
> > == Documentation ==
> > See [[http://htrace.org|htrace.org]] for the current state of the HTrace
> > project and documentation.
> >
> > How to enable tracing in
> > [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
> > Elliott Clark on
> > [[
> http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
> > in HBase]]
> >
> > == Initial Source ==
> > Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in
> the
> > summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
> >
> >
> > == Source and Intellectual Property Submission Plan ==
> > We know of no legal encumberments in the way of transfer of source to
> Apache.
> >
> > == External Dependencies ==
> > HTrace includes third party libs. These include guava, jetty, junit,
> protobuf,
> > hbase, and thrift.  All dependencies are Apache licensed or licenses
> that are
> > palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
> > ProtoBufs are BSD licensed.
> >
> > Cryptography
> > N/A
> >
> > == Required Resources ==
> >
> > === Mailing lists ===
> >   * private@htrace.incubator.apache.org (moderated subscriptions)
> >   * commits@htrace.incubator.apache.org
> >   * dev@htrace.incubator.apache.org
> >   * issues@htrace.incubator.apache.org
> >   * user@htrace.incubator.apache.org
> >
> > === Git Repository ===
> > https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
> >
> > === Issue Tracking ===
> > JIRA HTrace (HTRACE)
> >
> > === Other Resources ===
> > Means of setting up regular builds for htrace on builds.apache.org
> >
> > == Initial Committers ==
> >   * Colin McCabe (cmccabe@apache.org)
> >   * Elliott Clark (eclark@apache.org)
> >   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
> >   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
> >   * Michael Stack (stack@apache.org)
> >   * Nick Dimiduk (ndimiduk@apache.org)
> >   * Todd Lipcon (todd@apache.org)
> >
> >
> > == Affiliations ==
> >   * Colin McCabe - Cloudera
> >   * Elliott Clark - Facebook
> >   * Jonathan Leavitt - Google
> >   * Masatake Iwasaki - NTTData
> >   * Michael Stack - Cloudera
> >   * Nick Dimiduk - Hortonworks
> >   * Todd Lipcon - Cloudera
> >
> > == Sponsors ==
> >
> > === Champion ===
> > Roman Shaposhnik
> >
> > === Nominated Mentors ===
> >   * Michael Stack - Apache Member
> >   * Todd Lipcon - Apache Member
> >
> > We will be soliciting more mentors as part of the proposal process.
> >
> > === Sponsoring Entity ===
> > We would like to propose Apache incubator to sponsor this project.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Stack <st...@duboce.net>.
On Mon, Nov 3, 2014 at 2:15 PM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Le 3 nov. 2014 22:06, "Stack" <st...@duboce.net> a écrit :
> >
> > On Mon, Nov 3, 2014 at 12:04 PM, Jean-Louis MONTEIRO <jeanouii@gmail.com
> >
> > wrote:
> >
> > > BTW, wondering how to get the Apache Sirona community involved or if
> there
> > > is a possible common road where the 2 projects could join.
> > >
> > >
> > How do you see the projects relating?  How would the recording and
> viewing
> > of traces in Sirona look?  Do you foresee a zipkin like viewer in Sirona?
>
> We just started but we have agents to get data on nodes then we push in a
> store (cassandra mainly today). Gui is not yet complete for tracing part.
>

Ok. When comes time, would be great hooking up Apache Sirona as a Trace
Sink (HTrace has a few Sink implemenations as is: zipkin, files, hbase).
Thanks,
St.Ack

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Le 3 nov. 2014 22:06, "Stack" <st...@duboce.net> a écrit :
>
> On Mon, Nov 3, 2014 at 12:04 PM, Jean-Louis MONTEIRO <je...@gmail.com>
> wrote:
>
> > BTW, wondering how to get the Apache Sirona community involved or if
there
> > is a possible common road where the 2 projects could join.
> >
> >
> How do you see the projects relating?  How would the recording and viewing
> of traces in Sirona look?  Do you foresee a zipkin like viewer in Sirona?

We just started but we have agents to get data on nodes then we push in a
store (cassandra mainly today). Gui is not yet complete for tracing part.

> Thanks,
> St.Ack

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Stack <st...@duboce.net>.
On Mon, Nov 3, 2014 at 12:04 PM, Jean-Louis MONTEIRO <je...@gmail.com>
wrote:

> BTW, wondering how to get the Apache Sirona community involved or if there
> is a possible common road where the 2 projects could join.
>
>
How do you see the projects relating?  How would the recording and viewing
of traces in Sirona look?  Do you foresee a zipkin like viewer in Sirona?
Thanks,
St.Ack

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Romain Manni-Bucau <rm...@gmail.com>.
2014-11-03 20:04 GMT+00:00 Jean-Louis MONTEIRO <je...@gmail.com>:
> BTW, wondering how to get the Apache Sirona community involved or if there
> is a possible common road where the 2 projects could join.
>

+1. having a single solution would be awesome and sirona is just
starting to get there so wommunity will be happy to get help on this
part

> 2014-11-03 20:27 GMT+01:00 Roman Shaposhnik <rv...@apache.org>:
>
>> Hi!
>>
>> Thanks for the positive feedback and volunteering. I think the
>> more mentors the merrier -- all the folks who volunteered
>> please add your names to the wiki.
>>
>> As for the name, personally, I really like Distrace. That said,
>> I'd leave this bikesched to be painted for later ;-)
>>
>> Andrew, great point on the wording: I'll update the proposal.
>>
>> Finally, since I'm currently on vacation, I'll let this thread
>> go for a little longer and will start the official VOTE in a few
>> days.
>>
>> Thanks,
>> Roman.
>>
>> On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>> > Hi!
>> >
>> > I would like to propose HTrace to be consider for
>> > Apache Incubator. The proposal is attached and
>> > is also available on the wiki:
>> >     https://wiki.apache.org/incubator/HTraceProposal
>> >
>> > Please let me know what do you guys think and also
>> > don't hesitate to massage the proposal on the wiki
>> > based on the feedback from this thread.
>> >
>> > Thanks,
>> > Roman.
>> >
>> > == Abstract ==
>> > HTrace is a tracing framework intended for use with distributed
>> > systems written in java.
>> >
>> > == Proposal ==
>> > HTrace is an aid for understanding system behavior and for reasoning
>> > about performance
>> > issues in distributed systems. HTrace is primarily a low impedance
>> > library that a java
>> > distributed system can incorporate to generate ‘breadcrumbs’ or
>> > ‘traces’ along the path
>> > of execution, even as it crosses processes and machines. HTrace also
>> > includes various
>> > tools and glue for collecting, processing and ‘visualizing’ captured
>> > execution traces
>> > for analysis ex post facto of where time was spent and what resources
>> > were consumed.
>> >
>> > == Background ==
>> > Distributed systems are made up of multiple software components
>> > running on multiple
>> > computers connected by networks. Debugging or profiling operations run
>> > over non-trivial
>> > distributed systems -- figuring execution paths and what services,
>> machines, and
>> > libraries participated in the processing of a request -- can be involved.
>> >
>> > == Rationale ==
>> > Rather than have each distributed system build its own custom
>> > ‘tracing’ libraries,
>> > ideally all would use a single project that provides necessary
>> > primitives and saves
>> > each project building its own visualizations and processing tools anew.
>> >
>> > Google described “...[a] large-scale distributed systems tracing
>> infrastructure”
>> > in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The
>> paper
>> > tells a compelling story of what is possible when disparate systems
>> standardize
>> > on a single tracing library and cooperate, ‘passing the baton’, filling
>> out
>> > trace context as executions cross systems.
>> >
>> > HTrace aims to provide a rough equivalent in open source of the
>> described core
>> > Dapper tools and library.  As it is adopted by more projects, there will
>> be a
>> > ‘network effect’ as HTrace will provide a more comprehensive view of
>> activity
>> > on the cluster.  For example, as HDFS gets HTrace support, we can
>> connect this
>> > with the HTrace support in HBase to follow HBase requests as they enter
>> HDFS.
>> >
>> > Given the success of HTrace depends on its being integrated by many
>> projects,
>> > HTrace should be perceived as unhampered, free of any commercial,
>> political,
>> > or legal ‘taint’. Being an Apache project would help in this regard.
>> >
>> > == Initial Goals ==
>> > HTrace is a small project of narrow scope but with a grand vision:
>> >   * Move the HTrace source and repository to Apache, a vendor-neutral
>> > location. Currently HTrace resides at a Cloudera-hosted repository.
>> >   * Add past contributors as committers and institute Apache governance.
>> >   * Evangelize and encourage HTrace diffusion. Initially we will
>> > continue a focus on the Hadoop space since that is where most of the
>> > initial contributors work and it is where HTrace has been initially
>> > deployed.
>> >   * Building out the standalone visualization tool that ships with
>> HTrace.
>> >   * Build more community and add more committers
>> >
>> > == Current Status ==
>> > Currently HTrace has a viable Java trace library that can be interpolated
>> > to create ‘traces’.  The work that needs to be done on this library is
>> mostly
>> > bug fixes, ease-of-use improvements, and performance tweaks.  In the
>> future,
>> > we may add libraries for other languages besides Java.
>> >
>> > HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
>> > (a tracing
>> > sink and visualization system developed by Twitter
>> > https://github.com/twitter/zipkin),
>> > or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
>> > (https://code.google.com/p/python-graph/).
>> >
>> > Since the initial sprint in the summer of 2012 which saw HTrace patches
>> proposed
>> > for Apache HDFS and committed to Apache HBase, development has been
>> sporadic;
>> > mostly a single developer or two adding a feature or bug fixing. HTrace
>> is
>> > currently undergoing a new “spurt” of development with the effort to get
>> HTrace
>> > added to Apache HDFS revived and a new standalone viewing facility being
>> added
>> > in to HTrace itself.
>> >
>> > HTrace has been integrated by Apache Phoenix.
>> >
>> >
>> > === Meritocracy ===
>> > HTrace, up to this, has been run by Apache committers and PMC members.
>> > We want to
>> > build out a diverse developer and user community and run the HTrace
>> project in
>> > the Apache way.  Users and new contributors will be treated with respect
>> and
>> > welcomed; they will earn merit in the project by tendering quality
>> patches
>> > and support that move the project forward.  Those with a proven support
>> and
>> > quality patch track record will be encouraged to become committers.
>> >
>> > === Community ===
>> > There are just a few developers involved at the moment. If our project
>> > is accepted
>> > by incubator, building community would be a primary initial goal.
>> >
>> > === Core Developers ===
>> >
>> > Core developers include Apache members and members of the Hadoop and
>> > HBase PMCs.
>> > Of those listed, all have contributed to HTrace. Half are from Cloudera.
>> > The remainder are Hortonworks, NTTData, Google, and Facebook employees.
>> >
>> > === Alignment ===
>> > HTrace has been integrated into Apache HBase and Apache Phoenix.
>> Integration
>> > into Apache HDFS is currently being worked on. Approaching the Apache
>> YARN
>> > project would be a likely next integration.
>> >
>> >
>> > == Known Risks ==
>> > As noted above, development has been sporadic up to this.  It may
>> continue so.
>> >
>> > HTrace is not the primary focus of any of the current list of
>> contributors.
>> > It is for all a side effort.  HTrace may lack sufficient impetus with
>> such
>> > a state of affairs.
>> >
>> > For HTrace to tell a compelling story, it needs to be taken up by
>> significant
>> > projects that make up a traced distributed system.  For example, say
>> YARN and
>> > HBase take on HTrace but HDFS does not, then the HDFS portions of an
>> end-to-end
>> > operation will render opaque compromising our being able to tell a good
>> story
>> > around an execution. Because the picture painted has gaps, HTrace may be
>> left
>> > aside as ineffective.
>> >
>> > === Orphaned products ===
>> > The proposers have a vested interest in making HTrace succeed, driving
>> its
>> > development and its insertion into projects we all work on. Its
>> dispersion
>> > will shine light on difficult to understand interactions amongst the
>> various
>> > systems we all work on. A working, integrated HTrace will add a useful
>> > debugging mechanism to the Apache projects we all work on.
>> >
>> >
>> > === Inexperience with Open Source ===
>> > The majority of the proposers here have day jobs that has them working
>> near
>> > full-time on (Apache) open source projects. A few of us have helped carry
>> > other projects through incubator.  HTrace to date has been developed as
>> > an open source project.
>> >
>> > === Homogenous Developers ===
>> > The initial group of committers is small but already we have a healthy
>> > diversity of participating companies.  We are bay-area challenged but
>> > a Japanese contributor makes for a good counter balance.
>> >
>> > === Reliance on Salaried Developers ===
>> > Most of the contributors are paid to work in the Hadoop ecosystem.
>> > While we might wander from our current employers, we probably won’t
>> > go far from the Hadoop tree.  Whoever the Hadoop employer, it is
>> > plain a successful HTrace project is in everyone’s interest.
>> > At least one of the developers has already changed employers but
>> > his interest in seeing HTrace succeed prevails.
>> >
>> > === Relationships with Other Apache Products ===
>> > For HTrace to succeed, it is critical we build good relations with
>> > other distributed systems projects.  We intend to initially build
>> > on relations we already have in place, mostly in the Hadoop space.
>> >
>> > The HTrace project has been incorporated by Apache HBase and
>> > Apache Phoenix. It is currently being actively integrated into
>> > Apache HDFS.
>> >
>> > We do not know of any equivalent or near-equivalent project
>> > in the Apache space.
>> >
>> > The Dapper paper notes precedent, in particular, the Berkeley
>> > Rad Lab X-Trace project.
>> >
>> > ==== How HTrace relates to Zipkin ====
>> > Zipkin is an Apache Licensed project from Twitter. It is a complete
>> > tracing tool with trace collectors, trace viewers and tools to help
>> > you generate traces. It is written in Scala.  If your project is
>> > not Scala or if it is Java and you cannot afford a Scala dependency,
>> > at a minimum, you need an alternate means of generating traces.
>> > HTrace provides this facility for Java as well as bridging tools
>> > to feed traces to Zipkin for query and display.
>> >
>> > The projects complement each other.
>> >
>> > === A Excessive Fascination with the Apache Brand ===
>> > While we intend to leverage the Apache ‘branding’ when talking to other
>> > projects as testament of our project’s ‘neutrality’, we have no plans
>> > for making use of Apache brand in press releases nor posting billboards
>> > advertising acceptance of HTrace into Apache Incubator.
>> >
>> >
>> > == Documentation ==
>> > See [[http://htrace.org|htrace.org]] for the current state of the HTrace
>> > project and documentation.
>> >
>> > How to enable tracing in
>> > [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
>> > Elliott Clark on
>> > [[
>> http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
>> > in HBase]]
>> >
>> > == Initial Source ==
>> > Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in
>> the
>> > summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
>> >
>> >
>> > == Source and Intellectual Property Submission Plan ==
>> > We know of no legal encumberments in the way of transfer of source to
>> Apache.
>> >
>> > == External Dependencies ==
>> > HTrace includes third party libs. These include guava, jetty, junit,
>> protobuf,
>> > hbase, and thrift.  All dependencies are Apache licensed or licenses
>> that are
>> > palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
>> > ProtoBufs are BSD licensed.
>> >
>> > Cryptography
>> > N/A
>> >
>> > == Required Resources ==
>> >
>> > === Mailing lists ===
>> >   * private@htrace.incubator.apache.org (moderated subscriptions)
>> >   * commits@htrace.incubator.apache.org
>> >   * dev@htrace.incubator.apache.org
>> >   * issues@htrace.incubator.apache.org
>> >   * user@htrace.incubator.apache.org
>> >
>> > === Git Repository ===
>> > https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
>> >
>> > === Issue Tracking ===
>> > JIRA HTrace (HTRACE)
>> >
>> > === Other Resources ===
>> > Means of setting up regular builds for htrace on builds.apache.org
>> >
>> > == Initial Committers ==
>> >   * Colin McCabe (cmccabe@apache.org)
>> >   * Elliott Clark (eclark@apache.org)
>> >   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
>> >   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
>> >   * Michael Stack (stack@apache.org)
>> >   * Nick Dimiduk (ndimiduk@apache.org)
>> >   * Todd Lipcon (todd@apache.org)
>> >
>> >
>> > == Affiliations ==
>> >   * Colin McCabe - Cloudera
>> >   * Elliott Clark - Facebook
>> >   * Jonathan Leavitt - Google
>> >   * Masatake Iwasaki - NTTData
>> >   * Michael Stack - Cloudera
>> >   * Nick Dimiduk - Hortonworks
>> >   * Todd Lipcon - Cloudera
>> >
>> > == Sponsors ==
>> >
>> > === Champion ===
>> > Roman Shaposhnik
>> >
>> > === Nominated Mentors ===
>> >   * Michael Stack - Apache Member
>> >   * Todd Lipcon - Apache Member
>> >
>> > We will be soliciting more mentors as part of the proposal process.
>> >
>> > === Sponsoring Entity ===
>> > We would like to propose Apache incubator to sponsor this project.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>
>
>
> --
> Jean-Louis

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Jean-Louis MONTEIRO <je...@gmail.com>.
BTW, wondering how to get the Apache Sirona community involved or if there
is a possible common road where the 2 projects could join.

2014-11-03 20:27 GMT+01:00 Roman Shaposhnik <rv...@apache.org>:

> Hi!
>
> Thanks for the positive feedback and volunteering. I think the
> more mentors the merrier -- all the folks who volunteered
> please add your names to the wiki.
>
> As for the name, personally, I really like Distrace. That said,
> I'd leave this bikesched to be painted for later ;-)
>
> Andrew, great point on the wording: I'll update the proposal.
>
> Finally, since I'm currently on vacation, I'll let this thread
> go for a little longer and will start the official VOTE in a few
> days.
>
> Thanks,
> Roman.
>
> On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik <rv...@apache.org> wrote:
> > Hi!
> >
> > I would like to propose HTrace to be consider for
> > Apache Incubator. The proposal is attached and
> > is also available on the wiki:
> >     https://wiki.apache.org/incubator/HTraceProposal
> >
> > Please let me know what do you guys think and also
> > don't hesitate to massage the proposal on the wiki
> > based on the feedback from this thread.
> >
> > Thanks,
> > Roman.
> >
> > == Abstract ==
> > HTrace is a tracing framework intended for use with distributed
> > systems written in java.
> >
> > == Proposal ==
> > HTrace is an aid for understanding system behavior and for reasoning
> > about performance
> > issues in distributed systems. HTrace is primarily a low impedance
> > library that a java
> > distributed system can incorporate to generate ‘breadcrumbs’ or
> > ‘traces’ along the path
> > of execution, even as it crosses processes and machines. HTrace also
> > includes various
> > tools and glue for collecting, processing and ‘visualizing’ captured
> > execution traces
> > for analysis ex post facto of where time was spent and what resources
> > were consumed.
> >
> > == Background ==
> > Distributed systems are made up of multiple software components
> > running on multiple
> > computers connected by networks. Debugging or profiling operations run
> > over non-trivial
> > distributed systems -- figuring execution paths and what services,
> machines, and
> > libraries participated in the processing of a request -- can be involved.
> >
> > == Rationale ==
> > Rather than have each distributed system build its own custom
> > ‘tracing’ libraries,
> > ideally all would use a single project that provides necessary
> > primitives and saves
> > each project building its own visualizations and processing tools anew.
> >
> > Google described “...[a] large-scale distributed systems tracing
> infrastructure”
> > in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The
> paper
> > tells a compelling story of what is possible when disparate systems
> standardize
> > on a single tracing library and cooperate, ‘passing the baton’, filling
> out
> > trace context as executions cross systems.
> >
> > HTrace aims to provide a rough equivalent in open source of the
> described core
> > Dapper tools and library.  As it is adopted by more projects, there will
> be a
> > ‘network effect’ as HTrace will provide a more comprehensive view of
> activity
> > on the cluster.  For example, as HDFS gets HTrace support, we can
> connect this
> > with the HTrace support in HBase to follow HBase requests as they enter
> HDFS.
> >
> > Given the success of HTrace depends on its being integrated by many
> projects,
> > HTrace should be perceived as unhampered, free of any commercial,
> political,
> > or legal ‘taint’. Being an Apache project would help in this regard.
> >
> > == Initial Goals ==
> > HTrace is a small project of narrow scope but with a grand vision:
> >   * Move the HTrace source and repository to Apache, a vendor-neutral
> > location. Currently HTrace resides at a Cloudera-hosted repository.
> >   * Add past contributors as committers and institute Apache governance.
> >   * Evangelize and encourage HTrace diffusion. Initially we will
> > continue a focus on the Hadoop space since that is where most of the
> > initial contributors work and it is where HTrace has been initially
> > deployed.
> >   * Building out the standalone visualization tool that ships with
> HTrace.
> >   * Build more community and add more committers
> >
> > == Current Status ==
> > Currently HTrace has a viable Java trace library that can be interpolated
> > to create ‘traces’.  The work that needs to be done on this library is
> mostly
> > bug fixes, ease-of-use improvements, and performance tweaks.  In the
> future,
> > we may add libraries for other languages besides Java.
> >
> > HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
> > (a tracing
> > sink and visualization system developed by Twitter
> > https://github.com/twitter/zipkin),
> > or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
> > (https://code.google.com/p/python-graph/).
> >
> > Since the initial sprint in the summer of 2012 which saw HTrace patches
> proposed
> > for Apache HDFS and committed to Apache HBase, development has been
> sporadic;
> > mostly a single developer or two adding a feature or bug fixing. HTrace
> is
> > currently undergoing a new “spurt” of development with the effort to get
> HTrace
> > added to Apache HDFS revived and a new standalone viewing facility being
> added
> > in to HTrace itself.
> >
> > HTrace has been integrated by Apache Phoenix.
> >
> >
> > === Meritocracy ===
> > HTrace, up to this, has been run by Apache committers and PMC members.
> > We want to
> > build out a diverse developer and user community and run the HTrace
> project in
> > the Apache way.  Users and new contributors will be treated with respect
> and
> > welcomed; they will earn merit in the project by tendering quality
> patches
> > and support that move the project forward.  Those with a proven support
> and
> > quality patch track record will be encouraged to become committers.
> >
> > === Community ===
> > There are just a few developers involved at the moment. If our project
> > is accepted
> > by incubator, building community would be a primary initial goal.
> >
> > === Core Developers ===
> >
> > Core developers include Apache members and members of the Hadoop and
> > HBase PMCs.
> > Of those listed, all have contributed to HTrace. Half are from Cloudera.
> > The remainder are Hortonworks, NTTData, Google, and Facebook employees.
> >
> > === Alignment ===
> > HTrace has been integrated into Apache HBase and Apache Phoenix.
> Integration
> > into Apache HDFS is currently being worked on. Approaching the Apache
> YARN
> > project would be a likely next integration.
> >
> >
> > == Known Risks ==
> > As noted above, development has been sporadic up to this.  It may
> continue so.
> >
> > HTrace is not the primary focus of any of the current list of
> contributors.
> > It is for all a side effort.  HTrace may lack sufficient impetus with
> such
> > a state of affairs.
> >
> > For HTrace to tell a compelling story, it needs to be taken up by
> significant
> > projects that make up a traced distributed system.  For example, say
> YARN and
> > HBase take on HTrace but HDFS does not, then the HDFS portions of an
> end-to-end
> > operation will render opaque compromising our being able to tell a good
> story
> > around an execution. Because the picture painted has gaps, HTrace may be
> left
> > aside as ineffective.
> >
> > === Orphaned products ===
> > The proposers have a vested interest in making HTrace succeed, driving
> its
> > development and its insertion into projects we all work on. Its
> dispersion
> > will shine light on difficult to understand interactions amongst the
> various
> > systems we all work on. A working, integrated HTrace will add a useful
> > debugging mechanism to the Apache projects we all work on.
> >
> >
> > === Inexperience with Open Source ===
> > The majority of the proposers here have day jobs that has them working
> near
> > full-time on (Apache) open source projects. A few of us have helped carry
> > other projects through incubator.  HTrace to date has been developed as
> > an open source project.
> >
> > === Homogenous Developers ===
> > The initial group of committers is small but already we have a healthy
> > diversity of participating companies.  We are bay-area challenged but
> > a Japanese contributor makes for a good counter balance.
> >
> > === Reliance on Salaried Developers ===
> > Most of the contributors are paid to work in the Hadoop ecosystem.
> > While we might wander from our current employers, we probably won’t
> > go far from the Hadoop tree.  Whoever the Hadoop employer, it is
> > plain a successful HTrace project is in everyone’s interest.
> > At least one of the developers has already changed employers but
> > his interest in seeing HTrace succeed prevails.
> >
> > === Relationships with Other Apache Products ===
> > For HTrace to succeed, it is critical we build good relations with
> > other distributed systems projects.  We intend to initially build
> > on relations we already have in place, mostly in the Hadoop space.
> >
> > The HTrace project has been incorporated by Apache HBase and
> > Apache Phoenix. It is currently being actively integrated into
> > Apache HDFS.
> >
> > We do not know of any equivalent or near-equivalent project
> > in the Apache space.
> >
> > The Dapper paper notes precedent, in particular, the Berkeley
> > Rad Lab X-Trace project.
> >
> > ==== How HTrace relates to Zipkin ====
> > Zipkin is an Apache Licensed project from Twitter. It is a complete
> > tracing tool with trace collectors, trace viewers and tools to help
> > you generate traces. It is written in Scala.  If your project is
> > not Scala or if it is Java and you cannot afford a Scala dependency,
> > at a minimum, you need an alternate means of generating traces.
> > HTrace provides this facility for Java as well as bridging tools
> > to feed traces to Zipkin for query and display.
> >
> > The projects complement each other.
> >
> > === A Excessive Fascination with the Apache Brand ===
> > While we intend to leverage the Apache ‘branding’ when talking to other
> > projects as testament of our project’s ‘neutrality’, we have no plans
> > for making use of Apache brand in press releases nor posting billboards
> > advertising acceptance of HTrace into Apache Incubator.
> >
> >
> > == Documentation ==
> > See [[http://htrace.org|htrace.org]] for the current state of the HTrace
> > project and documentation.
> >
> > How to enable tracing in
> > [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
> > Elliott Clark on
> > [[
> http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
> > in HBase]]
> >
> > == Initial Source ==
> > Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in
> the
> > summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
> >
> >
> > == Source and Intellectual Property Submission Plan ==
> > We know of no legal encumberments in the way of transfer of source to
> Apache.
> >
> > == External Dependencies ==
> > HTrace includes third party libs. These include guava, jetty, junit,
> protobuf,
> > hbase, and thrift.  All dependencies are Apache licensed or licenses
> that are
> > palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
> > ProtoBufs are BSD licensed.
> >
> > Cryptography
> > N/A
> >
> > == Required Resources ==
> >
> > === Mailing lists ===
> >   * private@htrace.incubator.apache.org (moderated subscriptions)
> >   * commits@htrace.incubator.apache.org
> >   * dev@htrace.incubator.apache.org
> >   * issues@htrace.incubator.apache.org
> >   * user@htrace.incubator.apache.org
> >
> > === Git Repository ===
> > https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
> >
> > === Issue Tracking ===
> > JIRA HTrace (HTRACE)
> >
> > === Other Resources ===
> > Means of setting up regular builds for htrace on builds.apache.org
> >
> > == Initial Committers ==
> >   * Colin McCabe (cmccabe@apache.org)
> >   * Elliott Clark (eclark@apache.org)
> >   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
> >   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
> >   * Michael Stack (stack@apache.org)
> >   * Nick Dimiduk (ndimiduk@apache.org)
> >   * Todd Lipcon (todd@apache.org)
> >
> >
> > == Affiliations ==
> >   * Colin McCabe - Cloudera
> >   * Elliott Clark - Facebook
> >   * Jonathan Leavitt - Google
> >   * Masatake Iwasaki - NTTData
> >   * Michael Stack - Cloudera
> >   * Nick Dimiduk - Hortonworks
> >   * Todd Lipcon - Cloudera
> >
> > == Sponsors ==
> >
> > === Champion ===
> > Roman Shaposhnik
> >
> > === Nominated Mentors ===
> >   * Michael Stack - Apache Member
> >   * Todd Lipcon - Apache Member
> >
> > We will be soliciting more mentors as part of the proposal process.
> >
> > === Sponsoring Entity ===
> > We would like to propose Apache incubator to sponsor this project.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>


-- 
Jean-Louis

Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Roman Shaposhnik <rv...@apache.org>.
Hi!

Thanks for the positive feedback and volunteering. I think the
more mentors the merrier -- all the folks who volunteered
please add your names to the wiki.

As for the name, personally, I really like Distrace. That said,
I'd leave this bikesched to be painted for later ;-)

Andrew, great point on the wording: I'll update the proposal.

Finally, since I'm currently on vacation, I'll let this thread
go for a little longer and will start the official VOTE in a few
days.

Thanks,
Roman.

On Fri, Oct 31, 2014 at 4:06 PM, Roman Shaposhnik <rv...@apache.org> wrote:
> Hi!
>
> I would like to propose HTrace to be consider for
> Apache Incubator. The proposal is attached and
> is also available on the wiki:
>     https://wiki.apache.org/incubator/HTraceProposal
>
> Please let me know what do you guys think and also
> don't hesitate to massage the proposal on the wiki
> based on the feedback from this thread.
>
> Thanks,
> Roman.
>
> == Abstract ==
> HTrace is a tracing framework intended for use with distributed
> systems written in java.
>
> == Proposal ==
> HTrace is an aid for understanding system behavior and for reasoning
> about performance
> issues in distributed systems. HTrace is primarily a low impedance
> library that a java
> distributed system can incorporate to generate ‘breadcrumbs’ or
> ‘traces’ along the path
> of execution, even as it crosses processes and machines. HTrace also
> includes various
> tools and glue for collecting, processing and ‘visualizing’ captured
> execution traces
> for analysis ex post facto of where time was spent and what resources
> were consumed.
>
> == Background ==
> Distributed systems are made up of multiple software components
> running on multiple
> computers connected by networks. Debugging or profiling operations run
> over non-trivial
> distributed systems -- figuring execution paths and what services, machines, and
> libraries participated in the processing of a request -- can be involved.
>
> == Rationale ==
> Rather than have each distributed system build its own custom
> ‘tracing’ libraries,
> ideally all would use a single project that provides necessary
> primitives and saves
> each project building its own visualizations and processing tools anew.
>
> Google described “...[a] large-scale distributed systems tracing infrastructure”
> in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The paper
> tells a compelling story of what is possible when disparate systems standardize
> on a single tracing library and cooperate, ‘passing the baton’, filling out
> trace context as executions cross systems.
>
> HTrace aims to provide a rough equivalent in open source of the described core
> Dapper tools and library.  As it is adopted by more projects, there will be a
> ‘network effect’ as HTrace will provide a more comprehensive view of activity
> on the cluster.  For example, as HDFS gets HTrace support, we can connect this
> with the HTrace support in HBase to follow HBase requests as they enter HDFS.
>
> Given the success of HTrace depends on its being integrated by many  projects,
> HTrace should be perceived as unhampered, free of any commercial, political,
> or legal ‘taint’. Being an Apache project would help in this regard.
>
> == Initial Goals ==
> HTrace is a small project of narrow scope but with a grand vision:
>   * Move the HTrace source and repository to Apache, a vendor-neutral
> location. Currently HTrace resides at a Cloudera-hosted repository.
>   * Add past contributors as committers and institute Apache governance.
>   * Evangelize and encourage HTrace diffusion. Initially we will
> continue a focus on the Hadoop space since that is where most of the
> initial contributors work and it is where HTrace has been initially
> deployed.
>   * Building out the standalone visualization tool that ships with HTrace.
>   * Build more community and add more committers
>
> == Current Status ==
> Currently HTrace has a viable Java trace library that can be interpolated
> to create ‘traces’.  The work that needs to be done on this library is mostly
> bug fixes, ease-of-use improvements, and performance tweaks.  In the future,
> we may add libraries for other languages besides Java.
>
> HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
> (a tracing
> sink and visualization system developed by Twitter
> https://github.com/twitter/zipkin),
> or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
> (https://code.google.com/p/python-graph/).
>
> Since the initial sprint in the summer of 2012 which saw HTrace patches proposed
> for Apache HDFS and committed to Apache HBase, development has been sporadic;
> mostly a single developer or two adding a feature or bug fixing. HTrace is
> currently undergoing a new “spurt” of development with the effort to get HTrace
> added to Apache HDFS revived and a new standalone viewing facility being added
> in to HTrace itself.
>
> HTrace has been integrated by Apache Phoenix.
>
>
> === Meritocracy ===
> HTrace, up to this, has been run by Apache committers and PMC members.
> We want to
> build out a diverse developer and user community and run the HTrace project in
> the Apache way.  Users and new contributors will be treated with respect and
> welcomed; they will earn merit in the project by tendering quality patches
> and support that move the project forward.  Those with a proven support and
> quality patch track record will be encouraged to become committers.
>
> === Community ===
> There are just a few developers involved at the moment. If our project
> is accepted
> by incubator, building community would be a primary initial goal.
>
> === Core Developers ===
>
> Core developers include Apache members and members of the Hadoop and
> HBase PMCs.
> Of those listed, all have contributed to HTrace. Half are from Cloudera.
> The remainder are Hortonworks, NTTData, Google, and Facebook employees.
>
> === Alignment ===
> HTrace has been integrated into Apache HBase and Apache Phoenix.  Integration
> into Apache HDFS is currently being worked on. Approaching the Apache YARN
> project would be a likely next integration.
>
>
> == Known Risks ==
> As noted above, development has been sporadic up to this.  It may continue so.
>
> HTrace is not the primary focus of any of the current list of contributors.
> It is for all a side effort.  HTrace may lack sufficient impetus with such
> a state of affairs.
>
> For HTrace to tell a compelling story, it needs to be taken up by significant
> projects that make up a traced distributed system.  For example, say YARN and
> HBase take on HTrace but HDFS does not, then the HDFS portions of an end-to-end
> operation will render opaque compromising our being able to tell a good story
> around an execution. Because the picture painted has gaps, HTrace may be left
> aside as ineffective.
>
> === Orphaned products ===
> The proposers have a vested interest in making HTrace succeed, driving its
> development and its insertion into projects we all work on. Its dispersion
> will shine light on difficult to understand interactions amongst the various
> systems we all work on. A working, integrated HTrace will add a useful
> debugging mechanism to the Apache projects we all work on.
>
>
> === Inexperience with Open Source ===
> The majority of the proposers here have day jobs that has them working near
> full-time on (Apache) open source projects. A few of us have helped carry
> other projects through incubator.  HTrace to date has been developed as
> an open source project.
>
> === Homogenous Developers ===
> The initial group of committers is small but already we have a healthy
> diversity of participating companies.  We are bay-area challenged but
> a Japanese contributor makes for a good counter balance.
>
> === Reliance on Salaried Developers ===
> Most of the contributors are paid to work in the Hadoop ecosystem.
> While we might wander from our current employers, we probably won’t
> go far from the Hadoop tree.  Whoever the Hadoop employer, it is
> plain a successful HTrace project is in everyone’s interest.
> At least one of the developers has already changed employers but
> his interest in seeing HTrace succeed prevails.
>
> === Relationships with Other Apache Products ===
> For HTrace to succeed, it is critical we build good relations with
> other distributed systems projects.  We intend to initially build
> on relations we already have in place, mostly in the Hadoop space.
>
> The HTrace project has been incorporated by Apache HBase and
> Apache Phoenix. It is currently being actively integrated into
> Apache HDFS.
>
> We do not know of any equivalent or near-equivalent project
> in the Apache space.
>
> The Dapper paper notes precedent, in particular, the Berkeley
> Rad Lab X-Trace project.
>
> ==== How HTrace relates to Zipkin ====
> Zipkin is an Apache Licensed project from Twitter. It is a complete
> tracing tool with trace collectors, trace viewers and tools to help
> you generate traces. It is written in Scala.  If your project is
> not Scala or if it is Java and you cannot afford a Scala dependency,
> at a minimum, you need an alternate means of generating traces.
> HTrace provides this facility for Java as well as bridging tools
> to feed traces to Zipkin for query and display.
>
> The projects complement each other.
>
> === A Excessive Fascination with the Apache Brand ===
> While we intend to leverage the Apache ‘branding’ when talking to other
> projects as testament of our project’s ‘neutrality’, we have no plans
> for making use of Apache brand in press releases nor posting billboards
> advertising acceptance of HTrace into Apache Incubator.
>
>
> == Documentation ==
> See [[http://htrace.org|htrace.org]] for the current state of the HTrace
> project and documentation.
>
> How to enable tracing in
> [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
> Elliott Clark on
> [[http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
> in HBase]]
>
> == Initial Source ==
> Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in the
> summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
>
>
> == Source and Intellectual Property Submission Plan ==
> We know of no legal encumberments in the way of transfer of source to Apache.
>
> == External Dependencies ==
> HTrace includes third party libs. These include guava, jetty, junit, protobuf,
> hbase, and thrift.  All dependencies are Apache licensed or licenses that are
> palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
> ProtoBufs are BSD licensed.
>
> Cryptography
> N/A
>
> == Required Resources ==
>
> === Mailing lists ===
>   * private@htrace.incubator.apache.org (moderated subscriptions)
>   * commits@htrace.incubator.apache.org
>   * dev@htrace.incubator.apache.org
>   * issues@htrace.incubator.apache.org
>   * user@htrace.incubator.apache.org
>
> === Git Repository ===
> https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
>
> === Issue Tracking ===
> JIRA HTrace (HTRACE)
>
> === Other Resources ===
> Means of setting up regular builds for htrace on builds.apache.org
>
> == Initial Committers ==
>   * Colin McCabe (cmccabe@apache.org)
>   * Elliott Clark (eclark@apache.org)
>   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
>   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
>   * Michael Stack (stack@apache.org)
>   * Nick Dimiduk (ndimiduk@apache.org)
>   * Todd Lipcon (todd@apache.org)
>
>
> == Affiliations ==
>   * Colin McCabe - Cloudera
>   * Elliott Clark - Facebook
>   * Jonathan Leavitt - Google
>   * Masatake Iwasaki - NTTData
>   * Michael Stack - Cloudera
>   * Nick Dimiduk - Hortonworks
>   * Todd Lipcon - Cloudera
>
> == Sponsors ==
>
> === Champion ===
> Roman Shaposhnik
>
> === Nominated Mentors ===
>   * Michael Stack - Apache Member
>   * Todd Lipcon - Apache Member
>
> We will be soliciting more mentors as part of the proposal process.
>
> === Sponsoring Entity ===
> We would like to propose Apache incubator to sponsor this project.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Abiola A Balogun <a_...@icloud.com>.
Ho

is

> On Oct 31, 2557 BE, at 21:58, Jake Farrell <jf...@apache.org> wrote:
> 
> Hey Roman
> Great to see more tools to feed Zipkin. Dapper and Thrift, whats not to
> love. If you need more mentors please count me in
> 
> -Jake
> 
>> On Fri, Oct 31, 2014 at 7:06 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>> 
>> Hi!
>> 
>> I would like to propose HTrace to be consider for
>> Apache Incubator. The proposal is attached and
>> is also available on the wiki:
>>    https://wiki.apache.org/incubator/HTraceProposal
>> 
>> Please let me know what do you guys think and also
>> don't hesitate to massage the proposal on the wiki
>> based on the feedback from this thread.
>> 
>> Thanks,
>> Roman.
>> 
>> == Abstract ==
>> HTrace is a tracing framework intended for use with distributed
>> systems written in java.
>> 
>> == Proposal ==
>> HTrace is an aid for understanding system behavior and for reasoning
>> about performance
>> issues in distributed systems. HTrace is primarily a low impedance
>> library that a java
>> distributed system can incorporate to generate ‘breadcrumbs’ or
>> ‘traces’ along the path
>> of execution, even as it crosses processes and machines. HTrace also
>> includes various
>> tools and glue for collecting, processing and ‘visualizing’ captured
>> execution traces
>> for analysis ex post facto of where time was spent and what resources
>> were consumed.
>> 
>> == Background ==
>> Distributed systems are made up of multiple software components
>> running on multiple
>> computers connected by networks. Debugging or profiling operations run
>> over non-trivial
>> distributed systems -- figuring execution paths and what services,
>> machines, and
>> libraries participated in the processing of a request -- can be involved.
>> 
>> == Rationale ==
>> Rather than have each distributed system build its own custom
>> ‘tracing’ libraries,
>> ideally all would use a single project that provides necessary
>> primitives and saves
>> each project building its own visualizations and processing tools anew.
>> 
>> Google described “...[a] large-scale distributed systems tracing
>> infrastructure”
>> in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The
>> paper
>> tells a compelling story of what is possible when disparate systems
>> standardize
>> on a single tracing library and cooperate, ‘passing the baton’, filling out
>> trace context as executions cross systems.
>> 
>> HTrace aims to provide a rough equivalent in open source of the described
>> core
>> Dapper tools and library.  As it is adopted by more projects, there will
>> be a
>> ‘network effect’ as HTrace will provide a more comprehensive view of
>> activity
>> on the cluster.  For example, as HDFS gets HTrace support, we can connect
>> this
>> with the HTrace support in HBase to follow HBase requests as they enter
>> HDFS.
>> 
>> Given the success of HTrace depends on its being integrated by many
>> projects,
>> HTrace should be perceived as unhampered, free of any commercial,
>> political,
>> or legal ‘taint’. Being an Apache project would help in this regard.
>> 
>> == Initial Goals ==
>> HTrace is a small project of narrow scope but with a grand vision:
>>  * Move the HTrace source and repository to Apache, a vendor-neutral
>> location. Currently HTrace resides at a Cloudera-hosted repository.
>>  * Add past contributors as committers and institute Apache governance.
>>  * Evangelize and encourage HTrace diffusion. Initially we will
>> continue a focus on the Hadoop space since that is where most of the
>> initial contributors work and it is where HTrace has been initially
>> deployed.
>>  * Building out the standalone visualization tool that ships with HTrace.
>>  * Build more community and add more committers
>> 
>> == Current Status ==
>> Currently HTrace has a viable Java trace library that can be interpolated
>> to create ‘traces’.  The work that needs to be done on this library is
>> mostly
>> bug fixes, ease-of-use improvements, and performance tweaks.  In the
>> future,
>> we may add libraries for other languages besides Java.
>> 
>> HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
>> (a tracing
>> sink and visualization system developed by Twitter
>> https://github.com/twitter/zipkin),
>> or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
>> (https://code.google.com/p/python-graph/).
>> 
>> Since the initial sprint in the summer of 2012 which saw HTrace patches
>> proposed
>> for Apache HDFS and committed to Apache HBase, development has been
>> sporadic;
>> mostly a single developer or two adding a feature or bug fixing. HTrace is
>> currently undergoing a new “spurt” of development with the effort to get
>> HTrace
>> added to Apache HDFS revived and a new standalone viewing facility being
>> added
>> in to HTrace itself.
>> 
>> HTrace has been integrated by Apache Phoenix.
>> 
>> 
>> === Meritocracy ===
>> HTrace, up to this, has been run by Apache committers and PMC members.
>> We want to
>> build out a diverse developer and user community and run the HTrace
>> project in
>> the Apache way.  Users and new contributors will be treated with respect
>> and
>> welcomed; they will earn merit in the project by tendering quality patches
>> and support that move the project forward.  Those with a proven support and
>> quality patch track record will be encouraged to become committers.
>> 
>> === Community ===
>> There are just a few developers involved at the moment. If our project
>> is accepted
>> by incubator, building community would be a primary initial goal.
>> 
>> === Core Developers ===
>> 
>> Core developers include Apache members and members of the Hadoop and
>> HBase PMCs.
>> Of those listed, all have contributed to HTrace. Half are from Cloudera.
>> The remainder are Hortonworks, NTTData, Google, and Facebook employees.
>> 
>> === Alignment ===
>> HTrace has been integrated into Apache HBase and Apache Phoenix.
>> Integration
>> into Apache HDFS is currently being worked on. Approaching the Apache YARN
>> project would be a likely next integration.
>> 
>> 
>> == Known Risks ==
>> As noted above, development has been sporadic up to this.  It may continue
>> so.
>> 
>> HTrace is not the primary focus of any of the current list of contributors.
>> It is for all a side effort.  HTrace may lack sufficient impetus with such
>> a state of affairs.
>> 
>> For HTrace to tell a compelling story, it needs to be taken up by
>> significant
>> projects that make up a traced distributed system.  For example, say YARN
>> and
>> HBase take on HTrace but HDFS does not, then the HDFS portions of an
>> end-to-end
>> operation will render opaque compromising our being able to tell a good
>> story
>> around an execution. Because the picture painted has gaps, HTrace may be
>> left
>> aside as ineffective.
>> 
>> === Orphaned products ===
>> The proposers have a vested interest in making HTrace succeed, driving its
>> development and its insertion into projects we all work on. Its dispersion
>> will shine light on difficult to understand interactions amongst the
>> various
>> systems we all work on. A working, integrated HTrace will add a useful
>> debugging mechanism to the Apache projects we all work on.
>> 
>> 
>> === Inexperience with Open Source ===
>> The majority of the proposers here have day jobs that has them working near
>> full-time on (Apache) open source projects. A few of us have helped carry
>> other projects through incubator.  HTrace to date has been developed as
>> an open source project.
>> 
>> === Homogenous Developers ===
>> The initial group of committers is small but already we have a healthy
>> diversity of participating companies.  We are bay-area challenged but
>> a Japanese contributor makes for a good counter balance.
>> 
>> === Reliance on Salaried Developers ===
>> Most of the contributors are paid to work in the Hadoop ecosystem.
>> While we might wander from our current employers, we probably won’t
>> go far from the Hadoop tree.  Whoever the Hadoop employer, it is
>> plain a successful HTrace project is in everyone’s interest.
>> At least one of the developers has already changed employers but
>> his interest in seeing HTrace succeed prevails.
>> 
>> === Relationships with Other Apache Products ===
>> For HTrace to succeed, it is critical we build good relations with
>> other distributed systems projects.  We intend to initially build
>> on relations we already have in place, mostly in the Hadoop space.
>> 
>> The HTrace project has been incorporated by Apache HBase and
>> Apache Phoenix. It is currently being actively integrated into
>> Apache HDFS.
>> 
>> We do not know of any equivalent or near-equivalent project
>> in the Apache space.
>> 
>> The Dapper paper notes precedent, in particular, the Berkeley
>> Rad Lab X-Trace project.
>> 
>> ==== How HTrace relates to Zipkin ====
>> Zipkin is an Apache Licensed project from Twitter. It is a complete
>> tracing tool with trace collectors, trace viewers and tools to help
>> you generate traces. It is written in Scala.  If your project is
>> not Scala or if it is Java and you cannot afford a Scala dependency,
>> at a minimum, you need an alternate means of generating traces.
>> HTrace provides this facility for Java as well as bridging tools
>> to feed traces to Zipkin for query and display.
>> 
>> The projects complement each other.
>> 
>> === A Excessive Fascination with the Apache Brand ===
>> While we intend to leverage the Apache ‘branding’ when talking to other
>> projects as testament of our project’s ‘neutrality’, we have no plans
>> for making use of Apache brand in press releases nor posting billboards
>> advertising acceptance of HTrace into Apache Incubator.
>> 
>> 
>> == Documentation ==
>> See [[http://htrace.org|htrace.org]] for the current state of the HTrace
>> project and documentation.
>> 
>> How to enable tracing in
>> [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
>> Elliott Clark on
>> [[http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
>> in HBase]]
>> 
>> == Initial Source ==
>> Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in the
>> summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
>> 
>> 
>> == Source and Intellectual Property Submission Plan ==
>> We know of no legal encumberments in the way of transfer of source to
>> Apache.
>> 
>> == External Dependencies ==
>> HTrace includes third party libs. These include guava, jetty, junit,
>> protobuf,
>> hbase, and thrift.  All dependencies are Apache licensed or licenses that
>> are
>> palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
>> ProtoBufs are BSD licensed.
>> 
>> Cryptography
>> N/A
>> 
>> == Required Resources ==
>> 
>> === Mailing lists ===
>>  * private@htrace.incubator.apache.org (moderated subscriptions)
>>  * commits@htrace.incubator.apache.org
>>  * dev@htrace.incubator.apache.org
>>  * issues@htrace.incubator.apache.org
>>  * user@htrace.incubator.apache.org
>> 
>> === Git Repository ===
>> https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
>> 
>> === Issue Tracking ===
>> JIRA HTrace (HTRACE)
>> 
>> === Other Resources ===
>> Means of setting up regular builds for htrace on builds.apache.org
>> 
>> == Initial Committers ==
>>  * Colin McCabe (cmccabe@apache.org)
>>  * Elliott Clark (eclark@apache.org)
>>  * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
>>  * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
>>  * Michael Stack (stack@apache.org)
>>  * Nick Dimiduk (ndimiduk@apache.org)
>>  * Todd Lipcon (todd@apache.org)
>> 
>> 
>> == Affiliations ==
>>  * Colin McCabe - Cloudera
>>  * Elliott Clark - Facebook
>>  * Jonathan Leavitt - Google
>>  * Masatake Iwasaki - NTTData
>>  * Michael Stack - Cloudera
>>  * Nick Dimiduk - Hortonworks
>>  * Todd Lipcon - Cloudera
>> 
>> == Sponsors ==
>> 
>> === Champion ===
>> Roman Shaposhnik
>> 
>> === Nominated Mentors ===
>>  * Michael Stack - Apache Member
>>  * Todd Lipcon - Apache Member
>> 
>> We will be soliciting more mentors as part of the proposal process.
>> 
>> === Sponsoring Entity ===
>> We would like to propose Apache incubator to sponsor this project.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] [PROPOSAL] HTrace for Apache Incubator

Posted by Jake Farrell <jf...@apache.org>.
Hey Roman
Great to see more tools to feed Zipkin. Dapper and Thrift, whats not to
love. If you need more mentors please count me in

-Jake

On Fri, Oct 31, 2014 at 7:06 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> Hi!
>
> I would like to propose HTrace to be consider for
> Apache Incubator. The proposal is attached and
> is also available on the wiki:
>     https://wiki.apache.org/incubator/HTraceProposal
>
> Please let me know what do you guys think and also
> don't hesitate to massage the proposal on the wiki
> based on the feedback from this thread.
>
> Thanks,
> Roman.
>
> == Abstract ==
> HTrace is a tracing framework intended for use with distributed
> systems written in java.
>
> == Proposal ==
> HTrace is an aid for understanding system behavior and for reasoning
> about performance
> issues in distributed systems. HTrace is primarily a low impedance
> library that a java
> distributed system can incorporate to generate ‘breadcrumbs’ or
> ‘traces’ along the path
> of execution, even as it crosses processes and machines. HTrace also
> includes various
> tools and glue for collecting, processing and ‘visualizing’ captured
> execution traces
> for analysis ex post facto of where time was spent and what resources
> were consumed.
>
> == Background ==
> Distributed systems are made up of multiple software components
> running on multiple
> computers connected by networks. Debugging or profiling operations run
> over non-trivial
> distributed systems -- figuring execution paths and what services,
> machines, and
> libraries participated in the processing of a request -- can be involved.
>
> == Rationale ==
> Rather than have each distributed system build its own custom
> ‘tracing’ libraries,
> ideally all would use a single project that provides necessary
> primitives and saves
> each project building its own visualizations and processing tools anew.
>
> Google described “...[a] large-scale distributed systems tracing
> infrastructure”
> in Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. The
> paper
> tells a compelling story of what is possible when disparate systems
> standardize
> on a single tracing library and cooperate, ‘passing the baton’, filling out
> trace context as executions cross systems.
>
> HTrace aims to provide a rough equivalent in open source of the described
> core
> Dapper tools and library.  As it is adopted by more projects, there will
> be a
> ‘network effect’ as HTrace will provide a more comprehensive view of
> activity
> on the cluster.  For example, as HDFS gets HTrace support, we can connect
> this
> with the HTrace support in HBase to follow HBase requests as they enter
> HDFS.
>
> Given the success of HTrace depends on its being integrated by many
> projects,
> HTrace should be perceived as unhampered, free of any commercial,
> political,
> or legal ‘taint’. Being an Apache project would help in this regard.
>
> == Initial Goals ==
> HTrace is a small project of narrow scope but with a grand vision:
>   * Move the HTrace source and repository to Apache, a vendor-neutral
> location. Currently HTrace resides at a Cloudera-hosted repository.
>   * Add past contributors as committers and institute Apache governance.
>   * Evangelize and encourage HTrace diffusion. Initially we will
> continue a focus on the Hadoop space since that is where most of the
> initial contributors work and it is where HTrace has been initially
> deployed.
>   * Building out the standalone visualization tool that ships with HTrace.
>   * Build more community and add more committers
>
> == Current Status ==
> Currently HTrace has a viable Java trace library that can be interpolated
> to create ‘traces’.  The work that needs to be done on this library is
> mostly
> bug fixes, ease-of-use improvements, and performance tweaks.  In the
> future,
> we may add libraries for other languages besides Java.
>
> HTrace has means of dumping traces to the filesystem, Twitters’ Zipkin
> (a tracing
> sink and visualization system developed by Twitter
> https://github.com/twitter/zipkin),
> or Apache HBase.  Executions can be viewed either in Zipkin or in pygraph
> (https://code.google.com/p/python-graph/).
>
> Since the initial sprint in the summer of 2012 which saw HTrace patches
> proposed
> for Apache HDFS and committed to Apache HBase, development has been
> sporadic;
> mostly a single developer or two adding a feature or bug fixing. HTrace is
> currently undergoing a new “spurt” of development with the effort to get
> HTrace
> added to Apache HDFS revived and a new standalone viewing facility being
> added
> in to HTrace itself.
>
> HTrace has been integrated by Apache Phoenix.
>
>
> === Meritocracy ===
> HTrace, up to this, has been run by Apache committers and PMC members.
> We want to
> build out a diverse developer and user community and run the HTrace
> project in
> the Apache way.  Users and new contributors will be treated with respect
> and
> welcomed; they will earn merit in the project by tendering quality patches
> and support that move the project forward.  Those with a proven support and
> quality patch track record will be encouraged to become committers.
>
> === Community ===
> There are just a few developers involved at the moment. If our project
> is accepted
> by incubator, building community would be a primary initial goal.
>
> === Core Developers ===
>
> Core developers include Apache members and members of the Hadoop and
> HBase PMCs.
> Of those listed, all have contributed to HTrace. Half are from Cloudera.
> The remainder are Hortonworks, NTTData, Google, and Facebook employees.
>
> === Alignment ===
> HTrace has been integrated into Apache HBase and Apache Phoenix.
> Integration
> into Apache HDFS is currently being worked on. Approaching the Apache YARN
> project would be a likely next integration.
>
>
> == Known Risks ==
> As noted above, development has been sporadic up to this.  It may continue
> so.
>
> HTrace is not the primary focus of any of the current list of contributors.
> It is for all a side effort.  HTrace may lack sufficient impetus with such
> a state of affairs.
>
> For HTrace to tell a compelling story, it needs to be taken up by
> significant
> projects that make up a traced distributed system.  For example, say YARN
> and
> HBase take on HTrace but HDFS does not, then the HDFS portions of an
> end-to-end
> operation will render opaque compromising our being able to tell a good
> story
> around an execution. Because the picture painted has gaps, HTrace may be
> left
> aside as ineffective.
>
> === Orphaned products ===
> The proposers have a vested interest in making HTrace succeed, driving its
> development and its insertion into projects we all work on. Its dispersion
> will shine light on difficult to understand interactions amongst the
> various
> systems we all work on. A working, integrated HTrace will add a useful
> debugging mechanism to the Apache projects we all work on.
>
>
> === Inexperience with Open Source ===
> The majority of the proposers here have day jobs that has them working near
> full-time on (Apache) open source projects. A few of us have helped carry
> other projects through incubator.  HTrace to date has been developed as
> an open source project.
>
> === Homogenous Developers ===
> The initial group of committers is small but already we have a healthy
> diversity of participating companies.  We are bay-area challenged but
> a Japanese contributor makes for a good counter balance.
>
> === Reliance on Salaried Developers ===
> Most of the contributors are paid to work in the Hadoop ecosystem.
> While we might wander from our current employers, we probably won’t
> go far from the Hadoop tree.  Whoever the Hadoop employer, it is
> plain a successful HTrace project is in everyone’s interest.
> At least one of the developers has already changed employers but
> his interest in seeing HTrace succeed prevails.
>
> === Relationships with Other Apache Products ===
> For HTrace to succeed, it is critical we build good relations with
> other distributed systems projects.  We intend to initially build
> on relations we already have in place, mostly in the Hadoop space.
>
> The HTrace project has been incorporated by Apache HBase and
> Apache Phoenix. It is currently being actively integrated into
> Apache HDFS.
>
> We do not know of any equivalent or near-equivalent project
> in the Apache space.
>
> The Dapper paper notes precedent, in particular, the Berkeley
> Rad Lab X-Trace project.
>
> ==== How HTrace relates to Zipkin ====
> Zipkin is an Apache Licensed project from Twitter. It is a complete
> tracing tool with trace collectors, trace viewers and tools to help
> you generate traces. It is written in Scala.  If your project is
> not Scala or if it is Java and you cannot afford a Scala dependency,
> at a minimum, you need an alternate means of generating traces.
> HTrace provides this facility for Java as well as bridging tools
> to feed traces to Zipkin for query and display.
>
> The projects complement each other.
>
> === A Excessive Fascination with the Apache Brand ===
> While we intend to leverage the Apache ‘branding’ when talking to other
> projects as testament of our project’s ‘neutrality’, we have no plans
> for making use of Apache brand in press releases nor posting billboards
> advertising acceptance of HTrace into Apache Incubator.
>
>
> == Documentation ==
> See [[http://htrace.org|htrace.org]] for the current state of the HTrace
> project and documentation.
>
> How to enable tracing in
> [[http://hbase.apache.org/book/tracing.html|HBase using HTrace]]
> Elliott Clark on
> [[http://files.meetup.com/1350427/HBase%20Meetup%20-%20Zipkin.pptx|tracing
> in HBase]]
>
> == Initial Source ==
> Jonathan Leavitt and Todd Lipcon built the first versions of HTrace in the
> summer of 2012.  Jonathan was Todd’s summer intern at Cloudera.
>
>
> == Source and Intellectual Property Submission Plan ==
> We know of no legal encumberments in the way of transfer of source to
> Apache.
>
> == External Dependencies ==
> HTrace includes third party libs. These include guava, jetty, junit,
> protobuf,
> hbase, and thrift.  All dependencies are Apache licensed or licenses that
> are
> palatable: e.g. junit is EPL (Eclipse Public License v1.0) and
> ProtoBufs are BSD licensed.
>
> Cryptography
> N/A
>
> == Required Resources ==
>
> === Mailing lists ===
>   * private@htrace.incubator.apache.org (moderated subscriptions)
>   * commits@htrace.incubator.apache.org
>   * dev@htrace.incubator.apache.org
>   * issues@htrace.incubator.apache.org
>   * user@htrace.incubator.apache.org
>
> === Git Repository ===
> https://git-wip-us.apache.org/repos/asf/incubator-htrace.git
>
> === Issue Tracking ===
> JIRA HTrace (HTRACE)
>
> === Other Resources ===
> Means of setting up regular builds for htrace on builds.apache.org
>
> == Initial Committers ==
>   * Colin McCabe (cmccabe@apache.org)
>   * Elliott Clark (eclark@apache.org)
>   * Jonathan Leavitt (jon.s.leavitt@gmail.com) -- CLA being submitted
>   * Masatake Iwasaki (iwasakims@gmail.com) -- CLA being submitted
>   * Michael Stack (stack@apache.org)
>   * Nick Dimiduk (ndimiduk@apache.org)
>   * Todd Lipcon (todd@apache.org)
>
>
> == Affiliations ==
>   * Colin McCabe - Cloudera
>   * Elliott Clark - Facebook
>   * Jonathan Leavitt - Google
>   * Masatake Iwasaki - NTTData
>   * Michael Stack - Cloudera
>   * Nick Dimiduk - Hortonworks
>   * Todd Lipcon - Cloudera
>
> == Sponsors ==
>
> === Champion ===
> Roman Shaposhnik
>
> === Nominated Mentors ===
>   * Michael Stack - Apache Member
>   * Todd Lipcon - Apache Member
>
> We will be soliciting more mentors as part of the proposal process.
>
> === Sponsoring Entity ===
> We would like to propose Apache incubator to sponsor this project.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>