You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@htrace.apache.org by Mike Drob <md...@apache.org> on 2017/08/16 17:00:49 UTC

[DISCUSS] Attic podling Apache HTrace?

Hi folks,

Want to bring up a potentially uncofortable topic for some. Is it time to
retire/attic the project?

We've seen a minimal amount of activity in the past year. The last release
had two bug fixes, and had been pending for several months before somebody
reminded me to push the artifacts to subversion from the staging directory.

I'd love to see a renewed set of activity here, but I don't think there is
a ton of interest going on.

HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
which is a good sign, but I haven't heard much from them recently. I
definitely do no think we are at the point where a lack of releases and
activity is a sign of super advanced maturity and stability.

Your thoughts?

Mike

Re: [DISCUSS] Attic podling Apache HTrace?

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.
Hi Mike,

Thanks for putting this issue up.

 > Want to bring up a potentially uncofortable topic for some. Is it time to
 > retire/attic the project?

I would like to keep the project alive.
While we are silent for months,
many of the committers are still working on projects
using HTrace (such as Hadoop and HBase) and
we are capable to make new release if new major issues are found.

 > HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
 > which is a good sign, but I haven't heard much from them recently.

I will look into HBASE-14451 again and try to make it move forward.
Since one of the intent of big change in HTrace-4 is
making better end-to-end tracing (e.g. from HBase to HDFS),
bumping HTrace in HBase up to 4 would reveal the next task.

Regards,
Masatake Iwasaki

On 8/17/17 02:00, Mike Drob wrote:
> Hi folks,
>
> Want to bring up a potentially uncofortable topic for some. Is it time to
> retire/attic the project?
>
> We've seen a minimal amount of activity in the past year. The last release
> had two bug fixes, and had been pending for several months before somebody
> reminded me to push the artifacts to subversion from the staging directory.
>
> I'd love to see a renewed set of activity here, but I don't think there is
> a ton of interest going on.
>
> HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> which is a good sign, but I haven't heard much from them recently. I
> definitely do no think we are at the point where a lack of releases and
> activity is a sign of super advanced maturity and stability.
>
> Your thoughts?
>
> Mike
>


Re: [DISCUSS] Attic podling Apache HTrace?

Posted by Adrian Cole <ad...@gmail.com>.
> What are the likely alternatives for downstream projects that want distributed tracing?
Yes, for general purpose or RPC, but I think HTrace is still
positioned well for data services specifically.

> Do we think the field still has a big gap that HTrace can solve?
When at twitter (a couple yrs ago now), I know the data team preferred
htrace eventhough we had zipkin. Most of the tracing projects out
there do not focus on data services, or only recently do. While HTrace
may not be great at filling gaps in traditional RPC (as others do this
well enough), it probably does still have compelling advantages in
data services. I think the main holdback is getting the word out
and/or showing examples where the model and UI really shines in
HTrace's sweet spot (data services).

my 2p

Re: [DISCUSS] Attic podling Apache HTrace?

Posted by Sean Busbey <bu...@apache.org>.
What are the likely alternatives for downstream projects that want distributed tracing?

Do we think the field still has a big gap that HTrace can solve?

On 2017-08-16 12:00, Mike Drob <md...@apache.org> wrote: 
> Hi folks,
> 
> Want to bring up a potentially uncofortable topic for some. Is it time to
> retire/attic the project?
> 
> We've seen a minimal amount of activity in the past year. The last release
> had two bug fixes, and had been pending for several months before somebody
> reminded me to push the artifacts to subversion from the staging directory.
> 
> I'd love to see a renewed set of activity here, but I don't think there is
> a ton of interest going on.
> 
> HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> which is a good sign, but I haven't heard much from them recently. I
> definitely do no think we are at the point where a lack of releases and
> activity is a sign of super advanced maturity and stability.
> 
> Your thoughts?
> 
> Mike
> 

Re: [DISCUSS] Attic podling Apache HTrace?

Posted by Colin McCabe <cm...@apache.org>.
Thanks for bringing this up, Mike.

The original vision for HTrace was trying to unify a bunch of disparate
Hadoop components with a unified tracing layer.  This would allow us to
debug slowness or odd behavior in a much better way.  We started from
that vision and deduced the need to build a frontend API (htrace-core),
backend data store (htrace-hbase, htrace-htraced, etc.), and web UI
(htrace-web).

I still think that vision is valid, but achieving it was a lot harder
than we expected, for a couple of reasons.

First of all, I think building all those components needed someone (or
maybe several someones) to work on it full time.  We tried to do it part
time with a few HDFS and HBase committers.  Ultimately this didn't scale
as much as we needed it to.

Secondly, we were hoping for a lot of buy-in from Hadoop vendors and big
tech companies that used Hadoop.  Unfortunately, we didn't really get
that.  The Hadoop vendors were preoccupied with other things.  Big tech
companies seem to mostly developed their own internal systems using bits
and pieces of open source.  I think this is another area where we just
needed more budget.  In retrospect, having meetups and reaching out to
potential users is something we needed to do.  There are some other
projects that have been a lot better with this than we have.

I think we should try to refocus on some core use-cases.  Basically,
decide what we want to achieve and find the shortest path to that.  If
that involves using other projects, then that's fine-- as long as they
are open source projects compatible with the ideals of the ASF.

Off the top of my head, I can think of a few core use-cases:

* Why is my HDFS request slow?   Figure out if there are disk issues or
network issues.

* Why is my HBase request slow?  Follow HBase requests into the HDFS
layer.

* Who is making the most requests to HDFS?

* What average speed is Hadoop getting from its S3 requests?  How often
do we hit our local caches, versus going over the network?

best,
Colin


On Thu, Aug 17, 2017, at 10:04, Stack wrote:
> On Wed, Aug 16, 2017 at 10:00 AM, Mike Drob <md...@apache.org> wrote:
> 
> > Hi folks,
> >
> > Want to bring up a potentially uncofortable topic for some. Is it time to
> > retire/attic the project?
> >
> > We've seen a minimal amount of activity in the past year. The last release
> > had two bug fixes, and had been pending for several months before somebody
> > reminded me to push the artifacts to subversion from the staging directory.
> >
> > I'd love to see a renewed set of activity here, but I don't think there is
> > a ton of interest going on.
> >
> > HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> > which is a good sign, but I haven't heard much from them recently. I
> > definitely do no think we are at the point where a lack of releases and
> > activity is a sign of super advanced maturity and stability.
> >
> > Your thoughts?
> 
> 
> Thanks Mike for starting this thread.
> 
> Activity over the last year is here [1].
> 
> Is there any testimony other than evangelizing presentations on how
> htrace
> has provided a benefit?
> 
> HTrace needs a bit of work. In order of import:
> 
> 1. A complete viewer (punt and use zipkin instead?)
> 2. Hooked up systems that tell wholesome trace stories: hdfs is
> incomplete,
> hbase is broke, accumulo/unknown, phoenix/custom-htrace... who else?
> 3. Work needs to be done so an operator can easily enable/disable trace
> and
> easily obtain views without impinging upon general perf
> 
> It could do w/ an API cleanup (v5.0.0?) and study of the fact that it is
> painstaking manual work adding it into a system (and that it is
> subsequently easily damaged by code movement). It needs a particular type
> of barker to drive it cross-project since the cross-project realm is when
> it starts to come into its own (and each project in its turn will resist
> since the benefit not immediate), etc.
> 
> None of the above is under active dev.
> 
> St.Ack
> 
> 1. https://github.com/apache/incubator-htrace/graphs/commit-activity
> 
> 
> 
> > Mike
> >

Re: [DISCUSS] Attic podling Apache HTrace?

Posted by Stack <st...@duboce.net>.
On Wed, Aug 16, 2017 at 10:00 AM, Mike Drob <md...@apache.org> wrote:

> Hi folks,
>
> Want to bring up a potentially uncofortable topic for some. Is it time to
> retire/attic the project?
>
> We've seen a minimal amount of activity in the past year. The last release
> had two bug fixes, and had been pending for several months before somebody
> reminded me to push the artifacts to subversion from the staging directory.
>
> I'd love to see a renewed set of activity here, but I don't think there is
> a ton of interest going on.
>
> HBase is still on version 3. So is Accumulo, I think. Hadoop is on 4.1,
> which is a good sign, but I haven't heard much from them recently. I
> definitely do no think we are at the point where a lack of releases and
> activity is a sign of super advanced maturity and stability.
>
> Your thoughts?


Thanks Mike for starting this thread.

Activity over the last year is here [1].

Is there any testimony other than evangelizing presentations on how htrace
has provided a benefit?

HTrace needs a bit of work. In order of import:

1. A complete viewer (punt and use zipkin instead?)
2. Hooked up systems that tell wholesome trace stories: hdfs is incomplete,
hbase is broke, accumulo/unknown, phoenix/custom-htrace... who else?
3. Work needs to be done so an operator can easily enable/disable trace and
easily obtain views without impinging upon general perf

It could do w/ an API cleanup (v5.0.0?) and study of the fact that it is
painstaking manual work adding it into a system (and that it is
subsequently easily damaged by code movement). It needs a particular type
of barker to drive it cross-project since the cross-project realm is when
it starts to come into its own (and each project in its turn will resist
since the benefit not immediate), etc.

None of the above is under active dev.

St.Ack

1. https://github.com/apache/incubator-htrace/graphs/commit-activity



> Mike
>