You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@htrace.apache.org by "Colin P. McCabe" <cm...@apache.org> on 2015/01/16 09:30:50 UTC

HTrace integration for more HDFS client operations

Hi all,

I've got some good news that I figured I'd post to the list!  Today I added
a bunch of htrace integrating to HDFS, in
https://issues.apache.org/jira/browse/HDFS-7189.  This patch adds tracing
for a whole host of DFS client operations, such as rename and delete.

Obviously this will be helpful for HDFS users, and it should also increase
our ability to follow HBase operations all the way back into HDFS via
HTrace-- for example when HBase is deleting or moving a WAL, etc.

The last big piece of HTrace integration for HDFS is integration into the
output stream (i.e. the write path).  This should be coming soon, so stay
tuned.

cheers,
Colin

Re: HTrace integration for more HDFS client operations

Posted by "Colin P. McCabe" <cm...@apache.org>.
On Fri, Jan 16, 2015 at 10:03 AM, Nick Dimiduk <nd...@gmail.com> wrote:

> This reminds me: have we tested the compatibility of this new release with
> previous versions? For instance, if we upgrade HBase to the incubator
> release but not HDFS, will tracing work only as far as that?


So, for this 3.1.0 release, the story is pretty simple.  The previous
releases were in a different namespace and the jars had a different name,
so HBase and HDFS can use different versions if they want to.  There will
be no conflicts.  Of course, if HBase and HDFS don't use the same version,
the spans won't be "parented" with HBase's spans.  But there are no crashes
or other problems like that.

The situation for the future is more complex.  Of course, HBase pulls in
jars from Hadoop.  One of those jars is going to be our htrace-core jar.
The HDFS client and HBase's daemons are going to want to use the same
version of htrace.

I know that HBase likes to provide compatibility with as many versions of
Hadoop as it can.  Basically HBase is going to have to look at the oldest
version of Apache HTrace that Hadoop might ask it to use, and verify that
that works.

It might help to look at the stuff we're trying to get rid of in the API:
1. We'd like to get rid of the Span#addKVAnnotation method which takes
byte[], in favor of the one which takes String
2. We'd like to get rid of the public MilliSpan constructor...
MilliSpan#Builder is more flexible and future-proof.  If we want to add new
parameters we don't want a combinatorial explosion of constructors (we
learned this in Hadoop)
3. Do not use Span#getParentId because it assumes that there is a single
parent for each span, an assumption we're trying to get rid of

#2 and #3 shouldn't be a problem for HBase because there's no reason for
HBase to directly create MilliSpans, or call getParentId.  I bet there
might be some cases where we're calling the byte[] version of
addKVAnnotation, though.

So tl;dr: When we update HBase to use the new Apache jar, let's be careful
NOT to use any of these deprecated APIs.  Then we should be able to remove
those from the next release without creating any compat problems for HBase.

best,
Colin



> On Fri, Jan 16, 2015 at 9:44 AM, Stack <st...@duboce.net> wrote:
>
> > You the man CPMcC.
> > St.Ack
> >
> > On Fri, Jan 16, 2015 at 12:30 AM, Colin P. McCabe <cm...@apache.org>
> > wrote:
> >
> > > Hi all,
> > >
> > > I've got some good news that I figured I'd post to the list!  Today I
> > added
> > > a bunch of htrace integrating to HDFS, in
> > > https://issues.apache.org/jira/browse/HDFS-7189.  This patch adds
> > tracing
> > > for a whole host of DFS client operations, such as rename and delete.
> > >
> > > Obviously this will be helpful for HDFS users, and it should also
> > increase
> > > our ability to follow HBase operations all the way back into HDFS via
> > > HTrace-- for example when HBase is deleting or moving a WAL, etc.
> > >
> > > The last big piece of HTrace integration for HDFS is integration into
> the
> > > output stream (i.e. the write path).  This should be coming soon, so
> stay
> > > tuned.
> > >
> > > cheers,
> > > Colin
> > >
> >
>

Re: HTrace integration for more HDFS client operations

Posted by Nick Dimiduk <nd...@gmail.com>.
This reminds me: have we tested the compatibility of this new release with
previous versions? For instance, if we upgrade HBase to the incubator
release but not HDFS, will tracing work only as far as that? Worse, will we
get exceptions tossed from mismatch in the guts? Everything should be fine
from over-the-wire point of view, but we should check the in-process story.

-n

On Fri, Jan 16, 2015 at 9:44 AM, Stack <st...@duboce.net> wrote:

> You the man CPMcC.
> St.Ack
>
> On Fri, Jan 16, 2015 at 12:30 AM, Colin P. McCabe <cm...@apache.org>
> wrote:
>
> > Hi all,
> >
> > I've got some good news that I figured I'd post to the list!  Today I
> added
> > a bunch of htrace integrating to HDFS, in
> > https://issues.apache.org/jira/browse/HDFS-7189.  This patch adds
> tracing
> > for a whole host of DFS client operations, such as rename and delete.
> >
> > Obviously this will be helpful for HDFS users, and it should also
> increase
> > our ability to follow HBase operations all the way back into HDFS via
> > HTrace-- for example when HBase is deleting or moving a WAL, etc.
> >
> > The last big piece of HTrace integration for HDFS is integration into the
> > output stream (i.e. the write path).  This should be coming soon, so stay
> > tuned.
> >
> > cheers,
> > Colin
> >
>

Re: HTrace integration for more HDFS client operations

Posted by Stack <st...@duboce.net>.
You the man CPMcC.
St.Ack

On Fri, Jan 16, 2015 at 12:30 AM, Colin P. McCabe <cm...@apache.org>
wrote:

> Hi all,
>
> I've got some good news that I figured I'd post to the list!  Today I added
> a bunch of htrace integrating to HDFS, in
> https://issues.apache.org/jira/browse/HDFS-7189.  This patch adds tracing
> for a whole host of DFS client operations, such as rename and delete.
>
> Obviously this will be helpful for HDFS users, and it should also increase
> our ability to follow HBase operations all the way back into HDFS via
> HTrace-- for example when HBase is deleting or moving a WAL, etc.
>
> The last big piece of HTrace integration for HDFS is integration into the
> output stream (i.e. the write path).  This should be coming soon, so stay
> tuned.
>
> cheers,
> Colin
>

Re: HTrace integration for more HDFS client operations

Posted by Nick Dimiduk <nd...@gmail.com>.
Nice one Colin!

On Fri, Jan 16, 2015 at 12:30 AM, Colin P. McCabe <cm...@apache.org>
wrote:

> Hi all,
>
> I've got some good news that I figured I'd post to the list!  Today I added
> a bunch of htrace integrating to HDFS, in
> https://issues.apache.org/jira/browse/HDFS-7189.  This patch adds tracing
> for a whole host of DFS client operations, such as rename and delete.
>
> Obviously this will be helpful for HDFS users, and it should also increase
> our ability to follow HBase operations all the way back into HDFS via
> HTrace-- for example when HBase is deleting or moving a WAL, etc.
>
> The last big piece of HTrace integration for HDFS is integration into the
> output stream (i.e. the write path).  This should be coming soon, so stay
> tuned.
>
> cheers,
> Colin
>