You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chris Nauroth (JIRA)" <ji...@apache.org> on 2015/11/04 20:48:27 UTC

[jira] [Commented] (HADOOP-12547) Deprecate hadoop-pipes

    [ https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990266#comment-14990266 ] 

Chris Nauroth commented on HADOOP-12547:
----------------------------------------

Some of this discussion has not been constructive.  I urge everyone to stick to the technical points of the debate.

I'm still weighing this, but I have a few other points to mention for consideration.

Part of the argument presented here for deprecation/removal is that development has halted.  It's worth noting that the flow of patches for MapReduce itself has slowed significantly since completion of YARN/MRv2.  By extension, a C++ wrapper over MapReduce is going to see even fewer contributions.  I don't think patch count alone is a sufficient measure to justify the elimination (or the existence) of a component.

I have no direct experience with my users using hadoop-pipes, but I also don't see it as a hindrance to maintain if someone like Yahoo does find it useful.  Another part of the argument for removal was reduced build times.  I do not see this component causing a significant delay in build times though.  Granted, that's partly due to the lack of tests.

A more telling problem is the lack of tests.  Maybe I'm mistaken, but has the documentation vanished too?  These are gaps that don't speak well to the long-term viability of the component.  If we cannot come to consensus on removal, then we need to commit to filling those gaps.

As a matter of process, I disagree with adding libwebhdfs as a rider to this proposal.  I don't think the two are in a comparable state.  However, I do agree that libwebhdfs is a much more viable candidate for removal.  We have evidence that Pipes was at least used by someone at some time, worked correctly, and satsified its design goals.  I don't believe we have any evidence that anyone has ever used libwebhdfs, it still doesn't build properly in recent releases, and it does not satisfy its design goal of providing a library with no JVM dependency.  (This can be viewed as just a bug, but there is also not overwhelming support for bothering to fix it.)

> Deprecate hadoop-pipes
> ----------------------
>
>                 Key: HADOOP-12547
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12547
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few years, aside from very basic maintenance.  Hadoop streaming seems to be a better alternative, since it supports more programming languages and is better implemented.
> There were no responses to a message on the mailing list asking for users of Hadoop pipes... and in my experience, I have never seen anyone use this.  We should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)