You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by "Rananavare, Sunil" <su...@unify.com> on 2014/01/20 19:46:23 UTC

hadoop problem tracking

Hi, we are new to hadoop and trying to use open source apache hadoop (2.2.0) with IBM JDK V7.0. We had to compile hadoop components (e.g. pig, hive, etc.) with IBM JDK (since original distribution worked with Oracle JRE but failed on IBM JRE) however; we discovered a few compilation errors. Here are a few examples,


a.      Requires that jetty_util is added to the pom.xml, see https://issues.apache.org/jira/browse/HADOOP-10110 using the patch instructions.

b.      Requires that serialization package reference in XmlEditsVisitor.java is replaced (com.sun.org.apache.xml.internal.serialize -> org.apache.xml.serialize) (for instance https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014066399#77777777-0000-0000-0000-000014066555).

c.      Requires that Sun security package is replaced with corresponding IBM package in KeyStoreTestUtil.java (sun.security.x509 -> com.ibm.security.x509) (for example http://www-01.ibm.com/support/docview.wss?uid=swg1IV20285).

d.      The bug discovered and corrected by HortonWorks in FloatSplitter.java in build 2.2.0.6.0-76 (lowClausePrefix + Double.toString(curUpper), -> lowClausePrefix + Double.toString(curLower)).

After researching a bit it appears that these issues are known and require some minor code changes (as mentioned above). I am assuming some of these suggested code changes are probably being approved or in the processes of being incorporated into the future hadoop releases (0.2.3 or 3.0). I am wondering what the 'official' process is to handle situations like this. Are we allowed to make such suggested code changes in our local copy of hadoop distributions and maintain these changes in our private repository until the time when the "official" fixes are available in the released hadoop distributions? If so, is this approach Ok from the Apache licensing agreement standpoint?

I am sure this problem is not new and some of you might have resolved this. I would really appreciate your comments/advice.

Best regards,
Sunil