You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Roman Shaposhnik (Created) (JIRA)" <ji...@apache.org> on 2012/03/21 17:57:42 UTC

[jira] [Created] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
--------------------------------------------------------------------------------------------------

                 Key: MAHOUT-994
                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
             Project: Mahout
          Issue Type: Bug
          Components: Integration
    Affects Versions: 0.6
            Reporter: Roman Shaposhnik


Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240042#comment-13240042 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

PS review request = i mean at reviews.apache.org .
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "tom pierce (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

tom pierce updated MAHOUT-994:
------------------------------

    Attachment: MAHOUT-994.patch
    
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>         Attachments: MAHOUT-994.patch, MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Lance Norskog (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236249#comment-13236249 ] 

Lance Norskog commented on MAHOUT-994:
--------------------------------------

Don't the job jars pack up Hadoop libs like all the other dependencies? 
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234521#comment-13234521 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

What it should be relied on in new Hadoop branches to find the hadoop client libraries and settings?


                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234521#comment-13234521 ] 

Dmitriy Lyubimov edited comment on MAHOUT-994 at 3/21/12 5:13 PM:
------------------------------------------------------------------

What it should be relied on in new Hadoop branches to find the hadoop client libraries and settings?

Could you please describe the solution?

                
      was (Author: dlyubimov):
    What it should be relied on in new Hadoop branches to find the hadoop client libraries and settings?


                  
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264166#comment-13264166 ] 

Roman Shaposhnik commented on MAHOUT-994:
-----------------------------------------

Tom, thanks a million for taking care of this! I'd say the patch looks good (although I don't have a binding +1 mojo ;-)).
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>            Assignee: tom pierce
>         Attachments: MAHOUT-994.patch, MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "tom pierce (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240066#comment-13240066 ] 

tom pierce commented on MAHOUT-994:
-----------------------------------

You bet - happy to review.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235879#comment-13235879 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

But you are right, it looks like we are not guessing hadoop libraries for mahout local cases. i guess if what i've described, really happens, then it will fail.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Roman Shaposhnik (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Shaposhnik updated MAHOUT-994:
------------------------------------

    Attachment: MAHOUT-994.patch.txt

Sorry took me a bit longer than anticipated.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>         Attachments: MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "tom pierce (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

tom pierce reassigned MAHOUT-994:
---------------------------------

    Assignee: tom pierce

Anyone object the most recent patch?  It's a tiny edit off Roman's submission (as described on Reviewboard).  Will plan to commit early next week if nobody pipes up.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>            Assignee: tom pierce
>         Attachments: MAHOUT-994.patch, MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Roman Shaposhnik (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235684#comment-13235684 ] 

Roman Shaposhnik commented on MAHOUT-994:
-----------------------------------------

Sure. In fact, let me use Pig as a good example of how to handle these cases (scroll to line #162):
   http://svn.apache.org/viewvc/pig/trunk/bin/pig?revision=1228547&view=markup

Given that Pig, just like Mahout supports a local mode where it doesn't need Hadoop at all, I think
if we can agree on the approach they are taking I can submit a patch for you guys, essenntially
replacing lines #202-243 of:
   http://svn.apache.org/viewvc/mahout/trunk/bin/mahout?revision=1186718&view=markup
with the code similar to what Pig has.

What do you think?
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235796#comment-13235796 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

looking at the pig code and your references in terms of lines, i am not sure i can see how they add $HADOOP_HOME/lib files to the classpath. I also see that they still use HADOOP_CONF_DIR to add to classpath in order to add default hadoop configuration resources.

And finally your reference points to initialization of HADOOP_BIN which again i am not sure how they use it further along as i don't see that it is being used to form hadoop classpath or configuration anywhere further along. I am not sure 'which hadoop' can be relied upon so much to deduce libraries and conf reliably as different distros tend to place binaries and configurations into different actual locations (such as /usr/bin/hadoop) which are linked up into actual hadoop home layout somewhere else. That's why i asked you to clarify how you want to do it since i am still dubious "which" can be reliably used to deduce HADOOP_HOME.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265026#comment-13265026 ] 

Roman Shaposhnik commented on MAHOUT-994:
-----------------------------------------

Dmitriy, the bundled hadoop dependencies are only supposed to be used in local mode.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>            Assignee: tom pierce
>         Attachments: MAHOUT-994.patch, MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265104#comment-13265104 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

sounds good then.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>            Assignee: tom pierce
>         Attachments: MAHOUT-994.patch, MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Roman Shaposhnik (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235814#comment-13235814 ] 

Roman Shaposhnik commented on MAHOUT-994:
-----------------------------------------

There's no need to add $HADOOP_HOME/lib to the classpath as long as you're using hadoop itself as a launcher script. In fact, it is dangerous to assume that there's $HADOOP_HOME/lib to begin with. This is definitely not the case with Hadoop 0.23 and might not be the case with Hadoop 1.X

Explicit setting of HADOOP_CONF_DIR needs to be dropped from Pig script and I will not recommend keeping it in Mahout. Once again -- using hadoop launcher script itself is the best way to add these things to the proper places in th CP. If you try to do it explicitly you're essentially second guessing hadoop implementation and your scripts become brittle.

'which' is definitely quite reliable. In fact, if you look at Pig, Hive and HBase they have all transitioned to using this technique to get the Hadoop user wants. The only difference is defaults (some projects prefer hadoop from the PATH to override anything else, some prefer it the other way). Here's another example of how HBase does it (scroll to line #219):
  http://svn.apache.org/viewvc/hbase/trunk/bin/hbase?revision=1296661&view=markup

One again -- if you try to deduce the libraries yourself -- your scripts become brittle. You can *only* rely on hadoop implementation to set these things up for you. This is an uncomfortable truth of at least 3 different incompatible Hadoop implementations being actively deployed in the wild.

Does it make sense?

                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Roman Shaposhnik (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240031#comment-13240031 ] 

Roman Shaposhnik commented on MAHOUT-994:
-----------------------------------------

@Dmitriy

bq. I am not sure we use hadoop executable to launch Mahout stuff. 

Seems like you do: 
{noformat}
    (*)
      echo "MAHOUT-JOB: $MAHOUT_JOB"
      export HADOOP_CLASSPATH=$MAHOUT_CONF_DIR:${HADOOP_CLASSPATH}
      exec "$HADOOP_HOME/bin/hadoop" --config $HADOOP_CONF_DIR jar $MAHOUT_JOB $CLASS "$@"
{noformat}

bq. I mean, we are not bound by having to launch any MR stuff.

Understood. Pig also has local mode where it doesn't have to have any access to Hadoop at all

bq. I think it's a little bit disconcerting as we have a lot of utility classes that may be using hadoop classes even locally (e.g. reading or writing local sequence files). Can you suggest a strategy for Mahout local mode and still be able to bind to hadoop I/O classes?

I believe there are 2 issue at play here:
  # what to do for the local mode
  # what to do for the MapReduce mode

For both cases you'll have to make sure that you segregate bundled hadoop
dependencies into a separate location in your installation tree so that
you can easily add/remove those from your CLASSPATH. Lets call it $HADOOP_DEPS_DIR
(it must be different from your ususal lib subdir)

1. Local mode:

The very fact that you're running in local mode will be governed by $MAHOUT_LOCAL
(regardless of whether HADOOP_HOME is empty or not). 

When $MAHOUT_LOCAL is set you'll try to see whether you can run 'hadoop classpath'
(see how HADOOP_HOME/HADOOP_PREFIX factors into this below). If you can you'll add 
the value it returns to the CLASSPATH. If it is not available you'll add $HADOOP_DEPS_DIR 
to the classpath.

2. MapReduce mode

For the MapReduce mode, I would argue that the best option is to follow the suit of Pig/Hive/HBase, etc and rely on HADOOP_HOME *only* as a fall back value if hadoop is not resolved from the PATH by default. E.g.
{noformat}
HADOOP_LAUNCHER=$(PATH="${HADOOP_HOME:-${HADOOP_PREFIX}}/bin:$PATH" which hadoop 2>/dev/null)
{noformat}

then you'll need to replace all the explicit calls to $HADOOP_HOME/bin/... with the $HADOOP_LAUNCHER. I will also strongly recommend against guessing the value of HADOOP_CONF_DIR, since I don't think it buys you anything at all (the hadoop launcher script regardless of whether it is coming from honors HADOOP_CONF_DIR as much as it does --config).

That will take care of the MapReduce case.

If you agree with the proposal I can supply a patch shortly.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240040#comment-13240040 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

Sounds good to me. 

The only concern i have is to verify that local stuff is not broken when run from command line. I don't think this is verified by build unit tests. The history of local mode in Mahout is a little bit out of my scope, i will need another person to assertain the change for the local mode.

But what you are saying sounds very benign to me. Thank you for doing this. 
When you submit a patch, do you think you could also create a review request for Mahout? Hopefully this will encourage another pair of eyes to look at it. if nobody responds, let me know, i will try to locate another pair of eyes for it. 

Again, thank you for doing this, Hadoop best practices may sometimes be hard to figure.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235891#comment-13235891 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

I think it's a little bit disconcerting as we have a lot of utility classes that may be using hadoop classes even locally (e.g. reading or writing local sequence files). Can you suggest a strategy for Mahout local mode and still be able to bind to hadoop I/O classes?
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235871#comment-13235871 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

I am not sure we use hadoop executable to launch Mahout stuff. there are definitely cases where we don't use Hadoop at all (but the code generally does assume hadoop library avaiability on classpath, otherwise some classes and strategies may break with ClassNotFoundException during load). 

What is your suggestion to handle "hadoopless" cases and avoid ClassNotFound?
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262230#comment-13262230 ] 

jiraposter@reviews.apache.org commented on MAHOUT-994:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4632/#review7247
-----------------------------------------------------------


Looks good aside from the one issue below.  


trunk/bin/mahout
<https://reviews.apache.org/r/4632/#comment16017>

    Think we need to omit HADOOP_BINARY_CLASSPATH here; otherwise I get an error due to my cluster and Mahout HDFS libs having different API versions when I run with MAHOUT_LOCAL set.  


- tom


On 2012-04-04 00:26:36, Roman Shaposhnik wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4632/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-04-04 00:26:36)
bq.  
bq.  
bq.  Review request for mahout and tom pierce.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Patch according to the JIRA discussion.
bq.  
bq.  
bq.  This addresses bug MAHOUT-994.
bq.      https://issues.apache.org/jira/browse/MAHOUT-994
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/bin/mahout 1309189 
bq.    trunk/distribution/src/main/assembly/bin.xml 1309189 
bq.  
bq.  Diff: https://reviews.apache.org/r/4632/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  build and execution in various modes
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Roman
bq.  
bq.


                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>         Attachments: MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266938#comment-13266938 ] 

Hudson commented on MAHOUT-994:
-------------------------------

Integrated in Mahout-Quality #1456 (See [https://builds.apache.org/job/Mahout-Quality/1456/])
    MAHOUT-994 - bin/mahout should not rely on HADOOP_HOME (Revision 1333154)

     Result = SUCCESS
tcp : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1333154
Files : 
* /mahout/trunk/bin/mahout
* /mahout/trunk/distribution/src/main/assembly/bin.xml

                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>            Assignee: tom pierce
>             Fix For: 0.7
>
>         Attachments: MAHOUT-994.patch, MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264177#comment-13264177 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

not having tested it , it looks good. One question though it looks like hadoop dependency is included in assembly? It is only used when in local mode only? or it also makes its way into classpath even in distributed mode?

I am just worried if it may interfere with running Mahout with a third-party hadoop distribution.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>            Assignee: tom pierce
>         Attachments: MAHOUT-994.patch, MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235871#comment-13235871 ] 

Dmitriy Lyubimov edited comment on MAHOUT-994 at 3/22/12 6:45 PM:
------------------------------------------------------------------

I am not sure we use hadoop executable to launch Mahout stuff. there are definitely cases where we don't use Hadoop at all (but the code generally does assume hadoop library avaiability on classpath, otherwise some classes and strategies may break with ClassNotFoundException during load). 

I mean, we are not bound by having to launch any MR stuff.

What is your suggestion to handle "hadoopless" cases and avoid ClassNotFound?
                
      was (Author: dlyubimov):
    I am not sure we use hadoop executable to launch Mahout stuff. there are definitely cases where we don't use Hadoop at all (but the code generally does assume hadoop library avaiability on classpath, otherwise some classes and strategies may break with ClassNotFoundException during load). 

What is your suggestion to handle "hadoopless" cases and avoid ClassNotFound?
                  
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245925#comment-13245925 ] 

jiraposter@reviews.apache.org commented on MAHOUT-994:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4632/
-----------------------------------------------------------

Review request for mahout and tom pierce.


Summary
-------

Patch according to the JIRA discussion.


This addresses bug MAHOUT-994.
    https://issues.apache.org/jira/browse/MAHOUT-994


Diffs
-----

  trunk/bin/mahout 1309189 
  trunk/distribution/src/main/assembly/bin.xml 1309189 

Diff: https://reviews.apache.org/r/4632/diff


Testing
-------

build and execution in various modes


Thanks,

Roman


                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>         Attachments: MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Roman Shaposhnik (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Shaposhnik updated MAHOUT-994:
------------------------------------

    Status: Patch Available  (was: Open)
    
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>         Attachments: MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "tom pierce (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

tom pierce updated MAHOUT-994:
------------------------------

       Resolution: Fixed
    Fix Version/s: 0.7
           Status: Resolved  (was: Patch Available)
    
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>            Assignee: tom pierce
>             Fix For: 0.7
>
>         Attachments: MAHOUT-994.patch, MAHOUT-994.patch.txt
>
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "tom pierce (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240015#comment-13240015 ] 

tom pierce commented on MAHOUT-994:
-----------------------------------

It would make a lot of sense to me to unify with other Hadoop projects and start using the hadoop launcher script whenever we can - no need to maintain a custom reinvented wheel there.

As far as the job jar is concerned, I don't think it's customary to include Hadoop libs - seems redundant, and potentially dangerous when there are mismatched versions of classes in the classpath.  I know there are ways to specify/determine which versions of things to use and make this kind of thing work, but it has been my experience that it always becomes overcomplicated after a while - it is just bad juju.

Looks like the Hadoop classes in examples/target/dependency/hadoop-core-0.20.204.0.jar are tacked onto the classpath inside bin/mahout, which is why standalone mode currently 'just works' on a box with no hadoop install.  This seems less than ideal, though I can see why it is done.
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-994) mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches

Posted by "Dmitriy Lyubimov (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240045#comment-13240045 ] 

Dmitriy Lyubimov commented on MAHOUT-994:
-----------------------------------------

@Tom, 

perhaps you could be a such second review person on this issue? thanks. 

I agree attaching hadoop libraries to job file is dangerous. This should be compile-time dependency only, but runtime policy should be "provided".
                
> mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-994
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-994
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>            Reporter: Roman Shaposhnik
>
> Mahout should follow the Pig and Hive example and not rely explicitly on HADOOP_HOME and HADOOP_CONF_DIR

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira