You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Pat Ferrel (JIRA)" <ji...@apache.org> on 2017/06/19 17:44:00 UTC

[jira] [Updated] (MAHOUT-1951) Drivers don't run with remote Spark

     [ https://issues.apache.org/jira/browse/MAHOUT-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pat Ferrel updated MAHOUT-1951:
-------------------------------

The jar isn't supposed to have all deps, only the ones not provided by the environment. In fact it is supposed to have the minimum. 

So it appears some of the provided classes for previous platforms (Spark etc) have change in new versions? We then need to add to the dependency reduced jar but first check to see if a newer version of some provided dep will fill the bill or dependency-reduced will bloat needlessly.

What specifically is the error, what is missing.

> Drivers don't run with remote Spark
> -----------------------------------
>
>                 Key: MAHOUT-1951
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1951
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification, CLI, Collaborative Filtering
>    Affects Versions: 0.13.0
>         Environment: The command line drivers spark-itemsimilarity and spark-naivebayes using a remote or pseudo-clustered Spark
>            Reporter: Pat Ferrel
>            Assignee: Pat Ferrel
>            Priority: Blocker
>             Fix For: 0.13.0
>
>
> Missing classes when running these jobs because the dependencies-reduced jar, passed to Spark for serialization purposes, does not contain all needed classes.
> Found by a user. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)