You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Brian Dolan <bu...@gmail.com> on 2014/10/10 20:50:21 UTC

Mahout Scala DSL

May be double posting here, apologies…

I am very interested in the Mahout Scala DSL.  There doesn't seem to be much out there at the moment and I have two primary questions.

1. What are the relative advantages of using the Mahout DSL over Spark's MLlib?
2. How do I bundle / deploy a Mahout DSL script?  Should I compile it to a jar?  I haven't tried just yet.

Thanks!
b

~~~~~~
May All Your Sequences Converge




Re: Mahout Scala DSL

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On your first question, in short the main difference in philosophy is
probably same as between say Julia and weka. Or, R and R package. computing
envirornment vs. library.

There are few other less important distinctions.

One of the smaller ideas was that environment can adopt various backends
without need for change of higher level computational code.

And, of course, scripting as deployment aspect (think RScript tool).

Also, as collection of methods, MLib also doesn't attempt to standardize so
much the input types (say in terms of data frames and matrices and vectors)
the same way R does. Depending on where you stand, this may be both good
and bad.

Either way, the spark part of Mahout allows inline MLlib and Spark QL, so
the hope is that the whole is bigger than any of its part, as well.


On Fri, Oct 10, 2014 at 7:19 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> Brian,
>
> check  out the talk page. the talk about spark bindings in particular.
>
> As far as scripting (i assume you are talking about mahout shell here),
> check out "playing with the shell" page on mahout, it explains how to start
> etc.
>
> You don't have to script of course, then you'd just compile your work and
> run as a java application (or Spark application, if there is any difference
> at all).
>
> You'd still need to have spark/scala installed along with the environment
> (SCALA_HOME, JAVA_HOME, SPARK_HOME) in either case. Shell will also need
> MAHOUT_HOME.
>
>
> On Fri, Oct 10, 2014 at 11:50 AM, Brian Dolan <bu...@gmail.com> wrote:
>
>> May be double posting here, apologies…
>>
>> I am very interested in the Mahout Scala DSL.  There doesn't seem to be
>> much out there at the moment and I have two primary questions.
>>
>> 1. What are the relative advantages of using the Mahout DSL over Spark's
>> MLlib?
>> 2. How do I bundle / deploy a Mahout DSL script?  Should I compile it to
>> a jar?  I haven't tried just yet.
>>
>> Thanks!
>> b
>>
>> ~~~~~~
>> May All Your Sequences Converge
>>
>>
>>
>>
>

Re: Mahout Scala DSL

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Brian,

check  out the talk page. the talk about spark bindings in particular.

As far as scripting (i assume you are talking about mahout shell here),
check out "playing with the shell" page on mahout, it explains how to start
etc.

You don't have to script of course, then you'd just compile your work and
run as a java application (or Spark application, if there is any difference
at all).

You'd still need to have spark/scala installed along with the environment
(SCALA_HOME, JAVA_HOME, SPARK_HOME) in either case. Shell will also need
MAHOUT_HOME.


On Fri, Oct 10, 2014 at 11:50 AM, Brian Dolan <bu...@gmail.com> wrote:

> May be double posting here, apologies…
>
> I am very interested in the Mahout Scala DSL.  There doesn't seem to be
> much out there at the moment and I have two primary questions.
>
> 1. What are the relative advantages of using the Mahout DSL over Spark's
> MLlib?
> 2. How do I bundle / deploy a Mahout DSL script?  Should I compile it to a
> jar?  I haven't tried just yet.
>
> Thanks!
> b
>
> ~~~~~~
> May All Your Sequences Converge
>
>
>
>