You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Dmitriy Lyubimov <dl...@gmail.com> on 2013/07/26 09:07:45 UTC

Proposal: scala DSL module for Mahout linear algebra.

Hello,

i would like to put for discussion a proposal of adding a module
mathout-math-scala to Mahout containing various scala DSLs for Mahout
project.

Here is what i have got so far :
http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html

for now it is in-core stuff only, but it can also be used to script out
driver pipelines for Mahout DRM and solvers. (Some code, in particular,
tests may look ugly at the moment).

By proposing it as a part of Mahout, I of course pursue some selfish goals:
since the stuff covers a lot of Mahout matrix APIs, if I have it away from
Mahout, i would be having hard time maintaining it in sync with Mahout as
the project morphs its apis. So I want to make sure that committers run my
tests too before committing new changes.

(I am actually using this for spark-based solvers bsed on Mahout DRMs and
to make it more accessible to our data scientists to work with -- at some
point I hope to contribute spark ports of some Mahout work too).

Respectfully,
-Dmitriy

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
awesome, working now, test results popping up!


On Fri, Jul 26, 2013 at 12:47 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> yes
>
>
> On Fri, Jul 26, 2013 at 12:39 PM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> > pushed on your branch to github?
> >
> >
> > On Fri, Jul 26, 2013 at 12:16 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> > >wrote:
> >
> > > On Fri, Jul 26, 2013 at 8:40 AM, Jake Mannix <ja...@gmail.com>
> > > wrote:
> > >
> > > > Yep, that fixed it.  Are there any real tests?
> > > >
> > > > -------------------------------------------------------
> > > >  T E S T S
> > > > -------------------------------------------------------
> > > > Running mahout.math.MatrixOpsTest
> > > > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001
> > > sec -
> > > > in mahout.math.MatrixOpsTest
> > > > Running mahout.math.VectorOpsTest
> > > > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004
> > > sec -
> > > > in mahout.math.VectorOpsTest
> > > >
> > > > Results :
> > > >
> > > > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
> > > >
> > >
> > > Added scalatest plugin. Scalatest say this plugin is still in beta so
> > they
> > > haven't published any final release so i had to add their plugin repo
> to
> > > the module, but it seems to work.
> > >
> > > Run completed in 257 milliseconds.
> > > Total number of tests run: 16
> > > Suites: completed 3, aborted 0
> > > Tests: succeeded 16, failed 0, ignored 0, pending 0
> > > All tests passed.
> > >
> > > -d
> > >
> > > >
> > > >
> > > > On Fri, Jul 26, 2013 at 8:35 AM, Jake Mannix <ja...@gmail.com>
> > > > wrote:
> > > >
> > > > > I'm on your branch (dev-0.9.x-scala) but only doing a "mvn install"
> > > > inside
> > > > > of the new module - maybe I need to do it from the top level?
> > > > >
> > > > >
> > > > > On Fri, Jul 26, 2013 at 7:23 AM, Dmitriy Lyubimov <
> dlieu.7@gmail.com
> > > > >wrote:
> > > > >
> > > > >> On Jul 26, 2013 12:57 AM, "Jake Mannix" <ja...@gmail.com>
> > > wrote:
> > > > >> >
> > > > >> > Woohoo!  Awesome, I've forked you, and I'll start digging in
> soon.
> > > >  At a
> > > > >> > high level, this looks great.  Not so sure about so many
> operators
> > > - I
> > > > >> > don't know that we really need to have such a weighty syntax (a
> > %*%
> > > > b),
> > > > >> > java devs are going to be much more familiar with simply doing
> > > > >> a.times(b),
> > > > >> > and I don't think we should keep them from that.
> > > > >> >
> > > > >> > Quick question: I had a build error on your branch:
> > > > >> >
> > > > >> > [INFO] --- maven-scala-plugin:2.15.2:compile (default) @
> > > > >> mahout-math-scala
> > > > >> > ---
> > > > >> > [INFO] Checking for multiple versions of scala
> > > > >> > [WARNING]  Expected all dependencies to require Scala version:
> > 2.9.3
> > > > >> > [WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT
> > requires
> > > > >> scala
> > > > >> > version: 2.9.3
> > > > >> > [WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala
> > > version:
> > > > >> 2.9.2
> > > > >> > [WARNING] Multiple versions of scala libraries detected!
> > > > >> > [INFO] includes = [**/*.scala,**/*.java,]
> > > > >> > [INFO] excludes = []
> > > > >> > [INFO]
> > > > >> >
> > > > >>
> > > >
> > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
> > > > >> > info: compiling
> > > > >> > [INFO] Compiling 5 source files to
> > > > >> >
> > > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes
> > > > at
> > > > >> > 1374825106823
> > > > >> > Downloading:
> > > > >> >
> > > > >>
> > > > >>
> > > >
> > >
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
> > > > >> > Downloaded:
> > > > >> >
> > > > >>
> > > > >>
> > > >
> > >
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
> > > > >> > KB at 216.2 KB/sec)
> > > > >> > Downloading:
> > > > >> >
> > > > >>
> > > > >>
> > > >
> > >
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
> > > > >> > Downloaded:
> > > > >> >
> > > > >>
> > > > >>
> > > >
> > >
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
> > > > >> > KB at 1.6 KB/sec)
> > > > >> > [ERROR]
> > > > >> >
> > > > >>
> > > > >>
> > > >
> > >
> >
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
> > > > >> > error: value rightMult is not a member of
> > > > >> > org.apache.mahout.math.DiagonalMatrix
> > > > >>
> > > > >> Thats  bit strange. Are you recompiling the whole mahout fork? Or
> > just
> > > > the
> > > > >> scala  module? The oprimized multiplication has been added in this
> > > > branch
> > > > >> for sure; i may have not yet committed it yet to Mahout trunk. I
> > need
> > > to
> > > > >> check.
> > > > >>
> > > > >> > [INFO]   def :%*%(that: Matrix) = m.rightMult(that)
> > > > >> > [INFO]                              ^
> > > > >> > [ERROR]
> > > > >> >
> > > > >>
> > > > >>
> > > >
> > >
> >
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
> > > > >> > error: value leftMult is not a member of
> > > > >> > org.apache.mahout.math.DiagonalMatrix
> > > > >> > [INFO]   def %*%:(that: Matrix) = m.leftMult(that)
> > > > >> > [INFO]                              ^
> > > > >> > [ERROR] two errors found
> > > > >> > [INFO]
> > > > >> >
> > > >
> > ------------------------------------------------------------------------
> > > > >> > [INFO] BUILD FAILURE
> > > > >> > [INFO]
> > > > >> >
> > > >
> > ------------------------------------------------------------------------
> > > > >> >
> > > > >> >
> > > > >> > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <
> > > dlieu.7@gmail.com
> > > > >> >wrote:
> > > > >> >
> > > > >> > > Hello,
> > > > >> > >
> > > > >> > > i would like to put for discussion a proposal of adding a
> module
> > > > >> > > mathout-math-scala to Mahout containing various scala DSLs for
> > > > Mahout
> > > > >> > > project.
> > > > >> > >
> > > > >> > > Here is what i have got so far :
> > > > >> > >
> > > > >> > >
> > > > >>
> > > > >>
> > > >
> > >
> >
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> > > > >> > >
> > > > >> > > for now it is in-core stuff only, but it can also be used to
> > > script
> > > > >> out
> > > > >> > > driver pipelines for Mahout DRM and solvers. (Some code, in
> > > > >> particular,
> > > > >> > > tests may look ugly at the moment).
> > > > >> > >
> > > > >> > > By proposing it as a part of Mahout, I of course pursue some
> > > selfish
> > > > >> goals:
> > > > >> > > since the stuff covers a lot of Mahout matrix APIs, if I have
> it
> > > > away
> > > > >> from
> > > > >> > > Mahout, i would be having hard time maintaining it in sync
> with
> > > > Mahout
> > > > >> as
> > > > >> > > the project morphs its apis. So I want to make sure that
> > > committers
> > > > >> run
> > > > >> my
> > > > >> > > tests too before committing new changes.
> > > > >> > >
> > > > >> > > (I am actually using this for spark-based solvers bsed on
> Mahout
> > > > DRMs
> > > > >> and
> > > > >> > > to make it more accessible to our data scientists to work with
> > --
> > > at
> > > > >> some
> > > > >> > > point I hope to contribute spark ports of some Mahout work
> too).
> > > > >> > >
> > > > >> > > Respectfully,
> > > > >> > > -Dmitriy
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> >
> > > > >> >   -jake
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > >   -jake
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > >   -jake
> > > >
> > >
> >
> >
> >
> > --
> >
> >   -jake
> >
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
yes


On Fri, Jul 26, 2013 at 12:39 PM, Jake Mannix <ja...@gmail.com> wrote:

> pushed on your branch to github?
>
>
> On Fri, Jul 26, 2013 at 12:16 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
>
> > On Fri, Jul 26, 2013 at 8:40 AM, Jake Mannix <ja...@gmail.com>
> > wrote:
> >
> > > Yep, that fixed it.  Are there any real tests?
> > >
> > > -------------------------------------------------------
> > >  T E S T S
> > > -------------------------------------------------------
> > > Running mahout.math.MatrixOpsTest
> > > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001
> > sec -
> > > in mahout.math.MatrixOpsTest
> > > Running mahout.math.VectorOpsTest
> > > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004
> > sec -
> > > in mahout.math.VectorOpsTest
> > >
> > > Results :
> > >
> > > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
> > >
> >
> > Added scalatest plugin. Scalatest say this plugin is still in beta so
> they
> > haven't published any final release so i had to add their plugin repo to
> > the module, but it seems to work.
> >
> > Run completed in 257 milliseconds.
> > Total number of tests run: 16
> > Suites: completed 3, aborted 0
> > Tests: succeeded 16, failed 0, ignored 0, pending 0
> > All tests passed.
> >
> > -d
> >
> > >
> > >
> > > On Fri, Jul 26, 2013 at 8:35 AM, Jake Mannix <ja...@gmail.com>
> > > wrote:
> > >
> > > > I'm on your branch (dev-0.9.x-scala) but only doing a "mvn install"
> > > inside
> > > > of the new module - maybe I need to do it from the top level?
> > > >
> > > >
> > > > On Fri, Jul 26, 2013 at 7:23 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> > > >wrote:
> > > >
> > > >> On Jul 26, 2013 12:57 AM, "Jake Mannix" <ja...@gmail.com>
> > wrote:
> > > >> >
> > > >> > Woohoo!  Awesome, I've forked you, and I'll start digging in soon.
> > >  At a
> > > >> > high level, this looks great.  Not so sure about so many operators
> > - I
> > > >> > don't know that we really need to have such a weighty syntax (a
> %*%
> > > b),
> > > >> > java devs are going to be much more familiar with simply doing
> > > >> a.times(b),
> > > >> > and I don't think we should keep them from that.
> > > >> >
> > > >> > Quick question: I had a build error on your branch:
> > > >> >
> > > >> > [INFO] --- maven-scala-plugin:2.15.2:compile (default) @
> > > >> mahout-math-scala
> > > >> > ---
> > > >> > [INFO] Checking for multiple versions of scala
> > > >> > [WARNING]  Expected all dependencies to require Scala version:
> 2.9.3
> > > >> > [WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT
> requires
> > > >> scala
> > > >> > version: 2.9.3
> > > >> > [WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala
> > version:
> > > >> 2.9.2
> > > >> > [WARNING] Multiple versions of scala libraries detected!
> > > >> > [INFO] includes = [**/*.scala,**/*.java,]
> > > >> > [INFO] excludes = []
> > > >> > [INFO]
> > > >> >
> > > >>
> > >
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
> > > >> > info: compiling
> > > >> > [INFO] Compiling 5 source files to
> > > >> >
> > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes
> > > at
> > > >> > 1374825106823
> > > >> > Downloading:
> > > >> >
> > > >>
> > > >>
> > >
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
> > > >> > Downloaded:
> > > >> >
> > > >>
> > > >>
> > >
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
> > > >> > KB at 216.2 KB/sec)
> > > >> > Downloading:
> > > >> >
> > > >>
> > > >>
> > >
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
> > > >> > Downloaded:
> > > >> >
> > > >>
> > > >>
> > >
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
> > > >> > KB at 1.6 KB/sec)
> > > >> > [ERROR]
> > > >> >
> > > >>
> > > >>
> > >
> >
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
> > > >> > error: value rightMult is not a member of
> > > >> > org.apache.mahout.math.DiagonalMatrix
> > > >>
> > > >> Thats  bit strange. Are you recompiling the whole mahout fork? Or
> just
> > > the
> > > >> scala  module? The oprimized multiplication has been added in this
> > > branch
> > > >> for sure; i may have not yet committed it yet to Mahout trunk. I
> need
> > to
> > > >> check.
> > > >>
> > > >> > [INFO]   def :%*%(that: Matrix) = m.rightMult(that)
> > > >> > [INFO]                              ^
> > > >> > [ERROR]
> > > >> >
> > > >>
> > > >>
> > >
> >
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
> > > >> > error: value leftMult is not a member of
> > > >> > org.apache.mahout.math.DiagonalMatrix
> > > >> > [INFO]   def %*%:(that: Matrix) = m.leftMult(that)
> > > >> > [INFO]                              ^
> > > >> > [ERROR] two errors found
> > > >> > [INFO]
> > > >> >
> > >
> ------------------------------------------------------------------------
> > > >> > [INFO] BUILD FAILURE
> > > >> > [INFO]
> > > >> >
> > >
> ------------------------------------------------------------------------
> > > >> >
> > > >> >
> > > >> > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <
> > dlieu.7@gmail.com
> > > >> >wrote:
> > > >> >
> > > >> > > Hello,
> > > >> > >
> > > >> > > i would like to put for discussion a proposal of adding a module
> > > >> > > mathout-math-scala to Mahout containing various scala DSLs for
> > > Mahout
> > > >> > > project.
> > > >> > >
> > > >> > > Here is what i have got so far :
> > > >> > >
> > > >> > >
> > > >>
> > > >>
> > >
> >
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> > > >> > >
> > > >> > > for now it is in-core stuff only, but it can also be used to
> > script
> > > >> out
> > > >> > > driver pipelines for Mahout DRM and solvers. (Some code, in
> > > >> particular,
> > > >> > > tests may look ugly at the moment).
> > > >> > >
> > > >> > > By proposing it as a part of Mahout, I of course pursue some
> > selfish
> > > >> goals:
> > > >> > > since the stuff covers a lot of Mahout matrix APIs, if I have it
> > > away
> > > >> from
> > > >> > > Mahout, i would be having hard time maintaining it in sync with
> > > Mahout
> > > >> as
> > > >> > > the project morphs its apis. So I want to make sure that
> > committers
> > > >> run
> > > >> my
> > > >> > > tests too before committing new changes.
> > > >> > >
> > > >> > > (I am actually using this for spark-based solvers bsed on Mahout
> > > DRMs
> > > >> and
> > > >> > > to make it more accessible to our data scientists to work with
> --
> > at
> > > >> some
> > > >> > > point I hope to contribute spark ports of some Mahout work too).
> > > >> > >
> > > >> > > Respectfully,
> > > >> > > -Dmitriy
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> >
> > > >> >   -jake
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > >   -jake
> > > >
> > >
> > >
> > >
> > > --
> > >
> > >   -jake
> > >
> >
>
>
>
> --
>
>   -jake
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
pushed on your branch to github?


On Fri, Jul 26, 2013 at 12:16 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> On Fri, Jul 26, 2013 at 8:40 AM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> > Yep, that fixed it.  Are there any real tests?
> >
> > -------------------------------------------------------
> >  T E S T S
> > -------------------------------------------------------
> > Running mahout.math.MatrixOpsTest
> > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001
> sec -
> > in mahout.math.MatrixOpsTest
> > Running mahout.math.VectorOpsTest
> > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004
> sec -
> > in mahout.math.VectorOpsTest
> >
> > Results :
> >
> > Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
> >
>
> Added scalatest plugin. Scalatest say this plugin is still in beta so they
> haven't published any final release so i had to add their plugin repo to
> the module, but it seems to work.
>
> Run completed in 257 milliseconds.
> Total number of tests run: 16
> Suites: completed 3, aborted 0
> Tests: succeeded 16, failed 0, ignored 0, pending 0
> All tests passed.
>
> -d
>
> >
> >
> > On Fri, Jul 26, 2013 at 8:35 AM, Jake Mannix <ja...@gmail.com>
> > wrote:
> >
> > > I'm on your branch (dev-0.9.x-scala) but only doing a "mvn install"
> > inside
> > > of the new module - maybe I need to do it from the top level?
> > >
> > >
> > > On Fri, Jul 26, 2013 at 7:23 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> > >wrote:
> > >
> > >> On Jul 26, 2013 12:57 AM, "Jake Mannix" <ja...@gmail.com>
> wrote:
> > >> >
> > >> > Woohoo!  Awesome, I've forked you, and I'll start digging in soon.
> >  At a
> > >> > high level, this looks great.  Not so sure about so many operators
> - I
> > >> > don't know that we really need to have such a weighty syntax (a %*%
> > b),
> > >> > java devs are going to be much more familiar with simply doing
> > >> a.times(b),
> > >> > and I don't think we should keep them from that.
> > >> >
> > >> > Quick question: I had a build error on your branch:
> > >> >
> > >> > [INFO] --- maven-scala-plugin:2.15.2:compile (default) @
> > >> mahout-math-scala
> > >> > ---
> > >> > [INFO] Checking for multiple versions of scala
> > >> > [WARNING]  Expected all dependencies to require Scala version: 2.9.3
> > >> > [WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT requires
> > >> scala
> > >> > version: 2.9.3
> > >> > [WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala
> version:
> > >> 2.9.2
> > >> > [WARNING] Multiple versions of scala libraries detected!
> > >> > [INFO] includes = [**/*.scala,**/*.java,]
> > >> > [INFO] excludes = []
> > >> > [INFO]
> > >> >
> > >>
> > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
> > >> > info: compiling
> > >> > [INFO] Compiling 5 source files to
> > >> >
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes
> > at
> > >> > 1374825106823
> > >> > Downloading:
> > >> >
> > >>
> > >>
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
> > >> > Downloaded:
> > >> >
> > >>
> > >>
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
> > >> > KB at 216.2 KB/sec)
> > >> > Downloading:
> > >> >
> > >>
> > >>
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
> > >> > Downloaded:
> > >> >
> > >>
> > >>
> >
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
> > >> > KB at 1.6 KB/sec)
> > >> > [ERROR]
> > >> >
> > >>
> > >>
> >
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
> > >> > error: value rightMult is not a member of
> > >> > org.apache.mahout.math.DiagonalMatrix
> > >>
> > >> Thats  bit strange. Are you recompiling the whole mahout fork? Or just
> > the
> > >> scala  module? The oprimized multiplication has been added in this
> > branch
> > >> for sure; i may have not yet committed it yet to Mahout trunk. I need
> to
> > >> check.
> > >>
> > >> > [INFO]   def :%*%(that: Matrix) = m.rightMult(that)
> > >> > [INFO]                              ^
> > >> > [ERROR]
> > >> >
> > >>
> > >>
> >
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
> > >> > error: value leftMult is not a member of
> > >> > org.apache.mahout.math.DiagonalMatrix
> > >> > [INFO]   def %*%:(that: Matrix) = m.leftMult(that)
> > >> > [INFO]                              ^
> > >> > [ERROR] two errors found
> > >> > [INFO]
> > >> >
> > ------------------------------------------------------------------------
> > >> > [INFO] BUILD FAILURE
> > >> > [INFO]
> > >> >
> > ------------------------------------------------------------------------
> > >> >
> > >> >
> > >> > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <
> dlieu.7@gmail.com
> > >> >wrote:
> > >> >
> > >> > > Hello,
> > >> > >
> > >> > > i would like to put for discussion a proposal of adding a module
> > >> > > mathout-math-scala to Mahout containing various scala DSLs for
> > Mahout
> > >> > > project.
> > >> > >
> > >> > > Here is what i have got so far :
> > >> > >
> > >> > >
> > >>
> > >>
> >
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> > >> > >
> > >> > > for now it is in-core stuff only, but it can also be used to
> script
> > >> out
> > >> > > driver pipelines for Mahout DRM and solvers. (Some code, in
> > >> particular,
> > >> > > tests may look ugly at the moment).
> > >> > >
> > >> > > By proposing it as a part of Mahout, I of course pursue some
> selfish
> > >> goals:
> > >> > > since the stuff covers a lot of Mahout matrix APIs, if I have it
> > away
> > >> from
> > >> > > Mahout, i would be having hard time maintaining it in sync with
> > Mahout
> > >> as
> > >> > > the project morphs its apis. So I want to make sure that
> committers
> > >> run
> > >> my
> > >> > > tests too before committing new changes.
> > >> > >
> > >> > > (I am actually using this for spark-based solvers bsed on Mahout
> > DRMs
> > >> and
> > >> > > to make it more accessible to our data scientists to work with --
> at
> > >> some
> > >> > > point I hope to contribute spark ports of some Mahout work too).
> > >> > >
> > >> > > Respectfully,
> > >> > > -Dmitriy
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> >
> > >> >   -jake
> > >>
> > >
> > >
> > >
> > > --
> > >
> > >   -jake
> > >
> >
> >
> >
> > --
> >
> >   -jake
> >
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Fri, Jul 26, 2013 at 8:40 AM, Jake Mannix <ja...@gmail.com> wrote:

> Yep, that fixed it.  Are there any real tests?
>
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Running mahout.math.MatrixOpsTest
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec -
> in mahout.math.MatrixOpsTest
> Running mahout.math.VectorOpsTest
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec -
> in mahout.math.VectorOpsTest
>
> Results :
>
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
>

Added scalatest plugin. Scalatest say this plugin is still in beta so they
haven't published any final release so i had to add their plugin repo to
the module, but it seems to work.

Run completed in 257 milliseconds.
Total number of tests run: 16
Suites: completed 3, aborted 0
Tests: succeeded 16, failed 0, ignored 0, pending 0
All tests passed.

-d

>
>
> On Fri, Jul 26, 2013 at 8:35 AM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> > I'm on your branch (dev-0.9.x-scala) but only doing a "mvn install"
> inside
> > of the new module - maybe I need to do it from the top level?
> >
> >
> > On Fri, Jul 26, 2013 at 7:23 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
> >
> >> On Jul 26, 2013 12:57 AM, "Jake Mannix" <ja...@gmail.com> wrote:
> >> >
> >> > Woohoo!  Awesome, I've forked you, and I'll start digging in soon.
>  At a
> >> > high level, this looks great.  Not so sure about so many operators - I
> >> > don't know that we really need to have such a weighty syntax (a %*%
> b),
> >> > java devs are going to be much more familiar with simply doing
> >> a.times(b),
> >> > and I don't think we should keep them from that.
> >> >
> >> > Quick question: I had a build error on your branch:
> >> >
> >> > [INFO] --- maven-scala-plugin:2.15.2:compile (default) @
> >> mahout-math-scala
> >> > ---
> >> > [INFO] Checking for multiple versions of scala
> >> > [WARNING]  Expected all dependencies to require Scala version: 2.9.3
> >> > [WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT requires
> >> scala
> >> > version: 2.9.3
> >> > [WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala version:
> >> 2.9.2
> >> > [WARNING] Multiple versions of scala libraries detected!
> >> > [INFO] includes = [**/*.scala,**/*.java,]
> >> > [INFO] excludes = []
> >> > [INFO]
> >> >
> >>
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
> >> > info: compiling
> >> > [INFO] Compiling 5 source files to
> >> > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes
> at
> >> > 1374825106823
> >> > Downloading:
> >> >
> >>
> >>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
> >> > Downloaded:
> >> >
> >>
> >>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
> >> > KB at 216.2 KB/sec)
> >> > Downloading:
> >> >
> >>
> >>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
> >> > Downloaded:
> >> >
> >>
> >>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
> >> > KB at 1.6 KB/sec)
> >> > [ERROR]
> >> >
> >>
> >>
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
> >> > error: value rightMult is not a member of
> >> > org.apache.mahout.math.DiagonalMatrix
> >>
> >> Thats  bit strange. Are you recompiling the whole mahout fork? Or just
> the
> >> scala  module? The oprimized multiplication has been added in this
> branch
> >> for sure; i may have not yet committed it yet to Mahout trunk. I need to
> >> check.
> >>
> >> > [INFO]   def :%*%(that: Matrix) = m.rightMult(that)
> >> > [INFO]                              ^
> >> > [ERROR]
> >> >
> >>
> >>
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
> >> > error: value leftMult is not a member of
> >> > org.apache.mahout.math.DiagonalMatrix
> >> > [INFO]   def %*%:(that: Matrix) = m.leftMult(that)
> >> > [INFO]                              ^
> >> > [ERROR] two errors found
> >> > [INFO]
> >> >
> ------------------------------------------------------------------------
> >> > [INFO] BUILD FAILURE
> >> > [INFO]
> >> >
> ------------------------------------------------------------------------
> >> >
> >> >
> >> > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >> >wrote:
> >> >
> >> > > Hello,
> >> > >
> >> > > i would like to put for discussion a proposal of adding a module
> >> > > mathout-math-scala to Mahout containing various scala DSLs for
> Mahout
> >> > > project.
> >> > >
> >> > > Here is what i have got so far :
> >> > >
> >> > >
> >>
> >>
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> >> > >
> >> > > for now it is in-core stuff only, but it can also be used to script
> >> out
> >> > > driver pipelines for Mahout DRM and solvers. (Some code, in
> >> particular,
> >> > > tests may look ugly at the moment).
> >> > >
> >> > > By proposing it as a part of Mahout, I of course pursue some selfish
> >> goals:
> >> > > since the stuff covers a lot of Mahout matrix APIs, if I have it
> away
> >> from
> >> > > Mahout, i would be having hard time maintaining it in sync with
> Mahout
> >> as
> >> > > the project morphs its apis. So I want to make sure that committers
> >> run
> >> my
> >> > > tests too before committing new changes.
> >> > >
> >> > > (I am actually using this for spark-based solvers bsed on Mahout
> DRMs
> >> and
> >> > > to make it more accessible to our data scientists to work with -- at
> >> some
> >> > > point I hope to contribute spark ports of some Mahout work too).
> >> > >
> >> > > Respectfully,
> >> > > -Dmitriy
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> >   -jake
> >>
> >
> >
> >
> > --
> >
> >   -jake
> >
>
>
>
> --
>
>   -jake
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
I think maven doesnt detect scala tests, unlike sbt. Or i havent figured a
way how to, yet.
On Jul 26, 2013 8:41 AM, "Jake Mannix" <ja...@gmail.com> wrote:

> Yep, that fixed it.  Are there any real tests?
>
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Running mahout.math.MatrixOpsTest
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec -
> in mahout.math.MatrixOpsTest
> Running mahout.math.VectorOpsTest
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec -
> in mahout.math.VectorOpsTest
>
> Results :
>
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
>
>
> On Fri, Jul 26, 2013 at 8:35 AM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> > I'm on your branch (dev-0.9.x-scala) but only doing a "mvn install"
> inside
> > of the new module - maybe I need to do it from the top level?
> >
> >
> > On Fri, Jul 26, 2013 at 7:23 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
> >
> >> On Jul 26, 2013 12:57 AM, "Jake Mannix" <ja...@gmail.com> wrote:
> >> >
> >> > Woohoo!  Awesome, I've forked you, and I'll start digging in soon.
>  At a
> >> > high level, this looks great.  Not so sure about so many operators - I
> >> > don't know that we really need to have such a weighty syntax (a %*%
> b),
> >> > java devs are going to be much more familiar with simply doing
> >> a.times(b),
> >> > and I don't think we should keep them from that.
> >> >
> >> > Quick question: I had a build error on your branch:
> >> >
> >> > [INFO] --- maven-scala-plugin:2.15.2:compile (default) @
> >> mahout-math-scala
> >> > ---
> >> > [INFO] Checking for multiple versions of scala
> >> > [WARNING]  Expected all dependencies to require Scala version: 2.9.3
> >> > [WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT requires
> >> scala
> >> > version: 2.9.3
> >> > [WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala version:
> >> 2.9.2
> >> > [WARNING] Multiple versions of scala libraries detected!
> >> > [INFO] includes = [**/*.scala,**/*.java,]
> >> > [INFO] excludes = []
> >> > [INFO]
> >> >
> >>
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
> >> > info: compiling
> >> > [INFO] Compiling 5 source files to
> >> > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes
> at
> >> > 1374825106823
> >> > Downloading:
> >> >
> >>
> >>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
> >> > Downloaded:
> >> >
> >>
> >>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
> >> > KB at 216.2 KB/sec)
> >> > Downloading:
> >> >
> >>
> >>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
> >> > Downloaded:
> >> >
> >>
> >>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
> >> > KB at 1.6 KB/sec)
> >> > [ERROR]
> >> >
> >>
> >>
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
> >> > error: value rightMult is not a member of
> >> > org.apache.mahout.math.DiagonalMatrix
> >>
> >> Thats  bit strange. Are you recompiling the whole mahout fork? Or just
> the
> >> scala  module? The oprimized multiplication has been added in this
> branch
> >> for sure; i may have not yet committed it yet to Mahout trunk. I need to
> >> check.
> >>
> >> > [INFO]   def :%*%(that: Matrix) = m.rightMult(that)
> >> > [INFO]                              ^
> >> > [ERROR]
> >> >
> >>
> >>
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
> >> > error: value leftMult is not a member of
> >> > org.apache.mahout.math.DiagonalMatrix
> >> > [INFO]   def %*%:(that: Matrix) = m.leftMult(that)
> >> > [INFO]                              ^
> >> > [ERROR] two errors found
> >> > [INFO]
> >> >
> ------------------------------------------------------------------------
> >> > [INFO] BUILD FAILURE
> >> > [INFO]
> >> >
> ------------------------------------------------------------------------
> >> >
> >> >
> >> > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >> >wrote:
> >> >
> >> > > Hello,
> >> > >
> >> > > i would like to put for discussion a proposal of adding a module
> >> > > mathout-math-scala to Mahout containing various scala DSLs for
> Mahout
> >> > > project.
> >> > >
> >> > > Here is what i have got so far :
> >> > >
> >> > >
> >>
> >>
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> >> > >
> >> > > for now it is in-core stuff only, but it can also be used to script
> >> out
> >> > > driver pipelines for Mahout DRM and solvers. (Some code, in
> >> particular,
> >> > > tests may look ugly at the moment).
> >> > >
> >> > > By proposing it as a part of Mahout, I of course pursue some selfish
> >> goals:
> >> > > since the stuff covers a lot of Mahout matrix APIs, if I have it
> away
> >> from
> >> > > Mahout, i would be having hard time maintaining it in sync with
> Mahout
> >> as
> >> > > the project morphs its apis. So I want to make sure that committers
> >> run
> >> my
> >> > > tests too before committing new changes.
> >> > >
> >> > > (I am actually using this for spark-based solvers bsed on Mahout
> DRMs
> >> and
> >> > > to make it more accessible to our data scientists to work with -- at
> >> some
> >> > > point I hope to contribute spark ports of some Mahout work too).
> >> > >
> >> > > Respectfully,
> >> > > -Dmitriy
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> >   -jake
> >>
> >
> >
> >
> > --
> >
> >   -jake
> >
>
>
>
> --
>
>   -jake
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
Yep, that fixed it.  Are there any real tests?

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running mahout.math.MatrixOpsTest
Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec -
in mahout.math.MatrixOpsTest
Running mahout.math.VectorOpsTest
Tests run: 0, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec -
in mahout.math.VectorOpsTest

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0


On Fri, Jul 26, 2013 at 8:35 AM, Jake Mannix <ja...@gmail.com> wrote:

> I'm on your branch (dev-0.9.x-scala) but only doing a "mvn install" inside
> of the new module - maybe I need to do it from the top level?
>
>
> On Fri, Jul 26, 2013 at 7:23 AM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>> On Jul 26, 2013 12:57 AM, "Jake Mannix" <ja...@gmail.com> wrote:
>> >
>> > Woohoo!  Awesome, I've forked you, and I'll start digging in soon.  At a
>> > high level, this looks great.  Not so sure about so many operators - I
>> > don't know that we really need to have such a weighty syntax (a %*% b),
>> > java devs are going to be much more familiar with simply doing
>> a.times(b),
>> > and I don't think we should keep them from that.
>> >
>> > Quick question: I had a build error on your branch:
>> >
>> > [INFO] --- maven-scala-plugin:2.15.2:compile (default) @
>> mahout-math-scala
>> > ---
>> > [INFO] Checking for multiple versions of scala
>> > [WARNING]  Expected all dependencies to require Scala version: 2.9.3
>> > [WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT requires
>> scala
>> > version: 2.9.3
>> > [WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala version:
>> 2.9.2
>> > [WARNING] Multiple versions of scala libraries detected!
>> > [INFO] includes = [**/*.scala,**/*.java,]
>> > [INFO] excludes = []
>> > [INFO]
>> >
>> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
>> > info: compiling
>> > [INFO] Compiling 5 source files to
>> > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes at
>> > 1374825106823
>> > Downloading:
>> >
>>
>> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
>> > Downloaded:
>> >
>>
>> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
>> > KB at 216.2 KB/sec)
>> > Downloading:
>> >
>>
>> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
>> > Downloaded:
>> >
>>
>> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
>> > KB at 1.6 KB/sec)
>> > [ERROR]
>> >
>>
>> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
>> > error: value rightMult is not a member of
>> > org.apache.mahout.math.DiagonalMatrix
>>
>> Thats  bit strange. Are you recompiling the whole mahout fork? Or just the
>> scala  module? The oprimized multiplication has been added in this branch
>> for sure; i may have not yet committed it yet to Mahout trunk. I need to
>> check.
>>
>> > [INFO]   def :%*%(that: Matrix) = m.rightMult(that)
>> > [INFO]                              ^
>> > [ERROR]
>> >
>>
>> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
>> > error: value leftMult is not a member of
>> > org.apache.mahout.math.DiagonalMatrix
>> > [INFO]   def %*%:(that: Matrix) = m.leftMult(that)
>> > [INFO]                              ^
>> > [ERROR] two errors found
>> > [INFO]
>> > ------------------------------------------------------------------------
>> > [INFO] BUILD FAILURE
>> > [INFO]
>> > ------------------------------------------------------------------------
>> >
>> >
>> > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
>> >wrote:
>> >
>> > > Hello,
>> > >
>> > > i would like to put for discussion a proposal of adding a module
>> > > mathout-math-scala to Mahout containing various scala DSLs for Mahout
>> > > project.
>> > >
>> > > Here is what i have got so far :
>> > >
>> > >
>>
>> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
>> > >
>> > > for now it is in-core stuff only, but it can also be used to script
>> out
>> > > driver pipelines for Mahout DRM and solvers. (Some code, in
>> particular,
>> > > tests may look ugly at the moment).
>> > >
>> > > By proposing it as a part of Mahout, I of course pursue some selfish
>> goals:
>> > > since the stuff covers a lot of Mahout matrix APIs, if I have it away
>> from
>> > > Mahout, i would be having hard time maintaining it in sync with Mahout
>> as
>> > > the project morphs its apis. So I want to make sure that committers
>> run
>> my
>> > > tests too before committing new changes.
>> > >
>> > > (I am actually using this for spark-based solvers bsed on Mahout DRMs
>> and
>> > > to make it more accessible to our data scientists to work with -- at
>> some
>> > > point I hope to contribute spark ports of some Mahout work too).
>> > >
>> > > Respectfully,
>> > > -Dmitriy
>> > >
>> >
>> >
>> >
>> > --
>> >
>> >   -jake
>>
>
>
>
> --
>
>   -jake
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
I'm on your branch (dev-0.9.x-scala) but only doing a "mvn install" inside
of the new module - maybe I need to do it from the top level?


On Fri, Jul 26, 2013 at 7:23 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> On Jul 26, 2013 12:57 AM, "Jake Mannix" <ja...@gmail.com> wrote:
> >
> > Woohoo!  Awesome, I've forked you, and I'll start digging in soon.  At a
> > high level, this looks great.  Not so sure about so many operators - I
> > don't know that we really need to have such a weighty syntax (a %*% b),
> > java devs are going to be much more familiar with simply doing
> a.times(b),
> > and I don't think we should keep them from that.
> >
> > Quick question: I had a build error on your branch:
> >
> > [INFO] --- maven-scala-plugin:2.15.2:compile (default) @
> mahout-math-scala
> > ---
> > [INFO] Checking for multiple versions of scala
> > [WARNING]  Expected all dependencies to require Scala version: 2.9.3
> > [WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT requires
> scala
> > version: 2.9.3
> > [WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala version:
> 2.9.2
> > [WARNING] Multiple versions of scala libraries detected!
> > [INFO] includes = [**/*.scala,**/*.java,]
> > [INFO] excludes = []
> > [INFO]
> > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
> > info: compiling
> > [INFO] Compiling 5 source files to
> > /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes at
> > 1374825106823
> > Downloading:
> >
>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
> > Downloaded:
> >
>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
> > KB at 216.2 KB/sec)
> > Downloading:
> >
>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
> > Downloaded:
> >
>
> http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
> > KB at 1.6 KB/sec)
> > [ERROR]
> >
>
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
> > error: value rightMult is not a member of
> > org.apache.mahout.math.DiagonalMatrix
>
> Thats  bit strange. Are you recompiling the whole mahout fork? Or just the
> scala  module? The oprimized multiplication has been added in this branch
> for sure; i may have not yet committed it yet to Mahout trunk. I need to
> check.
>
> > [INFO]   def :%*%(that: Matrix) = m.rightMult(that)
> > [INFO]                              ^
> > [ERROR]
> >
>
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
> > error: value leftMult is not a member of
> > org.apache.mahout.math.DiagonalMatrix
> > [INFO]   def %*%:(that: Matrix) = m.leftMult(that)
> > [INFO]                              ^
> > [ERROR] two errors found
> > [INFO]
> > ------------------------------------------------------------------------
> > [INFO] BUILD FAILURE
> > [INFO]
> > ------------------------------------------------------------------------
> >
> >
> > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
> >
> > > Hello,
> > >
> > > i would like to put for discussion a proposal of adding a module
> > > mathout-math-scala to Mahout containing various scala DSLs for Mahout
> > > project.
> > >
> > > Here is what i have got so far :
> > >
> > >
>
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> > >
> > > for now it is in-core stuff only, but it can also be used to script out
> > > driver pipelines for Mahout DRM and solvers. (Some code, in particular,
> > > tests may look ugly at the moment).
> > >
> > > By proposing it as a part of Mahout, I of course pursue some selfish
> goals:
> > > since the stuff covers a lot of Mahout matrix APIs, if I have it away
> from
> > > Mahout, i would be having hard time maintaining it in sync with Mahout
> as
> > > the project morphs its apis. So I want to make sure that committers run
> my
> > > tests too before committing new changes.
> > >
> > > (I am actually using this for spark-based solvers bsed on Mahout DRMs
> and
> > > to make it more accessible to our data scientists to work with -- at
> some
> > > point I hope to contribute spark ports of some Mahout work too).
> > >
> > > Respectfully,
> > > -Dmitriy
> > >
> >
> >
> >
> > --
> >
> >   -jake
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Jul 26, 2013 12:57 AM, "Jake Mannix" <ja...@gmail.com> wrote:
>
> Woohoo!  Awesome, I've forked you, and I'll start digging in soon.  At a
> high level, this looks great.  Not so sure about so many operators - I
> don't know that we really need to have such a weighty syntax (a %*% b),
> java devs are going to be much more familiar with simply doing a.times(b),
> and I don't think we should keep them from that.
>
> Quick question: I had a build error on your branch:
>
> [INFO] --- maven-scala-plugin:2.15.2:compile (default) @ mahout-math-scala
> ---
> [INFO] Checking for multiple versions of scala
> [WARNING]  Expected all dependencies to require Scala version: 2.9.3
> [WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT requires scala
> version: 2.9.3
> [WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala version:
2.9.2
> [WARNING] Multiple versions of scala libraries detected!
> [INFO] includes = [**/*.scala,**/*.java,]
> [INFO] excludes = []
> [INFO]
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
> info: compiling
> [INFO] Compiling 5 source files to
> /Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes at
> 1374825106823
> Downloading:
>
http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
> Downloaded:
>
http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
> KB at 216.2 KB/sec)
> Downloading:
>
http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
> Downloaded:
>
http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
> KB at 1.6 KB/sec)
> [ERROR]
>
/Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
> error: value rightMult is not a member of
> org.apache.mahout.math.DiagonalMatrix

Thats  bit strange. Are you recompiling the whole mahout fork? Or just the
scala  module? The oprimized multiplication has been added in this branch
for sure; i may have not yet committed it yet to Mahout trunk. I need to
check.

> [INFO]   def :%*%(that: Matrix) = m.rightMult(that)
> [INFO]                              ^
> [ERROR]
>
/Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
> error: value leftMult is not a member of
> org.apache.mahout.math.DiagonalMatrix
> [INFO]   def %*%:(that: Matrix) = m.leftMult(that)
> [INFO]                              ^
> [ERROR] two errors found
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD FAILURE
> [INFO]
> ------------------------------------------------------------------------
>
>
> On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
>wrote:
>
> > Hello,
> >
> > i would like to put for discussion a proposal of adding a module
> > mathout-math-scala to Mahout containing various scala DSLs for Mahout
> > project.
> >
> > Here is what i have got so far :
> >
> >
http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> >
> > for now it is in-core stuff only, but it can also be used to script out
> > driver pipelines for Mahout DRM and solvers. (Some code, in particular,
> > tests may look ugly at the moment).
> >
> > By proposing it as a part of Mahout, I of course pursue some selfish
goals:
> > since the stuff covers a lot of Mahout matrix APIs, if I have it away
from
> > Mahout, i would be having hard time maintaining it in sync with Mahout
as
> > the project morphs its apis. So I want to make sure that committers run
my
> > tests too before committing new changes.
> >
> > (I am actually using this for spark-based solvers bsed on Mahout DRMs
and
> > to make it more accessible to our data scientists to work with -- at
some
> > point I hope to contribute spark ports of some Mahout work too).
> >
> > Respectfully,
> > -Dmitriy
> >
>
>
>
> --
>
>   -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
Woohoo!  Awesome, I've forked you, and I'll start digging in soon.  At a
high level, this looks great.  Not so sure about so many operators - I
don't know that we really need to have such a weighty syntax (a %*% b),
java devs are going to be much more familiar with simply doing a.times(b),
and I don't think we should keep them from that.

Quick question: I had a build error on your branch:

[INFO] --- maven-scala-plugin:2.15.2:compile (default) @ mahout-math-scala
---
[INFO] Checking for multiple versions of scala
[WARNING]  Expected all dependencies to require Scala version: 2.9.3
[WARNING]  org.apache.mahout:mahout-math-scala:0.9-SNAPSHOT requires scala
version: 2.9.3
[WARNING]  org.scalatest:scalatest_2.9.2:1.9.1 requires scala version: 2.9.2
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO]
/Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala:-1:
info: compiling
[INFO] Compiling 5 source files to
/Users/jake/open_src/gitrepo/mahout-twitter/math-scala/target/classes at
1374825106823
Downloading:
http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar
Downloaded:
http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.jar(11260
KB at 216.2 KB/sec)
Downloading:
http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom
Downloaded:
http://artifactory.local.twitter.com/repo/org/scala-lang/scala-compiler/2.9.3/scala-compiler-2.9.3.pom(2
KB at 1.6 KB/sec)
[ERROR]
/Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:14:
error: value rightMult is not a member of
org.apache.mahout.math.DiagonalMatrix
[INFO]   def :%*%(that: Matrix) = m.rightMult(that)
[INFO]                              ^
[ERROR]
/Users/jake/open_src/gitrepo/mahout-twitter/math-scala/src/main/scala/mahout/math/DiagonalOps.scala:16:
error: value leftMult is not a member of
org.apache.mahout.math.DiagonalMatrix
[INFO]   def %*%:(that: Matrix) = m.leftMult(that)
[INFO]                              ^
[ERROR] two errors found
[INFO]
------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO]
------------------------------------------------------------------------


On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> Hello,
>
> i would like to put for discussion a proposal of adding a module
> mathout-math-scala to Mahout containing various scala DSLs for Mahout
> project.
>
> Here is what i have got so far :
>
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
>
> for now it is in-core stuff only, but it can also be used to script out
> driver pipelines for Mahout DRM and solvers. (Some code, in particular,
> tests may look ugly at the moment).
>
> By proposing it as a part of Mahout, I of course pursue some selfish goals:
> since the stuff covers a lot of Mahout matrix APIs, if I have it away from
> Mahout, i would be having hard time maintaining it in sync with Mahout as
> the project morphs its apis. So I want to make sure that committers run my
> tests too before committing new changes.
>
> (I am actually using this for spark-based solvers bsed on Mahout DRMs and
> to make it more accessible to our data scientists to work with -- at some
> point I hope to contribute spark ports of some Mahout work too).
>
> Respectfully,
> -Dmitriy
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Ted Dunning <te...@gmail.com>.
Lets stick with one convention or another, not invent a new one.  

The good news is that a*b will often be the wrong shape so subsequent operations will fail.  I know that I still make this error even after using R for 10 years.  

Sent from my iPhone

On Jul 26, 2013, at 11:52, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> %*%, which incidentally gives us correct intuitive precedence of
> multiplication, but perhaps one might want to take this opportunity and
> make it a little simple, such as '*%" for example. or "**" even. However,
> just like your point said, i kept it in R-familiar form, just to keep it
> familiar for R users.

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Ted Dunning <te...@gmail.com>.
It is tempting to define a method x so that a x b would work.  

(Don't do this).  
(I am just making noise here). 

Sent from my iPhone

On Jul 26, 2013, at 11:52, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> BTW "a times b" is valid too since Mahout already implements it that way.
> All Mahout's Matrix's methods are obviously inherited, this is no more than
> a syntactic sugar.

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Yes. looks very R like.

Of course i had to do some deviations due to scala nature, e.g. t(m)
becomes

m.t

or

 m t

and nrow(m) becomes m.nrow etc.etc.



On Fri, Jul 26, 2013 at 9:58 AM, Sebastian Schelter <ss...@apache.org> wrote:

> Just as a sidenote,  SystemML [1] offers a similar language and also choose
> the R Syntax.
>
>
> [1] http://people.cs.uchicago.edu/~vikass/*SystemML*.pdf
>
> 2013/7/26 Dmitriy Lyubimov <dl...@gmail.com>
>
> > On Fri, Jul 26, 2013 at 5:12 AM, Jake Mannix <ja...@gmail.com>
> > wrote:
> >
> > > On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <te...@gmail.com>
> > > wrote:
> > >
> > > > This sounds great in principle.  I haven't seen any details yet
> > (haven't
> > > > had time to look).
> > > >
> > > > Is there a strong reason to go with the R syntax for multiplication
> > > instead
> > > > of the matlab convention that a*b means a.times(b)?
> > > >
> > >
> > > +1
> > >
> > > a * b being pointwise products will confuse any mathematician in the
> > > audience.
> > >
> >
> > I think consensus is that it is really a matter of religion and highly
> > depends on whether you come from R side or Matlab side. %*% ascends to
> R/S3
> > mythology and people coming from there  actually find it very familiar.
> So
> > it is really a matter of an individual opinion whether you believe Matlab
> > or R are more "popular".
> >
> > I personally consider %*% quite quirky myself, but having spent a lot of
> > time with R, i know how powerful a habit is.
> >
> > Another consideration here is that if we think of all element-wise
> *,/,+,-
> > as same class of "primitive operators"  and matrix multiplication as an
> > "advanced" operator, and noting that there's no such thing as "advanced
> +,
> > -, /" then it seems intuitive to reserve *,/,+,- exclusively for
> > "primitive" operations and having "advanced *" as a special case. In that
> > sense, my personal opinion is that Matlab approach with .* and ./ and *
> is
> > a bit counter-intuitive and I like R approach a little better. Besides, I
> > am not sure if scala would support ".*" operator, and even if it did, it
> > would screw its precedence.
> >
> > However, i did have a thought that maybe there's a point for
> simplification
> > of %*%, which incidentally gives us correct intuitive precedence of
> > multiplication, but perhaps one might want to take this opportunity and
> > make it a little simple, such as '*%" for example. or "**" even. However,
> > just like your point said, i kept it in R-familiar form, just to keep it
> > familiar for R users.
> >
> > BTW "a times b" is valid too since Mahout already implements it that way.
> > All Mahout's Matrix's methods are obviously inherited, this is no more
> than
> > a syntactic sugar.
> >
> >
> > >
> > > >
> > > >
> > > > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <
> dlieu.7@gmail.com
> > > > >wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > i would like to put for discussion a proposal of adding a module
> > > > > mathout-math-scala to Mahout containing various scala DSLs for
> Mahout
> > > > > project.
> > > > >
> > > > > Here is what i have got so far :
> > > > >
> > > > >
> > > >
> > >
> >
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> > > > >
> > > > > for now it is in-core stuff only, but it can also be used to script
> > out
> > > > > driver pipelines for Mahout DRM and solvers. (Some code, in
> > particular,
> > > > > tests may look ugly at the moment).
> > > > >
> > > > > By proposing it as a part of Mahout, I of course pursue some
> selfish
> > > > goals:
> > > > > since the stuff covers a lot of Mahout matrix APIs, if I have it
> away
> > > > from
> > > > > Mahout, i would be having hard time maintaining it in sync with
> > Mahout
> > > as
> > > > > the project morphs its apis. So I want to make sure that committers
> > run
> > > > my
> > > > > tests too before committing new changes.
> > > > >
> > > > > (I am actually using this for spark-based solvers bsed on Mahout
> DRMs
> > > and
> > > > > to make it more accessible to our data scientists to work with --
> at
> > > some
> > > > > point I hope to contribute spark ports of some Mahout work too).
> > > > >
> > > > > Respectfully,
> > > > > -Dmitriy
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > >   -jake
> > >
> >
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Sebastian Schelter <ss...@apache.org>.
Just as a sidenote,  SystemML [1] offers a similar language and also choose
the R Syntax.


[1] http://people.cs.uchicago.edu/~vikass/*SystemML*.pdf‎

2013/7/26 Dmitriy Lyubimov <dl...@gmail.com>

> On Fri, Jul 26, 2013 at 5:12 AM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> > On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <te...@gmail.com>
> > wrote:
> >
> > > This sounds great in principle.  I haven't seen any details yet
> (haven't
> > > had time to look).
> > >
> > > Is there a strong reason to go with the R syntax for multiplication
> > instead
> > > of the matlab convention that a*b means a.times(b)?
> > >
> >
> > +1
> >
> > a * b being pointwise products will confuse any mathematician in the
> > audience.
> >
>
> I think consensus is that it is really a matter of religion and highly
> depends on whether you come from R side or Matlab side. %*% ascends to R/S3
> mythology and people coming from there  actually find it very familiar. So
> it is really a matter of an individual opinion whether you believe Matlab
> or R are more "popular".
>
> I personally consider %*% quite quirky myself, but having spent a lot of
> time with R, i know how powerful a habit is.
>
> Another consideration here is that if we think of all element-wise *,/,+,-
> as same class of "primitive operators"  and matrix multiplication as an
> "advanced" operator, and noting that there's no such thing as "advanced +,
> -, /" then it seems intuitive to reserve *,/,+,- exclusively for
> "primitive" operations and having "advanced *" as a special case. In that
> sense, my personal opinion is that Matlab approach with .* and ./ and * is
> a bit counter-intuitive and I like R approach a little better. Besides, I
> am not sure if scala would support ".*" operator, and even if it did, it
> would screw its precedence.
>
> However, i did have a thought that maybe there's a point for simplification
> of %*%, which incidentally gives us correct intuitive precedence of
> multiplication, but perhaps one might want to take this opportunity and
> make it a little simple, such as '*%" for example. or "**" even. However,
> just like your point said, i kept it in R-familiar form, just to keep it
> familiar for R users.
>
> BTW "a times b" is valid too since Mahout already implements it that way.
> All Mahout's Matrix's methods are obviously inherited, this is no more than
> a syntactic sugar.
>
>
> >
> > >
> > >
> > > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> > > >wrote:
> > >
> > > > Hello,
> > > >
> > > > i would like to put for discussion a proposal of adding a module
> > > > mathout-math-scala to Mahout containing various scala DSLs for Mahout
> > > > project.
> > > >
> > > > Here is what i have got so far :
> > > >
> > > >
> > >
> >
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> > > >
> > > > for now it is in-core stuff only, but it can also be used to script
> out
> > > > driver pipelines for Mahout DRM and solvers. (Some code, in
> particular,
> > > > tests may look ugly at the moment).
> > > >
> > > > By proposing it as a part of Mahout, I of course pursue some selfish
> > > goals:
> > > > since the stuff covers a lot of Mahout matrix APIs, if I have it away
> > > from
> > > > Mahout, i would be having hard time maintaining it in sync with
> Mahout
> > as
> > > > the project morphs its apis. So I want to make sure that committers
> run
> > > my
> > > > tests too before committing new changes.
> > > >
> > > > (I am actually using this for spark-based solvers bsed on Mahout DRMs
> > and
> > > > to make it more accessible to our data scientists to work with -- at
> > some
> > > > point I hope to contribute spark ports of some Mahout work too).
> > > >
> > > > Respectfully,
> > > > -Dmitriy
> > > >
> > >
> >
> >
> >
> > --
> >
> >   -jake
> >
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Fri, Jul 26, 2013 at 5:12 AM, Jake Mannix <ja...@gmail.com> wrote:

> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > This sounds great in principle.  I haven't seen any details yet (haven't
> > had time to look).
> >
> > Is there a strong reason to go with the R syntax for multiplication
> instead
> > of the matlab convention that a*b means a.times(b)?
> >
>
> +1
>
> a * b being pointwise products will confuse any mathematician in the
> audience.
>

I think consensus is that it is really a matter of religion and highly
depends on whether you come from R side or Matlab side. %*% ascends to R/S3
mythology and people coming from there  actually find it very familiar. So
it is really a matter of an individual opinion whether you believe Matlab
or R are more "popular".

I personally consider %*% quite quirky myself, but having spent a lot of
time with R, i know how powerful a habit is.

Another consideration here is that if we think of all element-wise *,/,+,-
as same class of "primitive operators"  and matrix multiplication as an
"advanced" operator, and noting that there's no such thing as "advanced +,
-, /" then it seems intuitive to reserve *,/,+,- exclusively for
"primitive" operations and having "advanced *" as a special case. In that
sense, my personal opinion is that Matlab approach with .* and ./ and * is
a bit counter-intuitive and I like R approach a little better. Besides, I
am not sure if scala would support ".*" operator, and even if it did, it
would screw its precedence.

However, i did have a thought that maybe there's a point for simplification
of %*%, which incidentally gives us correct intuitive precedence of
multiplication, but perhaps one might want to take this opportunity and
make it a little simple, such as '*%" for example. or "**" even. However,
just like your point said, i kept it in R-familiar form, just to keep it
familiar for R users.

BTW "a times b" is valid too since Mahout already implements it that way.
All Mahout's Matrix's methods are obviously inherited, this is no more than
a syntactic sugar.


>
> >
> >
> > On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> > >wrote:
> >
> > > Hello,
> > >
> > > i would like to put for discussion a proposal of adding a module
> > > mathout-math-scala to Mahout containing various scala DSLs for Mahout
> > > project.
> > >
> > > Here is what i have got so far :
> > >
> > >
> >
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> > >
> > > for now it is in-core stuff only, but it can also be used to script out
> > > driver pipelines for Mahout DRM and solvers. (Some code, in particular,
> > > tests may look ugly at the moment).
> > >
> > > By proposing it as a part of Mahout, I of course pursue some selfish
> > goals:
> > > since the stuff covers a lot of Mahout matrix APIs, if I have it away
> > from
> > > Mahout, i would be having hard time maintaining it in sync with Mahout
> as
> > > the project morphs its apis. So I want to make sure that committers run
> > my
> > > tests too before committing new changes.
> > >
> > > (I am actually using this for spark-based solvers bsed on Mahout DRMs
> and
> > > to make it more accessible to our data scientists to work with -- at
> some
> > > point I hope to contribute spark ports of some Mahout work too).
> > >
> > > Respectfully,
> > > -Dmitriy
> > >
> >
>
>
>
> --
>
>   -jake
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <te...@gmail.com> wrote:

> This sounds great in principle.  I haven't seen any details yet (haven't
> had time to look).
>
> Is there a strong reason to go with the R syntax for multiplication instead
> of the matlab convention that a*b means a.times(b)?
>

+1

a * b being pointwise products will confuse any mathematician in the
audience.


>
>
> On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
>
> > Hello,
> >
> > i would like to put for discussion a proposal of adding a module
> > mathout-math-scala to Mahout containing various scala DSLs for Mahout
> > project.
> >
> > Here is what i have got so far :
> >
> >
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
> >
> > for now it is in-core stuff only, but it can also be used to script out
> > driver pipelines for Mahout DRM and solvers. (Some code, in particular,
> > tests may look ugly at the moment).
> >
> > By proposing it as a part of Mahout, I of course pursue some selfish
> goals:
> > since the stuff covers a lot of Mahout matrix APIs, if I have it away
> from
> > Mahout, i would be having hard time maintaining it in sync with Mahout as
> > the project morphs its apis. So I want to make sure that committers run
> my
> > tests too before committing new changes.
> >
> > (I am actually using this for spark-based solvers bsed on Mahout DRMs and
> > to make it more accessible to our data scientists to work with -- at some
> > point I hope to contribute spark ports of some Mahout work too).
> >
> > Respectfully,
> > -Dmitriy
> >
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Robin East <ro...@xense.co.uk>.
> Aside from that, it seems lapack backend is running up to 5x slower on amd
> hardware that our company unfortunately chose to invest in... argh!..
Is Lapack properly tuned for the hardware? Can make a big difference.
Sent from my iPhone

On 27 Jul 2013, at 14:10, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> On Jul 26, 2013 11:56 PM, "Nick Pentreath" <ni...@gmail.com> wrote:
>> 
>> Thanks for the update on that PR I will definitely take a look.
>> 
>> 
>> I wonder if they will run into the exact same Colt issues as mahout did?!
> 
> Yes i wondered that too since the day i saw spark als example.
> 
> Jblas is far better choice but as Sebastian has demonstrated bona fide
> improvements are hard to achieve due to high jni costs, so i would actually
> have a specific type of matrix to solve specific probems when needed rather
> than sweepingly generalize it as a dense vector or matrix support.
> 
> Aside from that, it seems lapack backend is running up to 5x slower on amd
> hardware that our company unfortunately chose to invest in... argh!..
> 
>> 
>> 
>> This DSL looks great, I'm gonna play around with it as soon as I get a
> chance.
>> 
>> 
>> 
>> One question - breeze has quite a similar syntax that is a bit simpler in
> some ways - basically * for matrix multiply and :* for elementwise. Would
> something similar work here?
> 
> As i commented before, it just caters to R syntax, along with bunch of
> other things. If we beleive that there is a reason to inherit syntax vs
> devising something new, then there are really few candidates, and i dont
> think Breeze is going to cut it based on adoption level.
> 
> In particular, in my company it is hard to convince R users to start using
> scala or java as it is, so I am just scoring points here by making it look
> familiar to them.
> 
> Also i want to reserve the colon to command associativity of operation, as
> scala means it, which is important for optimizing non commutative
> operations such as elementwise division or matrix multiplication. E.g.
> there are significant peroformance differences between saying
> 
> A %*% diagonal === A.times(diagonal)
> 
> And
> 
> A %*%: diagonal === diagonal.timesLeft(A).
> 
> Obviously the latter is n flops and the former is n squared.
> 
> I dont think breeze made a wise decision by putting a special functional
> meaning into :  . It is reserved for associativity in scala.
> 
>> 
>> 
>> Would be quite nice to have same syntax but different backends that are
> swappable ;)
>> —
>> Sent from Mailbox for iPhone
>> 
>> On Sat, Jul 27, 2013 at 2:42 AM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> 
>>> coincidentally, spark mlib just posted a pull request intended to add
>>> support for dense and sparse vectors, looks quite similar.
>>> https://github.com/mesos/spark/pull/736. They seem to choose JBlas
> backing
>>> for dense stuff (although at a vector level there's probably not much
>>> reason to) and as-is Colt for sparse stuff.
>>> On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>>>> 
>>>> 
>>>> 
>>>> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <ted.dunning@gmail.com
>> wrote:
>>>> 
>>>>> This sounds great in principle.  I haven't seen any details yet
> (haven't
>>>>> had time to look).
>>>>> 
>>>>> Is there a strong reason to go with the R syntax for multiplication
>>>>> instead
>>>>> of the matlab convention that a*b means a.times(b)?
>>>> 
>>>> As discussed, but also because matlab style elementwise operators are
>>>> impossible to keep at proper precedence level in scala. It kind of has
> to
>>>> start with either '*' or '%' to keep proper precedence, '.*' will not
> work
>>>> unfortunately. And mix along the lines "some of Matlab, some of perhaps
>>>> completely something else' does not seem appealing at all.
>>>> 
>>>> 

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Ted Dunning <te...@gmail.com>.
That isn't at all obvious.  Matrix.times() can and should check for a sparse through the normal vector ops.  If it doesn't, that really is a bug.  

Sent from my iPhone

On Jul 27, 2013, at 6:10, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> A %*% diagonal === A.times(diagonal)
> 
> And
> 
> A %*%: diagonal === diagonal.timesLeft(A).
> 
> Obviously the latter is n flops and the former is n squared.

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Ted Dunning <te...@gmail.com>.
So getting much better map reduce count probably requires that distributed ops be implemented lazily and then reorganized when they actually run.  Is there any appetite for that?

Sent from my iPhone

On Jul 27, 2013, at 8:00, Jake Mannix <ja...@gmail.com> wrote:

> But yeah, maybe we'll just be looking at two different focuses on this: I
> really care more about writing nicer MR pipelines for our jobs (I've
> already played around with a nice replacement for seq2sparse in a single
> small scalding job with modular components, it's about 1/10th the number of
> lines of our current one, with most of the functionality), and getting a
> nice integrated REPL for playing with the results.

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Ted Dunning <te...@gmail.com>.
On Mon, Jul 29, 2013 at 12:39 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> so i guess my action item would be to separate support for syntaxes but i
> will leave out support for elementwise * and / until this is settled...
>

This suddenly sounds very wise.  Let the issue ferment a bit and get more
people to have experience with the code.

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
ok, thank you, Ted.

It looks like it makes sense to provide options here and i will do that.

i am still not clear what is wanted for elementwise products though on
matlab side though. If it is a method then it can be just made part of
Matrix interface.

Note that there are three types of element-wise problems though that are
handled by operators:

1)  simple elementwise e.g a * b that produces a new product object;
2) in-place element wise e.g. a /= b (currently handled by a.assign(b,...)
3) right-associative in-place elementwise such as 1 /=: x (assign 1/x to x
and return x)

suppose we take a different elementwise signature, say, *@ to keep correct
scala precedence, then it will look like
1) a *@ b
2) a *@= b
3) a *@=: b (or 1 /@=: x)

even if 1..3 replaced by methods, I'd need 3 method names, not one. More
than that, we already have assigns() to handle that, albeit in a long form.

so i guess my action item would be to separate support for syntaxes but i
will leave out support for elementwise * and / until this is settled...



On Sun, Jul 28, 2013 at 6:17 PM, Ted Dunning <te...@gmail.com> wrote:

> After letting this soak for a bit, I would tend to prefer either full-on R
> (less preferred) or Matlab with defects (more preferred).  Matlab with
> defects would use * for matrix multiplication and have a method name for
> element by element product.
>
> It is fine to have special syntax modules, but I think that this isn't all
> that big a deal.
>
>
> On Sun, Jul 28, 2013 at 11:17 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
>
> > yeah. we are out of luck with matlab syntax.
> >
> > *&, *|, *^, *%, *#, *@, *~, *?, *!, *>, *<, *\ all work . '*.' or '*,'
> will
> > not work. "*:" or ":*" have special meaning.
> >
> >
> > On Sun, Jul 28, 2013 at 10:58 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> > >wrote:
> >
> > > FWIW,
> > >
> > > one approach might be to separate DSL into several. E.g. RLikeOps and
> > > MatlabLikeOps or WhateverOps, none of which is imported by default. and
> > > then the code would have to say "import RLikeOps._" to enalbe R-like
> DSL,
> > > and vice versa.
> > >
> > > But matlab style '*.' symbol unfortunately doesn't seem to work in
> scala
> > > without backquotes. apparently scala treats '.' 'as a keyword and can't
> > > reduce it as a part of anything else.
> > >
> > >
> > > On Sat, Jul 27, 2013 at 6:43 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> > >wrote:
> > >
> > >>
> > >>
> > >>
> > >> On Sat, Jul 27, 2013 at 6:31 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> > >wrote:
> > >>
> > >>>
> > >>>
> > >>>
> > >>> >
> > >>>> > diagv(1 /: s)
> > >>>> >
> > >>>>
> > >>>> But since this is just the inverse of the matrix, and I imagine it's
> > >>>> actually
> > >>>> clearer to do just diagv(s).inverse instead of diagv(1 /: s)
> > >>>>
> > >>>>
> > >>> Well. DSL is just the icing. Nobody's taking the cake away.
> > >>>
> > >>> in a sense that, once/if/when Mahout supports inverse(), it would be
> > >>> exactly how one might use it. DSL is not about implementation, it is
> > about
> > >>> semantic sugar only. It only maps to what exists.
> > >>>
> > >>> On a side note, it never actually occurred to me to call pinv() or
> > >>> solve() on a diagonal matrix. Or orthonormal for that matter. Their
> > >>> identities are so appealing it kind of becomes second nature after
> some
> > >>> time. the only use for solve() i had is actually for solving linear
> > >>> equations. In my R prototype for SSVD [1] one will find exactly the
> > same
> > >>> style code, i.e.  diag(1/e$values) .
> > >>>
> > >>> pardon, this should read "non-signular" of course, an honest typo.
> > >>
> > >>
> > >>> Even then you probably actually want leftInverse() and
> rightInverse(),
> > >>> not just inverse, which is only defined for *non *singular square
> > >>> matrices and would be equal right and left inverses in that case.
> Which
> > >>> oddly enough brings us back to left-associative and right-associative
> > >>> operations.
> > >>>
> > >>> [1]
> > >>>
> >
> https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
> > >>>
> > >>>
> > >>
> > >
> >
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Ted Dunning <te...@gmail.com>.
After letting this soak for a bit, I would tend to prefer either full-on R
(less preferred) or Matlab with defects (more preferred).  Matlab with
defects would use * for matrix multiplication and have a method name for
element by element product.

It is fine to have special syntax modules, but I think that this isn't all
that big a deal.


On Sun, Jul 28, 2013 at 11:17 AM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> yeah. we are out of luck with matlab syntax.
>
> *&, *|, *^, *%, *#, *@, *~, *?, *!, *>, *<, *\ all work . '*.' or '*,' will
> not work. "*:" or ":*" have special meaning.
>
>
> On Sun, Jul 28, 2013 at 10:58 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
>
> > FWIW,
> >
> > one approach might be to separate DSL into several. E.g. RLikeOps and
> > MatlabLikeOps or WhateverOps, none of which is imported by default. and
> > then the code would have to say "import RLikeOps._" to enalbe R-like DSL,
> > and vice versa.
> >
> > But matlab style '*.' symbol unfortunately doesn't seem to work in scala
> > without backquotes. apparently scala treats '.' 'as a keyword and can't
> > reduce it as a part of anything else.
> >
> >
> > On Sat, Jul 27, 2013 at 6:43 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
> >
> >>
> >>
> >>
> >> On Sat, Jul 27, 2013 at 6:31 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >wrote:
> >>
> >>>
> >>>
> >>>
> >>> >
> >>>> > diagv(1 /: s)
> >>>> >
> >>>>
> >>>> But since this is just the inverse of the matrix, and I imagine it's
> >>>> actually
> >>>> clearer to do just diagv(s).inverse instead of diagv(1 /: s)
> >>>>
> >>>>
> >>> Well. DSL is just the icing. Nobody's taking the cake away.
> >>>
> >>> in a sense that, once/if/when Mahout supports inverse(), it would be
> >>> exactly how one might use it. DSL is not about implementation, it is
> about
> >>> semantic sugar only. It only maps to what exists.
> >>>
> >>> On a side note, it never actually occurred to me to call pinv() or
> >>> solve() on a diagonal matrix. Or orthonormal for that matter. Their
> >>> identities are so appealing it kind of becomes second nature after some
> >>> time. the only use for solve() i had is actually for solving linear
> >>> equations. In my R prototype for SSVD [1] one will find exactly the
> same
> >>> style code, i.e.  diag(1/e$values) .
> >>>
> >>> pardon, this should read "non-signular" of course, an honest typo.
> >>
> >>
> >>> Even then you probably actually want leftInverse() and rightInverse(),
> >>> not just inverse, which is only defined for *non *singular square
> >>> matrices and would be equal right and left inverses in that case. Which
> >>> oddly enough brings us back to left-associative and right-associative
> >>> operations.
> >>>
> >>> [1]
> >>>
> https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
> >>>
> >>>
> >>
> >
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
yeah. we are out of luck with matlab syntax.

*&, *|, *^, *%, *#, *@, *~, *?, *!, *>, *<, *\ all work . '*.' or '*,' will
not work. "*:" or ":*" have special meaning.


On Sun, Jul 28, 2013 at 10:58 AM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> FWIW,
>
> one approach might be to separate DSL into several. E.g. RLikeOps and
> MatlabLikeOps or WhateverOps, none of which is imported by default. and
> then the code would have to say "import RLikeOps._" to enalbe R-like DSL,
> and vice versa.
>
> But matlab style '*.' symbol unfortunately doesn't seem to work in scala
> without backquotes. apparently scala treats '.' 'as a keyword and can't
> reduce it as a part of anything else.
>
>
> On Sat, Jul 27, 2013 at 6:43 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>>
>>
>>
>> On Sat, Jul 27, 2013 at 6:31 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>>
>>>
>>>
>>>
>>> >
>>>> > diagv(1 /: s)
>>>> >
>>>>
>>>> But since this is just the inverse of the matrix, and I imagine it's
>>>> actually
>>>> clearer to do just diagv(s).inverse instead of diagv(1 /: s)
>>>>
>>>>
>>> Well. DSL is just the icing. Nobody's taking the cake away.
>>>
>>> in a sense that, once/if/when Mahout supports inverse(), it would be
>>> exactly how one might use it. DSL is not about implementation, it is about
>>> semantic sugar only. It only maps to what exists.
>>>
>>> On a side note, it never actually occurred to me to call pinv() or
>>> solve() on a diagonal matrix. Or orthonormal for that matter. Their
>>> identities are so appealing it kind of becomes second nature after some
>>> time. the only use for solve() i had is actually for solving linear
>>> equations. In my R prototype for SSVD [1] one will find exactly the same
>>> style code, i.e.  diag(1/e$values) .
>>>
>>> pardon, this should read "non-signular" of course, an honest typo.
>>
>>
>>> Even then you probably actually want leftInverse() and rightInverse(),
>>> not just inverse, which is only defined for *non *singular square
>>> matrices and would be equal right and left inverses in that case. Which
>>> oddly enough brings us back to left-associative and right-associative
>>> operations.
>>>
>>> [1]
>>> https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
>>>
>>>
>>
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
FWIW,

one approach might be to separate DSL into several. E.g. RLikeOps and
MatlabLikeOps or WhateverOps, none of which is imported by default. and
then the code would have to say "import RLikeOps._" to enalbe R-like DSL,
and vice versa.

But matlab style '*.' symbol unfortunately doesn't seem to work in scala
without backquotes. apparently scala treats '.' 'as a keyword and can't
reduce it as a part of anything else.


On Sat, Jul 27, 2013 at 6:43 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

>
>
>
> On Sat, Jul 27, 2013 at 6:31 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>>
>>
>>
>> >
>>> > diagv(1 /: s)
>>> >
>>>
>>> But since this is just the inverse of the matrix, and I imagine it's
>>> actually
>>> clearer to do just diagv(s).inverse instead of diagv(1 /: s)
>>>
>>>
>> Well. DSL is just the icing. Nobody's taking the cake away.
>>
>> in a sense that, once/if/when Mahout supports inverse(), it would be
>> exactly how one might use it. DSL is not about implementation, it is about
>> semantic sugar only. It only maps to what exists.
>>
>> On a side note, it never actually occurred to me to call pinv() or
>> solve() on a diagonal matrix. Or orthonormal for that matter. Their
>> identities are so appealing it kind of becomes second nature after some
>> time. the only use for solve() i had is actually for solving linear
>> equations. In my R prototype for SSVD [1] one will find exactly the same
>> style code, i.e.  diag(1/e$values) .
>>
>> pardon, this should read "non-signular" of course, an honest typo.
>
>
>> Even then you probably actually want leftInverse() and rightInverse(),
>> not just inverse, which is only defined for *non *singular square
>> matrices and would be equal right and left inverses in that case. Which
>> oddly enough brings us back to left-associative and right-associative
>> operations.
>>
>> [1]
>> https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
>>
>>
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Sat, Jul 27, 2013 at 6:31 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

>
>
>
> >
>> > diagv(1 /: s)
>> >
>>
>> But since this is just the inverse of the matrix, and I imagine it's
>> actually
>> clearer to do just diagv(s).inverse instead of diagv(1 /: s)
>>
>>
> Well. DSL is just the icing. Nobody's taking the cake away.
>
> in a sense that, once/if/when Mahout supports inverse(), it would be
> exactly how one might use it. DSL is not about implementation, it is about
> semantic sugar only. It only maps to what exists.
>
> On a side note, it never actually occurred to me to call pinv() or solve()
> on a diagonal matrix. Or orthonormal for that matter. Their identities are
> so appealing it kind of becomes second nature after some time. the only use
> for solve() i had is actually for solving linear equations. In my R
> prototype for SSVD [1] one will find exactly the same style code, i.e.
>  diag(1/e$values) .
>
> pardon, this should read "non-signular" of course, an honest typo.


> Even then you probably actually want leftInverse() and rightInverse(), not
> just inverse, which is only defined for *non *singular square matrices
> and would be equal right and left inverses in that case. Which oddly enough
> brings us back to left-associative and right-associative operations.
>
> [1]
> https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
>
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
>
> > diagv(1 /: s)
> >
>
> But since this is just the inverse of the matrix, and I imagine it's
> actually
> clearer to do just diagv(s).inverse instead of diagv(1 /: s)
>
>
Well. DSL is just the icing. Nobody's taking the cake away.

in a sense that, once/if/when Mahout supports inverse(), it would be
exactly how one might use it. DSL is not about implementation, it is about
semantic sugar only. It only maps to what exists.

On a side note, it never actually occurred to me to call pinv() or solve()
on a diagonal matrix. Or orthonormal for that matter. Their identities are
so appealing it kind of becomes second nature after some time. the only use
for solve() i had is actually for solving linear equations. In my R
prototype for SSVD [1] one will find exactly the same style code, i.e.
 diag(1/e$values) .

Even then you probably actually want leftInverse() and rightInverse(), not
just inverse, which is only defined for singular square matrices and would
be equal right and left inverses in that case. Which oddly enough brings us
back to left-associative and right-associative operations.

[1]
https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
On Sat, Jul 27, 2013 at 1:53 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> Can you show me some examples of where I'd *want* to do the "wrong thing"
> > from an associativity standpoint?  "5 - x" where x is a vector, is kinda
> > weird.
> > But maybe you're subtracting off a mean or something, but then I'd
> probably
> > write this as "- (x - 5)", because I always associate left to right. :)
> >
>
> I actually haven't written support for unary_- , that's something i really
> haven't encountered (yet) until you said it :)
>

Heh, yeah, that's something I can actually imagine using. :)


> While there's a unary negation operatior, there's no unary inversion
> operator (1/x). This actually an example of fairly frequent operation over
> vectors.
>
> In my in-core ssvd example, i use it to compute inverse of a diagonal
> matrix (Sigma)^-1 (or any inverse of a diagonal matrix for that matter):
>
> diagv(1 /: s)
>

But since this is just the inverse of the matrix, and I imagine it's
actually
clearer to do just diagv(s).inverse instead of diagv(1 /: s)


> where s is the vector of singular values.
>
> elementwise substraction is fairly frequent when comparing matrices and
> making approximation asserts, e.g. in svg test something like
>
> assert  (a - u %*% diagv(s) %*% v.t ) norm <= 1e-10
>

Like I was saying: elementwise addition and subtraction *are* matrix-wise
operations, so there's no ambiguity.  If a and s are matrices, (a - s) is
completely clear.


> I actually wouldn't know how to encode elementwise inversion if i cannot
> use colon.
>
> Use of colon with operations in scala actually IMO is quite intuitive in a
> sense that the side that has it, points to "this" and the other side points
> to "that".
>
> The assumption is that use of the colon must not change the result of
> operation (be functionally equivalent), just like in many cases foldLeft is
> equivalent to foldRight, i.e.
>
> a * b === a :* b === a *: b,
>
> it follows that :* means "timesRight" or "solveRight", in Mahout speak,
>  and *: means "timesLeft", or "solveLeft'.
>
> FWIW I guess it does take a few minutes to settle (it took a few with me at
> least), but after that it seems pretty intuitive.
>

Yeah, I can imagine getting used to it.  Haven't yet, but I can imagine it.
:)


>  >
> >
> > > and putting completely different functional meaning into :* and * will
> > > confuse scala users to no end who got used to things like :/ and /: .
> > This
> > > all needs striking a subtle balance unfortunately.
> > >
> >
> > Ok, then like I said, maybe I'll just defer to your judgement on the
> > operator
> > syntax, as I've *never* gotten used to the scala :/ and /: uses.   I
> prefer
> > method calls to method calls masquerading as native operators.  Maybe
> > I should Stop Being Afraid and Learn to Love the DSL, but I'm not quite
> > there yet: Too Much Magic. :)
> >
> >
> > >
> > > as i said before, i am not hung on %*% syntax, but i don't think doing
> :*
> > > or .* for elementwise would work on scala.
> > >
> >
> > How often do we really do elementwise matrix operations?  Is this really
> > a thing we often want to worry about?  addition and subtraction, sure,
> but
> >
>
> Use of Hadamard product is indeed rare (i had only one case with the PCA
> pipeline), but elementwise - and / seems to be popular with me. Regardless,
> it is done for completeness; and if it is done at all, imo it better be
> done consistently with the rest of the elementwise pack.
>

Consistency I'm down with.  Completeness, on the other hand, I'm with Godel
on that one. ;)

-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Can you show me some examples of where I'd *want* to do the "wrong thing"
> from an associativity standpoint?  "5 - x" where x is a vector, is kinda
> weird.
> But maybe you're subtracting off a mean or something, but then I'd probably
> write this as "- (x - 5)", because I always associate left to right. :)
>

I actually haven't written support for unary_- , that's something i really
haven't encountered (yet) until you said it :)

While there's a unary negation operatior, there's no unary inversion
operator (1/x). This actually an example of fairly frequent operation over
vectors.

In my in-core ssvd example, i use it to compute inverse of a diagonal
matrix (Sigma)^-1 (or any inverse of a diagonal matrix for that matter):

diagv(1 /: s)

where s is the vector of singular values.

elementwise substraction is fairly frequent when comparing matrices and
making approximation asserts, e.g. in svg test something like

assert  (a - u %*% diagv(s) %*% v.t ) norm <= 1e-10

I actually wouldn't know how to encode elementwise inversion if i cannot
use colon.

Use of colon with operations in scala actually IMO is quite intuitive in a
sense that the side that has it, points to "this" and the other side points
to "that".

The assumption is that use of the colon must not change the result of
operation (be functionally equivalent), just like in many cases foldLeft is
equivalent to foldRight, i.e.

a * b === a :* b === a *: b,

it follows that :* means "timesRight" or "solveRight", in Mahout speak,
 and *: means "timesLeft", or "solveLeft'.

FWIW I guess it does take a few minutes to settle (it took a few with me at
least), but after that it seems pretty intuitive.




>
>
> > and putting completely different functional meaning into :* and * will
> > confuse scala users to no end who got used to things like :/ and /: .
> This
> > all needs striking a subtle balance unfortunately.
> >
>
> Ok, then like I said, maybe I'll just defer to your judgement on the
> operator
> syntax, as I've *never* gotten used to the scala :/ and /: uses.   I prefer
> method calls to method calls masquerading as native operators.  Maybe
> I should Stop Being Afraid and Learn to Love the DSL, but I'm not quite
> there yet: Too Much Magic. :)
>
>
> >
> > as i said before, i am not hung on %*% syntax, but i don't think doing :*
> > or .* for elementwise would work on scala.
> >
>
> How often do we really do elementwise matrix operations?  Is this really
> a thing we often want to worry about?  addition and subtraction, sure, but
>

Use of Hadamard product is indeed rare (i had only one case with the PCA
pipeline), but elementwise - and / seems to be popular with me. Regardless,
it is done for completeness; and if it is done at all, imo it better be
done consistently with the rest of the elementwise pack.

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
On Sat, Jul 27, 2013 at 9:40 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> Jake, this is in-core. I work on similar expressiveness for spark backed
> DRMs and there are indeed different set of algorithms there and naive
> combinations are not necessarily producing the best outcome. There's no
> doubt MR stuff will need amended set of operations and primitives.
>
> As far as associativity is concerned, this is just scala . One cannot
> implement elementwise 5-x as 5:-x or 5-x on a sealed left-hand argument,
>  such as the language rules and no amount of discussion on our side can
> change that. You can only do 5 -: x .
>

Can you show me some examples of where I'd *want* to do the "wrong thing"
from an associativity standpoint?  "5 - x" where x is a vector, is kinda
weird.
But maybe you're subtracting off a mean or something, but then I'd probably
write this as "- (x - 5)", because I always associate left to right. :)


> and putting completely different functional meaning into :* and * will
> confuse scala users to no end who got used to things like :/ and /: . This
> all needs striking a subtle balance unfortunately.
>

Ok, then like I said, maybe I'll just defer to your judgement on the
operator
syntax, as I've *never* gotten used to the scala :/ and /: uses.   I prefer
method calls to method calls masquerading as native operators.  Maybe
I should Stop Being Afraid and Learn to Love the DSL, but I'm not quite
there yet: Too Much Magic. :)


>
> as i said before, i am not hung on %*% syntax, but i don't think doing :*
> or .* for elementwise would work on scala.
>

How often do we really do elementwise matrix operations?  Is this really
a thing we often want to worry about?  addition and subtraction, sure, but
that's the full matrix operation too.  Ditto for multiplication or division
*by scalars*, but Hadamard products on matrices?  I guess it _happens_,
but I'm not sure I've ever done it, or if I have, it's pretty darn rare.


>
>
>
> On Sat, Jul 27, 2013 at 8:00 AM, Jake Mannix <ja...@gmail.com>
> wrote:
>
> > I think my main concern is one of readability and hidden information: I
> > really _don't_ like having to know _anything_ about associativity rules,
> > and I'm not sure that catering to R users (*or* matlab users) is what we
> > want to do.  Maybe I'm thinking in a different direction with my scala
> > (+scalding) interop work, but I really am not aiming for some totally
> > fluent API for non-programmer analysts.  I'm not one, and guessing their
> > needs will be really hard for me.  I just want more concise syntax,
> better
> > types, access to a nice REPL, and access to a much more sophisticated yet
> > compact MR pipelining DSL.  For this, scala + scalding serves admirably.
> >
> >
> > On Sat, Jul 27, 2013 at 6:10 AM, Dmitriy Lyubimov <dl...@gmail.com>
> > wrote:
> >
> > > On Jul 26, 2013 11:56 PM, "Nick Pentreath" <ni...@gmail.com>
> > > wrote:
> > > >
> > > > Thanks for the update on that PR I will definitely take a look.
> > > >
> > > >
> > > > I wonder if they will run into the exact same Colt issues as mahout
> > did?!
> > >
> > > Yes i wondered that too since the day i saw spark als example.
> > >
> > > Jblas is far better choice but as Sebastian has demonstrated bona fide
> > > improvements are hard to achieve due to high jni costs, so i would
> > actually
> > > have a specific type of matrix to solve specific probems when needed
> > rather
> > > than sweepingly generalize it as a dense vector or matrix support.
> > >
> > > Aside from that, it seems lapack backend is running up to 5x slower on
> > amd
> > > hardware that our company unfortunately chose to invest in... argh!..
> > >
> > > >
> > > >
> > > > This DSL looks great, I'm gonna play around with it as soon as I get
> a
> > > chance.
> > > >
> > > >
> > > >
> > > > One question - breeze has quite a similar syntax that is a bit
> simpler
> > in
> > > some ways - basically * for matrix multiply and :* for elementwise.
> Would
> > > something similar work here?
> > >
> > > As i commented before, it just caters to R syntax, along with bunch of
> > > other things. If we beleive that there is a reason to inherit syntax vs
> > > devising something new, then there are really few candidates, and i
> dont
> > > think Breeze is going to cut it based on adoption level.
> > >
> > > In particular, in my company it is hard to convince R users to start
> > using
> > > scala or java as it is, so I am just scoring points here by making it
> > look
> > > familiar to them.
> > >
> > > Also i want to reserve the colon to command associativity of operation,
> > as
> > > scala means it, which is important for optimizing non commutative
> > > operations such as elementwise division or matrix multiplication. E.g.
> > > there are significant peroformance differences between saying
> > >
> > >
> > Maybe I should step out of the discussion where it dives into what
> > operators we use, because frankly, I probably won't use them much,
> > *especially* if there is too much magical associativity rules I have to
> > remember - I *hate* stuff like:
> >
> >
> >
> > > A %*% diagonal === A.times(diagonal)
> > >
> > > And
> > >
> > > A %*%: diagonal === diagonal.timesLeft(A).
> > >
> >
> > In particular, pretty much whenever we're going to be doing a map-reduce
> > job in a method call (for the distributed case), being terribly clever in
> > our syntax is going to bite us, because people (esp. typical R users, who
> > aren't super performance focused) will be doing stuff like "(A.t %*%
> B).t -
> > (A.t %*% A)" without thinking whether this can be reorganized at all to
> > reduce the number of map-reduce passes.  Maybe that's ok, but they're
> going
> > to super-complain on the list all the time if we give them too much rope
> to
> > hang themselves with.
> >
> > But yeah, maybe we'll just be looking at two different focuses on this: I
> > really care more about writing nicer MR pipelines for our jobs (I've
> > already played around with a nice replacement for seq2sparse in a single
> > small scalding job with modular components, it's about 1/10th the number
> of
> > lines of our current one, with most of the functionality), and getting a
> > nice integrated REPL for playing with the results.
> >
> > And maybe getting R (and matlab) users to use our stuff is a good thing,
> > even if it means them hanging themselves a bit.  Heh.
> >
> >
> > > Obviously the latter is n flops and the former is n squared.
> > >
> > > I dont think breeze made a wise decision by putting a special
> functional
> > > meaning into :  . It is reserved for associativity in scala.
> > >
> > > >
> > > >
> > > > Would be quite nice to have same syntax but different backends that
> are
> > > swappable ;)
> > > > —
> > > > Sent from Mailbox for iPhone
> > > >
> > > > On Sat, Jul 27, 2013 at 2:42 AM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >
> > > > wrote:
> > > >
> > > > > coincidentally, spark mlib just posted a pull request intended to
> add
> > > > > support for dense and sparse vectors, looks quite similar.
> > > > > https://github.com/mesos/spark/pull/736. They seem to choose JBlas
> > > backing
> > > > > for dense stuff (although at a vector level there's probably not
> much
> > > > > reason to) and as-is Colt for sparse stuff.
> > > > > On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <
> dlieu.7@gmail.com
> > >
> > > wrote:
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <
> ted.dunning@gmail.com
> > > >wrote:
> > > > >>
> > > > >>> This sounds great in principle.  I haven't seen any details yet
> > > (haven't
> > > > >>> had time to look).
> > > > >>>
> > > > >>> Is there a strong reason to go with the R syntax for
> multiplication
> > > > >>> instead
> > > > >>> of the matlab convention that a*b means a.times(b)?
> > > > >>>
> > > > >>
> > > > >> As discussed, but also because matlab style elementwise operators
> > are
> > > > >> impossible to keep at proper precedence level in scala. It kind of
> > has
> > > to
> > > > >> start with either '*' or '%' to keep proper precedence, '.*' will
> > not
> > > work
> > > > >> unfortunately. And mix along the lines "some of Matlab, some of
> > > perhaps
> > > > >> completely something else' does not seem appealing at all.
> > > > >>
> > > > >>
> > >
> >
> >
> >
> > --
> >
> >   -jake
> >
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Jake, this is in-core. I work on similar expressiveness for spark backed
DRMs and there are indeed different set of algorithms there and naive
combinations are not necessarily producing the best outcome. There's no
doubt MR stuff will need amended set of operations and primitives.

As far as associativity is concerned, this is just scala . One cannot
implement elementwise 5-x as 5:-x or 5-x on a sealed left-hand argument,
 such as the language rules and no amount of discussion on our side can
change that. You can only do 5 -: x .

and putting completely different functional meaning into :* and * will
confuse scala users to no end who got used to things like :/ and /: . This
all needs striking a subtle balance unfortunately.

as i said before, i am not hung on %*% syntax, but i don't think doing :*
or .* for elementwise would work on scala.



On Sat, Jul 27, 2013 at 8:00 AM, Jake Mannix <ja...@gmail.com> wrote:

> I think my main concern is one of readability and hidden information: I
> really _don't_ like having to know _anything_ about associativity rules,
> and I'm not sure that catering to R users (*or* matlab users) is what we
> want to do.  Maybe I'm thinking in a different direction with my scala
> (+scalding) interop work, but I really am not aiming for some totally
> fluent API for non-programmer analysts.  I'm not one, and guessing their
> needs will be really hard for me.  I just want more concise syntax, better
> types, access to a nice REPL, and access to a much more sophisticated yet
> compact MR pipelining DSL.  For this, scala + scalding serves admirably.
>
>
> On Sat, Jul 27, 2013 at 6:10 AM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
> > On Jul 26, 2013 11:56 PM, "Nick Pentreath" <ni...@gmail.com>
> > wrote:
> > >
> > > Thanks for the update on that PR I will definitely take a look.
> > >
> > >
> > > I wonder if they will run into the exact same Colt issues as mahout
> did?!
> >
> > Yes i wondered that too since the day i saw spark als example.
> >
> > Jblas is far better choice but as Sebastian has demonstrated bona fide
> > improvements are hard to achieve due to high jni costs, so i would
> actually
> > have a specific type of matrix to solve specific probems when needed
> rather
> > than sweepingly generalize it as a dense vector or matrix support.
> >
> > Aside from that, it seems lapack backend is running up to 5x slower on
> amd
> > hardware that our company unfortunately chose to invest in... argh!..
> >
> > >
> > >
> > > This DSL looks great, I'm gonna play around with it as soon as I get a
> > chance.
> > >
> > >
> > >
> > > One question - breeze has quite a similar syntax that is a bit simpler
> in
> > some ways - basically * for matrix multiply and :* for elementwise. Would
> > something similar work here?
> >
> > As i commented before, it just caters to R syntax, along with bunch of
> > other things. If we beleive that there is a reason to inherit syntax vs
> > devising something new, then there are really few candidates, and i dont
> > think Breeze is going to cut it based on adoption level.
> >
> > In particular, in my company it is hard to convince R users to start
> using
> > scala or java as it is, so I am just scoring points here by making it
> look
> > familiar to them.
> >
> > Also i want to reserve the colon to command associativity of operation,
> as
> > scala means it, which is important for optimizing non commutative
> > operations such as elementwise division or matrix multiplication. E.g.
> > there are significant peroformance differences between saying
> >
> >
> Maybe I should step out of the discussion where it dives into what
> operators we use, because frankly, I probably won't use them much,
> *especially* if there is too much magical associativity rules I have to
> remember - I *hate* stuff like:
>
>
>
> > A %*% diagonal === A.times(diagonal)
> >
> > And
> >
> > A %*%: diagonal === diagonal.timesLeft(A).
> >
>
> In particular, pretty much whenever we're going to be doing a map-reduce
> job in a method call (for the distributed case), being terribly clever in
> our syntax is going to bite us, because people (esp. typical R users, who
> aren't super performance focused) will be doing stuff like "(A.t %*% B).t -
> (A.t %*% A)" without thinking whether this can be reorganized at all to
> reduce the number of map-reduce passes.  Maybe that's ok, but they're going
> to super-complain on the list all the time if we give them too much rope to
> hang themselves with.
>
> But yeah, maybe we'll just be looking at two different focuses on this: I
> really care more about writing nicer MR pipelines for our jobs (I've
> already played around with a nice replacement for seq2sparse in a single
> small scalding job with modular components, it's about 1/10th the number of
> lines of our current one, with most of the functionality), and getting a
> nice integrated REPL for playing with the results.
>
> And maybe getting R (and matlab) users to use our stuff is a good thing,
> even if it means them hanging themselves a bit.  Heh.
>
>
> > Obviously the latter is n flops and the former is n squared.
> >
> > I dont think breeze made a wise decision by putting a special functional
> > meaning into :  . It is reserved for associativity in scala.
> >
> > >
> > >
> > > Would be quite nice to have same syntax but different backends that are
> > swappable ;)
> > > —
> > > Sent from Mailbox for iPhone
> > >
> > > On Sat, Jul 27, 2013 at 2:42 AM, Dmitriy Lyubimov <dl...@gmail.com>
> > > wrote:
> > >
> > > > coincidentally, spark mlib just posted a pull request intended to add
> > > > support for dense and sparse vectors, looks quite similar.
> > > > https://github.com/mesos/spark/pull/736. They seem to choose JBlas
> > backing
> > > > for dense stuff (although at a vector level there's probably not much
> > > > reason to) and as-is Colt for sparse stuff.
> > > > On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >
> > wrote:
> > > >>
> > > >>
> > > >>
> > > >> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <ted.dunning@gmail.com
> > >wrote:
> > > >>
> > > >>> This sounds great in principle.  I haven't seen any details yet
> > (haven't
> > > >>> had time to look).
> > > >>>
> > > >>> Is there a strong reason to go with the R syntax for multiplication
> > > >>> instead
> > > >>> of the matlab convention that a*b means a.times(b)?
> > > >>>
> > > >>
> > > >> As discussed, but also because matlab style elementwise operators
> are
> > > >> impossible to keep at proper precedence level in scala. It kind of
> has
> > to
> > > >> start with either '*' or '%' to keep proper precedence, '.*' will
> not
> > work
> > > >> unfortunately. And mix along the lines "some of Matlab, some of
> > perhaps
> > > >> completely something else' does not seem appealing at all.
> > > >>
> > > >>
> >
>
>
>
> --
>
>   -jake
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
I think my main concern is one of readability and hidden information: I
really _don't_ like having to know _anything_ about associativity rules,
and I'm not sure that catering to R users (*or* matlab users) is what we
want to do.  Maybe I'm thinking in a different direction with my scala
(+scalding) interop work, but I really am not aiming for some totally
fluent API for non-programmer analysts.  I'm not one, and guessing their
needs will be really hard for me.  I just want more concise syntax, better
types, access to a nice REPL, and access to a much more sophisticated yet
compact MR pipelining DSL.  For this, scala + scalding serves admirably.


On Sat, Jul 27, 2013 at 6:10 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> On Jul 26, 2013 11:56 PM, "Nick Pentreath" <ni...@gmail.com>
> wrote:
> >
> > Thanks for the update on that PR I will definitely take a look.
> >
> >
> > I wonder if they will run into the exact same Colt issues as mahout did?!
>
> Yes i wondered that too since the day i saw spark als example.
>
> Jblas is far better choice but as Sebastian has demonstrated bona fide
> improvements are hard to achieve due to high jni costs, so i would actually
> have a specific type of matrix to solve specific probems when needed rather
> than sweepingly generalize it as a dense vector or matrix support.
>
> Aside from that, it seems lapack backend is running up to 5x slower on amd
> hardware that our company unfortunately chose to invest in... argh!..
>
> >
> >
> > This DSL looks great, I'm gonna play around with it as soon as I get a
> chance.
> >
> >
> >
> > One question - breeze has quite a similar syntax that is a bit simpler in
> some ways - basically * for matrix multiply and :* for elementwise. Would
> something similar work here?
>
> As i commented before, it just caters to R syntax, along with bunch of
> other things. If we beleive that there is a reason to inherit syntax vs
> devising something new, then there are really few candidates, and i dont
> think Breeze is going to cut it based on adoption level.
>
> In particular, in my company it is hard to convince R users to start using
> scala or java as it is, so I am just scoring points here by making it look
> familiar to them.
>
> Also i want to reserve the colon to command associativity of operation, as
> scala means it, which is important for optimizing non commutative
> operations such as elementwise division or matrix multiplication. E.g.
> there are significant peroformance differences between saying
>
>
Maybe I should step out of the discussion where it dives into what
operators we use, because frankly, I probably won't use them much,
*especially* if there is too much magical associativity rules I have to
remember - I *hate* stuff like:



> A %*% diagonal === A.times(diagonal)
>
> And
>
> A %*%: diagonal === diagonal.timesLeft(A).
>

In particular, pretty much whenever we're going to be doing a map-reduce
job in a method call (for the distributed case), being terribly clever in
our syntax is going to bite us, because people (esp. typical R users, who
aren't super performance focused) will be doing stuff like "(A.t %*% B).t -
(A.t %*% A)" without thinking whether this can be reorganized at all to
reduce the number of map-reduce passes.  Maybe that's ok, but they're going
to super-complain on the list all the time if we give them too much rope to
hang themselves with.

But yeah, maybe we'll just be looking at two different focuses on this: I
really care more about writing nicer MR pipelines for our jobs (I've
already played around with a nice replacement for seq2sparse in a single
small scalding job with modular components, it's about 1/10th the number of
lines of our current one, with most of the functionality), and getting a
nice integrated REPL for playing with the results.

And maybe getting R (and matlab) users to use our stuff is a good thing,
even if it means them hanging themselves a bit.  Heh.


> Obviously the latter is n flops and the former is n squared.
>
> I dont think breeze made a wise decision by putting a special functional
> meaning into :  . It is reserved for associativity in scala.
>
> >
> >
> > Would be quite nice to have same syntax but different backends that are
> swappable ;)
> > —
> > Sent from Mailbox for iPhone
> >
> > On Sat, Jul 27, 2013 at 2:42 AM, Dmitriy Lyubimov <dl...@gmail.com>
> > wrote:
> >
> > > coincidentally, spark mlib just posted a pull request intended to add
> > > support for dense and sparse vectors, looks quite similar.
> > > https://github.com/mesos/spark/pull/736. They seem to choose JBlas
> backing
> > > for dense stuff (although at a vector level there's probably not much
> > > reason to) and as-is Colt for sparse stuff.
> > > On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> > >>
> > >>
> > >>
> > >> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <ted.dunning@gmail.com
> >wrote:
> > >>
> > >>> This sounds great in principle.  I haven't seen any details yet
> (haven't
> > >>> had time to look).
> > >>>
> > >>> Is there a strong reason to go with the R syntax for multiplication
> > >>> instead
> > >>> of the matlab convention that a*b means a.times(b)?
> > >>>
> > >>
> > >> As discussed, but also because matlab style elementwise operators are
> > >> impossible to keep at proper precedence level in scala. It kind of has
> to
> > >> start with either '*' or '%' to keep proper precedence, '.*' will not
> work
> > >> unfortunately. And mix along the lines "some of Matlab, some of
> perhaps
> > >> completely something else' does not seem appealing at all.
> > >>
> > >>
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Jul 26, 2013 11:56 PM, "Nick Pentreath" <ni...@gmail.com> wrote:
>
> Thanks for the update on that PR I will definitely take a look.
>
>
> I wonder if they will run into the exact same Colt issues as mahout did?!

Yes i wondered that too since the day i saw spark als example.

Jblas is far better choice but as Sebastian has demonstrated bona fide
improvements are hard to achieve due to high jni costs, so i would actually
have a specific type of matrix to solve specific probems when needed rather
than sweepingly generalize it as a dense vector or matrix support.

Aside from that, it seems lapack backend is running up to 5x slower on amd
hardware that our company unfortunately chose to invest in... argh!..

>
>
> This DSL looks great, I'm gonna play around with it as soon as I get a
chance.
>
>
>
> One question - breeze has quite a similar syntax that is a bit simpler in
some ways - basically * for matrix multiply and :* for elementwise. Would
something similar work here?

As i commented before, it just caters to R syntax, along with bunch of
other things. If we beleive that there is a reason to inherit syntax vs
devising something new, then there are really few candidates, and i dont
think Breeze is going to cut it based on adoption level.

In particular, in my company it is hard to convince R users to start using
scala or java as it is, so I am just scoring points here by making it look
familiar to them.

Also i want to reserve the colon to command associativity of operation, as
scala means it, which is important for optimizing non commutative
operations such as elementwise division or matrix multiplication. E.g.
there are significant peroformance differences between saying

A %*% diagonal === A.times(diagonal)

And

A %*%: diagonal === diagonal.timesLeft(A).

Obviously the latter is n flops and the former is n squared.

I dont think breeze made a wise decision by putting a special functional
meaning into :  . It is reserved for associativity in scala.

>
>
> Would be quite nice to have same syntax but different backends that are
swappable ;)
> —
> Sent from Mailbox for iPhone
>
> On Sat, Jul 27, 2013 at 2:42 AM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
> > coincidentally, spark mlib just posted a pull request intended to add
> > support for dense and sparse vectors, looks quite similar.
> > https://github.com/mesos/spark/pull/736. They seem to choose JBlas
backing
> > for dense stuff (although at a vector level there's probably not much
> > reason to) and as-is Colt for sparse stuff.
> > On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <dl...@gmail.com>
wrote:
> >>
> >>
> >>
> >> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <ted.dunning@gmail.com
>wrote:
> >>
> >>> This sounds great in principle.  I haven't seen any details yet
(haven't
> >>> had time to look).
> >>>
> >>> Is there a strong reason to go with the R syntax for multiplication
> >>> instead
> >>> of the matlab convention that a*b means a.times(b)?
> >>>
> >>
> >> As discussed, but also because matlab style elementwise operators are
> >> impossible to keep at proper precedence level in scala. It kind of has
to
> >> start with either '*' or '%' to keep proper precedence, '.*' will not
work
> >> unfortunately. And mix along the lines "some of Matlab, some of perhaps
> >> completely something else' does not seem appealing at all.
> >>
> >>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Jake Mannix <ja...@gmail.com>.
On Fri, Jul 26, 2013 at 11:56 PM, Nick Pentreath
<ni...@gmail.com>wrote:

> Thanks for the update on that PR I will definitely take a look.
>
>
> I wonder if they will run into the exact same Colt issues as mahout did?!
>

Yeah, that's pretty strange, Colt is totally abandoned, and had lots of
little bugs that
we've fixed, and performance that we've improved.


> This DSL looks great, I'm gonna play around with it as soon as I get a
> chance.
>
> One question - breeze has quite a similar syntax that is a bit simpler in
> some ways - basically * for matrix multiply and :* for elementwise. Would
> something similar work here?
>

+1


>
>
> Would be quite nice to have same syntax but different backends that are
> swappable ;)
> —
> Sent from Mailbox for iPhone
>
> On Sat, Jul 27, 2013 at 2:42 AM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
> > coincidentally, spark mlib just posted a pull request intended to add
> > support for dense and sparse vectors, looks quite similar.
> > https://github.com/mesos/spark/pull/736. They seem to choose JBlas
> backing
> > for dense stuff (although at a vector level there's probably not much
> > reason to) and as-is Colt for sparse stuff.
> > On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> >>
> >>
> >>
> >> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <ted.dunning@gmail.com
> >wrote:
> >>
> >>> This sounds great in principle.  I haven't seen any details yet
> (haven't
> >>> had time to look).
> >>>
> >>> Is there a strong reason to go with the R syntax for multiplication
> >>> instead
> >>> of the matlab convention that a*b means a.times(b)?
> >>>
> >>
> >> As discussed, but also because matlab style elementwise operators are
> >> impossible to keep at proper precedence level in scala. It kind of has
> to
> >> start with either '*' or '%' to keep proper precedence, '.*' will not
> work
> >> unfortunately. And mix along the lines "some of Matlab, some of perhaps
> >> completely something else' does not seem appealing at all.
> >>
> >>
>



-- 

  -jake

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Nick Pentreath <ni...@gmail.com>.
Thanks for the update on that PR I will definitely take a look.


I wonder if they will run into the exact same Colt issues as mahout did?!


This DSL looks great, I'm gonna play around with it as soon as I get a chance.



One question - breeze has quite a similar syntax that is a bit simpler in some ways - basically * for matrix multiply and :* for elementwise. Would something similar work here? 


Would be quite nice to have same syntax but different backends that are swappable ;)
—
Sent from Mailbox for iPhone

On Sat, Jul 27, 2013 at 2:42 AM, Dmitriy Lyubimov <dl...@gmail.com>
wrote:

> coincidentally, spark mlib just posted a pull request intended to add
> support for dense and sparse vectors, looks quite similar.
> https://github.com/mesos/spark/pull/736. They seem to choose JBlas backing
> for dense stuff (although at a vector level there's probably not much
> reason to) and as-is Colt for sparse stuff.
> On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>
>>
>>
>> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <te...@gmail.com>wrote:
>>
>>> This sounds great in principle.  I haven't seen any details yet (haven't
>>> had time to look).
>>>
>>> Is there a strong reason to go with the R syntax for multiplication
>>> instead
>>> of the matlab convention that a*b means a.times(b)?
>>>
>>
>> As discussed, but also because matlab style elementwise operators are
>> impossible to keep at proper precedence level in scala. It kind of has to
>> start with either '*' or '%' to keep proper precedence, '.*' will not work
>> unfortunately. And mix along the lines "some of Matlab, some of perhaps
>> completely something else' does not seem appealing at all.
>>
>>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
coincidentally, spark mlib just posted a pull request intended to add
support for dense and sparse vectors, looks quite similar.
https://github.com/mesos/spark/pull/736. They seem to choose JBlas backing
for dense stuff (although at a vector level there's probably not much
reason to) and as-is Colt for sparse stuff.


On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

>
>
>
> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <te...@gmail.com>wrote:
>
>> This sounds great in principle.  I haven't seen any details yet (haven't
>> had time to look).
>>
>> Is there a strong reason to go with the R syntax for multiplication
>> instead
>> of the matlab convention that a*b means a.times(b)?
>>
>
> As discussed, but also because matlab style elementwise operators are
> impossible to keep at proper precedence level in scala. It kind of has to
> start with either '*' or '%' to keep proper precedence, '.*' will not work
> unfortunately. And mix along the lines "some of Matlab, some of perhaps
> completely something else' does not seem appealing at all.
>
>

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <te...@gmail.com> wrote:

> This sounds great in principle.  I haven't seen any details yet (haven't
> had time to look).
>
> Is there a strong reason to go with the R syntax for multiplication instead
> of the matlab convention that a*b means a.times(b)?
>

As discussed, but also because matlab style elementwise operators are
impossible to keep at proper precedence level in scala. It kind of has to
start with either '*' or '%' to keep proper precedence, '.*' will not work
unfortunately. And mix along the lines "some of Matlab, some of perhaps
completely something else' does not seem appealing at all.

Re: Proposal: scala DSL module for Mahout linear algebra.

Posted by Ted Dunning <te...@gmail.com>.
This sounds great in principle.  I haven't seen any details yet (haven't
had time to look).

Is there a strong reason to go with the R syntax for multiplication instead
of the matlab convention that a*b means a.times(b)?


On Fri, Jul 26, 2013 at 12:07 AM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> Hello,
>
> i would like to put for discussion a proposal of adding a module
> mathout-math-scala to Mahout containing various scala DSLs for Mahout
> project.
>
> Here is what i have got so far :
>
> http://weatheringthrutechdays.blogspot.com/2013/07/scala-dsl-for-mahout-in-core-linear.html
>
> for now it is in-core stuff only, but it can also be used to script out
> driver pipelines for Mahout DRM and solvers. (Some code, in particular,
> tests may look ugly at the moment).
>
> By proposing it as a part of Mahout, I of course pursue some selfish goals:
> since the stuff covers a lot of Mahout matrix APIs, if I have it away from
> Mahout, i would be having hard time maintaining it in sync with Mahout as
> the project morphs its apis. So I want to make sure that committers run my
> tests too before committing new changes.
>
> (I am actually using this for spark-based solvers bsed on Mahout DRMs and
> to make it more accessible to our data scientists to work with -- at some
> point I hope to contribute spark ports of some Mahout work too).
>
> Respectfully,
> -Dmitriy
>