You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org> on 2014/07/22 23:24:38 UTC
[jira] [Updated] (MAHOUT-1597) A + 1.0 (element-wise scala
operation) gives wrong result if rdd is missing rows, Spark side
[ https://issues.apache.org/jira/browse/MAHOUT-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitriy Lyubimov updated MAHOUT-1597:
-------------------------------------
Status: Patch Available (was: Open)
> A + 1.0 (element-wise scala operation) gives wrong result if rdd is missing rows, Spark side
> --------------------------------------------------------------------------------------------
>
> Key: MAHOUT-1597
> URL: https://issues.apache.org/jira/browse/MAHOUT-1597
> Project: Mahout
> Issue Type: Bug
> Affects Versions: 0.9
> Reporter: Dmitriy Lyubimov
> Assignee: Dmitriy Lyubimov
> Fix For: 1.0
>
>
> {code}
> // Concoct an rdd with missing rows
> val aRdd: DrmRdd[Int] = sc.parallelize(
> 0 -> dvec(1, 2, 3) ::
> 3 -> dvec(3, 4, 5) :: Nil
> ).map { case (key, vec) => key -> (vec: Vector)}
> val drmA = drmWrap(rdd = aRdd)
> val controlB = inCoreA + 1.0
> val drmB = drmA + 1.0
> (drmB -: controlB).norm should be < 1e-10
> {code}
> should not fail.
> it was failing due to elementwise scalar operator only evaluates rows actually present in dataset.
> In case of Int-keyed row matrices, there are implied rows that yet may not be present in RDD.
> Our goal is to detect the condition and evaluate missing rows prior to physical operators that don't work with missing implied rows.
--
This message was sent by Atlassian JIRA
(v6.2#6252)