You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@systemml.apache.org by Matthias Boehm <mb...@gmail.com> on 2018/03/10 09:42:43 UTC

Re: [DISCUSS] integrated testing for MLContext, SPARK, codegen.

Hi Janardhan,

in general, we prefer to compare against R because it helps detecting
issues that are common across different optimizers and execution modes. So
for small scripts like PCA, I would recommend to simply create an R script,
which should be very similar to the dml script.

However, for more complex scripts with lots of table, removeEmpty, and
matrix-vector operations, creating the R scripts can be tedious and
error-prone because it requires additional operations such as vector
replications. For such case (including some existing tests), we could
indeed compare the different modes (e.g., w/ and w/o codegen). Let's decide
that case by case.

Regarding MLContext, yes it would be good to extend the algorithm test
coverage by simply reusing the dml and R scripts from the existing
application or codegen tests.

Regards,
Matthias

On Fri, Mar 9, 2018 at 9:52 PM, Janardhan Pulivarthi <
janardhan.pulivarthi@gmail.com> wrote:

> Hi,
>
> Matthias -
> 1. We are checking the values of codegen algorithms with the R script, but
> can we compare
>
> ` pca script with [codegen_disabled] = pca script with [codegen_enabled]`
>
>
> Deron -
> 1. The same way can we compare the result through MLContext invocation
> with other mode of running the script.
>
>
> Thanks,
> Janardhan
>