You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemml.apache.org by Matthias Boehm <mb...@us.ibm.com> on 2016/04/03 07:08:34 UTC
Re: Gxuides about running SystemML by spark cluster
thanks again for catching
https://issues.apache.org/jira/browse/SYSTEMML-609, yes the change is in
SystemML head now, so please rebuild SystemML or use one of our nightly
builds (https://sparktc.ibmcloud.com/repo/latest/). Thanks.
For running SystemML on Spark, you have multiple options (
http://apache.github.io/incubator-systemml/#running-systemml). Either use
MLContext or spark-submit. Since our documentation does not show many
examples for spark-submit yet, here is a typical command line invocation:
../spark/bin/spark-submit \
--class org.apache.sysml.api.DMLScript \
--master yarn-client \
--num-executors 10 \
--driver-memory 20g \
--executor-memory 60g \
--executor-cores 24 \
--queue default \
--conf spark.driver.maxResultSize=0 \
./SystemML.jar \
-f test.dml -stats -exec hybrid_spark -nvargs ...
Everything else is similar to the hadoop invocation. We also provide you a
script that simplifies this configuration:
https://github.com/apache/incubator-systemml/blob/master/scripts/sparkDML.sh
. Keep in mind that if you want to run in yarn-cluster, you should put the
DML script and potentially SystemML-config into HDFS too.
Regards,
Matthias
From: Wenjie Zhuang <ka...@vt.edu>
To: dev@systemml.incubator.apache.org
Cc: Matthias Boehm/Almaden/IBM@IBMUS
Date: 04/02/2016 07:50 PM
Subject: Re: Gxuides about running SystemML by spark cluster
Hi,
I try to run StepLinearRegDS.dml by spark yarn mode today. And I get the
following result. Is it correct?
Thanks.
BEGIN STEPWISE LINEAR REGRESSION SCRIPT
Reading X and Y...
Best AIC without any features: 4123.134539784949
Best AIC 4068.2916533784332 achieved with feature: 22
Running linear regression with selected features...
Computing the statistics...
Writing the output matrix...
On Sat, Apr 2, 2016 at 8:37 AM, Wenjie Zhuang <ka...@vt.edu> wrote:
Hi,
I am now trying to run experiments about SystemML on spark cluster. Could
you please share some guides about how to run StepLinearRegDS.dml by
spark cluster? The official guide I find is most about hadoop.
Thanks & Have a good weekend!