You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Imran Younus (JIRA)" <ji...@apache.org> on 2016/12/16 20:02:59 UTC
[jira] [Created] (SYSTEMML-1156) problem with MLContext and QR
Imran Younus created SYSTEMML-1156:
--------------------------------------
Summary: problem with MLContext and QR
Key: SYSTEMML-1156
URL: https://issues.apache.org/jira/browse/SYSTEMML-1156
Project: SystemML
Issue Type: Bug
Components: Runtime
Environment: spark 1.6.2
centOS7
Reporter: Imran Younus
I'm trying to run this simple code to get QR
{code}
X = rand(rows=4, cols=2)
[H, R] = qr(X)
print(toString(H))
print ("X is of size : " + nrow(X) + "," + ncol(X))
print ("H is of size : " + nrow(H) + "," + ncol(H))
print ("R is of size : " + nrow(R) + "," + ncol(R))
n = ncol(H)
for( j in n:1 ) {
print(j);
V = H[,j];
print ("V is of size : " + nrow(V) + "," + ncol(V))
VTV = t(V) %*% V
print(toString(VTV))
}
{code}
I ran this in CP mode and in hybrid spark mode.
In the CP mode this works perfectly fine.
But, when I run this with spark then the behavior is strange.
The problem is that inside the for loop, when I assign {{H\[,j\]}} to {{V}}, it becomes {{H}} instead of just a column of {{H}}. So, {{VTV}} then becomes a matrix instead of just a number which I want. This only happens inside the for loop. If I do this without for loop then there is no problem. Also, this is occurs only for matrix {{H}}. If I replace {{H}} with {{X}} instead, then there is no problem. Here is the out of the code when I run it with spark:
{code}
16/12/16 11:53:27 INFO api.DMLScript: BEGIN DML run 12/16/2016 11:53:27
16/12/16 11:53:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
X is of size : 4,2
H is of size : 4,2
R is of size : 4,2
1.526 0.000
0.459 1.905
0.280 -0.202
0.659 0.373
2
V is of size : 4,1
3.051 1.064
1.064 3.811
1
V is of size : 4,1
3.051 1.064
1.064 3.811
16/12/16 11:53:27 INFO api.DMLScript: SystemML Statistics:
Total execution time: 0.624 sec.
Number of executed Spark inst: 0.
16/12/16 11:53:27 INFO api.DMLScript: END DML run 12/16/2016 11:53:27
{code}
As you can see from the output, the size of {{V}} is correct. Its supposed to be a column vector. But, {{VTV}} is a 2x2 matrix instead of a number because {{V}} is just {{H}}. We print {{V}} and see that.
Here is correct output form CP mode:
{code}
================================================================================
================================================================================
16/12/16 11:54:56 INFO api.DMLScript: BEGIN DML run 12/16/2016 11:54:56
16/12/16 11:54:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
X is of size : 4,2
H is of size : 4,2
R is of size : 4,2
1.575 0.000
0.476 1.591
0.296 -0.772
0.596 0.233
2
V is of size : 4,1
3.182
1
V is of size : 4,1
3.151
16/12/16 11:54:57 INFO api.DMLScript: SystemML Statistics:
Total execution time: 0.199 sec.
Number of executed MR Jobs: 0.
16/12/16 11:54:57 INFO api.DMLScript: END DML run 12/16/2016 11:54:57
{code}
[~mboehm7] [~niketanpansare]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)