You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Imran Younus (JIRA)" <ji...@apache.org> on 2016/12/16 20:02:59 UTC

[jira] [Created] (SYSTEMML-1156) problem with MLContext and QR

Imran Younus created SYSTEMML-1156:
--------------------------------------

             Summary: problem with MLContext and QR
                 Key: SYSTEMML-1156
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1156
             Project: SystemML
          Issue Type: Bug
          Components: Runtime
         Environment: spark 1.6.2
centOS7

            Reporter: Imran Younus


I'm trying to run this simple code to get QR
{code}
X = rand(rows=4, cols=2)

[H, R] = qr(X)

print(toString(H))
print ("X is of size : " + nrow(X) + "," + ncol(X))
print ("H is of size : " + nrow(H) + "," + ncol(H))
print ("R is of size : " + nrow(R) + "," + ncol(R))

n = ncol(H)

for( j in n:1 ) {
    print(j);
    V = H[,j];
    print ("V is of size : " + nrow(V) + "," + ncol(V))
    VTV = t(V) %*% V
    print(toString(VTV))
}
{code}

I ran this in CP mode and in hybrid spark mode.

In the CP mode this works perfectly fine.

But, when I run this with spark then the behavior is strange.
The problem is that inside the  for loop, when I assign {{H\[,j\]}} to {{V}}, it becomes {{H}} instead of just a column of {{H}}. So, {{VTV}} then becomes a matrix instead of just a number which I want. This only happens inside the for loop. If I do this  without for loop then there is no problem. Also, this is occurs only for matrix {{H}}. If I replace {{H}} with {{X}} instead, then there is no problem. Here is the out of the code when I run it with spark:

{code}
16/12/16 11:53:27 INFO api.DMLScript: BEGIN DML run 12/16/2016 11:53:27
16/12/16 11:53:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
X is of size : 4,2
H is of size : 4,2
R is of size : 4,2
1.526 0.000
0.459 1.905
0.280 -0.202
0.659 0.373

2
V is of size : 4,1
3.051 1.064
1.064 3.811

1
V is of size : 4,1
3.051 1.064
1.064 3.811

16/12/16 11:53:27 INFO api.DMLScript: SystemML Statistics:
Total execution time:		0.624 sec.
Number of executed Spark inst:	0.

16/12/16 11:53:27 INFO api.DMLScript: END DML run 12/16/2016 11:53:27
{code}

As you can see from the output, the size of {{V}} is correct. Its supposed to be a column vector. But, {{VTV}} is a 2x2 matrix instead of a number because {{V}} is just {{H}}. We print {{V}} and see that.

Here is correct output form CP mode:

{code}
================================================================================
================================================================================
16/12/16 11:54:56 INFO api.DMLScript: BEGIN DML run 12/16/2016 11:54:56
16/12/16 11:54:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
X is of size : 4,2
H is of size : 4,2
R is of size : 4,2
1.575 0.000
0.476 1.591
0.296 -0.772
0.596 0.233

2
V is of size : 4,1
3.182

1
V is of size : 4,1
3.151

16/12/16 11:54:57 INFO api.DMLScript: SystemML Statistics:
Total execution time:		0.199 sec.
Number of executed MR Jobs:	0.

16/12/16 11:54:57 INFO api.DMLScript: END DML run 12/16/2016 11:54:57
{code}

[~mboehm7] [~niketanpansare]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)