You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2016/03/24 23:05:25 UTC

[jira] [Created] (MADLIB-983) SVD minor messaging improvements

Frank McQuillan created MADLIB-983:
--------------------------------------

             Summary: SVD minor messaging improvements
                 Key: MADLIB-983
                 URL: https://issues.apache.org/jira/browse/MADLIB-983
             Project: Apache MADlib
          Issue Type: Improvement
          Components: Module: Matrix Factorisation
            Reporter: Frank McQuillan


1) Output of singular values adds a NULL row to bottom of table:

madlib=# SELECT * FROM svd_s ORDER BY row_id;
 row_id | col_id |      value       
--------+--------+------------------
      1 |      1 | 6475.67225281804
      2 |      2 | 1875.18065580415
      3 |      3 | 1483.25228429636
      4 |      4 | 1159.72262897427
      5 |      5 | 1033.86092570574
      6 |      6 | 948.437358703966
      7 |      7 | 795.379572772455
      8 |      8 | 709.086240684469
      9 |      9 | 462.473775959371
     10 |     10 | 365.875217945698
     10 |     10 |                 
(11 rows)

This was required in the past where the NULL row was used to identify the matrix dimensions.  Can be removed now.  Since PCA uses SVD need to be sure it does not break anything in PCA.

2) Error message is cryptic:

ERROR: plpy.SPIError: plpy.Error: SVD error: Number of Lanczos iterations should be in the range of [10, 10] (plpython.c:4648)
SQL state: XX000
Context: Traceback (most recent call last):
  PL/Python function "svd", line 25, in <module>
    row_id, k, n_iterations, result_summary_table)
  PL/Python function "svd", line 84, in svd
  PL/Python function "svd", line 536, in _svd_upper_wrap
  PL/Python function "svd", line 598, in _svd_upper
PL/Python function "svd"

Should have a better error message that says 
nIterations < k or nIterations > col_dim

Code snippet is:

elif nIterations < k or nIterations > col_dim:
       plpy.error("SVD error: Number of Lanczos iterations should be"
                  " in the range of [{0}, {1}]".format(k, col_dim))




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)