You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Liang-Chi Hsieh (JIRA)" <ji...@apache.org> on 2018/08/29 12:12:00 UTC
[jira] [Commented] (SPARK-25217) Error thrown when creating BlockMatrix

    [ https://issues.apache.org/jira/browse/SPARK-25217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596250#comment-16596250 ] 

Liang-Chi Hsieh commented on SPARK-25217:
-----------------------------------------

I think you are mixing {{pyspark.ml.linalg.Matrix}}, {{Matrices}} with {{pyspark.mllib.linalg.distributed.BlockMatrix}}.

If you use {{Matrix}} and {{Matrices}} from {{pyspark.mllib.linalg}}, it works without the error.

> Error thrown when creating BlockMatrix
> --------------------------------------
>
>                 Key: SPARK-25217
>                 URL: https://issues.apache.org/jira/browse/SPARK-25217
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.3.1
>            Reporter: cs5090237
>            Priority: Major
>
> dm1 = Matrices.dense(3, 2, [1, 2, 3, 4, 5, 6])
> dm2 = Matrices.dense(3, 2, [7, 8, 9, 10, 11, 12])
> sm = Matrices.sparse(3, 2, [0, 1, 3], [0, 1, 2], [7, 11, 12])
> blocks1 = sc.parallelize([((0, 0), dm1)])
> sm_ = Matrix(3,2,sm)
> blocks2 = sc.parallelize([((0, 0), sm), ((1, 0), sm)])
> blocks3 = sc.parallelize([((0, 0), sm), ((1, 0), dm2)])
> mat2 = BlockMatrix(blocks2, 3, 2)
> mat3 = BlockMatrix(blocks3, 3, 2)
>  
> *Running above sample code in Pyspark from documentation raises following error:* 
>  
> An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 14 in stage 53.0 failed 4 times, most recent failure: Lost task 14.3 in stage 53.0 (TID 1081, , executor 15): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/mnt/yarn/usercache/livy/appcache//pyspark.zip/pyspark/worker.py", line 230, in main process() File "/mnt/yarn/usercache/livy/appcache//pyspark.zip/pyspark/worker.py", line 225, in process serializer.dump_stream(func(split_index, iterator), outfile) File "/mnt/yarn/usercache/livy/appcache/application_1535051034290_0001/container_1535051034290_0001_01_000023/pyspark.zip/pyspark/serializers.py", line 372, in dump_stream vs = list(itertools.islice(iterator, batch)) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1371, in takeUpToNumLeft File "/mnt/yarn/usercache/livy/appcache//pyspark.zip/pyspark/util.py", line 55, in wrapper return f(*args, **kwargs) File "/mnt/yarn/usercache/livy/appcache//pyspark.zip/pyspark/mllib/linalg/distributed.py", line 975, in _convert_to_matrix_block_tuple raise TypeError("Cannot convert type %s into a sub-matrix block tuple" % type(block)) TypeError: Cannot convert type <type 'tuple'> into a sub-matrix block tuple
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org