You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Simone Robutti <si...@gmail.com> on 2017/07/05 14:52:13 UTC

Collecting matrix's entries raises an error only when run inside a test

Hello, I have this problem and  Google is not helping. Instead, it looks
like an unreported bug and there are no hints to possible workarounds.

the error is the following:

Traceback (most recent call last):
  File
"/home/simone/motionlogic/trip-labeler/test/trip_labeler_test/model_test.py",
line 43, in test_make_trip_matrix
    entries = trip_matrix.entries.map(lambda entry: (entry.i, entry.j,
entry.value)).collect()
  File
"/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 770, in collect
    with SCCallSiteSync(self.context) as css:
  File
"/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/traceback_utils.py",
line 72, in __enter__
    self._context._jsc.setCallSite(self._call_site)
AttributeError: 'NoneType' object has no attribute 'setCallSite'

and it is raised when I try to collect a
pyspark.mllib.linalg.distributed.CoordinateMatrix entries with .collect()
and it happens only when this run in a test suite with more than one class,
so it's probably related to the creation and destruction of SparkContexts
but I cannot understand how.

Spark version is 1.6.2

I saw multiple references to this error for other classses in the pyspark
ml library but none of them contained hints toward the solution.

I'm running tests through nosetests when it breaks. Running a single
TestCase in Intellij works fine.

Is there a known solution? Is it a known problem?

Thank you,

Simone

Re: Collecting matrix's entries raises an error only when run inside a test

Posted by Yanbo Liang <yb...@gmail.com>.
Hi Simone,

Would you mind to share the minimized code to reproduce this issue?

Yanbo

On Wed, Jul 5, 2017 at 10:52 PM, Simone Robutti <si...@gmail.com>
wrote:

> Hello, I have this problem and  Google is not helping. Instead, it looks
> like an unreported bug and there are no hints to possible workarounds.
>
> the error is the following:
>
> Traceback (most recent call last):
>   File "/home/simone/motionlogic/trip-labeler/test/trip_labeler_test/model_test.py",
> line 43, in test_make_trip_matrix
>     entries = trip_matrix.entries.map(lambda entry: (entry.i, entry.j,
> entry.value)).collect()
>   File "/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
> line 770, in collect
>     with SCCallSiteSync(self.context) as css:
>   File "/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/traceback_utils.py",
> line 72, in __enter__
>     self._context._jsc.setCallSite(self._call_site)
> AttributeError: 'NoneType' object has no attribute 'setCallSite'
>
> and it is raised when I try to collect a pyspark.mllib.linalg.distributed.CoordinateMatrix
> entries with .collect() and it happens only when this run in a test suite
> with more than one class, so it's probably related to the creation and
> destruction of SparkContexts but I cannot understand how.
>
> Spark version is 1.6.2
>
> I saw multiple references to this error for other classses in the pyspark
> ml library but none of them contained hints toward the solution.
>
> I'm running tests through nosetests when it breaks. Running a single
> TestCase in Intellij works fine.
>
> Is there a known solution? Is it a known problem?
>
> Thank you,
>
> Simone
>