You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by Leemoonsoo <gi...@git.apache.org> on 2015/06/29 22:26:57 UTC
[GitHub] incubator-zeppelin pull request: [ZEPPELIN-97][ZEPPELIN-134] pyspa...
GitHub user Leemoonsoo opened a pull request:
https://github.com/apache/incubator-zeppelin/pull/129
[ZEPPELIN-97][ZEPPELIN-134] pyspark issue with mllib api
There were issue [ZEPPELIN-97](https://issues.apache.org/jira/browse/ZEPPELIN-97) with pyspark 1.4. The reason is, from pyspark 1.4, java gateway is created with `auto_convert = True` option. This PR fixes the problem.
This PR also handles [ZEPPELIN-134](https://issues.apache.org/jira/browse/ZEPPELIN-134), inject sqlContext.
And it finally improves to print more verbose stacktrace message, for example
from
```
(<type 'exceptions.AttributeError'>, AttributeError("'list' object has no attribute '_get_object_id'",), <traceback object at 0x392b638>)
```
to
```
Traceback (most recent call last):
File "/var/folders/zt/nd4j13y14jjg7_5pc4xgy7t80000gn/T//zeppelin_pyspark.py", line 110, in <module>
eval(compiledCode)
File "<string>", line 3, in <module>
File "/Users/moon/Projects/zeppelin/spark-1.4.0-bin-hadoop2.3/python/pyspark/sql/dataframe.py", line 1200, in withColumn
return self.select('*', col.alias(colName))
File "/Users/moon/Projects/zeppelin/spark-1.4.0-bin-hadoop2.3/python/pyspark/sql/dataframe.py", line 738, in select
jdf = self._jdf.select(self._jcols(*cols))
File "/Users/moon/Projects/zeppelin/spark-1.4.0-bin-hadoop2.3/python/pyspark/sql/dataframe.py", line 630, in _jcols
return self._jseq(cols, _to_java_column)
File "/Users/moon/Projects/zeppelin/spark-1.4.0-bin-hadoop2.3/python/pyspark/sql/dataframe.py", line 617, in _jseq
return _to_seq(self.sql_ctx._sc, cols, converter)
File "/Users/moon/Projects/zeppelin/spark-1.4.0-bin-hadoop2.3/python/pyspark/sql/column.py", line 60, in _to_seq
return sc._jvm.PythonUtils.toSeq(cols)
File "/Users/moon/Projects/zeppelin/spark-1.4.0-bin-hadoop2.3/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 529, in __call__
[get_command_part(arg, self.pool) for arg in new_args])
File "/Users/moon/Projects/zeppelin/spark-1.4.0-bin-hadoop2.3/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 265, in get_command_part
command_part = REFERENCE_TYPE + parameter._get_object_id()
AttributeError: 'list' object has no attribute '_get_object_id'
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Leemoonsoo/incubator-zeppelin ZEPPELIN-97
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-zeppelin/pull/129.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #129
----
commit bce3c1d33e5ab48146c2d70e81935e361fcff9c2
Author: Lee moon soo <mo...@apache.org>
Date: 2015-06-29T19:53:10Z
Print more stacktrace
commit ab01a665781a9b1399eb000ec480ed1ed4d9b715
Author: Lee moon soo <mo...@apache.org>
Date: 2015-06-29T20:20:36Z
Add testcase for auto_convert option
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: [ZEPPELIN-97][ZEPPELIN-134] pyspa...
Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/129#issuecomment-116857501
Ready to merge.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: [ZEPPELIN-97][ZEPPELIN-134] pyspa...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/incubator-zeppelin/pull/129
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-zeppelin pull request: [ZEPPELIN-97][ZEPPELIN-134] pyspa...
Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:
https://github.com/apache/incubator-zeppelin/pull/129#issuecomment-117051643
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---