You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Lucas Tittmann (JIRA)" <ji...@apache.org> on 2016/02/26 17:51:18 UTC

[jira] [Created] (ZEPPELIN-703) PySpark - max() does not work as expected

Lucas Tittmann created ZEPPELIN-703:
---------------------------------------

             Summary: PySpark - max() does not work as expected
                 Key: ZEPPELIN-703
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-703
             Project: Zeppelin
          Issue Type: Bug
          Components: documentation, zeppelin-interpreter
         Environment: Using PySpark in Zeppelin on Server with Spark 1.6 
            Reporter: Lucas Tittmann
            Priority: Minor


Please excuse me if the error is no bug in Zeppelin or PySpark but just a result of my inexperience in using the platform or because I didn't find it in the documentation. I tried to use max() as I would in Python, like:

%pyspark
max(1,2)
# expected: 
# 2
# output: 
#Traceback (most recent call last):
# File "/tmp/zeppelin_pyspark.py", line 222, in <module>
#    eval(compiledCode)
#  File "<string>", line 1, in <module>
#TypeError: _() takes exactly 1 argument (2 given)

# or like
max([1,2])
# expected: 
# 2
# output: 
Traceback (most recent call last):
#   File "/tmp/zeppelin_pyspark.py", line 222, in <module>
#     eval(compiledCode)
#   File "<string>", line 1, in <module>
#   File "/opt/zeppelin/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/functions.py", line 39, in _
#     jc = getattr(sc._jvm.functions, name)(col._jc if isinstance(col, Column) else col)
#   File "/opt/zeppelin/spark-1.5.0-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
#     self.target_id, self.name)
#   File "/opt/zeppelin/spark-1.5.0-bin-hadoop2.6/python/pyspark/sql/utils.py", line 36, in deco
#     return f(*a, **kw)
#   File "/opt/zeppelin/spark-1.5.0-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 304, in get_return_value
#     format(target_id, '.', name, value))
# Py4JError: An error occurred while calling z:org.apache.spark.sql.functions.max. Trace:
# py4j.Py4JException: Method max([class java.util.ArrayList]) does not exist
# 	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333)
# 	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:358)
# 	at py4j.Gateway.invoke(Gateway.java:254)
# 	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
# 	at py4j.commands.CallCommand.execute(CallCommand.java:79)
# 	at py4j.GatewayConnection.run(GatewayConnection.java:207)
# 	at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)