You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nikolay Kashtanov (JIRA)" <ji...@apache.org> on 2019/03/19 09:45:00 UTC

[jira] [Commented] (SPARK-24668) PySpark crashes when getting the webui url if the webui is disabled

    [ https://issues.apache.org/jira/browse/SPARK-24668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795890#comment-16795890 ] 

Nikolay Kashtanov commented on SPARK-24668:
-------------------------------------------

Hi [~Karthik Palaniappan] are you worknig on this issue? if not, I'd like to take this one.

> PySpark crashes when getting the webui url if the webui is disabled
> -------------------------------------------------------------------
>
>                 Key: SPARK-24668
>                 URL: https://issues.apache.org/jira/browse/SPARK-24668
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.3.0, 2.4.0
>         Environment: * Spark 2.3.0
>  * Spark-on-YARN
>  * Java 8
>  * Python 3.6.5
>  * Jupyter 4.4.0
>            Reporter: Karthik Palaniappan
>            Priority: Minor
>
> Repro:
>  
> Evaluate `sc` in a Jupyter notebook:
>  
>  
> {{---------------------------------------------------------------------------}}
> {{Py4JJavaError                             Traceback (most recent call last)}}
> {{/opt/conda/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)}}
> {{    343             method = get_real_method(obj, self.print_method)}}
> {{    344             if method is not None:}}
> {{--> 345                 return method()}}
> {{    346             return None}}
> {{    347         else:}}
> {{/usr/lib/spark/python/pyspark/context.py in _repr_html_(self)}}
> {{    261         </div>}}
> {{    262         """.format(}}
> {{--> 263             sc=self}}
> {{    264         )}}
> {{    265 }}
> {{/usr/lib/spark/python/pyspark/context.py in uiWebUrl(self)}}
> {{    373     def uiWebUrl(self):}}
> {{    374         """Return the URL of the SparkUI instance started by this SparkContext"""}}
> {{--> 375         return self._[jsc.sc|https://www.google.com/url?q=http://jsc.sc&sa=D&usg=AFQjCNHUwO0Cf3OHs1QafBFXzShZ_PU8IQ]().uiWebUrl().get()}}
> {{    376 }}
> {{    377     @property}}
> {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py in __call__(self, *args)}}
> {{   1158         answer = self.gateway_client.send_command(command)}}
> {{   1159         return_value = get_return_value(}}
> {{-> 1160             answer, self.gateway_client, self.target_id, [self.name|https://www.google.com/url?q=http://self.name&sa=D&usg=AFQjCNEu_LlQOduOrIyV64UgIuRgm6Ea2w])}}
> {{   1161 }}
> {{   1162         for temp_arg in temp_args:}}
> {{/usr/lib/spark/python/pyspark/sql/utils.py in deco(*a, **kw)}}
> {{     61     def deco(*a, **kw):}}
> {{     62         try:}}
> {{---> 63             return f(*a, **kw)}}
> {{     64         except py4j.protocol.Py4JJavaError as e:}}
> {{     65             s = e.java_exception.toString()}}
> {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)}}
> {{    318                 raise Py4JJavaError(}}
> {{    319                     "An error occurred while calling \{0}{1}\{2}.\n".}}
> {{--> 320                     format(target_id, ".", name), value)}}
> {{    321             else:}}
> {{    322                 raise Py4JError(}}
> {{Py4JJavaError: An error occurred while calling o80.get.}}
> {{: java.util.NoSuchElementException: None.get}}
> {{        at scala.None$.get(Option.scala:347)}}
> {{        at scala.None$.get(Option.scala:345)}}
> {{        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
> {{        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
> {{        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
> {{        at java.lang.reflect.Method.invoke(Method.java:498)}}
> {{        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)}}
> {{        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)}}
> {{        at py4j.Gateway.invoke(Gateway.java:282)}}
> {{        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)}}
> {{        at py4j.commands.CallCommand.execute(CallCommand.java:79)}}
> {{        at py4j.GatewayConnection.run(GatewayConnection.java:214)}}
> {{        at java.lang.Thread.run(Thread.java:748)}}
>  
> PySpark only prints out the web ui url in `_repr_html`, not `__repr__`, so this only happens in notebooks that render html, not the pyspark shell. [https://github.com/apache/spark/commit/f654b39a63d4f9b118733733c7ed2a1b58649e3d]
>  
> Disabling Spark's UI with `spark.ui.enabled` *is* valuable outside of tests. A couple reasons that come to mind:
> 1) If you run multiple spark applications from one machine, Spark irritatingly starts picking the same port (4040), as the first application, then increments (4041, 4042, etc) until it finds an open port. If you are running 10 spark apps, then the 11th prints out 10 warnings about ports being taken until it finally finds one.
> 2) You can serve the spark web ui from a dedicated spark history server instead of per-driver. This is documented here, at least for Spark-on-YARN: [https://spark.apache.org/docs/latest/running-on-yarn.html#using-the-spark-history-server-to-replace-the-spark-web-ui.]
>  
> PySpark should not crash if the web ui is disabled. There are a couple of options:
> 1) SparkContext#uiWebUrl() in Scala should return the driver web ui url or the history server url, depending on which one is being used.
> 2) PySpark should call getOrElse(None) rather than get().
>  
> I strongly prefer option 1), but I can't figure out how to do it in a non-hacky way. In SparkContext.scala, uiWebUrl() comes from `_ui.map(_.webUrl)`, where `_ui` contains the actual SparkUI if spark.ui.enabled=true.
> 1) I could set `_ui` to SparkUI.createHistoryUI(), and then just avoid calling `bind()` on the UI server. I'm not sure what the implications would be for classes outside of SparkContext that use SparkContext#ui.
> 2) I could make `_ui` and `uiWebUrl()` inconsistent. `_ui` only contains the in-driver UI and `uiWebUrl()` returns the in-driver or history URL.
>  
> I would appreciate some help figuring out how to proceed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org