You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "jugosag (Jira)" <ji...@apache.org> on 2019/12/04 07:03:00 UTC

[jira] [Commented] (SPARK-28921) Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)

    [ https://issues.apache.org/jira/browse/SPARK-28921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987593#comment-16987593 ] 

jugosag commented on SPARK-28921:
---------------------------------

We are also observing this on two of our clusters, both set up with Rancher, one on Kubernetes version 1.15.5 the other on 1.14.6. Interestingly, on Minikube with version 1.15.4 it works.

Problem is that the Spark version which has the fix (2.4.5) has not been released yet and the 3.0.0 preview from Nov 6th has the same problem.

When will 2.4.5 be released?

 

> Spark jobs failing on latest versions of Kubernetes (1.15.3, 1.14.6, 1,13.10, 1.12.10, 1.11.10)
> -----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-28921
>                 URL: https://issues.apache.org/jira/browse/SPARK-28921
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 2.3.0, 2.3.1, 2.3.3, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4
>            Reporter: Paul Schweigert
>            Assignee: Andy Grove
>            Priority: Major
>             Fix For: 2.4.5, 3.0.0
>
>
> Spark jobs are failing on latest versions of Kubernetes when jobs attempt to provision executor pods (jobs like Spark-Pi that do not launch executors run without a problem):
>  
> Here's an example error message:
>  
> {code:java}
> 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes.
> 19/08/30 01:29:09 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes.19/08/30 01:29:09 WARN WatchConnectionManager: Exec Failure: HTTP 403, Status: 403 - 
> java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden' 
>     at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216) 
>     at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183) 
>     at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) 
>     at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) 
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
>     at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Looks like the issue is caused by fixes for a recent CVE : 
> CVE: [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14809]
> Fix: [https://github.com/fabric8io/kubernetes-client/pull/1669]
>  
> Looks like upgrading kubernetes-client to 4.4.2 would solve this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org