You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by "hanna-liashchuk (via GitHub)" <gi...@apache.org> on 2023/01/26 11:56:50 UTC

[GitHub] [kyuubi] hanna-liashchuk opened a new issue, #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

hanna-liashchuk opened a new issue, #4203:
URL: https://github.com/apache/kyuubi/issues/4203

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the [issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Describe the bug
   
   I'm testing 1.6.1-incubating release on Kubernetes and I found out that it is completely broken. Executors are starting and failing after a couple of seconds. 
   
   Executor log is below:
   
   ```
   ++ id -u
   + myuid=185
   ++ id -g
   + mygid=1000
   + set +e
   ++ getent passwd 185
   + uidentry=spark:x:185:1000::/home/spark:/bin/sh
   + set -e
   + '[' -z spark:x:185:1000::/home/spark:/bin/sh ']'
   + '[' -z /usr/java/default ']'
   + SPARK_CLASSPATH=':/opt/spark/jars/*'
   + env
   + grep SPARK_JAVA_OPT_
   + sort -t_ -k4 -n
   + sed 's/[^=]*=\(.*\)/\1/g'
   + readarray -t SPARK_EXECUTOR_JAVA_OPTS
   + '[' -n '' ']'
   + '[' -z ']'
   + '[' -z ']'
   + '[' -n '' ']'
   + '[' -z ']'
   + '[' -z x ']'
   + SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*'
   + case "$1" in
   + shift 1
   + CMD=(${JAVA_HOME}/bin/java "${SPARK_EXECUTOR_JAVA_OPTS[@]}" -Xms$SPARK_EXECUTOR_MEMORY -Xmx$SPARK_EXECUTOR_MEMORY -cp "$SPARK_CLASSPATH:$SPARK_DIST_CLASSPATH" org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend --driver-url $SPARK_DRIVER_URL --executor-id $SPARK_EXECUTOR_ID --cores $SPARK_EXECUTOR_CORES --app-id $SPARK_APPLICATION_ID --hostname $SPARK_EXECUTOR_POD_IP --resourceProfileId $SPARK_RESOURCE_PROFILE_ID --podName $SPARK_EXECUTOR_POD_NAME)
   
   + exec /usr/bin/tini -s -- /usr/java/default/bin/java -Dspark.driver.port=38081 -Dspark.kyuubi.metrics.prometheus.port=10019 -Dspark.ui.port=0 -Xms1024m -Xmx1024m -cp '/opt/spark/conf::/opt/spark/jars/*:' org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend --driver-url spark://CoarseGrainedScheduler@172.17.16.142:38081 --executor-id 45 --cores 3 --app-id spark-application-1674731059727 --hostname 172.17.21.78 --resourceProfileId 0 --podName
   Unrecognized options: --podName
   
   Usage: org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend [options]
   
    Options are:
      --driver-url <driverUrl>
      --executor-id <executorId>
      --bind-address <bindAddress>
      --hostname <hostname>
      --cores <cores>
      --resourcesFile <fileWithJSONResourceInformation>
      --app-id <appid>
      --worker-url <workerUrl>
      --resourceProfileId <id>
      --podName <podName>
   
   
   ```
   
   Lats version that worked was 1.6.0-incubating, the SPARK_EXECUTOR_POD_NAME var was set with the value from spark.kubernetes.executor.podNamePrefix. 
   NB: 1.6.0 doesn't work without this podNamePrefix parameter as the name generated by the Kyuubi server doesn't conform to the k8s naming conventions. 1.6.1 doesn't work even with this parameter.
   
   ### Affects Version(s)
   
   1.6.1
   
   ### Kyuubi Server Log Output
   
   _No response_
   
   ### Kyuubi Engine Log Output
   
   _No response_
   
   ### Kyuubi Server Configurations
   
   ```yaml
   ## Kyuubi authentication
   kyuubi.authentication              NONE
   
   # Kyuubi Metrics
   # https://kyuubi.readthedocs.io/en/latest/monitor/metrics.html
   # https://kyuubi.apache.org/docs/latest/deployment/settings.html#metrics
   kyuubi.metrics.enabled          true
   kyuubi.metrics.reporters        PROMETHEUS,JSON
   kyuubi.metrics.prometheus.path  /metrics
   kyuubi.metrics.prometheus.port  10019
   kyuubi.metrics.json.interval    PT10S
   kyuubi.metrics.json.location    ${KYUUBI_METRICS_JSON_LOCATION}
   
   # Kyuubi frontend
   kyuubi.frontend.login.timeout               PT40S
   kyuubi.frontend.thrift.binary.bind.host     ${POD_IP}
   kyuubi.frontend.thrift.binary.bind.port     10009
   kyuubi.session.idle.timeout                 PT30M
   
   kyuubi.ha.zookeeper.quorum          ${ZOOKEEPER_CONNECT}
   kyuubi.ha.zookeeper.namespace           ${POD_NAMESPACE}
   kyuubi.engine.connection.url.use.hostname         false
   kyuubi.engine.share.level              CONNECTION
   
   # ============ Spark Config ============
   
   spark.master                                        k8s://https://kubernetes.default.svc
   spark.driver.host                                   ${POD_IP}
   spark.driver.cores                                  1
   spark.executor.cores                                3
   spark.kubernetes.executor.limit.cores               3
   spark.kubernetes.executor.request.cores             3
   spark.driver.memory                                 2g
   spark.driver.maxResultSize                          1g
   spark.kubernetes.driver.pod.name                    ${POD_NAME}
   spark.kubernetes.executor.podNamePrefix             kyuubi-sql
   spark.kubernetes.container.image                    <REDUCTED>/spark:spark3.3.0-hadoop3-delta2.1.0-scala2.12
   spark.kubernetes.container.image.pullPolicy         Always
   spark.kubernetes.container.image.pullSecrets        <REDUCTED>
   spark.kubernetes.namespace                          ${POD_NAMESPACE}
   spark.kubernetes.authenticate.serviceAccountName    kyuubi-spark
   spark.kubernetes.driver.label.spark-component       spark-job
   spark.kubernetes.executor.label.spark-component     spark-job
   spark.kubernetes.memoryOverheadFactor               0.4
   spark.ui.prometheus.enabled                         true
   spark.decommission.enabled                          true
   
   #Dynamic allocation
   spark.dynamicAllocation.enabled                     true
   spark.dynamicAllocation.shuffleTracking.enabled     true
   spark.dynamicAllocation.schedulerBacklogTimeout     2s
   spark.dynamicAllocation.minExecutors                1
   spark.dynamicAllocation.maxExecutors                3
   spark.cleaner.periodicGC.interval                   10min
   spark.dynamicAllocation.executorAllocationRatio     0.75
   spark.kubernetes.dynamicAllocation.deleteGracePeriod    20s
   spark.kubernetes.allocation.maxPendingPods          1
   
   # Delta
   spark.sql.extensions                                io.delta.sql.DeltaSparkSessionExtension
   spark.sql.catalog.spark_catalog                     org.apache.spark.sql.delta.catalog.DeltaCatalog
   
   # TPC-DS
   #spark.sql.catalog.tpcds                             org.apache.kyuubi.spark.connector.tpcds.TPCDSCatalog
   #spark.jars                                          /opt/kyuubi/jars/kyuubi-spark-connector-tpcds_2.12-1.6.1-incubating.jar
   
   # Log into Spark History Server
   spark.eventLog.enabled                            true
   spark.eventLog.dir                                file://${SPARK_EVENT_LOG_DIR}
   spark.eventLog.compress                           true
   spark.eventLog.compression.codec                  snappy
   spark.eventLog.rolling.enabled                    true
   spark.ui.enabled                                  false
   ```
   
   
   ### Kyuubi Engine Configurations
   
   _No response_
   
   ### Additional context
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
   - [X] No. I cannot submit a PR at this time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] hanna-liashchuk commented on issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

Posted by "hanna-liashchuk (via GitHub)" <gi...@apache.org>.
hanna-liashchuk commented on issue #4203:
URL: https://github.com/apache/kyuubi/issues/4203#issuecomment-1412536724

   so I missed the fact that my Kyuubi image has a different spark version from the spark driver image. This was the root cause 
   Closing this issue
   p.s. 1.6.1 seems to be working ok with k8s, even the executor name issue is gone. Good work! :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on issue #4203:
URL: https://github.com/apache/kyuubi/issues/4203#issuecomment-1409764685

   > 1.6.0 doesn't work without this podNamePrefix parameter as the name generated by the Kyuubi server doesn't conform to the k8s naming conventions. 1.6.1 doesn't work even with this parameter.
   
   I see you paste the executor error logs, but could you please elaborate more about the details? e.g. the kyuubi generated spark-submit command, the driver logs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on issue #4203:
URL: https://github.com/apache/kyuubi/issues/4203#issuecomment-1406004008

   > 1.6.0 doesn't work without this podNamePrefix parameter as the name generated by the Kyuubi server doesn't conform to the k8s naming conventions. 1.6.1 doesn't work even with this parameter.
   
   What's the exact error? Is it https://issues.apache.org/jira/browse/SPARK-40869?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] hanna-liashchuk commented on issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

Posted by "hanna-liashchuk (via GitHub)" <gi...@apache.org>.
hanna-liashchuk commented on issue #4203:
URL: https://github.com/apache/kyuubi/issues/4203#issuecomment-1412406863

   worth mentioning that we have not only spark in Kubernetes, but the kyuubi server itself 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] hanna-liashchuk closed issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

Posted by "hanna-liashchuk (via GitHub)" <gi...@apache.org>.
hanna-liashchuk closed issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s
URL: https://github.com/apache/kyuubi/issues/4203


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on issue #4203:
URL: https://github.com/apache/kyuubi/issues/4203#issuecomment-1413034416

   Good to know your problem is resolved :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] hanna-liashchuk commented on issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

Posted by "hanna-liashchuk (via GitHub)" <gi...@apache.org>.
hanna-liashchuk commented on issue #4203:
URL: https://github.com/apache/kyuubi/issues/4203#issuecomment-1406142206

   @pan3793 nope, it's `Unrecognized options: --podName`
   I've attached the log in the description 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] hanna-liashchuk commented on issue #4203: [Bug] 1.6.1 doesn't work with Spark on k8s

Posted by "hanna-liashchuk (via GitHub)" <gi...@apache.org>.
hanna-liashchuk commented on issue #4203:
URL: https://github.com/apache/kyuubi/issues/4203#issuecomment-1412378578

   @pan3793 sorry, I've updated the ticket with server and engine logs
   since we are running in k8s, spark driver pod == kyuubi server pod. If you know any other option, please share :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org