You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by "dnskr (via GitHub)" <gi...@apache.org> on 2023/04/26 22:39:31 UTC
[GitHub] [kyuubi] dnskr opened a new pull request, #4776: [K8S][HELM] Add Spark engine configuration support
dnskr opened a new pull request, #4776:
URL: https://github.com/apache/kyuubi/pull/4776
### _Why are the changes needed?_
The PR is needed to configure default properties to be used by Apache Spark as query engine.
The PR also changes `values.yaml` file structure:
```yaml
# APIs for connectivity and interoperation between supported clients and Kyuubi server
api:
# Thrift Binary protocol (HiveServer2 compatible)
thriftBinary:
...
# Kyuubi server configuration
server:
replicas: 2
...
# Query engines
engine:
# Apache Spark default configuration
spark:
...
```
### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
Re: [PR] [K8S][HELM] Add Spark engine configuration support [kyuubi]
Posted by "dnskr (via GitHub)" <gi...@apache.org>.
dnskr closed pull request #4776: [K8S][HELM] Add Spark engine configuration support
URL: https://github.com/apache/kyuubi/pull/4776
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] pan3793 commented on pull request #4776: [K8S][HELM] Add Spark engine configuration support
Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on PR #4776:
URL: https://github.com/apache/kyuubi/pull/4776#issuecomment-1525580145
Seems we had such an idea about the structure of `value.yaml`, but decided to reject it.
In practice, if Spark uses HDFS as storage and HMS as metastore, typically, the user should provide `hive-site.xml` `core-site.xml` `hdfs-site.xml` etc. under HADOOP_CONF_DIR, which would be shared by both Kyuubi server and Spark engine (other engines may require it too)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] codecov-commenter commented on pull request #4776: [K8S][HELM] Add Spark engine configuration support
Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #4776:
URL: https://github.com/apache/kyuubi/pull/4776#issuecomment-1524228529
## [Codecov](https://codecov.io/gh/apache/kyuubi/pull/4776?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#4776](https://codecov.io/gh/apache/kyuubi/pull/4776?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (be227cd) into [master](https://codecov.io/gh/apache/kyuubi/commit/b7012aa206eb836073987cb6c8aa59ec4421c174?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b7012aa) will **decrease** coverage by `0.03%`.
> The diff coverage is `n/a`.
```diff
@@ Coverage Diff @@
## master #4776 +/- ##
============================================
- Coverage 57.99% 57.96% -0.03%
Complexity 13 13
============================================
Files 581 581
Lines 32431 32431
Branches 4309 4309
============================================
- Hits 18807 18799 -8
- Misses 11820 11827 +7
- Partials 1804 1805 +1
```
[see 6 files with indirect coverage changes](https://codecov.io/gh/apache/kyuubi/pull/4776/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
:mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4776: [K8S][HELM] Add Spark engine configuration support
Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4776:
URL: https://github.com/apache/kyuubi/pull/4776#discussion_r1179063733
##########
charts/kyuubi/templates/spark-configmap.yaml:
##########
@@ -0,0 +1,91 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: {{ .Release.Name }}-spark
+ labels: {{- include "kyuubi.labels" . | nindent 4 }}
+data:
+ {{- with .Values.engine.spark.conf.sparkEnv }}
+ spark-env.sh: |
+ #!/usr/bin/env bash
+ {{- tpl . $ | nindent 4 }}
+ {{- end }}
+
+ spark-defaults.conf: |
+ ## Helm chart provided Spark configurations
+ spark.submit.deployMode=cluster
+
+ spark.master={{ .Values.engine.spark.master }}
+ spark.kubernetes.namespace={{ .Values.engine.spark.namespace | default .Release.Namespace }}
+
+ spark.kubernetes.authenticate.driver.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+ spark.kubernetes.authenticate.executor.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+
+ ### Image properties
+ spark.kubernetes.container.image={{ .Values.engine.spark.image.repository }}:{{ .Values.engine.spark.image.tag }}
+ spark.kubernetes.container.image.pullPolicy={{ .Values.engine.spark.image.pullPolicy }}
+ spark.kubernetes.container.image.pullSecrets={{ range .Values.imagePullSecrets }}{{ print .name "," }}{{ end }}
+
+ ### Driver resources
Review Comment:
IMO this is kind of over-engineering stuff.
One advantage of Kyuubi is, it almost transparently supports all Spark features, users who are familiar w/ Spark, should easy to understand how Kyuubi works and how to configure the Spark engine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] dnskr commented on a diff in pull request #4776: [K8S][HELM] Add Spark engine configuration support
Posted by "dnskr (via GitHub)" <gi...@apache.org>.
dnskr commented on code in PR #4776:
URL: https://github.com/apache/kyuubi/pull/4776#discussion_r1190424073
##########
charts/kyuubi/templates/spark-configmap.yaml:
##########
@@ -0,0 +1,91 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: {{ .Release.Name }}-spark
+ labels: {{- include "kyuubi.labels" . | nindent 4 }}
+data:
+ {{- with .Values.engine.spark.conf.sparkEnv }}
+ spark-env.sh: |
+ #!/usr/bin/env bash
+ {{- tpl . $ | nindent 4 }}
+ {{- end }}
+
+ spark-defaults.conf: |
+ ## Helm chart provided Spark configurations
+ spark.submit.deployMode=cluster
+
+ spark.master={{ .Values.engine.spark.master }}
+ spark.kubernetes.namespace={{ .Values.engine.spark.namespace | default .Release.Namespace }}
+
+ spark.kubernetes.authenticate.driver.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+ spark.kubernetes.authenticate.executor.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+
+ ### Image properties
+ spark.kubernetes.container.image={{ .Values.engine.spark.image.repository }}:{{ .Values.engine.spark.image.tag }}
+ spark.kubernetes.container.image.pullPolicy={{ .Values.engine.spark.image.pullPolicy }}
+ spark.kubernetes.container.image.pullSecrets={{ range .Values.imagePullSecrets }}{{ print .name "," }}{{ end }}
+
+ ### Driver resources
Review Comment:
You are right, it is over-engineered implementation and it might confuse users about what to configure and where.
I would like to find some balance between convenient basic configuration and flexibility, but obviously it is not the best implementation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4776: [K8S][HELM] Add Spark engine configuration support
Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4776:
URL: https://github.com/apache/kyuubi/pull/4776#discussion_r1179063733
##########
charts/kyuubi/templates/spark-configmap.yaml:
##########
@@ -0,0 +1,91 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: {{ .Release.Name }}-spark
+ labels: {{- include "kyuubi.labels" . | nindent 4 }}
+data:
+ {{- with .Values.engine.spark.conf.sparkEnv }}
+ spark-env.sh: |
+ #!/usr/bin/env bash
+ {{- tpl . $ | nindent 4 }}
+ {{- end }}
+
+ spark-defaults.conf: |
+ ## Helm chart provided Spark configurations
+ spark.submit.deployMode=cluster
+
+ spark.master={{ .Values.engine.spark.master }}
+ spark.kubernetes.namespace={{ .Values.engine.spark.namespace | default .Release.Namespace }}
+
+ spark.kubernetes.authenticate.driver.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+ spark.kubernetes.authenticate.executor.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+
+ ### Image properties
+ spark.kubernetes.container.image={{ .Values.engine.spark.image.repository }}:{{ .Values.engine.spark.image.tag }}
+ spark.kubernetes.container.image.pullPolicy={{ .Values.engine.spark.image.pullPolicy }}
+ spark.kubernetes.container.image.pullSecrets={{ range .Values.imagePullSecrets }}{{ print .name "," }}{{ end }}
+
+ ### Driver resources
Review Comment:
IMO this is kind of over-engineering stuff. I would suggest we left "Driver resources" and "Executor resources" to the user.
One advantage of Kyuubi is, it almost transparently supports all Spark features, users who are familiar w/ Spark, should quickly understand how Kyuubi works and how to configure the Spark engine.
##########
charts/kyuubi/templates/spark-configmap.yaml:
##########
@@ -0,0 +1,91 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: {{ .Release.Name }}-spark
+ labels: {{- include "kyuubi.labels" . | nindent 4 }}
+data:
+ {{- with .Values.engine.spark.conf.sparkEnv }}
+ spark-env.sh: |
+ #!/usr/bin/env bash
+ {{- tpl . $ | nindent 4 }}
+ {{- end }}
+
+ spark-defaults.conf: |
+ ## Helm chart provided Spark configurations
+ spark.submit.deployMode=cluster
+
+ spark.master={{ .Values.engine.spark.master }}
+ spark.kubernetes.namespace={{ .Values.engine.spark.namespace | default .Release.Namespace }}
+
+ spark.kubernetes.authenticate.driver.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+ spark.kubernetes.authenticate.executor.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+
+ ### Image properties
+ spark.kubernetes.container.image={{ .Values.engine.spark.image.repository }}:{{ .Values.engine.spark.image.tag }}
+ spark.kubernetes.container.image.pullPolicy={{ .Values.engine.spark.image.pullPolicy }}
+ spark.kubernetes.container.image.pullSecrets={{ range .Values.imagePullSecrets }}{{ print .name "," }}{{ end }}
+
+ ### Driver resources
Review Comment:
IMO this is kind of over-engineering stuff. I would suggest we left "Driver resources" and "Executor resources" to the user.
One advantage of Kyuubi is, it almost transparently supports all Spark features, users who are familiar w/ Spark, should easy to understand how Kyuubi works and how to configure the Spark engine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] dnskr commented on pull request #4776: [K8S][HELM] Add Spark engine configuration support
Posted by "dnskr (via GitHub)" <gi...@apache.org>.
dnskr commented on PR #4776:
URL: https://github.com/apache/kyuubi/pull/4776#issuecomment-1542881873
Thanks for the comments!
This is experimental changes and they are not working fully, so I created the PR as draft. Apologize for confusing and for the delayed response.
> Seems we had such an idea about the structure of `value.yaml`, but decided to reject it.
Right, we discussed [here](https://github.com/apache/kyuubi/pull/4147#discussion_r1068996020). I'll continue with flat structure in a separate PR. As I mentioned above, it is more experimental PR to track my tries and demo different approach.
> In practice, if Spark uses HDFS as storage and HMS as metastore, typically, the user should provide `hive-site.xml` `core-site.xml` `hdfs-site.xml` etc. under HADOOP_CONF_DIR, which would be shared by both Kyuubi server and Spark engine (other engines may require it too)
Got it! I'll add these files as well. Am I right that there is no default `HADOOP_CONF_DIR` path in Kyuubi server or Kyuubi docker image? If no, could you please suggest how to set it in the chart(add env variable, property etc)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4776: [K8S][HELM] Add Spark engine configuration support
Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4776:
URL: https://github.com/apache/kyuubi/pull/4776#discussion_r1179066998
##########
charts/kyuubi/templates/spark-configmap.yaml:
##########
@@ -0,0 +1,91 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+ name: {{ .Release.Name }}-spark
+ labels: {{- include "kyuubi.labels" . | nindent 4 }}
+data:
+ {{- with .Values.engine.spark.conf.sparkEnv }}
+ spark-env.sh: |
+ #!/usr/bin/env bash
+ {{- tpl . $ | nindent 4 }}
+ {{- end }}
+
+ spark-defaults.conf: |
+ ## Helm chart provided Spark configurations
+ spark.submit.deployMode=cluster
+
+ spark.master={{ .Values.engine.spark.master }}
+ spark.kubernetes.namespace={{ .Values.engine.spark.namespace | default .Release.Namespace }}
+
+ spark.kubernetes.authenticate.driver.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+ spark.kubernetes.authenticate.executor.serviceAccountName={{ .Values.engine.spark.serviceAccount.name | default (printf "%s-spark" .Release.Name) }}
+
+ ### Image properties
+ spark.kubernetes.container.image={{ .Values.engine.spark.image.repository }}:{{ .Values.engine.spark.image.tag }}
+ spark.kubernetes.container.image.pullPolicy={{ .Values.engine.spark.image.pullPolicy }}
+ spark.kubernetes.container.image.pullSecrets={{ range .Values.imagePullSecrets }}{{ print .name "," }}{{ end }}
+
+ ### Driver resources
Review Comment:
Seems we can put everything in `spark-defaults.conf` to `values.yaml`'s `sparkDefaults` block, in Spark native configuration.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
Re: [PR] [K8S][HELM] Add Spark engine configuration support [kyuubi]
Posted by "dnskr (via GitHub)" <gi...@apache.org>.
dnskr commented on PR #4776:
URL: https://github.com/apache/kyuubi/pull/4776#issuecomment-1883774422
Closed in favor of https://github.com/apache/kyuubi/pull/5934
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org