You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "hiboyang (via GitHub)" <gi...@apache.org> on 2023/05/03 16:04:15 UTC

[GitHub] [spark] hiboyang opened a new pull request, #41036: [Connect] Add Spark Connect Go prototype code and example

hiboyang opened a new pull request, #41036:
URL: https://github.com/apache/spark/pull/41036

   ### What changes were proposed in this pull request?
   
   This pull request is to add a small Spark Connect Go client example and prototype.
   
   ### Why are the changes needed?
   
   Spark Connect was released in Spark 3.4.0. There is no Go client yet. Better to have a Go client so Spark Connect could be used by Go programmer.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. User will be able to use Go to write Spark Connect application.
   
   ### How was this patch tested?
   
   Manually tested by running the example Go code.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1546009193

   Hi @grundprinzip, I see some error like following in PR check (https://github.com/hiboyang/spark/actions/runs/4955561542/jobs/8865088163). I already run `./dev/connect-gen-protos.sh`, but still see this error. Do you have any suggestion on how to deal with this?
   
   ```
   Different files: ['base_pb2.pyi', 'catalog_pb2.pyi', 'commands_pb2.pyi', 'common_pb2.pyi', 'example_plugins_pb2.pyi', 'expressions_pb2.pyi', 'relations_pb2.pyi', 'types_pb2.pyi']
   Generated files for pyspark-connect are out of sync! If you have touched files under connector/connect/common/src/main/protobuf/, please run ./dev/connect-gen-protos.sh. If you haven't touched any file above, please rebase your PR against main branch.
   Error: Process completed with exit code 255.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1714782166

   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1533402592

   > Thanks. Please file a JIRA and update the PR title, @hiboyang .
   > 
   > cc @HyukjinKwon , @cloud-fan , @hvanhovell , @LuciferYang , @grundprinzip
   
   Thanks for the suggestion! Added JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] github-actions[bot] closed pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example
URL: https://github.com/apache/spark/pull/41036


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1184493980


##########
connector/connect/client/go/README.md:
##########
@@ -0,0 +1,35 @@
+- Prepare your environment to generate proto Go files:

Review Comment:
   Cool, thanks for the info!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537337003

   If you don't mind, my first suggestion is that we make a quick prototype of the build system so that we don't have to check in the generated code. 
   
   I can help with that as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.
amaliujia commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1184443145


##########
connector/connect/client/go/README.md:
##########
@@ -0,0 +1,92 @@
+## Summary

Review Comment:
   Not a Go expert so just a question: how does such code get distributed in Go world? Does it need to be released in some ways?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537503907

   
   > By the way, we need to discuss where to publish those generated Go proto files. People will need to reference those files as Go module/package, when they write Spark Connect Go application. Any thoughts on this?
   
   I think we probably need to figure out how `go get` will work  as well. Unfortunately, the release distribution will require us to check-in the generated code. 
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.
amaliujia commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1184443145


##########
connector/connect/client/go/README.md:
##########
@@ -0,0 +1,92 @@
+## Summary

Review Comment:
   Not a Go expert so just a question: does such code distributed in Go world? Does it need to be released in some ways?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1538720811

   > It would be great if you could just merge the change from my PR (just add the remote tree to your branch and merge the changes) and then push to your PR again. Right now, the Makefile is semi broken because you didn't mv the examples from examples to `cmd`.
   > 
   > I took the project layout from: https://github.com/golang-standards/project-layout
   > 
   > Now, for the package name we will have to refactor this a bit to make go get work. I think it should become: `https://github.com/apache/spark/connector/connect/client/go`
   > 
   > We can use a versioning scheme similar to what arrow does.
   
   Got it, I am not super familiar with git commands to merge between different PRs, sorry for breaking here. I think now Makefile is good.
   
   I renamed to `https://github.com/apache/spark/connector/connect/client/go` in the code.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1542142103

   Build link :https://github.com/hiboyang/spark/actions/runs/4933558573/jobs/8817627400


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537477043

   > I created a draft PR that contains the necessary build code and moves the files around to fit the approach, feel free to just use the code in your PR.
   > 
   > #41080
   
   This is great, thanks @grundprinzip! I copied code from your PR to this PR. Those `make` commands are very convenient!
   
   By the way, we need to discuss where to publish those generated Go proto files. People will need to reference those files as Go module/package, when they write Spark Connect Go application. Any thoughts on this?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.
amaliujia commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1184440923


##########
connector/connect/client/go/README.md:
##########
@@ -0,0 +1,35 @@
+- Prepare your environment to generate proto Go files:

Review Comment:
   We have been using this to generate proto code for python https://github.com/apache/spark/blob/master/dev/connect-gen-protos.sh
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1533984760

   Need to add unit test / integration test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1571432439

   Hi @hiboyang,
   
   We've prepared the repository at https://github.com/apache/spark-connect-go so that you mostly just need to drop your files in there. I've already addressed the problem of referencing the proto files and setup the code generation as part of the makefile. The next step would really just be adding your code.
   
   Looking forward to your contribution!
   Thanks
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537797356

   Do we have a dev guideline for this go client? And will Github Action run tests for it? I think we can merge this PR once the infra is ready, the API coverage can be improved later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1538727584

   @hiboyang I will need some time to integrate a build into the CI (github worklows). I hope I can get it done quickly
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1189391849


##########
connector/connect/client/go/go.mod:
##########
@@ -0,0 +1,53 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+module github.com/apache/spark/connector/connect/client/go/v_3_4
+
+go 1.19

Review Comment:
   We generally just consider the LTS version for Java,  Spark supports three LTS Java versions(8/11/17 and 17 is the latest LTS currently) and always build and tests using the latest minor version of each LTS, so I think Spark uses the latest version of Java.
   
   However, I'm not familiar with the version mechanism of the Go language so far, so I can't provide any better suggestions for choosing the test version at the moment. @hiboyang So can you provide more other background information? For example, which version are the mainstream Go project communities more inclined to support now?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537505103

   It would be great if you could just merge the change from my PR (just add the remote tree to your branch and merge the changes) and then push to your PR again. Right now, the Makefile is semi broken because you didn't mv the examples from examples to `cmd`.
   
   I took the project layout from: https://github.com/golang-standards/project-layout
   
   Now, for the package name we will have to refactor this a bit to make go get work. I think it should become: `https://github.com/apache/spark/connector/connect/client/go`
   
   We can use a versioning scheme similar to what arrow does.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537507895

   @hvanhovell @HyukjinKwon @cloud-fan @gatorsmile How do you propose we make progress here?
   
   I'm worried that we will accumulate a large number of changes in particular with the generated code that needs to be checked in and changed to make the prototype working. In the worst case it will make it harder to review.
   
   What would be a good skeleton to submit?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1544807716

   Thanks @grundprinzip for adding github workflow! I merged your whole PR branch to my PR just now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1183970744


##########
connector/connect/client/go/README.md:
##########
@@ -0,0 +1,35 @@
+- Prepare your environment to generate proto Go files:

Review Comment:
   The current buf build should support generating the go bindings already. I think we can find an easy way to make it work. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1188047028


##########
connector/connect/client/go/go.mod:
##########
@@ -0,0 +1,53 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+module github.com/apache/spark/connector/connect/client/go/v_3_4
+
+go 1.19

Review Comment:
   it seems the latest version is 1.20?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1533705017

   > This is awesome! Thanks for starting the work. I think the next step would be to have a quick discussion in a readme or the PR on the rough design of the objects and methods so that we get an idea of what might work or not.
   
   Thanks @grundprinzip for the feedback :) Yes, will add more details in the PR to have more discussion!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1184242830


##########
connector/connect/client/go/README.md:
##########
@@ -0,0 +1,35 @@
+- Prepare your environment to generate proto Go files:

Review Comment:
   Do you mean the `protobuf-maven-plugin` plugin? Unfortunately it does not support generating go bindings. I also searched around, did not find proper plugins in maven to generate protobuf go code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1184476110


##########
connector/connect/client/go/README.md:
##########
@@ -0,0 +1,92 @@
+## Summary

Review Comment:
   For Go, normally people put the Go code in GitHub repo, and reference that repo as a Go library (package). For example, following code references to `github.com/apache/arrow/go/v12/arrow/ipc` library which is in repo `https://github.com/apache/arrow`:
   
   ```
   import "github.com/apache/arrow/go/v12/arrow/ipc"
   
   func foo() {
   	arrowReader, err := ipc.NewReader(...)
   }
   ```



##########
connector/connect/client/go/README.md:
##########
@@ -0,0 +1,92 @@
+## Summary

Review Comment:
   Add Quick Start guide, explaining how to start a Go Spark Connect application.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1189220734


##########
connector/connect/client/go/go.mod:
##########
@@ -0,0 +1,53 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+module github.com/apache/spark/connector/connect/client/go/v_3_4
+
+go 1.19

Review Comment:
   Right, Go latest version is 1.20. It is released on Fed this year, afraid it is too new to be used by majority Go developers. Spark does not use latest version of Java, thus better not use latest Go version either?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1568313333

   Hey @hiboyang,
   
   we made some progress on the repository and would like to ask you to move your PR to https://github.com/apache/spark-connect-go
   
   For the the verification, we can add this later but would be great to get the first version of the PR submitted to incrementally make progress. I would suggest as well to commit the proto source files and we will build a verification that compares the two branches between the main spark and golang client repo late.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1543522121

   @hiboyang I've updated the github workflows file to build the Go tests as well. Please pick the changes from my test PR https://github.com/apache/spark/pull/41117
   
   * .github/workflows/build_and_test.yml
   
   In addition, please make sure to pick the changes on the `.gitignore` and please remove the `internal/generated.out` file from the git history it's needed to trigger the code generation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537416130

   I created a draft PR that contains the necessary build code and moves the files around to fit the approach, feel free to just use the code in your PR.
   
   https://github.com/apache/spark/pull/41080


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537823312

   I can help run the tests and build as part of the CI that shouldn't be too hard when the make build runs locally. 
   
   @hiboyang let me know when make and make fulltest pass locally and then we can try to get a first version merged before we add a lot more coverage. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1537339507

   > If you don't mind, my first suggestion is that we make a quick prototype of the build system so that we don't have to check in the generated code.
   > 
   > I can help with that as well.
   
   Yes, one thing to consider is how Spark Connect Go application could reference those code as a Go library. Please go ahead with the prototype, thanks for helping here!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on PR #41036:
URL: https://github.com/apache/spark/pull/41036#issuecomment-1538707652

   > I can help run the tests and build as part of the CI that shouldn't be too hard when the make build runs locally.
   > 
   > @hiboyang let me know when make and make fulltest pass locally and then we can try to get a first version merged before we add a lot more coverage.
   
   Yes, thanks @grundprinzip for helping CI here!
   
   `make fulltest` is passing now. Need to run `make internal/generated.out` first to generate Go proto files.
   
   ```
   make internal/generated.out
   
   make fulltest
   >> TEST, "coverage"
        ?   	github.com/apache/spark/go/v_3_4/examples/spark-connect-example-raw-grpc-client	[no test files]
        ?   	github.com/apache/spark/go/v_3_4/examples/spark-connect-example-spark-session	[no test files]
        ?   	github.com/apache/spark/go/v_3_4/internal/generated	[no test files]
        ok  	github.com/apache/spark/go/v_3_4/spark/sql	0.358s	coverage: 22.6% of statements
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hiboyang commented on a diff in pull request #41036: [SPARK-43351] [CONNECT] Add Spark Connect Go prototype code and example

Posted by "hiboyang (via GitHub)" <gi...@apache.org>.
hiboyang commented on code in PR #41036:
URL: https://github.com/apache/spark/pull/41036#discussion_r1190021545


##########
connector/connect/client/go/go.mod:
##########
@@ -0,0 +1,53 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+module github.com/apache/spark/connector/connect/client/go/v_3_4
+
+go 1.19

Review Comment:
   Got it, thanks @LuciferYang for explaining! There seems no single Go version used by mainstream Go communities. I checked several popular Go projects, they use different versions like following:
   
   ```
   grpc/grpc-go go 1.17
   etcd-io/etcd go 1.19
   kubernetes/kubernetes go 1.20
   kubernetes/client-go go 1.20
   helm/helm go 1.19
   ```
   
   Maybe we should make Spark Connect Go Client support and tested with multiple Go versions. Since grundprinzip is working on the CI, let's revisit this after that?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org