You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "rangadi (via GitHub)" <gi...@apache.org> on 2023/05/30 06:00:56 UTC

[GitHub] [spark] rangadi opened a new pull request, #41377: [DRAFT] Generate Protobuf descriptor files at build time.

rangadi opened a new pull request, #41377:
URL: https://github.com/apache/spark/pull/41377

   [ This is a draft PR and is not ready to merge yet]
   
   This updates build to generate descriptor files for Protobuf files at build time. This removes `.desc` from test-resources.
   
   This solution is currently a bit hacky. Maven & SBT plugins generate descriptor files differently:
     - Maven builds one descriptor file for each proto file (`catalyst_types.desc`, `functions_suite.desc` etc).
     - SBT plugin build single descriptor file for all the proto files.
   It is not clear how we can get consistent out from these. We can use either option, but need both builds to generate same files. 
   
   
   ### How was this patch tested?
    - Existing unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #41377: [DRAFT] Generate Protobuf descriptor files at build time.

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #41377:
URL: https://github.com/apache/spark/pull/41377#discussion_r1211817475


##########
project/SparkBuild.scala:
##########
@@ -929,7 +929,12 @@ object SparkProtobuf {
     (Test / PB.protoSources) += (Test / sourceDirectory).value / "resources" / "protobuf",
 
     (Test / PB.targets) := Seq(
-      PB.gens.java -> target.value / "generated-test-sources"
+      PB.gens.java -> target.value / "generated-test-sources",
+      PB.gens.descriptorSet -> target.value / "generated-test-sources/descriptor-set-sbt.desc",

Review Comment:
   Seems the two plugins use different commands, I don't have a better way either, unless patch `sbt-protoc` to use the same generation command as the `protoc-jar-maven-plugin`
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang closed pull request #41377: [SPARK-43921][PROTOBUF] Generate Protobuf descriptor files at build time

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang closed pull request #41377: [SPARK-43921][PROTOBUF] Generate Protobuf descriptor files at build time
URL: https://github.com/apache/spark/pull/41377


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rangadi commented on a diff in pull request #41377: [DRAFT] Generate Protobuf descriptor files at build time.

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on code in PR #41377:
URL: https://github.com/apache/spark/pull/41377#discussion_r1212240116


##########
project/SparkBuild.scala:
##########
@@ -929,7 +929,12 @@ object SparkProtobuf {
     (Test / PB.protoSources) += (Test / sourceDirectory).value / "resources" / "protobuf",
 
     (Test / PB.targets) := Seq(
-      PB.gens.java -> target.value / "generated-test-sources"
+      PB.gens.java -> target.value / "generated-test-sources",
+      PB.gens.descriptorSet -> target.value / "generated-test-sources/descriptor-set-sbt.desc",

Review Comment:
   Shall we go this approach then? I will rename `testFile()` to something else. 
   I does test the full functionality. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rangadi commented on a diff in pull request #41377: [DRAFT] Generate Protobuf descriptor files at build time.

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on code in PR #41377:
URL: https://github.com/apache/spark/pull/41377#discussion_r1213687793


##########
connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufFunctionsSuite.scala:
##########
@@ -40,11 +40,11 @@ class ProtobufFunctionsSuite extends QueryTest with SharedSparkSession with Prot
 
   import testImplicits._
 
-  val testFileDescFile = testFile("functions_suite.desc", "protobuf/functions_suite.desc")
+  val testFileDescFile = protobufDescriptorFile("functions_suite.desc")

Review Comment:
   Renamed this utility method not to conflict with `testFile()` in super class. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #41377: [SPARK-43921][PROTOBUF] Generate Protobuf descriptor files at build time

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on PR #41377:
URL: https://github.com/apache/spark/pull/41377#issuecomment-1580769465

   I checked this one with maven and Scala 2.13, all test passed.
   Merged to master. Thanks @rangadi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rangadi commented on pull request #41377: [DRAFT] Generate Protobuf descriptor files at build time.

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on PR #41377:
URL: https://github.com/apache/spark/pull/41377#issuecomment-1572792964

   This is now ready for review. cc: @LuciferYang, @gengliangwang, @SandishKumarHN 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #41377: [DRAFT] Generate Protobuf descriptor files at build time.

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #41377:
URL: https://github.com/apache/spark/pull/41377#discussion_r1210401764


##########
project/SparkBuild.scala:
##########
@@ -929,7 +929,12 @@ object SparkProtobuf {
     (Test / PB.protoSources) += (Test / sourceDirectory).value / "resources" / "protobuf",
 
     (Test / PB.targets) := Seq(
-      PB.gens.java -> target.value / "generated-test-sources"
+      PB.gens.java -> target.value / "generated-test-sources",
+      PB.gens.descriptorSet -> target.value / "generated-test-sources/descriptor-set-sbt.desc",

Review Comment:
   I'll take a look at this one tomorrow
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rangadi commented on pull request #41377: [SPARK-43921] Generate Protobuf descriptor files at build time.

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on PR #41377:
URL: https://github.com/apache/spark/pull/41377#issuecomment-1579192292

   @LuciferYang could you merge this? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rangadi commented on pull request #41377: [SPARK-43921][PROTOBUF] Generate Protobuf descriptor files at build time

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on PR #41377:
URL: https://github.com/apache/spark/pull/41377#issuecomment-1581100223

   Thanks @LuciferYang!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rangadi commented on pull request #41377: [SPARK-43921] Generate Protobuf descriptor files at build time.

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on PR #41377:
URL: https://github.com/apache/spark/pull/41377#issuecomment-1579830437

   Fixed. The conflict was because of descriptor file updated in #41481


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #41377: [SPARK-43921] Generate Protobuf descriptor files at build time.

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on PR #41377:
URL: https://github.com/apache/spark/pull/41377#issuecomment-1579803072

   @rangadi Could you resolve the conflict?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rangadi commented on a diff in pull request #41377: [DRAFT] Generate Protobuf descriptor files at build time.

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on code in PR #41377:
URL: https://github.com/apache/spark/pull/41377#discussion_r1209742583


##########
project/SparkBuild.scala:
##########
@@ -929,7 +929,12 @@ object SparkProtobuf {
     (Test / PB.protoSources) += (Test / sourceDirectory).value / "resources" / "protobuf",
 
     (Test / PB.targets) := Seq(
-      PB.gens.java -> target.value / "generated-test-sources"
+      PB.gens.java -> target.value / "generated-test-sources",
+      PB.gens.descriptorSet -> target.value / "generated-test-sources/descriptor-set-sbt.desc",

Review Comment:
   @LuciferYang this PR is an attempt at generating descriptor files at build time. it is not clear if there is a way to make SBT create one descriptor file for each proto file. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #41377: [DRAFT] Generate Protobuf descriptor files at build time.

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #41377:
URL: https://github.com/apache/spark/pull/41377#discussion_r1212716911


##########
project/SparkBuild.scala:
##########
@@ -929,7 +929,12 @@ object SparkProtobuf {
     (Test / PB.protoSources) += (Test / sourceDirectory).value / "resources" / "protobuf",
 
     (Test / PB.targets) := Seq(
-      PB.gens.java -> target.value / "generated-test-sources"
+      PB.gens.java -> target.value / "generated-test-sources",
+      PB.gens.descriptorSet -> target.value / "generated-test-sources/descriptor-set-sbt.desc",

Review Comment:
   I think it is ok, Whether they are in multiple files or all in one



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org