You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/11 02:24:58 UTC

[GitHub] [spark] LuciferYang opened a new pull request, #38609: [SPARK-40593][CONNECT] Add profile to make user can specify custom `protocExecutable` and `pluginExecutable` when building connect module

LuciferYang opened a new pull request, #38609:
URL: https://github.com/apache/spark/pull/38609

   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
grundprinzip commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1023252592


##########
connector/connect/README.md:
##########
@@ -24,7 +24,31 @@ or
 ```bash
 ./build/sbt -Phive clean package
 ```
-   
+
+### Build with user-defined `protoc` and `protoc-gen-grpc-java`
+
+When the user cannot use the official `protoc` and `protoc-gen-grpc-java` binary files to build the `connect` module in the compilation environment,
+for example, compiling `connect` module on CentOS 6 or CentOS 7 which the default `glibc` version is less than 2.14, we can try to compile and test by 
+specifying the user-defined `protoc` and `protoc-gen-grpc-java` binary files as follows:
+
+```bash
+export CONNECT_PROTOC_EXEC_PATH=/path-to-protoc-exe
+export CONNECT_PLUGIN_EXEC_PATH=/path-to-protoc-gen-grpc-java-exe
+./build/mvn -Phive -Puser-defined-pb clean package

Review Comment:
   From a consistency perspective I'm suggesting to actually call the profile `user-defined-protoc` because it points to the `protoc` compiler.
   
   ```suggestion
   ./build/mvn -Phive -Puser-defined-protoc clean package
   ```



##########
connector/connect/pom.xml:
##########
@@ -371,4 +350,68 @@
       </plugin>
     </plugins>
   </build>
+  <profiles>
+    <profile>
+      <id>official-pb</id>

Review Comment:
   ```suggestion
         <id>default-protoc</id>
   ```



##########
connector/connect/README.md:
##########
@@ -24,7 +24,31 @@ or
 ```bash
 ./build/sbt -Phive clean package
 ```
-   
+
+### Build with user-defined `protoc` and `protoc-gen-grpc-java`
+
+When the user cannot use the official `protoc` and `protoc-gen-grpc-java` binary files to build the `connect` module in the compilation environment,
+for example, compiling `connect` module on CentOS 6 or CentOS 7 which the default `glibc` version is less than 2.14, we can try to compile and test by 
+specifying the user-defined `protoc` and `protoc-gen-grpc-java` binary files as follows:
+
+```bash
+export CONNECT_PROTOC_EXEC_PATH=/path-to-protoc-exe
+export CONNECT_PLUGIN_EXEC_PATH=/path-to-protoc-gen-grpc-java-exe
+./build/mvn -Phive -Puser-defined-pb clean package
+```
+
+or
+
+```bash
+export CONNECT_PROTOC_EXEC_PATH=/path-to-protoc-exe
+export CONNECT_PLUGIN_EXEC_PATH=/path-to-protoc-gen-grpc-java-exe
+./build/sbt -Puser-defined-pb clean package

Review Comment:
   ```suggestion
   ./build/sbt -Puser-defined-protoc clean package
   ```



##########
connector/connect/pom.xml:
##########
@@ -371,4 +350,68 @@
       </plugin>
     </plugins>
   </build>
+  <profiles>
+    <profile>
+      <id>official-pb</id>
+      <activation>
+        <activeByDefault>true</activeByDefault>
+      </activation>
+      <build>
+        <plugins>
+          <!-- Add protobuf-maven-plugin and provide ScalaPB as a code generation plugin -->
+          <plugin>
+            <groupId>org.xolstice.maven.plugins</groupId>
+            <artifactId>protobuf-maven-plugin</artifactId>
+            <version>0.6.1</version>
+            <configuration>
+              <protocArtifact>com.google.protobuf:protoc:${protobuf.version}:exe:${os.detected.classifier}</protocArtifact>
+              <pluginId>grpc-java</pluginId>
+              <pluginArtifact>io.grpc:protoc-gen-grpc-java:${io.grpc.version}:exe:${os.detected.classifier}</pluginArtifact>
+              <protoSourceRoot>src/main/protobuf</protoSourceRoot>
+            </configuration>
+            <executions>
+              <execution>
+                <goals>
+                  <goal>compile</goal>
+                  <goal>compile-custom</goal>
+                  <goal>test-compile</goal>
+                </goals>
+              </execution>
+            </executions>
+          </plugin>
+        </plugins>
+      </build>
+    </profile>
+    <profile>
+      <id>user-defined-pb</id>

Review Comment:
   ```suggestion
         <id>user-defined-protoc</id>
   ```



##########
project/SparkBuild.scala:
##########
@@ -109,6 +109,14 @@ object SparkBuild extends PomBuild {
     if (profiles.contains("jdwp-test-debug")) {
       sys.props.put("test.jdwp.enabled", "true")
     }
+    if (profiles.contains("user-defined-pb")) {

Review Comment:
   ```suggestion
       if (profiles.contains("user-defined-protoc")) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1694589092

   What's your compile command? I think it's not related to this PR. I'm using M2 Max, and I can build successfully with the default profile.I think M2 doesn't need to use `user-defined-protoc` and `user-defined-protoc` is for older versions of Linux, such as CentOS 6.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38609: [WIP][SPARK-40593][CONNECT] Add profile to make user can specify custom `protocExecutable` and `pluginExecutable` when building connect module

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1311156538

   @HyukjinKwon 
   
   As I mentioned in SPARK-40593
   
   The default glibc version in CentOs6 and CentOs7 is 2.12, but glibc 2.14 is required at least when build `connect` module, so the following compilation errors will occur when build `connect`  module on CentOs6&CentOs7:
   
   ```
   [ERROR] PROTOC FAILED: /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe)
   /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.18' not found (required by /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe)
   /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.14' not found (required by /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe)
   /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe: /usr/lib64/libstdc++.so.6: version `CXXABI_1.3.5' not found (required by /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe) 
   ```
   
   In the current draft, I added a new profile to let the user specify user-defined `protocExecutable` and `pluginExecutable` for workaourd, for exmaple 
   
   ```
   export CONNECT_PROTOC_EXEC_PATH = pathToProtocExecutable
   export CONNECT_PLUGIN_EXEC_PATH = pathToPluginExecutable
   mvn clean install -DskipTests connector/connect -am 
   ```
   
   
   Currently, only Maven related parts are implemented in draft. Do we accept this way to workaround?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1317964560

   Any other changes? @HyukjinKwon @grundprinzip @amaliujia Thanks ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38609: [WIP][SPARK-40593][CONNECT] Add profile to make user can specify custom `protocExecutable` and `pluginExecutable` when building connect module

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1019875427


##########
project/SparkBuild.scala:
##########
@@ -109,6 +109,16 @@ object SparkBuild extends PomBuild {
     if (profiles.contains("jdwp-test-debug")) {
       sys.props.put("test.jdwp.enabled", "true")
     }
+    if (profiles.contains("user-defined-pb")) {

Review Comment:
   also cc @19855134604



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
amaliujia commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1314615345

   LGTM assuming this is tested manually (seems to be hard have a UT)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
grundprinzip commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1318486206

   Since I cannot directly edit the PR description or title, I would kindly ask you to do the following changes:
   
   Title: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.
   
   > ### What changes were proposed in this pull request?
   > This PR adds a new profile named `-Puser-defined-protoc` to support that users can build and test `connect` module by specifying custom `protoc ` and `protoc-gen-grpc-java` executables.
   > 
   > ### Why are the changes needed?
   > As described in [SPARK-40593](https://issues.apache.org/jira/browse/SPARK-40593), the latest versions of `protoc` and `protoc-gen-grpc-java` have minimum version requirements for basic libraries such as `glibc` and `glibcxx`. Because of that it is not possible to compile the `connect` module out of the box on CentOS 6 or CentOS 7. Instead the following error messages is shown:
   > 
   > ```
   > [ERROR] PROTOC FAILED: /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe)
   > /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.18' not found (required by /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe)
   > /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.14' not found (required by /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe)
   > /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe: /usr/lib64/libstdc++.so.6: version `CXXABI_1.3.5' not found (required by /home/disk0/spark-source/connect/target/protoc-plugins/protoc-3.21.1-linux-x86_64.exe) 
   > ```
   > 
   > ### Does this PR introduce _any_ user-facing change?
   > No, the way to using official pre-release `protoc` and `protoc-gen-grpc-java` binary files is activated by default.
   > 
   > ### How was this patch tested?
   > * Pass GitHub Actions
   > * Manual test on CentOS6u3 and CentOS7u4
   > 
   > ```shell
   > export CONNECT_PROTOC_EXEC_PATH=/path-to-protoc-exe
   > export CONNECT_PLUGIN_EXEC_PATH=/path-to-protoc-gen-grpc-java-exe
   > ./build/mvn clean install -pl connector/connect -Puser-defined-protoc -am -DskipTests
   > ./build/mvn clean test -pl connector/connect -Puser-defined-protoc 
   > ```
   > 
   > and
   > 
   > ```shell
   > export CONNECT_PROTOC_EXEC_PATH=/path-to-protoc-exe
   > export CONNECT_PLUGIN_EXEC_PATH=/path-to-protoc-gen-grpc-java-exe
   > ./build/sbt clean "connect/compile" -Puser-defined-protoc
   > ./build/sbt  "connect/test" -Puser-defined-protoc
   > ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1318498330

   Both pr title and description have been updated @grundprinzip Thanks ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38609: [WIP][SPARK-40593][CONNECT] Add profile to make user can specify custom `protocExecutable` and `pluginExecutable` when building connect module

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1019875427


##########
project/SparkBuild.scala:
##########
@@ -109,6 +109,16 @@ object SparkBuild extends PomBuild {
     if (profiles.contains("jdwp-test-debug")) {
       sys.props.put("test.jdwp.enabled", "true")
     }
+    if (profiles.contains("user-defined-pb")) {

Review Comment:
   also cc @JiangHaonan



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1022325876


##########
connector/connect/README.md:
##########
@@ -24,7 +24,30 @@ or
 ```bash
 ./build/sbt -Phive clean package
 ```
-   
+
+### Build with user-defined `protoc` and `protoc-gen-grpc-java`

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38609: [WIP][SPARK-40593][CONNECT] Add profile to make user can specify custom `protocExecutable` and `pluginExecutable` when building connect module

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1311194812

   User need to manually compile `protoc-xxx-linux-x86_64.exe` and `protoc-gen-grpc-java-1.47.0-linux-x86_64.exe` can executable on `CentOs6&CentOs7`.
   
   Or pre install the library to a custom directory on the machine, and change the `rpath` and `interpreter` of the official
   `protoc-xxx-linux-x86_64.exe` and `protoc-gen-grpc-java-1.47.0-linux-x86_64.exe` to the library path
   
   I choose the latter way and have build pass, this may be complex, but it is easier to perform than upgrading the OS base library.
   
   Yes, documentation is required to tell users how to use this way. Do we also need to tell users how to build `protoc-xxx-linux-x86_64.exe` and `protoc-gen-grpc-java-1.47.0-linux-x86_64.exe` in the document?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38609: [WIP][SPARK-40593][CONNECT] Add profile to make user can specify custom `protocExecutable` and `pluginExecutable` when building connect module

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1311196281

   Let me finish the sbt part first
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1319466822

   Thansks @HyukjinKwon @grundprinzip @amaliujia  ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #38609: [WIP][SPARK-40593][CONNECT] Add profile to make user can specify custom `protocExecutable` and `pluginExecutable` when building connect module

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1311181590

   How do we get the user-defined protobuf executables for `CONNECT_PROTOC_EXEC_PATH` and `CONNECT_PLUGIN_EXEC_PATH` in CentOS 6 and 7? If this is the only way, I am fine but we should probably document it how to build/use.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #38609: [WIP][SPARK-40593][CONNECT] Add profile to make user can specify custom `protocExecutable` and `pluginExecutable` when building connect module

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1311239503

   cc @grundprinzip @amaliujia FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
amaliujia commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1021995966


##########
project/SparkBuild.scala:
##########
@@ -109,6 +109,16 @@ object SparkBuild extends PomBuild {
     if (profiles.contains("jdwp-test-debug")) {
       sys.props.put("test.jdwp.enabled", "true")
     }
+    if (profiles.contains("user-defined-pb")) {

Review Comment:
   Is this from the pom file?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jinhai-cloud commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.

Posted by "jinhai-cloud (via GitHub)" <gi...@apache.org>.
jinhai-cloud commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1694577194

   @LuciferYang ,When I was compiling master, I encountered the following problem. The model is Mac M2. Could you tell me how to mutate successfully?
   
   ```
   [ERROR] /Users/cs/GitHub/spark/connector/connect/common/src/main/protobuf/spark/connect/expressions.proto [0:0]: /Users/cs/GitHub/spark/connector/connect/common/target/protoc-plugins/protoc-gen-grpc-java-1.56.0-osx-aarch_64.exe: program not found or is not executable
   Please specify a program using absolute path or make sure the program is available in your PATH system variable
   --grpc-java_out: protoc-gen-grpc-java: Plugin failed with status code 1.
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
amaliujia commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1318054981

   Checking with @grundprinzip to see if there are more comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.

Posted by GitBox <gi...@apache.org>.
grundprinzip commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1318500167

   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1022256460


##########
project/SparkBuild.scala:
##########
@@ -109,6 +109,16 @@ object SparkBuild extends PomBuild {
     if (profiles.contains("jdwp-test-debug")) {
       sys.props.put("test.jdwp.enabled", "true")
     }
+    if (profiles.contains("user-defined-pb")) {

Review Comment:
   It is customary that the profile name should be consistent with that in `pom.xml`.
   
   The `profiles` from env `SBT_MAVEN_PROFILES`,
   
   https://github.com/apache/spark/blob/84fabc27d688601feabb42abfc7356cd743b3c38/build/sbt-launch-lib.bash#L98-L102
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1023446576


##########
project/SparkBuild.scala:
##########
@@ -109,6 +109,14 @@ object SparkBuild extends PomBuild {
     if (profiles.contains("jdwp-test-debug")) {
       sys.props.put("test.jdwp.enabled", "true")
     }
+    if (profiles.contains("user-defined-pb")) {

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] amaliujia commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
amaliujia commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1314293906

   Looks easy to follow!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1316865115

   <img width="1220" alt="image" src="https://user-images.githubusercontent.com/1475305/202171441-cc303a48-fc6b-475c-a68b-bf997603cad1.png">
    
   GA passed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1022213304


##########
connector/connect/README.md:
##########
@@ -24,7 +24,30 @@ or
 ```bash
 ./build/sbt -Phive clean package
 ```
-   
+
+### Build with user-defined `protoc` and `protoc-gen-grpc-java`

Review Comment:
   I would add an example in which we need this custom compilation. For example, CentOS 5 and 6.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1023446300


##########
connector/connect/README.md:
##########
@@ -24,7 +24,31 @@ or
 ```bash
 ./build/sbt -Phive clean package
 ```
-   
+
+### Build with user-defined `protoc` and `protoc-gen-grpc-java`
+
+When the user cannot use the official `protoc` and `protoc-gen-grpc-java` binary files to build the `connect` module in the compilation environment,
+for example, compiling `connect` module on CentOS 6 or CentOS 7 which the default `glibc` version is less than 2.14, we can try to compile and test by 
+specifying the user-defined `protoc` and `protoc-gen-grpc-java` binary files as follows:
+
+```bash
+export CONNECT_PROTOC_EXEC_PATH=/path-to-protoc-exe
+export CONNECT_PLUGIN_EXEC_PATH=/path-to-protoc-gen-grpc-java-exe
+./build/mvn -Phive -Puser-defined-pb clean package

Review Comment:
   done



##########
connector/connect/README.md:
##########
@@ -24,7 +24,31 @@ or
 ```bash
 ./build/sbt -Phive clean package
 ```
-   
+
+### Build with user-defined `protoc` and `protoc-gen-grpc-java`
+
+When the user cannot use the official `protoc` and `protoc-gen-grpc-java` binary files to build the `connect` module in the compilation environment,
+for example, compiling `connect` module on CentOS 6 or CentOS 7 which the default `glibc` version is less than 2.14, we can try to compile and test by 
+specifying the user-defined `protoc` and `protoc-gen-grpc-java` binary files as follows:
+
+```bash
+export CONNECT_PROTOC_EXEC_PATH=/path-to-protoc-exe
+export CONNECT_PLUGIN_EXEC_PATH=/path-to-protoc-gen-grpc-java-exe
+./build/mvn -Phive -Puser-defined-pb clean package
+```
+
+or
+
+```bash
+export CONNECT_PROTOC_EXEC_PATH=/path-to-protoc-exe
+export CONNECT_PLUGIN_EXEC_PATH=/path-to-protoc-gen-grpc-java-exe
+./build/sbt -Puser-defined-pb clean package

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38609: [SPARK-40593][BUILD][CONNECT] Make user can build and test `connect` module by specifying the user-defined `protoc` and `protoc-gen-grpc-java`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38609:
URL: https://github.com/apache/spark/pull/38609#discussion_r1023446458


##########
connector/connect/pom.xml:
##########
@@ -371,4 +350,68 @@
       </plugin>
     </plugins>
   </build>
+  <profiles>
+    <profile>
+      <id>official-pb</id>

Review Comment:
   done



##########
connector/connect/pom.xml:
##########
@@ -371,4 +350,68 @@
       </plugin>
     </plugins>
   </build>
+  <profiles>
+    <profile>
+      <id>official-pb</id>
+      <activation>
+        <activeByDefault>true</activeByDefault>
+      </activation>
+      <build>
+        <plugins>
+          <!-- Add protobuf-maven-plugin and provide ScalaPB as a code generation plugin -->
+          <plugin>
+            <groupId>org.xolstice.maven.plugins</groupId>
+            <artifactId>protobuf-maven-plugin</artifactId>
+            <version>0.6.1</version>
+            <configuration>
+              <protocArtifact>com.google.protobuf:protoc:${protobuf.version}:exe:${os.detected.classifier}</protocArtifact>
+              <pluginId>grpc-java</pluginId>
+              <pluginArtifact>io.grpc:protoc-gen-grpc-java:${io.grpc.version}:exe:${os.detected.classifier}</pluginArtifact>
+              <protoSourceRoot>src/main/protobuf</protoSourceRoot>
+            </configuration>
+            <executions>
+              <execution>
+                <goals>
+                  <goal>compile</goal>
+                  <goal>compile-custom</goal>
+                  <goal>test-compile</goal>
+                </goals>
+              </execution>
+            </executions>
+          </plugin>
+        </plugins>
+      </build>
+    </profile>
+    <profile>
+      <id>user-defined-pb</id>

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #38609: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #38609: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.
URL: https://github.com/apache/spark/pull/38609


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #38609: [SPARK-40593][BUILD][CONNECT] Support user configurable `protoc` and `protoc-gen-grpc-java` executables when building Spark Connect.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #38609:
URL: https://github.com/apache/spark/pull/38609#issuecomment-1319422162

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org