You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "juliuszsompolski (via GitHub)" <gi...@apache.org> on 2023/05/02 14:43:11 UTC

[GitHub] [spark] juliuszsompolski opened a new pull request, #41015: [CONNECT][DOC] Add information on how to regenerate proto for python client

juliuszsompolski opened a new pull request, #41015:
URL: https://github.com/apache/spark/pull/41015

   ### What changes were proposed in this pull request?
   
   Figuring out how to generate connect grpc proto in python was surprisingly hard to figure out for me (not knowing much about python development though), so adding it to the README.
   
   ### Why are the changes needed?
   
   Improving internal documentation.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Not applicable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41015: [MINOR][CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #41015:
URL: https://github.com/apache/spark/pull/41015#discussion_r1183513372


##########
connector/connect/README.md:
##########
@@ -21,6 +21,28 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 
+To generate the Python client code from the proto files:
+
+First, make sure to have a Python environment with the installed dependencies.
+Specifically, install `black` and dependencies from the "Spark Connect python proto generation plugin (optional)" section.

Review Comment:
   no biggie but I suspect we won't necessarily need to mention this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] nija-at commented on a diff in pull request #41015: [CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "nija-at (via GitHub)" <gi...@apache.org>.
nija-at commented on code in PR #41015:
URL: https://github.com/apache/spark/pull/41015#discussion_r1182659878


##########
connector/connect/README.md:
##########
@@ -21,6 +21,17 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 
+When adding new proto messages, the python proto code need to be regenerated. To do it, use `dev/connect-gen-protos.sh` script.
+It depends on
+```
+brew install bufbuild/buf/buf
+```
+and python dependencies from
+```
+pip install -r dev/requirements.txt
+```
+(specifically, install `black` and dependencies from "Spark Connect python proto generation plugin (optional)" section)
+

Review Comment:
   I would've written it differently.
   
   ````suggestion
   To generate the Python client code from the proto files,
   
   First, make sure to have a Python environment with the installed dependencies:
   
   ```
   pip install -r dev/requirements.txt
   ```
   
   Install [buf](https://github.com/bufbuild/buf)
   
   ```
   brew install bufbuild/buf/buf
   ```
   
   Generate the Python files by running:
   
   ```
   dev/connect-gen-protos.sh
   ```
   
   ````



##########
connector/connect/README.md:
##########
@@ -21,6 +21,17 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 

Review Comment:
   Would it be useful to add a README in this folder and link to this README?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] nija-at commented on a diff in pull request #41015: [MINOR][CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "nija-at (via GitHub)" <gi...@apache.org>.
nija-at commented on code in PR #41015:
URL: https://github.com/apache/spark/pull/41015#discussion_r1183621047


##########
connector/connect/README.md:
##########
@@ -21,6 +21,28 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 
+To generate the Python client code from the proto files:
+
+First, make sure to have a Python environment with the installed dependencies.
+Specifically, install `black` and dependencies from the "Spark Connect python proto generation plugin (optional)" section.
+
+
+```
+pip install -r dev/requirements.txt
+```
+
+Install [buf](https://github.com/bufbuild/buf)
+
+```
+brew install bufbuild/buf/buf
+```
+
+Generate the Python files by running:
+
+```
+dev/connect-gen-protos.sh

Review Comment:
   @juliuszsompolski was not able to find them. So I suspect they are not in a very discoverable spot.
   
   Dedupe in a follow up sounds good. We can also change these to be links to the correct place.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] nija-at commented on a diff in pull request #41015: [CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "nija-at (via GitHub)" <gi...@apache.org>.
nija-at commented on code in PR #41015:
URL: https://github.com/apache/spark/pull/41015#discussion_r1182659878


##########
connector/connect/README.md:
##########
@@ -21,6 +21,17 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 
+When adding new proto messages, the python proto code need to be regenerated. To do it, use `dev/connect-gen-protos.sh` script.
+It depends on
+```
+brew install bufbuild/buf/buf
+```
+and python dependencies from
+```
+pip install -r dev/requirements.txt
+```
+(specifically, install `black` and dependencies from "Spark Connect python proto generation plugin (optional)" section)
+

Review Comment:
   I would've written it differently.
   
   ````suggestion
   To generate the Python client code from the proto files,
   
   First, make sure to have a Python environment with the installed dependencies.
   Specifically, install `black` and dependencies from the "Spark Connect python proto generation plugin (optional)" section.
   
   
   ```
   pip install -r dev/requirements.txt
   ```
   
   Install [buf](https://github.com/bufbuild/buf)
   
   ```
   brew install bufbuild/buf/buf
   ```
   
   Generate the Python files by running:
   
   ```
   dev/connect-gen-protos.sh
   ```
   
   ````



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #41015: [MINOR][CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #41015:
URL: https://github.com/apache/spark/pull/41015#discussion_r1183511891


##########
connector/connect/README.md:
##########
@@ -21,6 +21,28 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 
+To generate the Python client code from the proto files:
+
+First, make sure to have a Python environment with the installed dependencies.
+Specifically, install `black` and dependencies from the "Spark Connect python proto generation plugin (optional)" section.
+
+
+```
+pip install -r dev/requirements.txt
+```
+
+Install [buf](https://github.com/bufbuild/buf)
+
+```
+brew install bufbuild/buf/buf

Review Comment:
   and at https://spark.apache.org/docs/latest/api/python/development/contributing.html#prerequisite



##########
connector/connect/README.md:
##########
@@ -21,6 +21,28 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 
+To generate the Python client code from the proto files:
+
+First, make sure to have a Python environment with the installed dependencies.
+Specifically, install `black` and dependencies from the "Spark Connect python proto generation plugin (optional)" section.
+
+
+```
+pip install -r dev/requirements.txt
+```
+
+Install [buf](https://github.com/bufbuild/buf)
+
+```
+brew install bufbuild/buf/buf
+```
+
+Generate the Python files by running:
+
+```
+dev/connect-gen-protos.sh

Review Comment:
   This is actually documented at https://spark.apache.org/docs/latest/api/python/development/testing.html#running-tests-for-python-client



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk closed pull request #41015: [MINOR][CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk closed pull request #41015: [MINOR][CONNECT][DOC] Add information on how to regenerate proto for python client
URL: https://github.com/apache/spark/pull/41015


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #41015: [CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on PR #41015:
URL: https://github.com/apache/spark/pull/41015#issuecomment-1532727011

   +1, LGTM. Merging to master.
   Thank you, @juliuszsompolski and @grundprinzip @nija-at for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] nija-at commented on a diff in pull request #41015: [CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "nija-at (via GitHub)" <gi...@apache.org>.
nija-at commented on code in PR #41015:
URL: https://github.com/apache/spark/pull/41015#discussion_r1182662216


##########
connector/connect/README.md:
##########
@@ -21,6 +21,17 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 

Review Comment:
   Would it be useful to add a README in this folder and link to this README?
   
   [non blocking]



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41015: [CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41015:
URL: https://github.com/apache/spark/pull/41015#discussion_r1182675151


##########
connector/connect/README.md:
##########
@@ -21,6 +21,17 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 

Review Comment:
   I think that folder is `rst` based and published to the web documentation and not markdown based, so I think README.md doesn't fit there.
   But I was surprised that the documentation there doesn't mention getting dependencies from `dev/requirements.txt`. But I'm not really familiar with python dev and these docs, maybe someone more familiar should have a better idea what would be the best place for it? @HyukjinKwon ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #41015: [CONNECT][DOC] Add information on how to regenerate proto for python client

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #41015:
URL: https://github.com/apache/spark/pull/41015#discussion_r1183457223


##########
connector/connect/README.md:
##########
@@ -21,6 +21,28 @@ user experience across all languages. Please follow the below guidelines:
 
 Python-specific development guidelines are located in [python/docs/source/development/testing.rst](https://github.com/apache/spark/blob/master/python/docs/source/development/testing.rst) that is published at [Development tab](https://spark.apache.org/docs/latest/api/python/development/index.html) in PySpark documentation.
 
+To generate the Python client code from the proto files:
+
+First, make sure to have a Python environment with the installed dependencies.
+Specifically, install `black` and dependencies from the "Spark Connect python proto generation plugin (optional)" section.
+
+
+```
+pip install -r dev/requirements.txt
+```
+
+Install [buf](https://github.com/bufbuild/buf)
+
+```
+brew install bufbuild/buf/buf
+```
+
+Generate the Python files by running:
+
+```
+dev/connect-gen-protos.sh

Review Comment:
   Don't know is it specific to me but this command fails on my site with error:
   ```
   + buf generate --debug -vvv
   ...
   Failure: 403 Forbidden
   ```
   
   A workaround is to fork https://github.com/HyukjinKwon/my-github-actions and run the workflow:
   ![Screen Shot 2023-03-24 at 8 38 18 PM](https://user-images.githubusercontent.com/1580697/235881720-aedf17a5-b14e-4a62-ab21-576a2221fedf.png)
   and download the results:
   ![Screen Shot 2023-03-24 at 8 38 45 PM](https://user-images.githubusercontent.com/1580697/235881807-1a2a8c34-44bd-44cc-9b5c-1ff11d1eebd0.png)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org