You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "panbingkun (via GitHub)" <gi...@apache.org> on 2023/08/23 02:34:45 UTC

[GitHub] [spark] panbingkun opened a new pull request, #42622: [SPARK-44923][PYTHON][DOCS] Some directories should be cleared when regenerating files

panbingkun opened a new pull request, #42622:
URL: https://github.com/apache/spark/pull/42622

   ### What changes were proposed in this pull request?
   The pr aims to fix some bug in regenerating pyspark docs in certain scenarios.
   
   ### Why are the changes needed?
   - The following error occurred while I was regenerating the pyspark document.
      <img width="1001" alt="image" src="https://github.com/apache/spark/assets/15246973/548abd63-4349-4267-b1fe-a293bd1e7f3e">
   
   - We can simply reproduce this problem as follows:
    1.git reset --hard 3f380b9ecc8b27f6965b554061572e0990f0513
       <img width="1416" alt="image" src="https://github.com/apache/spark/assets/15246973/5ab9c8fc-5835-4ced-8d92-9d5e020b262a">
    2.make clean html, at this point, it is successful.
       <img width="1000" alt="image" src="https://github.com/apache/spark/assets/15246973/5c3ce07f-cbe8-4177-ae22-b16c3fc62e01">
   3.git pull
   4.make clean html, at this point, it is failed.
       <img width="1001" alt="image" src="https://github.com/apache/spark/assets/15246973/548abd63-4349-4267-b1fe-a293bd1e7f3e">
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   1.Pass GA.
   2.Manually test.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #42622: [SPARK-44923][PYTHON][DOCS] Some directories should be cleared when regenerating files

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #42622:
URL: https://github.com/apache/spark/pull/42622#issuecomment-1689197410

   > in what case will this be a problem?
   > I don't see similar doc build failures in CI of master and branch-3.5
   
   1. The docs file has been generated locally.
   2. At this point, the pyspark code has made changes, such as deleting the function 'chr'`.
   3. Execute the command to generate docs files locally, and an error will occur at this time.
   4. 
   The reason why it doesn't happen on GA is because it's always a new generation on GA.
   
   In `conf.py`, both directories `source\reference\api` and `reference\pyspark.pandas\api` have been cleaned, and we should maintain consistency: `reference\pyspark.sql\api` and `reference\pyspark.ss\api`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #42622: [SPARK-44923][PYTHON][BUILD] Some directories should be cleared when regenerating files

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #42622:
URL: https://github.com/apache/spark/pull/42622#discussion_r1302518317


##########
python/docs/source/conf.py:
##########
@@ -33,22 +33,16 @@
 
 # Remove previously generated rst files. Ignore errors just in case it stops
 # generating whole docs.
-shutil.rmtree(
-    "%s/reference/api" % os.path.dirname(os.path.abspath(__file__)), ignore_errors=True)
-shutil.rmtree(
-    "%s/reference/pyspark.pandas/api" % os.path.dirname(os.path.abspath(__file__)),
-    ignore_errors=True)
-try:
-    os.mkdir("%s/reference/api" % os.path.dirname(os.path.abspath(__file__)))
-except OSError as e:
-    if e.errno != errno.EEXIST:
-        raise
-try:
-    os.mkdir("%s/reference/pyspark.pandas/api" % os.path.dirname(
-        os.path.abspath(__file__)))
-except OSError as e:
-    if e.errno != errno.EEXIST:
-        raise
+gen_rst_dirs = ["reference/api", "reference/pyspark.pandas/api",

Review Comment:
   Make the code more concise.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #42622: [SPARK-44923][PYTHON][DOCS] Some directories should be cleared when regenerating files

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #42622:
URL: https://github.com/apache/spark/pull/42622#issuecomment-1689179825

   @panbingkun in what case will this be a problem?
   I don't see similar doc build failures in CI of master and branch-3.5


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #42622: [SPARK-44923][PYTHON][DOCS] Some directories should be cleared when regenerating files

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #42622:
URL: https://github.com/apache/spark/pull/42622#issuecomment-1689173498

   cc @zhengruifeng @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #42622: [SPARK-44923][PYTHON][BUILD] Some directories should be cleared when regenerating files

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #42622:
URL: https://github.com/apache/spark/pull/42622#issuecomment-1689328814

   Compared to the 1st version, the 2nd version makes the code more concise.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng closed pull request #42622: [SPARK-44923][PYTHON][BUILD] Some directories should be cleared when regenerating files

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng closed pull request #42622: [SPARK-44923][PYTHON][BUILD] Some directories should be cleared when regenerating files
URL: https://github.com/apache/spark/pull/42622


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on a diff in pull request #42622: [SPARK-44923][PYTHON][BUILD] Some directories should be cleared when regenerating files

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #42622:
URL: https://github.com/apache/spark/pull/42622#discussion_r1302518317


##########
python/docs/source/conf.py:
##########
@@ -33,22 +33,16 @@
 
 # Remove previously generated rst files. Ignore errors just in case it stops
 # generating whole docs.
-shutil.rmtree(
-    "%s/reference/api" % os.path.dirname(os.path.abspath(__file__)), ignore_errors=True)
-shutil.rmtree(
-    "%s/reference/pyspark.pandas/api" % os.path.dirname(os.path.abspath(__file__)),
-    ignore_errors=True)
-try:
-    os.mkdir("%s/reference/api" % os.path.dirname(os.path.abspath(__file__)))
-except OSError as e:
-    if e.errno != errno.EEXIST:
-        raise
-try:
-    os.mkdir("%s/reference/pyspark.pandas/api" % os.path.dirname(
-        os.path.abspath(__file__)))
-except OSError as e:
-    if e.errno != errno.EEXIST:
-        raise
+gen_rst_dirs = ["reference/api", "reference/pyspark.pandas/api",

Review Comment:
   Make the code more concise.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #42622: [SPARK-44923][PYTHON][DOCS] Some directories should be cleared when regenerating files

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #42622:
URL: https://github.com/apache/spark/pull/42622#issuecomment-1689173078

   As shown in the following figure, during the document generation process of `sphinx-build`, some directories and files will be automatically generated in the directory: 
   <img width="302" alt="image" src="https://github.com/apache/spark/assets/15246973/c679c9d3-a010-4884-8f96-f56bc5fcde4c">
   <img width="238" alt="image" src="https://github.com/apache/spark/assets/15246973/d04b4330-8784-46a3-9a58-cb27b75dcc97">


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #42622: [SPARK-44923][PYTHON][BUILD] Some directories should be cleared when regenerating files

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #42622:
URL: https://github.com/apache/spark/pull/42622#issuecomment-1689585458

   merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org