You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "nchammas (via GitHub)" <gi...@apache.org> on 2024/01/11 04:15:23 UTC

[PR] [SPARK-46668][DOCS] Parallelize Sphinx build of Python API docs [spark]

nchammas opened a new pull request, #44680:
URL: https://github.com/apache/spark/pull/44680

   ### What changes were proposed in this pull request?
   
   Upgrade to Sphinx 4.5.0, which is the [latest in the 4.x line][1] and includes the [fix for parallel builds on macOS][2].
   
   Enable parallel Sphinx workers to build the Python API docs.
   
   I experimented with a few different values, and `auto` seems to work best. Configuring 4 workers seems to yield the same improvement as `auto`, suggesting parallelization beyond that is ineffective due to some sort of resource contention. But I left it as `auto` since that's more dynamic and may work better on CI.
   
   On my 16-core Intel workstation, the runtime of `make html` was cut by ~60%.
   
   ```sh
   # `make html` @ master
   real    43m51.167s
   user    41m43.526s
   sys     0m39.651s
   
   # `make html` with parallel workers
   real    17m8.424s
   user    174m42.051s
   sys     5m8.824s
   ```
   
   [1]: https://www.sphinx-doc.org/en/master/changes.html#release-4-5-0-released-mar-28-2022
   [2]: https://github.com/sphinx-doc/sphinx/pull/9793
   
   ### Why are the changes needed?
   
   This saves developer time (and may also save CI time).
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   I manually built and reviewed the docs using:
   
   ```sh
   SKIP_SCALADOC=1 SKIP_SQLDOC=1 SKIP_RDOC=1 time bundle exec jekyll build
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46668][DOCS] Parallelize Sphinx build of Python API docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on PR #44680:
URL: https://github.com/apache/spark/pull/44680#issuecomment-1886254077

   cc @itholic and @HyukjinKwon. This PR builds on #44012.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46668][DOCS] Parallelize Sphinx build of Python API docs [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #44680: [SPARK-46668][DOCS] Parallelize Sphinx build of Python API docs
URL: https://github.com/apache/spark/pull/44680


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46668][DOCS] Parallelize Sphinx build of Python API docs [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44680:
URL: https://github.com/apache/spark/pull/44680#issuecomment-1886565032

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org