You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "nchammas (via GitHub)" <gi...@apache.org> on 2024/01/19 02:05:58 UTC

[PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

nchammas opened a new pull request, #44791:
URL: https://github.com/apache/spark/pull/44791

   ### What changes were proposed in this pull request?
   
   We are abusing Jekyll's plugin system to set flags for what API docs to build. This change maintains this overall status quo but makes the following improvements:
   
   - Rename the plugin file to be in line with its actual purpose.
   - Organize the code into functions so it's easier to follow and understand.
   - Print section headers that are easier to find in the output when building the docs.
   
   ### Why are the changes needed?
   
   This should make maintaining this part of the doc building workflow easier.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   I built the docs with various skip flags set and confirmed the build succeeds. I also manually browsed through some of the Scala, Java, Python, and SQL API docs.
   
   The only docs I didn't test building were the R docs, because I do not have R installed locally.
   
   Here is the build output on my machine for `SKIP_RDOC=1`: [build-docs.log.zip](https://github.com/apache/spark/files/13983386/build-docs.log.zip)
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44791:
URL: https://github.com/apache/spark/pull/44791#issuecomment-1907113959

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #44791:
URL: https://github.com/apache/spark/pull/44791#discussion_r1461246226


##########
docs/_plugins/build_api_docs.rb:
##########
@@ -0,0 +1,205 @@
+#

Review Comment:
   We should probably change:
   
   ```
   dev/change-scala-version.sh:echo "$BASEDIR/docs/_plugins/copy_api_dirs.rb"
   dev/change-scala-version.sh:  sed_i '/\-Pscala-'$TO_VERSION'/!s:build/sbt:build/sbt \-Pscala\-'$TO_VERSION':' "$BASEDIR/docs/_plugins/copy_api_dirs.rb"
   dev/change-scala-version.sh:  sed_i 's:build/sbt \-Pscala\-'$FROM_VERSION':build/sbt:' "$BASEDIR/docs/_plugins/copy_api_dirs.rb"
   dev/change-scala-version.sh:sed_i 's/scala\-'$FROM_VERSION'/scala\-'$TO_VERSION'/' "$BASEDIR/docs/_plugins/copy_api_dirs.rb"
   ```
   
   together .. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on PR #44791:
URL: https://github.com/apache/spark/pull/44791#issuecomment-1899526921

   In the future, I think it would be appropriate to convert this script into Python, as Ruby is not a focus of our community. (I don't use Ruby much myself, either.) But organizing the code into clear functions is something we'd want to do regardless of the language.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on code in PR #44791:
URL: https://github.com/apache/spark/pull/44791#discussion_r1461249378


##########
docs/_plugins/build_api_docs.rb:
##########
@@ -0,0 +1,205 @@
+#

Review Comment:
   Oh, good catch. Will update.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on PR #44791:
URL: https://github.com/apache/spark/pull/44791#issuecomment-1906467635

   I diffed the built `_site/` directory across `master` and this branch. They are identical except for some minor differences in some generated files. I've updated the PR description accordingly with the details.
   
   @HyukjinKwon - I think this PR is good to go. Let me know if there is anything more you'd like me to do here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on code in PR #44791:
URL: https://github.com/apache/spark/pull/44791#discussion_r1461253952


##########
docs/_plugins/build_api_docs.rb:
##########
@@ -0,0 +1,205 @@
+#

Review Comment:
   Fixed and updated the PR description.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on code in PR #44791:
URL: https://github.com/apache/spark/pull/44791#discussion_r1459233153


##########
docs/_plugins/build_api_docs.rb:
##########
@@ -0,0 +1,205 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# This include enables functions like `cd` and `cp_r`.
+require 'fileutils'
+include FileUtils
+
+THIS_DIR = File.dirname(__FILE__)
+SPARK_PROJECT_ROOT = File.expand_path(THIS_DIR + "/../..")
+$spark_package_is_built = false
+
+def print_header(text)
+  banner = "* #{text} *"
+  banner_bar = "*" * banner.size
+
+  puts ""
+  puts banner_bar
+  puts banner
+  puts banner_bar
+end
+
+def build_scala_and_java_docs
+  print_header "Building Scala and Java API docs."
+  cd(SPARK_PROJECT_ROOT)
+
+  command = "build/sbt -Pkinesis-asl clean compile unidoc"

Review Comment:
   When we build both the Scala docs as well as either the Python or SQL docs, we end up building Spark twice. In a follow-up improvement, I think it would make sense to build Spark just once if any of the Scala, Python, or SQL docs are requested:
   
   ```sh
   ./build/sbt -Pkinesis-asl -Phive clean package
   ```
   
   Then, if Scala docs are specifically also requested, we just build the `unidoc`:
   
   ```sh
   ./build/sbt -Pkinesis-asl -Phive unidoc
   ```
   
   This should save us around 2 minutes on the complete documentation build.
   
   For now, I'm leaving it like this since that's the current behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #44791:
URL: https://github.com/apache/spark/pull/44791#discussion_r1461246396


##########
docs/_plugins/build_api_docs.rb:
##########
@@ -0,0 +1,205 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more

Review Comment:
   So is the change basically refactoring?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on code in PR #44791:
URL: https://github.com/apache/spark/pull/44791#discussion_r1459289329


##########
docs/_plugins/build_api_docs.rb:
##########
@@ -0,0 +1,205 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# This include enables functions like `cd` and `cp_r`.
+require 'fileutils'
+include FileUtils
+
+THIS_DIR = File.dirname(__FILE__)
+SPARK_PROJECT_ROOT = File.expand_path(THIS_DIR + "/../..")
+$spark_package_is_built = false
+
+def print_header(text)
+  banner = "* #{text} *"
+  banner_bar = "*" * banner.size
+
+  puts ""
+  puts banner_bar
+  puts banner
+  puts banner_bar
+end
+
+def build_scala_and_java_docs
+  print_header "Building Scala and Java API docs."
+  cd(SPARK_PROJECT_ROOT)
+
+  command = "build/sbt -Pkinesis-asl clean compile unidoc"

Review Comment:
   I'm also not sure we need the `kinesis-asl` profile here, as it seems the Kinesis docs are built with or without it. But I'd leave that as-is for now since it's not that much noise and I'm not 100% sure it's useless.
   
   It was added in #2239.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.
nchammas commented on code in PR #44791:
URL: https://github.com/apache/spark/pull/44791#discussion_r1461249578


##########
docs/_plugins/build_api_docs.rb:
##########
@@ -0,0 +1,205 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more

Review Comment:
   Yes. The goal is to preserve current behavior while making the code easier to maintain.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46764][DOCS] Reorganize script to build API docs [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #44791: [SPARK-46764][DOCS] Reorganize script to build API docs
URL: https://github.com/apache/spark/pull/44791


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org