You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/11/16 23:56:37 UTC

[PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

HyukjinKwon opened a new pull request, #43855:
URL: https://github.com/apache/spark/pull/43855

   ### What changes were proposed in this pull request?
   
   This PR restores the DSv2 documentation. https://github.com/apache/spark/pull/38392 mistakenly added `org/apache/spark/sql/connect` as a private that includes `org/apache/spark/sql/connector`.
   
   ### Why are the changes needed?
   
   For end users to read DSv2 documentation.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, it restores the DSv2 API documentation that used to be there https://spark.apache.org/docs/3.3.0/api/scala/org/apache/spark/sql/connector/catalog/index.html
   
   ### How was this patch tested?
   
   Manually tested via:
   
   ```
   SKIP_PYTHONDOC=1 SKIP_RDOC=1 SKIP_SQLDOC=1 bundle exec jekyll build
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "allisonwang-db (via GitHub)" <gi...@apache.org>.
allisonwang-db commented on code in PR #43855:
URL: https://github.com/apache/spark/pull/43855#discussion_r1396686102


##########
project/SparkBuild.scala:
##########
@@ -1361,7 +1361,7 @@ object Unidoc {
       .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/util/io")))
       .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/util/kvstore")))
       .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/sql/catalyst")))
-      .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/sql/connect")))
+      .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/sql/connect/")))

Review Comment:
   Although it may not be possible, is there a way to prevent documentation regressions :) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #43855:
URL: https://github.com/apache/spark/pull/43855#discussion_r1396517875


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.java:
##########
@@ -58,7 +58,7 @@ public interface SupportsMetadataColumns extends Table {
    * Determines how this data source handles name conflicts between metadata and data columns.
    * <p>
    * If true, spark will automatically rename the metadata column to resolve the conflict. End users
-   * can reliably select metadata columns (renamed or not) with {@link Dataset.metadataColumn}, and
+   * can reliably select metadata columns (renamed or not) with `Dataset.metadataColumn`, and

Review Comment:
   It fails as below:
   
   ```
   [error] /.../spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.java:61:1:  error: reference not found
   [error]    * can reliably select metadata columns (renamed or not) with {@link Dataset.metadataColumn}, and
   [error]                                                                        ^
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #43855:
URL: https://github.com/apache/spark/pull/43855#issuecomment-1816268352

   @HyukjinKwon it seems `Docker integration tests` keep failing after this PR 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #43855:
URL: https://github.com/apache/spark/pull/43855#issuecomment-1815831044

   Merged to master. 
   
   Could you make a backport to branch-3.5 and branch-3.4, @HyukjinKwon ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun closed pull request #43855: [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API
URL: https://github.com/apache/spark/pull/43855


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43855:
URL: https://github.com/apache/spark/pull/43855#issuecomment-1815877688

   Sure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #43855:
URL: https://github.com/apache/spark/pull/43855#discussion_r1396686425


##########
project/SparkBuild.scala:
##########
@@ -1361,7 +1361,7 @@ object Unidoc {
       .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/util/io")))
       .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/util/kvstore")))
       .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/sql/catalyst")))
-      .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/sql/connect")))
+      .map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/sql/connect/")))

Review Comment:
   I think it'd be difficult :-). at least I have no idea.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #43855:
URL: https://github.com/apache/spark/pull/43855#discussion_r1396517875


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.java:
##########
@@ -58,7 +58,7 @@ public interface SupportsMetadataColumns extends Table {
    * Determines how this data source handles name conflicts between metadata and data columns.
    * <p>
    * If true, spark will automatically rename the metadata column to resolve the conflict. End users
-   * can reliably select metadata columns (renamed or not) with {@link Dataset.metadataColumn}, and
+   * can reliably select metadata columns (renamed or not) with `Dataset.metadataColumn`, and

Review Comment:
   It fails as below:
   
   ```
   [error] /.../spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.java:61:1:  error: reference not found
   [error]    * can reliably select metadata columns (renamed or not) with {@link Dataset.metadataColumn}, and
   [error]                                                                        ^
   ```



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.java:
##########
@@ -58,7 +58,7 @@ public interface SupportsMetadataColumns extends Table {
    * Determines how this data source handles name conflicts between metadata and data columns.
    * <p>
    * If true, spark will automatically rename the metadata column to resolve the conflict. End users
-   * can reliably select metadata columns (renamed or not) with {@link Dataset.metadataColumn}, and
+   * can reliably select metadata columns (renamed or not) with {@code Dataset.metadataColumn}, and

Review Comment:
   It fails as below:
   
   ```
   [error] /.../spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/SupportsMetadataColumns.java:61:1:  error: reference not found
   [error]    * can reliably select metadata columns (renamed or not) with {@link Dataset.metadataColumn}, and
   [error]                                                                        ^
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45963][SQL][DOCS] Restore documentation for DSv2 API [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #43855:
URL: https://github.com/apache/spark/pull/43855#issuecomment-1815532865

   There are a couple of Javadoc errors .. let me fix them here together.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org