You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ge...@apache.org on 2022/03/22 05:26:57 UTC

[spark] branch branch-3.3 updated: [SPARK-38574][DOCS] Enrich the documentation of option avroSchema

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
     new 8b90205  [SPARK-38574][DOCS] Enrich the documentation of option avroSchema
8b90205 is described below

commit 8b90205ae971eb0ef6e79d849abb14243bb7dc0f
Author: tianhanhu <ad...@gmail.com>
AuthorDate: Tue Mar 22 13:23:39 2022 +0800

    [SPARK-38574][DOCS] Enrich the documentation of option avroSchema
    
    ### What changes were proposed in this pull request?
    Enrich Avro data source documentation to emphasize the difference between
    `avroSchema` which is an option, and `jsonFormatSchema` which is a parameter to function `from_avro` .
    
    When using `from_avro`, `avroSchema` option can be set to a compatible and evolved schema, while `jsonFormatSchema` has to be the actual schema. Elsewise, the behavior is undefined.
    
    ### Why are the changes needed?
    Reduce confusion caused by option and parameter having similar namings.
    
    ### Does this PR introduce _any_ user-facing change?
    Yes, Avro data source documentation is enriched a bit.
    
    ### How was this patch tested?
    No testing required. Just a documentation change
    
    Closes #35880 from tianhanhu/SPARK-38574.
    
    Authored-by: tianhanhu <ad...@gmail.com>
    Signed-off-by: Gengliang Wang <ge...@apache.org>
    (cherry picked from commit ee5121a56e10ba2c65ae67159da472713cc5edd4)
    Signed-off-by: Gengliang Wang <ge...@apache.org>
---
 docs/sql-data-sources-avro.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/sql-data-sources-avro.md b/docs/sql-data-sources-avro.md
index a26d56f..db3e03c 100644
--- a/docs/sql-data-sources-avro.md
+++ b/docs/sql-data-sources-avro.md
@@ -231,10 +231,11 @@ Data source options of Avro can be set via:
     <td>Optional schema provided by a user in JSON format.
       <ul>
         <li>
-          When reading Avro, this option can be set to an evolved schema, which is compatible but different with
+          When reading Avro files or calling function <code>from_avro</code>, this option can be set to an evolved schema, which is compatible but different with
           the actual Avro schema. The deserialization schema will be consistent with the evolved schema.
           For example, if we set an evolved schema containing one additional column with a default value,
-          the reading result in Spark will contain the new column too.
+          the reading result in Spark will contain the new column too. Note that when using this option with 
+          <code>from_avro</code>, you still need to pass the actual Avro schema as a parameter to the function.
         </li>
         <li>
           When writing Avro, this option can be set if the expected output Avro schema doesn't match the

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org