You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/03/20 23:25:08 UTC

[GitHub] [spark] tianhanhu commented on a change in pull request #35880: [SPARK-38574][CORE] Enrich Avro data source documentation

tianhanhu commented on a change in pull request #35880:
URL: https://github.com/apache/spark/pull/35880#discussion_r830690187



##########
File path: docs/sql-data-sources-avro.md
##########
@@ -234,7 +234,9 @@ Data source options of Avro can be set via:
           When reading Avro, this option can be set to an evolved schema, which is compatible but different with
           the actual Avro schema. The deserialization schema will be consistent with the evolved schema.
           For example, if we set an evolved schema containing one additional column with a default value,
-          the reading result in Spark will contain the new column too.
+          the reading result in Spark will contain the new column too. Note that when using this option with 
+          <code>from_avro</code>, you still need to pass the actual schema as a parameter to the function. 
+          Otherwise, the behavior is undefined: it may fail or return arbitrary result.

Review comment:
       Make sense. Removed the "otherwise" clause.
   I think emphasizing on the difference between "actual" and "evolved" should be enough for this case? Further mentioning input vs output seems a bit wordy for me... @gengliangwang 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org