You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by "huangxiaopingRD (via GitHub)" <gi...@apache.org> on 2023/03/06 02:27:05 UTC

[GitHub] [hudi] huangxiaopingRD commented on issue #8036: [SUPPORT] Migration of hudi tables encountered issue related to metadata column

huangxiaopingRD commented on issue #8036:
URL: https://github.com/apache/hudi/issues/8036#issuecomment-1455335723

   > good question.
   > 
   > Depending on what sql tool you might use, you can try to explore how to select all columns except a few. then, you can ignore the hoodie meta columns explicitly in your insert into statement.
   > 
   > For eg, for spark sql, you can do the following
   > 
   > spark.sql("SET spark.sql.parser.quotedRegexColumnNames=true")
   > 
   > #select all columns except a,b sql("select `(a|b)?+.+` from tmp").show() #+---+---+ #| id| c| #+---+---+ #| 1| 4| #+---+---+
   > 
   > Ref: https://stackoverflow.com/questions/63127263/how-to-select-all-columns-except-2-of-them-from-a-large-table-on-pyspark-sql
   > 
   > Hive: https://stackoverflow.com/questions/51227890/hive-how-to-select-all-but-one-column
   
   Thanks @nsivabalan , this is a better way. But we hope to be transparent to users, we finally decided to adopt the method of injecting Spark's Rule to be compatible with this case from the system point of view.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org