You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/19 15:47:12 UTC

[GitHub] [hudi] parisni opened a new issue, #5363: [SUPPORT] Hudi don't propagate column comments into hive metastore / parquet files

parisni opened a new issue, #5363:
URL: https://github.com/apache/hudi/issues/5363

   when a spark schema has a metadata with a comment field, then the spark writer propagates the comment into the metastore.
   
   Then other metastore client (hive, presto) can describe the table and get comments.
   
   It turns out hudi does not support them: when such comment is added to the schema, the resulting table don't get the comment. 
   
   Digging the source code, the schema comes either from the hudi commit metadata in avro format or by reading the last parquet file.
   However the initial comment is not present in both.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] parisni commented on issue #5363: [SUPPORT] Hudi don't propagate column comments into hive metastore / parquet files

Posted by GitBox <gi...@apache.org>.
parisni commented on issue #5363:
URL: https://github.com/apache/hudi/issues/5363#issuecomment-1112522324

   Indeed, this is exactly what I am looking for !
   thanks
   
   On Tue, 2022-04-26 at 19:33 -0700, Sivabalan Narayanan wrote:
   > would this work for you https://github.com/apache/hudi/pull/4960 or
   > are you looking for something else?
   > 
   > 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua closed issue #5363: [SUPPORT] Hudi don't propagate column comments into hive metastore / parquet files

Posted by GitBox <gi...@apache.org>.
yihua closed issue #5363: [SUPPORT] Hudi don't propagate column comments into hive metastore / parquet files
URL: https://github.com/apache/hudi/issues/5363


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5363: [SUPPORT] Hudi don't propagate column comments into hive metastore / parquet files

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5363:
URL: https://github.com/apache/hudi/issues/5363#issuecomment-1113663521

   @parisni Glad to know that.  Closing this issue.  Let us know if you have additional questions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5363: [SUPPORT] Hudi don't propagate column comments into hive metastore / parquet files

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5363:
URL: https://github.com/apache/hudi/issues/5363#issuecomment-1110467234

   would this work for you https://github.com/apache/hudi/pull/4960 or are you looking for something else?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on issue #5363: [SUPPORT] Hudi don't propagate column comments into hive metastore / parquet files

Posted by GitBox <gi...@apache.org>.
codope commented on issue #5363:
URL: https://github.com/apache/hudi/issues/5363#issuecomment-1103903971

   @parisni this is a known issue. We did not see a strong use case to add comments. May I know your usecase. Perhaps we can take it up in a future release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] parisni commented on issue #5363: [SUPPORT] Hudi don't propagate column comments into hive metastore / parquet files

Posted by GitBox <gi...@apache.org>.
parisni commented on issue #5363:
URL: https://github.com/apache/hudi/issues/5363#issuecomment-1103945800

   our use case is improve quality of our lakehouse. Hudi tables are often accessible to end users (they allow to apply GDPR treatment) and the column/tables comments is a neat way to improve data analysts quality and user experience. Also our upstream data source sometimes do have comments (parquet metadata / hive metastore regular comments) and when transformed into hudi, that information is lost. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org