You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/25 13:05:15 UTC

[GitHub] [hudi] cdmikechen opened a new issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro

cdmikechen opened a new issue #2034:
URL: https://github.com/apache/hudi/issues/2034


   **Describe the problem you faced**
   
   If using DeltaStreamer to get kafka avro data to hudi, DateType can't be transformed to right data (like `2020-8-24`). DateType always shows `1970-01-01`.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   I use debezium to get some mysql tables datas to kafka, and then use DeltaStreamer to save in hudi. I checked columns and found that every date type column always shows `1970-01-01`.
   In `org.apache.hudi.AvroConversionHelper` hudi use these codes to cast int to date:
   ```scala
   case (DateType, INT) =>
     (item: AnyRef) =>
       if (item == null) {
         null
       } else {
         if (item.isInstanceOf[Integer]) {
           new Date(item.asInstanceOf[Integer].longValue())
         } else {
           new Date(item.asInstanceOf[Long])
         }
       }
   ```
   
   I write some codes to test this: 
   ```java
   System.out.println(new java.sql.Date(18498));
   System.out.println(org.apache.spark.sql.catalyst.util.DateTimeUtils.toJavaDate(18498));
   ```
   result:
   ```
   1970-01-01
   2020-08-24
   ```
   
   
   **Environment Description**
   
   * Hudi version : 0.6.0
   
   * Spark version : 2.4.4
   
   * Hive version : 2.3.3
   
   * Hadoop version : 2.8.5
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   I think it's a bug, but I'm not sure if anyone else has encountered it and can prove that it's ubiquitous.
   If this is really a bug, I think we should propose a PR to fix it
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] cdmikechen commented on issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro

Posted by GitBox <gi...@apache.org>.
cdmikechen commented on issue #2034:
URL: https://github.com/apache/hudi/issues/2034#issuecomment-680364028


   @bvaradar 
   Yes, of course. I will deal with it recently.
   I've test a case in hive, I found hive may also parse date type as int and display it as `yyyy-mm-dd`, according to day offsets from `1970-01-01`. 
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2034:
URL: https://github.com/apache/hudi/issues/2034#issuecomment-682062239


   Thanks. Closing this issue as it is tracked in Jira


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2034:
URL: https://github.com/apache/hudi/issues/2034#issuecomment-680230551


   Looking at the constructor of java.sql.Date, 
   
   Date(long date) :Constructs a Date object using the given milliseconds time value.
   It expects time resolution in milliseconds. 
   
   But from debezium and Avro specification page, it looks like INT for DATE logical type represents the number of days since epoch. 
   
   Filed a Jira : https://issues.apache.org/jira/browse/HUDI-1225
   
   cc @shenh062326  
   
   @cdmikechen : Can you send a PR to fix it ? 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar closed issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro

Posted by GitBox <gi...@apache.org>.
bvaradar closed issue #2034:
URL: https://github.com/apache/hudi/issues/2034


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org