You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/25 13:05:15 UTC
[GitHub] [hudi] cdmikechen opened a new issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro
cdmikechen opened a new issue #2034:
URL: https://github.com/apache/hudi/issues/2034
**Describe the problem you faced**
If using DeltaStreamer to get kafka avro data to hudi, DateType can't be transformed to right data (like `2020-8-24`). DateType always shows `1970-01-01`.
**To Reproduce**
Steps to reproduce the behavior:
I use debezium to get some mysql tables datas to kafka, and then use DeltaStreamer to save in hudi. I checked columns and found that every date type column always shows `1970-01-01`.
In `org.apache.hudi.AvroConversionHelper` hudi use these codes to cast int to date:
```scala
case (DateType, INT) =>
(item: AnyRef) =>
if (item == null) {
null
} else {
if (item.isInstanceOf[Integer]) {
new Date(item.asInstanceOf[Integer].longValue())
} else {
new Date(item.asInstanceOf[Long])
}
}
```
I write some codes to test this:
```java
System.out.println(new java.sql.Date(18498));
System.out.println(org.apache.spark.sql.catalyst.util.DateTimeUtils.toJavaDate(18498));
```
result:
```
1970-01-01
2020-08-24
```
**Environment Description**
* Hudi version : 0.6.0
* Spark version : 2.4.4
* Hive version : 2.3.3
* Hadoop version : 2.8.5
* Storage (HDFS/S3/GCS..) : HDFS
* Running on Docker? (yes/no) : no
**Additional context**
I think it's a bug, but I'm not sure if anyone else has encountered it and can prove that it's ubiquitous.
If this is really a bug, I think we should propose a PR to fix it
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] cdmikechen commented on issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro
Posted by GitBox <gi...@apache.org>.
cdmikechen commented on issue #2034:
URL: https://github.com/apache/hudi/issues/2034#issuecomment-680364028
@bvaradar
Yes, of course. I will deal with it recently.
I've test a case in hive, I found hive may also parse date type as int and display it as `yyyy-mm-dd`, according to day offsets from `1970-01-01`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2034:
URL: https://github.com/apache/hudi/issues/2034#issuecomment-682062239
Thanks. Closing this issue as it is tracked in Jira
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2034:
URL: https://github.com/apache/hudi/issues/2034#issuecomment-680230551
Looking at the constructor of java.sql.Date,
Date(long date) :Constructs a Date object using the given milliseconds time value.
It expects time resolution in milliseconds.
But from debezium and Avro specification page, it looks like INT for DATE logical type represents the number of days since epoch.
Filed a Jira : https://issues.apache.org/jira/browse/HUDI-1225
cc @shenh062326
@cdmikechen : Can you send a PR to fix it ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar closed issue #2034: [SUPPORT] DateType can't be transformed to right data by kafka avro
Posted by GitBox <gi...@apache.org>.
bvaradar closed issue #2034:
URL: https://github.com/apache/hudi/issues/2034
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org