You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "chie hayashida (Jira)" <ji...@apache.org> on 2020/09/20 04:11:00 UTC
[jira] [Updated] (BEAM-10934) handling Date type in HCatToRow
[ https://issues.apache.org/jira/browse/BEAM-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chie hayashida updated BEAM-10934:
----------------------------------
Summary: handling Date type in HCatToRow (was: handling Date type when convert another class to Row class)
> handling Date type in HCatToRow
> -------------------------------
>
> Key: BEAM-10934
> URL: https://issues.apache.org/jira/browse/BEAM-10934
> Project: Beam
> Issue Type: Improvement
> Components: io-java-hcatalog, sdk-java-core
> Reporter: chie hayashida
> Priority: P2
>
> When I convert HCatRecord include Date type record to Row, it failed with the following errors.
> * the code
> ```
> PCollection<Row> p =
> pipeline
> /*
> * Step #1: Read hive table rows from Hive.
> */
> .apply(
> "Read from Hive source",
> HCatToRow.fromSpec(
> HCatalogIO.read()
> .withConfigProperties(configProperties)
> .withDatabase(options.getHiveDatabaseName())
> .withTable(options.getHiveTableName())
> .withFilter(options.getFilterString())));
> ```
> * error log
> ```
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.IllegalArgumentException: For field name submissiondate and DATETIME type got unexpected class class java.sql.Date
> at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
> at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
> at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
> at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
> at com.google.cloud.teleport.v2.templates.HiveToBigQuery.run(HiveToBigQuery.java:234)
> at com.google.cloud.teleport.v2.templates.HiveToBigQuery.main(HiveToBigQuery.java:176)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: For field name submissiondate and DATETIME type got unexpected class class java.sql.Date
> at org.apache.beam.sdk.values.Row$Builder.verifyDateTime(Row.java:828)
> at org.apache.beam.sdk.values.Row$Builder.verifyPrimitiveType(Row.java:755)
> at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:654)
> at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:635)
> at org.apache.beam.sdk.values.Row$Builder.build(Row.java:840)
> at org.apache.beam.sdk.io.hcatalog.HCatToRow$HCatToRowFn.processElement(HCatToRow.java:84)
> ```
> It occurs because HCatalogIO reads Date type as java.sql.Date in HCatRecord, but Row class doesn't support Date and HCatToRow doesn't care about it.
> I think there are two solution about it.
> 1. Row type supports Date type(java.util.Date or java.sql.Date)
> I don't know another IO classes enough, but there may be another IO classes which has same problem, and this solution may be able to solve those problem.
> 2. Add logic to convert Date type to Datetime type in HCatToRow
> The impact of change will be smaller then 1. because it doesn't change Row class.
> Which would be better?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)