You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2022/04/03 17:28:00 UTC
[jira] [Commented] (BEAM-10934) handling Date type in HCatToRow
[ https://issues.apache.org/jira/browse/BEAM-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516544#comment-17516544 ]
Beam JIRA Bot commented on BEAM-10934:
--------------------------------------
This issue was marked "stale-P2" and has not received a public comment in 14 days. It is now automatically moved to P3. If you are still affected by it, you can comment and move it back to P2.
> handling Date type in HCatToRow
> -------------------------------
>
> Key: BEAM-10934
> URL: https://issues.apache.org/jira/browse/BEAM-10934
> Project: Beam
> Issue Type: Bug
> Components: io-java-hcatalog, sdk-java-core
> Reporter: chie hayashida
> Priority: P3
> Labels: Clarified, starter
>
> When I convert HCatRecord include Date type record to Row, it failed with the following errors.
> * the code
> ```
> PCollection<Row> p =
> pipeline
> /*
> * Step #1: Read hive table rows from Hive.
> */
> .apply(
> "Read from Hive source",
> HCatToRow.fromSpec(
> HCatalogIO.read()
> .withConfigProperties(configProperties)
> .withDatabase(options.getHiveDatabaseName())
> .withTable(options.getHiveTableName())
> .withFilter(options.getFilterString())));
> ```
> * error log
> ```
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.IllegalArgumentException: For field name submissiondate and DATETIME type got unexpected class class java.sql.Date
> at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:348)
> at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:318)
> at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:213)
> at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
> at com.google.cloud.teleport.v2.templates.HiveToBigQuery.run(HiveToBigQuery.java:234)
> at com.google.cloud.teleport.v2.templates.HiveToBigQuery.main(HiveToBigQuery.java:176)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalArgumentException: For field name submissiondate and DATETIME type got unexpected class class java.sql.Date
> at org.apache.beam.sdk.values.Row$Builder.verifyDateTime(Row.java:828)
> at org.apache.beam.sdk.values.Row$Builder.verifyPrimitiveType(Row.java:755)
> at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:654)
> at org.apache.beam.sdk.values.Row$Builder.verify(Row.java:635)
> at org.apache.beam.sdk.values.Row$Builder.build(Row.java:840)
> at org.apache.beam.sdk.io.hcatalog.HCatToRow$HCatToRowFn.processElement(HCatToRow.java:84)
> ```
> It occurs because HCatalogIO reads Date type as java.sql.Date in HCatRecord, but Row class doesn't support Date and HCatToRow doesn't care about it.
> I think there are two solution about it.
> 1. Row type supports Date type(java.util.Date or java.sql.Date)
> I don't know another IO classes enough, but there may be another IO classes which has same problem, and this solution may be able to solve those problem.
> 2. Add logic to convert Date type to Datetime type in HCatToRow
> The impact of change will be smaller then 1. because it doesn't change Row class.
> Which would be better?
--
This message was sent by Atlassian Jira
(v8.20.1#820001)