You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Alexander Filipchik (Jira)" <ji...@apache.org> on 2020/04/21 20:34:00 UTC
[jira] [Created] (HUDI-826) Spark to avro schema in 0.6
incompatible with 0.5 for fixed types
Alexander Filipchik created HUDI-826:
----------------------------------------
Summary: Spark to avro schema in 0.6 incompatible with 0.5 for fixed types
Key: HUDI-826
URL: https://issues.apache.org/jira/browse/HUDI-826
Project: Apache Hudi (incubating)
Issue Type: Bug
Reporter: Alexander Filipchik
Fix For: 0.6.0
Let's say we had some dataset created with SQL transformer using a query:
{code:java}
// select bla AS DECIMAL(20, 9)) bla
{code}
In 0.5 spark->avro converter (Databrics) would generate something like:
{code:java}
// {
"name": "bla",
"type": [
"string",
"null"
]
},
{code}
in 0.6 (Spark):
{code:java}
// {
"name": "bla",
"type": [
{
"type": "fixed",
"name": "order_subtotal",
"namespace": "",
"size": 16,
"logicalType": "decimal",
"precision": 38,
"scale": 17
}, "null"
]
},
{code}
types are very different in that case. During the merge reader would fail with:
{code:java}
// at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251)
at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
at org.apache.hudi.utilities.TestCss.testParquetWithSchema(TestCss.java:270)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at org.apache.avro.generic.GenericData.createFixed(GenericData.java:1168)
at org.apache.parquet.avro.AvroConverters$FieldFixedConverter.convert(AvroConverters.java:310
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)