You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Niels Basjes (JIRA)" <ji...@apache.org> on 2016/09/23 14:00:29 UTC

[jira] [Comment Edited] (PARQUET-725) Parquet AVRO tests fail

    [ https://issues.apache.org/jira/browse/PARQUET-725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15516512#comment-15516512 ] 

Niels Basjes edited comment on PARQUET-725 at 9/23/16 2:00 PM:
---------------------------------------------------------------

Found the root cause (fixed in a yet to be released version of AVRO): 
AVRO-1799:  java: GenericData.toString() mutates underlying ByteBuffer backed data

This also is the reason this problem did not occur in my IDE (IntelliJ).
The debugger underlying does a 'toString' to show the record on the screen during debugging.
Because this was done on both the 'equals' a step later would now succeed, while when running it would make it fail.


was (Author: nielsbasjes):
Found the propable root cause (fixed in a yet to be released version of AVRO): 
AVRO-1799:  java: GenericData.toString() mutates underlying ByteBuffer backed data

> Parquet AVRO tests fail
> -----------------------
>
>                 Key: PARQUET-725
>                 URL: https://issues.apache.org/jira/browse/PARQUET-725
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Niels Basjes
>
> I found that on my machine some of the tests in the parquet-avro fail.
> {code}
> Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.073 sec
> Running org.apache.parquet.avro.TestAvroDataSupplier
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec
> Running org.apache.parquet.avro.TestReadWrite
> Tests run: 18, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 0.414 sec <<< FAILURE!
> Running org.apache.parquet.avro.TestBackwardCompatibility
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec
> Running org.apache.parquet.avro.TestReadWriteOldListBehavior
> Tests run: 16, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 0.148 sec <<< FAILURE!
> Running org.apache.parquet.avro.TestInputOutputFormat
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.29 sec
> Running org.apache.parquet.avro.TestReflectLogicalTypes
> Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.165 sec
> Running org.apache.parquet.avro.TestCircularReferences
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0 sec
> Results :
> Failed tests:   testWriteReflectReadGeneric(org.apache.parquet.avro.TestReflectReadWrite): expected:<{"myboolean": true, "mybyte": 1, "myshort": 1, "myint": 1, "mylong": 2, "myfloat": 3.1, "mydouble": 4.1, "mybytes": {"bytes": "\u0001\u0002\u0003\u0004"}, "mystring": "Hello", "myenum": "A", "mymap": {"a": "1", "b": "2"}, "myshortarray": [1, 2], "myintarray": [1, 2], "mystringarray": ["a", "b"], "mylist": ["a", "b", "c"]}> but was:<{"myboolean": true, "mybyte": 1, "myshort": 1, "myint": 1, "mylong": 2, "myfloat": 3.1, "mydouble": 4.1, "mybytes": {"bytes": ""}, "mystring": "Hello", "myenum": "A", "mymap": {"a": "1", "b": "2"}, "myshortarray": [1, 2], "myintarray": [1, 2], "mystringarray": ["a", "b"], "mylist": ["a", "b", "c"]}>
>   testWriteDecimalBytes(org.apache.parquet.avro.TestGenericLogicalTypes): Should read BigDecimals as bytes expected:<[{"dec": {"bytes": "ò\u0096"}}, {"dec": {"bytes": "\u0000²àø"}}]> but was:<[{"dec": {"bytes": ""}}, {"dec": {"bytes": ""}}]>
>   testAll[0](org.apache.parquet.avro.TestReadWrite): expected:<java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
>   testAllUsingDefaultAvroSchema[0](org.apache.parquet.avro.TestReadWrite): expected:<java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
>   testAll[1](org.apache.parquet.avro.TestReadWrite): expected:<java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
>   testAllUsingDefaultAvroSchema[1](org.apache.parquet.avro.TestReadWrite): expected:<java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
>   testAll[0](org.apache.parquet.avro.TestReadWriteOldListBehavior): expected:<java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
>   testAllUsingDefaultAvroSchema[0](org.apache.parquet.avro.TestReadWriteOldListBehavior): expected:<java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
>   testAll[1](org.apache.parquet.avro.TestReadWriteOldListBehavior): expected:<java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
>   testAllUsingDefaultAvroSchema[1](org.apache.parquet.avro.TestReadWriteOldListBehavior): expected:<java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
> {code}
> I see two classes of problems:
> # The json with byte arrays appear different.
> # Some tests compare the 'toString' of a ByteBuffer. Now for two ByteBuffers that both contain the SAME bytes these tests fail simply because the position field of the ByteBuffer is different. I think these should compare the contents of the ByteBuffer instead.
> {code}
> <java.nio.HeapByteBuffer[pos=0 lim=5 cap=5]> but was:<java.nio.HeapByteBuffer[pos=5 lim=5 cap=5]>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)