You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Aaron Blake Niskode-Dossett (Jira)" <ji...@apache.org> on 2020/09/29 15:56:00 UTC

[jira] [Created] (PARQUET-1917) [parquet-proto] default values are stored in oneOf fields that aren't set

Aaron Blake Niskode-Dossett created PARQUET-1917:
----------------------------------------------------

             Summary: [parquet-proto] default values are stored in oneOf fields that aren't set
                 Key: PARQUET-1917
                 URL: https://issues.apache.org/jira/browse/PARQUET-1917
             Project: Parquet
          Issue Type: Bug
    Affects Versions: 1.12.0
            Reporter: Aaron Blake Niskode-Dossett


SCHEMA
--------
{noformat}
message Person {
  int32 foo = 1;
  oneof optional_bar {
    int32 bar_int = 200;
    int32 bar_int2 = 201;
    string bar_string = 300;
  }
}{noformat}
 
CODE
--------
I set values for foo and bar_string
 
{noformat}
for (int i = 0; i < 3; i += 1) {
                com.etsy.grpcparquet.Person message = Person.newBuilder()
                        .setFoo(i)
                        .setBarString("hello world")
                        .build();
                message.writeDelimitedTo(out);
            }{noformat}


And then I write the protobuf file out to parquet.
 
RESULT
-----------
{noformat}
$ parquet-tools show example.parquet                                                                                                                        
+-------+-----------+------------+--------------+
|   foo |   bar_int |   bar_int2 | bar_string   |
|-------+-----------+------------+--------------|
|     0 |         0 |          0 | hello world  |
|     1 |         0 |          0 | hello world  |
|     2 |         0 |          0 | hello world  |
+-------+-----------+------------+--------------+{noformat}
 
bar_int and bar_int2 should be EMPTY for all three rows since only bar_string is set in the oneof.  0 is the default value for int, but it should not be stored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)