You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Arthur Passos (Jira)" <ji...@apache.org> on 2022/08/18 12:25:00 UTC

[jira] [Created] (ARROW-17459) [C++] Support nested data conversions for chunked array

Arthur Passos created ARROW-17459:
-------------------------------------

             Summary: [C++] Support nested data conversions for chunked array
                 Key: ARROW-17459
                 URL: https://issues.apache.org/jira/browse/ARROW-17459
             Project: Apache Arrow
          Issue Type: New Feature
          Components: C++
            Reporter: Arthur Passos


`FileReaderImpl::ReadRowGroup` fails with "Nested data conversions not implemented for chunked array outputs". It fails on [ChunksToSingle](https://github.com/apache/arrow/blob/7f6b074b84b1ca519b7c5fc7da318e8d47d44278/cpp/src/parquet/arrow/reader.cc#L95)

Data schema is: 

 
{code:java}
  optional group fields_map (MAP) = 217 {
    repeated group key_value {
      required binary key (STRING) = 218;
      optional binary value (STRING) = 219;
    }
  }
fields_map.key_value.value-> Size In Bytes: 13243589 Size In Ratio: 0.20541047
fields_map.key_value.key-> Size In Bytes: 3008860 Size In Ratio: 0.046667963
{code}
 

Is there a way to work around this issue in the cpp lib?

In any case, I am willing to implement this, but I need some guidance. I am very new to parquet (as in started reading about it yesterday).

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)