You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/20 06:29:51 UTC

[GitHub] [arrow] VisBeo opened a new issue, #13201: [C++] How to define and write "Map" data (nested type)?

VisBeo opened a new issue, #13201:
URL: https://github.com/apache/arrow/issues/13201

   This is not an issue regarding the parquet lib itself but maybe with the accompanying documentation/examples:
   
   Based on the "StreamWriter" example  [stream_reader_writer](https://github.com/apache/arrow/blob/master/cpp/examples/parquet/parquet_stream_api/stream_reader_writer.cc), I'm trying to add another column to the parquet schema which I want to write "Map-Data" (`Map<Integer, String>`) to.
   
   According to [this source](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps), Maps are supported by the C++ implementation. However, it's not clear to me how to ...
   
   - correctly add/define another column of type "Map" to the parquet schema. According to the previous link, Maps are "Logical Types", so my best try so far is
   
   `fields.push_back(parquet::schema::PrimitiveNode::Make("Map", parquet::Repetition::REQUIRED, parquet::LogicalType::Map(), parquet::Type::INT64));
   `
   This compiles without error but I'm uncertain about the last parameter "primitive_type" (parquet::Type::INT64).
   
   - how to define and write actual Map values. Again, the second links provides a general description of a parquet Map:
   
   ```
   // Map<String, Integer>
   required group my_map (MAP) {
     repeated group key_value {
       required binary key (UTF8);
       optional int32 value;
     }
   }
   ```
   The [github repo](https://github.com/apache/arrow/tree/master/cpp) is missing a C++ example (or test case) on how to use a Map/Logical Types. 
   I'm looking for advice/an C++ example, how to define and use a Map together with the StreamWriter API.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] wjones127 commented on issue #13201: [C++] How to define and write "Map" data (nested type)?

Posted by GitBox <gi...@apache.org>.
wjones127 commented on issue #13201:
URL: https://github.com/apache/arrow/issues/13201#issuecomment-1133030881

   Looking at the header file and the tests, I don't believe the StreamWriter has any support for nested parquet types (structs, lists, and maps), so I don't think this is currently possible.
   
   Maps and other nested types are supported for writing, just not through that API. The Arrow writer supports arbitrary nested types, and there's likely a lower-level API for doing that too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] VisBeo closed issue #13201: [C++] How to define and write "Map" data (nested type)?

Posted by GitBox <gi...@apache.org>.
VisBeo closed issue #13201: [C++]  How to define and write "Map" data (nested type)?
URL: https://github.com/apache/arrow/issues/13201


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] VisBeo commented on issue #13201: [C++] How to define and write "Map" data (nested type)?

Posted by GitBox <gi...@apache.org>.
VisBeo commented on issue #13201:
URL: https://github.com/apache/arrow/issues/13201#issuecomment-1134661827

   Thanks for the timely feedback, I think that answers my question.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org