You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by Ian Joiner <ia...@gmail.com> on 2021/12/09 13:31:49 UTC

[C++] CompressionSize vs CompressionBlockSize & StripeSize

Hi,

As the Arrow developer who developed the Arrow->ORC adapter in C++/Python and is working on https://github.com/apache/arrow/pull/9702 <https://github.com/apache/arrow/pull/9702> I would like to ask the ORC community two questions:

1. Are orc::WriterOptions::getCompressionBlockSize() in orc/c++/include/orc/Writer.hh and orc::Reader::getCompressionSize() in orc/c++/include/orc/Reader.hh of the same ORC file always the same? I have done some testing by modifying CompressionBlockSize, writing the ORC file and then reading it. I do get a CompressionSize identical to CompressionBlockSize.

2. What’s the unit of the StripeSize? Is it in bytes, KBs, MBs or something else?

Thanks,
Ian Joiner