You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by qihua wu <wu...@gmail.com> on 2013/11/09 15:34:43 UTC

ORC default stripe size of 250M, before or after compression

If the size is before compression, then after compression, the strip size
stored on disk will be not uniform which doesn't look good. But if it's
after compression, then how did hive know the size is 250M after
compression? Will hive compress some, check whether it reaches 250M, if not
reached, then add more and compress, repeat over again and again until it
reaches 250M. But this looks like not cost effective. Anyone could help me
understand?