You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Ravi Tatapudi <ra...@in.ibm.com> on 2016/06/15 14:50:56 UTC
Parquet-API writing lot of log-messages impacting performance benchmarks
Hello,
As part of performance testing of "Parquet-write", I see the following
issue:
1) For writing a 50 GB parquet-file, I see that, around 140 MB of
log-messages (shown below) are being written by Parquet-API:
=======================
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.InternalParquetRecordWriter:
Flushing mem columnStore to file. allocated memory: 116,400
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 1,312B for [L_ORDERKEY] INT32: 319 values, 1,276B raw, 1,276B
comp, 1 pages, encodings: [BIT_PACKED, PLAIN]
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 1,312B for [L_PARTKEY] INT32: 319 values, 1,276B raw, 1,276B comp,
1 pages, encodings: [BIT_PACKED, PLAIN]
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 1,312B for [L_SUPPKEY] INT32: 319 values, 1,276B raw, 1,276B comp,
1 pages, encodings: [BIT_PACKED, PLAIN]
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 1,312B for [L_LINENUMBER] INT32: 319 values, 1,276B raw, 1,276B
comp, 1 pages, encodings: [BIT_PACKED, PLAIN]
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 318B for [L_QUANTITY] FLOAT: 319 values, 282B raw, 282B comp, 1
pages, encodings: [BIT_PACKED, PLAIN_DICTIONARY], dic { 80 entries, 320B
raw, 80B comp}
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 318B for [L_EXTENDEDPRICE] FLOAT: 319 values, 282B raw, 282B comp,
1 pages, encodings: [BIT_PACKED, PLAIN_DICTIONARY], dic { 80 entries, 320B
raw, 80B comp}
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 318B for [L_DISCOUNT] FLOAT: 319 values, 282B raw, 282B comp, 1
pages, encodings: [BIT_PACKED, PLAIN_DICTIONARY], dic { 80 entries, 320B
raw, 80B comp}
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 318B for [L_TAX] FLOAT: 319 values, 282B raw, 282B comp, 1 pages,
encodings: [BIT_PACKED, PLAIN_DICTIONARY], dic { 80 entries, 320B raw, 80B
comp}
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 2,267B for [L_RETURNFLAG] BINARY: 319 values, 2,233B raw, 2,233B
comp, 1 pages, encodings: [BIT_PACKED, PLAIN]
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 2,267B for [L_LINESTATUS] BINARY: 319 values, 2,233B raw, 2,233B
comp, 1 pages, encodings: [BIT_PACKED, PLAIN]
Jun 7, 2016 1:09:51 AM INFO: parquet.hadoop.ColumnChunkPageWriteStore:
written 9,010B for [L_SHIPDATE] BINARY: 319 values, 8,932B raw, 8,932B
comp, 1 pages, encodings: [BIT_PACKED, PLAIN]
........................................
........................................
=======================
Is there any way to suppress these messages ? Could you please let me
know.
Thanks,
Ravi