You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Thomas Friedrich (JIRA)" <ji...@apache.org> on 2015/07/01 22:26:05 UTC
[jira] [Created] (PARQUET-324) row count incorrect if data file has
more than 2^31 rows
Thomas Friedrich created PARQUET-324:
----------------------------------------
Summary: row count incorrect if data file has more than 2^31 rows
Key: PARQUET-324
URL: https://issues.apache.org/jira/browse/PARQUET-324
Project: Parquet
Issue Type: Bug
Components: parquet-mr
Affects Versions: 1.7.0, 1.8.0
Reporter: Thomas Friedrich
Priority: Minor
If a parquet file has more than 2^31 rows, the row count written into the file metadata is incorrect.
The cause of the problem is the use of an int instead of long data type for numRows in ParquetMetadataConverter, toParquetMetadata:
int numRows = 0;
for (BlockMetaData block : blocks) {
numRows += block.getRowCount();
addRowGroup(parquetMetadata, rowGroups, block);
}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)