You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/06/05 16:51:00 UTC
[jira] [Created] (IMPALA-9831)
TestScannersFuzzing::test_fuzz_alltypes() hits DCHECK in
parquet-page-reader.cc
Joe McDonnell created IMPALA-9831:
-------------------------------------
Summary: TestScannersFuzzing::test_fuzz_alltypes() hits DCHECK in parquet-page-reader.cc
Key: IMPALA-9831
URL: https://issues.apache.org/jira/browse/IMPALA-9831
Project: IMPALA
Issue Type: Bug
Components: Backend
Affects Versions: Impala 4.0
Reporter: Joe McDonnell
In a recent precommit job, an Impalad crashed with the following DCHECK:
{noformat}
F0604 01:18:36.921769 30923 parquet-page-reader.cc:67] b64df3da7eea7c65:16f9c6e800000001] Check failed: col_end < file_desc.file_length (6820 vs. 6820) {noformat}
The assert is checking that the end of a column is before the end of the file. This must be true, because the footer takes up space at the end of the file.
The code for this DCHECK is:
{noformat}
int64_t col_end = col_start + col_len;
// Already validated in ValidateColumnOffsets()
DCHECK_GT(col_end, 0);
DCHECK_LT(col_end, file_desc.file_length); <---------{noformat}
This mentions that this was already validated in ParquetMetadataUtils::ValidateColumnOffsets(). That is where the problem is:
{noformat}
int64_t col_len = col_chunk.meta_data.total_compressed_size;
int64_t col_end = col_start + col_len;
if (col_end <= 0 || col_end > file_length) {
return Status(Substitute("Parquet file '$0': metadata is corrupt. Column $1 has "
"invalid column offsets (offset=$2, size=$3, file_size=$4).", filename, i,
col_start, col_len, file_length));
}{noformat}
The condition should be "col_end >= file_length".
If we knew the size of the parquet footer, this check could be stricter as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org