You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/06/05 16:51:00 UTC

[jira] [Created] (IMPALA-9831) TestScannersFuzzing::test_fuzz_alltypes() hits DCHECK in parquet-page-reader.cc

Joe McDonnell created IMPALA-9831:
-------------------------------------

             Summary: TestScannersFuzzing::test_fuzz_alltypes() hits DCHECK in parquet-page-reader.cc
                 Key: IMPALA-9831
                 URL: https://issues.apache.org/jira/browse/IMPALA-9831
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 4.0
            Reporter: Joe McDonnell


In a recent precommit job, an Impalad crashed with the following DCHECK:
{noformat}
F0604 01:18:36.921769 30923 parquet-page-reader.cc:67] b64df3da7eea7c65:16f9c6e800000001] Check failed: col_end < file_desc.file_length (6820 vs. 6820) {noformat}
The assert is checking that the end of a column is before the end of the file. This must be true, because the footer takes up space at the end of the file.

The code for this DCHECK is:
{noformat}
  int64_t col_end = col_start + col_len;
  // Already validated in ValidateColumnOffsets()
  DCHECK_GT(col_end, 0);
  DCHECK_LT(col_end, file_desc.file_length); <---------{noformat}
This mentions that this was already validated in ParquetMetadataUtils::ValidateColumnOffsets(). That is where the problem is:
{noformat}
int64_t col_len = col_chunk.meta_data.total_compressed_size;
int64_t col_end = col_start + col_len;
if (col_end <= 0 || col_end > file_length) {
  return Status(Substitute("Parquet file '$0': metadata is corrupt. Column $1 has "
      "invalid column offsets (offset=$2, size=$3, file_size=$4).", filename, i,
      col_start, col_len, file_length));
}{noformat}
The condition should be "col_end >= file_length".

If we knew the size of the parquet footer, this check could be stricter as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org