You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/06/17 15:05:00 UTC
[jira] [Resolved] (IMPALA-9831)
TestScannersFuzzing::test_fuzz_alltypes() hits DCHECK in
parquet-page-reader.cc
[ https://issues.apache.org/jira/browse/IMPALA-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joe McDonnell resolved IMPALA-9831.
-----------------------------------
Fix Version/s: Impala 4.0
Target Version: Impala 4.0
Resolution: Fixed
> TestScannersFuzzing::test_fuzz_alltypes() hits DCHECK in parquet-page-reader.cc
> -------------------------------------------------------------------------------
>
> Key: IMPALA-9831
> URL: https://issues.apache.org/jira/browse/IMPALA-9831
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Critical
> Labels: flaky
> Fix For: Impala 4.0
>
>
> In a recent precommit job, an Impalad crashed with the following DCHECK:
> {noformat}
> F0604 01:18:36.921769 30923 parquet-page-reader.cc:67] b64df3da7eea7c65:16f9c6e800000001] Check failed: col_end < file_desc.file_length (6820 vs. 6820) {noformat}
> The assert is checking that the end of a column is before the end of the file. This must be true, because the footer takes up space at the end of the file.
> The code for this DCHECK is:
> {noformat}
> int64_t col_end = col_start + col_len;
> // Already validated in ValidateColumnOffsets()
> DCHECK_GT(col_end, 0);
> DCHECK_LT(col_end, file_desc.file_length); <---------{noformat}
> This mentions that this was already validated in ParquetMetadataUtils::ValidateColumnOffsets(). That is where the problem is:
> {noformat}
> int64_t col_len = col_chunk.meta_data.total_compressed_size;
> int64_t col_end = col_start + col_len;
> if (col_end <= 0 || col_end > file_length) {
> return Status(Substitute("Parquet file '$0': metadata is corrupt. Column $1 has "
> "invalid column offsets (offset=$2, size=$3, file_size=$4).", filename, i,
> col_start, col_len, file_length));
> }{noformat}
> The condition should be "col_end >= file_length".
> If we knew the size of the parquet footer, this check could be stricter as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)