You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Lars Volker (JIRA)" <ji...@apache.org> on 2017/05/06 18:56:04 UTC

[jira] [Created] (IMPALA-5287) Add a test for skip.header.line.count on compressed files

Lars Volker created IMPALA-5287:
-----------------------------------

             Summary: Add a test for skip.header.line.count on compressed files
                 Key: IMPALA-5287
                 URL: https://issues.apache.org/jira/browse/IMPALA-5287
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 2.9.0
            Reporter: Lars Volker
            Assignee: Lars Volker
            Priority: Critical


Before the fix for IMPALA-3905 was merged, the HDFS text scanner initialized the decompressor after finding the first row. This was wrong, but not an issue for normal compressed tables, since for those we only issue a single scan range, ant therefore can skip searching for the first newline character.

However, this broke skipping header lines at the beginning of compressed files. We should add a test for skip.header.line.count on compressed files to prevent a regression in the future.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)