You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2018/08/06 23:33:00 UTC

[jira] [Created] (HIVE-20327) Compactor should gracefully handle 0 length files and invalid orc files

Eugene Koifman created HIVE-20327:
-------------------------------------

             Summary: Compactor should gracefully handle 0 length files and invalid orc files
                 Key: HIVE-20327
                 URL: https://issues.apache.org/jira/browse/HIVE-20327
             Project: Hive
          Issue Type: Improvement
          Components: Transactions
    Affects Versions: 2.0.0
            Reporter: Eugene Koifman
            Assignee: Eugene Koifman


Older versions of Streaming API did not handle interrupts well and could leave 0-length ORC files behind which cannot be read.

These should be just skipped.

Other cases of file where ORC Reader cannot be created
1. regular write (1 txn delta) where the client died and didn't properly close the file - this delta should be aborted and never read
2. streaming ingest write (delta_x_y, x < y).  There should always be a side file if the file was not closed properly. (though it may still indicate that length is 0)


If we check these cases and still can't create a reader, it should not silently skip the file since the system thinks it contains at least some committed data but the file is corrupted (and the side file doesn't point at a valid footer) - we should never be in this situation and we should throw so that the end user can try manual intervention (where the only option may be deleting the file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)