You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ramkumar Vadali (JIRA)" <ji...@apache.org> on 2011/08/24 21:10:29 UTC

[jira] [Updated] (HIVE-2404) Allow RCFile Reader to tolerate corruptions

     [ https://issues.apache.org/jira/browse/HIVE-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ramkumar Vadali updated HIVE-2404:
----------------------------------

    Attachment: toleratecorruptions.patch

This patch add a configuration option hive.io.rcfile.tolerate.corruptions. If the option is set to true - 
 * lazy decompression is disabled
 * Unexpected errors are treated as corruptions and the reader indicates no more data

This allows graceful termination of the read when there are corruptions

The default value of hive.io.rcfile.tolerate.corruptions is false

Tested this by using rcfilecat on a file with a corrupt block of data.

> Allow RCFile Reader to tolerate corruptions
> -------------------------------------------
>
>                 Key: HIVE-2404
>                 URL: https://issues.apache.org/jira/browse/HIVE-2404
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.7.1
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>            Priority: Minor
>         Attachments: toleratecorruptions.patch
>
>
> Sometimes it is useful to tolerate corruptions during a query and return results based on the files that can be processed. A single corrupt block of data should not prevent reading the rest of the data.
> We need a way to gracefully ignore errors while reading a RC File

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira