You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Sharad Agarwal (JIRA)" <ji...@apache.org> on 2008/08/19 16:05:44 UTC

[jira] Updated: (HADOOP-3828) Write skipped records' bytes to DFS

     [ https://issues.apache.org/jira/browse/HADOOP-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sharad Agarwal updated HADOOP-3828:
-----------------------------------

    Attachment: 3828_v1.patch

This works as follows:-
Write the skipped record (key,value) as SequenceFile.
By default the skipped records are written  in the folder "_skip" in the output dir. This is configurable using SkipBadRecords.setSkipOutputPath

-The patch also fixes a corner case by initializing the variable "skipping" in TaskInProgress.
-Also it makes some changes in SortedRanges. Made it cloneable and fixed serialization of member variable.
-cleanup in MapTask by having a different implementation of RecordReader for normal mode (skipping=false)

> Write skipped records' bytes to DFS
> -----------------------------------
>
>                 Key: HADOOP-3828
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3828
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Sharad Agarwal
>            Assignee: Sharad Agarwal
>         Attachments: 3828_v1.patch
>
>
> This is an incremental step over HADOOP-153, which provides the base skipping functionality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.