You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Veena Basavaraj (JIRA)" <ji...@apache.org> on 2015/03/06 18:03:38 UTC

[jira] [Issue Comment Deleted] (SQOOP-1856) Sqoop2: Handling failures ( Row and Field level ) in Sqoop

     [ https://issues.apache.org/jira/browse/SQOOP-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Veena Basavaraj updated SQOOP-1856:
-----------------------------------
    Comment: was deleted

(was: Lets start somewhere! Part of plugging in a new engine also mean to get the current infra right.

So lets not worry about what should be a sub task of what, rather tackle pre-requisites)

> Sqoop2: Handling failures ( Row and Field level ) in Sqoop
> ----------------------------------------------------------
>
>                 Key: SQOOP-1856
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1856
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 2.0.0
>
>
> Skipping corrupted rows in Sqoop 
> What is the proposed strategy for handling such scenarios in batch transfer?
> Probably one of the below ..
> 1. Skip/ignore and still continue for good records
> 2. just bail out once we have a bad record?
> 3. have a threshold of how many bad rows we can tolerate? that is configurable.
> From Anand Iyer
> {quote}
> Sqoop is the most obvious place for the functionality discussed in this thread. But at some point, we should start think about adding ... functionality such as  (Policy Driven SLAs and Data Validation) ....
> {quote}
> This means we want to be able to define not just failure handling, but more elaborate strategies for sqoop data validation, metrics exposing the state of transfer etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)