You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Gwen Shapira (JIRA)" <ji...@apache.org> on 2014/10/23 02:10:33 UTC

[jira] [Assigned] (SQOOP-1606) Oraoop import can end up with overlapping input splits, generating duplicate data

     [ https://issues.apache.org/jira/browse/SQOOP-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gwen Shapira reassigned SQOOP-1606:
-----------------------------------

    Assignee: Gwen Shapira

> Oraoop import can end up with overlapping input splits, generating duplicate data
> ---------------------------------------------------------------------------------
>
>                 Key: SQOOP-1606
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1606
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Gwen Shapira
>            Assignee: Gwen Shapira
>
> Pretty rare, but can happen (that is - did happen...):
> Assume a table, TABLE1 with an index, also named TABLE1.
> The segments for TABLE1 table and TABLE1 index are in two different tablespaces, and both have identical relative_fno.
> In that case, the input splits generated by getOracleDataChunksExtent will include block ranges that belong to the index and can overlap with some of the "correct" block ranges. This can lead to duplicate data when importing.
> The solution should be to use object_type when filtering to limit ourselves to tables and partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)