You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Venkat Ranganathan (JIRA)" <ji...@apache.org> on 2014/09/30 03:28:34 UTC

[jira] [Commented] (SQOOP-1558) Data Partition Awareness

    [ https://issues.apache.org/jira/browse/SQOOP-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152679#comment-14152679 ] 

Venkat Ranganathan commented on SQOOP-1558:
-------------------------------------------

We do provide partition-wise access where it makes sense.   For example with Netezza and this is done as part of "direct" mode connectors which can take advantage of DB specific features.   We welcome additional per DB direct mode connectors for DBs we don't have direct mode connectors that can be made more intelligent.

> Data Partition Awareness
> ------------------------
>
>                 Key: SQOOP-1558
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1558
>             Project: Sqoop
>          Issue Type: New Feature
>          Components: connectors/generic
>    Affects Versions: 2.0.0
>            Reporter: Naveen Manivannan
>            Priority: Critical
>
> Sqoop is not aware of the partitioning of tables within an RDBMS. Thus it'll create 'where' clauses whose ranges extend across multiple partitions forcing full table scans or unnecessary scans of partitions. This information is available within the metadata tables of most RDMBS.
> The following classes needs have logic to allow for Sqoop to not extend outside of a range as derived by a generic function. The specific SQL queries and logic to find the partition ranges, if exists, of partition columns can be written per database.
> sqoop/common/src/main/java/org/apache/sqoop/job/etl/ PartitionerContext.java
> sqoop/connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/ connector/ jdbc/GenericJdbcImportPartitioner.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)