You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org> on 2014/11/13 05:16:31 UTC

[jira] [Commented] (SQOOP-1602) Sqoop2: Fix the current balancing across Loaders internal to Sqoop

    [ https://issues.apache.org/jira/browse/SQOOP-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209235#comment-14209235 ] 

Jarek Jarcec Cecho commented on SQOOP-1602:
-------------------------------------------

Loaders are currently executed on the same nodes as extractors (we're not doing any re-shuffling of data), so the fact that we have unbalanced number of loaders, means that we also had unbalanced number of extractors and that is the real problem. I would check into how were the data split (what partitions) were created in the JDBC connector and take it from there.

> Sqoop2:  Fix the current balancing across Loaders internal to Sqoop 
> --------------------------------------------------------------------
>
>                 Key: SQOOP-1602
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1602
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Veena Basavaraj
>            Assignee: Qian Xu
>             Fix For: 1.99.5
>
>
> The balancing of the record to the loaders in done internally in SQOOP today
> While writing the Kite Connector Qian noticed that this is not done fairly.
> While I am testing kite connector, I allocated 2 loaders. I thought data will be divided by 50% and 50% to both loaders. But actually the second loader does nothing, because its DataReader does not have any data to provide. Is it by design?
> >> About loaders do not have data in a balanced way.
> My scenario is 4 "jdbc_mysql" extractors to extract 100k row data (10MB). There are 2 Kite loaders to read data.
> This must be a bug that needs to be fixed in SQOOP ( [~abec] confirmed it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)