You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Hari Shreedharan (JIRA)" <ji...@apache.org> on 2012/12/09 21:37:21 UTC

[jira] [Comment Edited] (SQOOP-738) Sqoop is not importing all data in Sqoop 2

    [ https://issues.apache.org/jira/browse/SQOOP-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527622#comment-13527622 ] 

Hari Shreedharan edited comment on SQOOP-738 at 12/9/12 8:35 PM:
-----------------------------------------------------------------

Ah, so you mean to say that the free.release() call in the readContent method unblocks the framework to commit before the close method is called? That makes sense, which is why it was not noticed till now - since it would be seen only on larger workloads. We actually never waited on the completion condition (even before I refactored this class, since we always did the threading from within this class only, and never considered the close condition). That is pretty easy to do. Just add a CyclicBuffer to do it. It would be about 2 lines of code. 
                
      was (Author: hshreedharan):
    Ah, so you mean to say that the free.release() call should not be in the readContent method? That makes sense. We actually never waited on the completion condition. That is pretty easy to do. Just add a CyclicBuffer to do it. It would be about 2 lines of code. 
                  
> Sqoop is not importing all data in Sqoop 2
> ------------------------------------------
>
>                 Key: SQOOP-738
>                 URL: https://issues.apache.org/jira/browse/SQOOP-738
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Jarek Jarcec Cecho
>            Priority: Blocker
>             Fix For: 1.99.1
>
>
> I've tried to import exactly 408,957 (nice rounded number right?) rows in 10 mappers and I've noticed that not all mappers will supply all the data all the time. For example in run I got 6 files with expected size of 10MB whereas the other 4 random files are completely empty. In another run I got 8 files of 10MB and just 2 files empty. I did not quite found any logic regarding how many and which files will end up empty. We definitely need to address this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira