You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Daniel Voros (JIRA)" <ji...@apache.org> on 2018/05/04 11:11:00 UTC

[jira] [Commented] (SQOOP-3317) org.apache.sqoop.validation.RowCountValidator in live RDBMS system

    [ https://issues.apache.org/jira/browse/SQOOP-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463723#comment-16463723 ] 

Daniel Voros commented on SQOOP-3317:
-------------------------------------

Hi [~srikumaran.t], thank you for reporting this!

As far as I can tell, currently the only option for validation is to check for an exact match for the number of records. "Percentage tolerant" validation was only mentioned in the documentation but is not implemented.

In my opinion this kind of validation (comparing the number of records) doesn't make much sense and should only be used as a sanity check, since it doesn't guarantee the equality of the contents.

However we could improve the existing implementation by introducing another parameter (margin/threshold) to not require an exact match and we could also implement "Percentage tolerant".

> org.apache.sqoop.validation.RowCountValidator in live RDBMS system
> ------------------------------------------------------------------
>
>                 Key: SQOOP-3317
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3317
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Sri Kumaran Thirupathy
>            Priority: Major
>
> org.apache.sqoop.validation.RowCountValidator is retrieving count from Source after the MR completes. This fails in live RDBMS case.
> org.apache.sqoop.validation.RowCountValidator can retrive count during MR execution phase.  
> Also, How to use Percentage Tolerant? Reference: [https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)