You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Adam Konarski <ak...@gmail.com> on 2015/10/24 14:29:40 UTC

Incremental import using sqoop2

Hello
I would like to import data from MySQL table into HDFS. I have everything
configured and I am able to create simple job in sqoop-shell that is
copying data. However, I would like to copy each time only new records, but
I am not sure how to achieve this. When I create job there is a parameter
named "check column" and I have columns like ID or eventTimestamp that seem
to be suitable there. However, in such case I should enter also "last
value" then. Do I have to manage this last value by myself and each time
create new job with new "last value"? Why in such case create a job if it
is used only once and then has to be recreated? Is it not possible for
Sqoop to manage this, by storing each time new "last value" and import only
new records? Moreover, why I have this error message when I enter anything
as "last value": "Size of input exceeds allowance for this input field.
Maximal allowed size is -1"?

Best regards,
Adam

Re: Incremental import using sqoop2

Posted by Abraham Fine <ab...@cloudera.com>.
I fix for the issue was committed today:
https://issues.apache.org/jira/browse/SQOOP-2640

Please let us know if everything is working on your end.

Thanks,
Abe

On Oct 24, 2015, at 6:44 AM, Adam Konarski <ak...@gmail.com> wrote:

I use the newest one 1.99.6. This is how it looks in sqoop-shell:

sqoop:000> update job -j=5
Updating job with id 5
Please update job:
Name: dziala

>From database configuration

Schema name: telcoData
Table name: SecureUser
Table SQL statement:
Table column names:
Partition column name:
Null value allowed for the partition column:
Boundary query:

Incremental read

Check column: event_timestamp
Last value: 0
Error message: Size of input exceeds allowance for this input field.
Maximal allowed size is -1
Last value: 0
Error message: Size of input exceeds allowance for this input field.
Maximal allowed size is -1
Last value: 0

Re: Incremental import using sqoop2

Posted by Adam Konarski <ak...@gmail.com>.
I use the newest one 1.99.6. This is how it looks in sqoop-shell:

sqoop:000> update job -j=5
Updating job with id 5
Please update job:
Name: dziala

>From database configuration

Schema name: telcoData
Table name: SecureUser
Table SQL statement:
Table column names:
Partition column name:
Null value allowed for the partition column:
Boundary query:

Incremental read

Check column: event_timestamp
Last value: 0
Error message: Size of input exceeds allowance for this input field.
Maximal allowed size is -1
Last value: 0
Error message: Size of input exceeds allowance for this input field.
Maximal allowed size is -1
Last value: 0

Re: Incremental import using sqoop2

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
What version are you using Adam?

The incremental import in JDBC connector has been added in 1.99.6 via SQOOP-1805. Quickly looking through the code it seems that the "last value” field [1] don’t have specified size which in turn defaults to -1. It’s interesting to me that integration tests with this one are succeeding whereas real life use does not. Could you create a JIRA for it [2]? I’ll take a deeper look.

Jarcec

Links:
1: https://github.com/apache/sqoop/blame/sqoop2/connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/configuration/IncrementalRead.java#L34
2: https://issues.apache.org/jira/browse/SQOOP/

> On Oct 24, 2015, at 5:29 AM, Adam Konarski <ak...@gmail.com> wrote:
> 
> Hello
> I would like to import data from MySQL table into HDFS. I have everything configured and I am able to create simple job in sqoop-shell that is copying data. However, I would like to copy each time only new records, but I am not sure how to achieve this. When I create job there is a parameter named "check column" and I have columns like ID or eventTimestamp that seem to be suitable there. However, in such case I should enter also "last value" then. Do I have to manage this last value by myself and each time create new job with new "last value"? Why in such case create a job if it is used only once and then has to be recreated? Is it not possible for Sqoop to manage this, by storing each time new "last value" and import only new records? Moreover, why I have this error message when I enter anything as "last value": "Size of input exceeds allowance for this input field. Maximal allowed size is -1"?
> 
> Best regards, 
> Adam
>