You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Fero Szabo (JIRA)" <ji...@apache.org> on 2018/05/17 17:06:00 UTC

[jira] [Comment Edited] (SQOOP-3082) Sqoop import fails after TCP connection reset if split by datetime column

    [ https://issues.apache.org/jira/browse/SQOOP-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479374#comment-16479374 ] 

Fero Szabo edited comment on SQOOP-3082 at 5/17/18 5:05 PM:
------------------------------------------------------------

Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it to the current version, please find this version attached.
 I've tested it manually with an Integer and a Date column in the split-by option.

The former to ensure that it doesn't alter current behavior, the latter to check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and found that the data type precedence will ensure the correct behavior of Sqoop. For example, if the lastRecordValue field contains a number, it will be "encoded" as a String because of the apostrophes in the resulting statement, however, since the column's type is still INT, the INT will take precedence and the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the rules for data type precedence specify that the data type with the lower precedence is converted to the data type with the higher precedence. If the conversion is not a supported implicit conversion, an error is returned. When both operand expressions have the same data type, the result of the operation has that data type.
{quote}
(1) SQL Server 2000: [https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
 (2) current documentation: [https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)
 ]

I believe we should get this committed now, since it adds a real value for sqoop users, even without tests.

Since testing a connection reset is not a trivial issue, I've opened SQOOP-3325, to track the implementation of the tests.

 

 


was (Author: fero):
Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it to the current version, please find this version attached.
I've tested it manually with an Integer and a Date column in the split-by option.

The former to ensure that it doesn't alter current behavior, the latter to check if the fix actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch fixes the issue.*

I also checked the relevant parts of the documentation of SQL Server (1, 2) and found that the data type precedence will ensure the correct behavior of Sqoop. For example, if the lastRecordValue field contains a number, it will be "encoded" as a String because of the apostrophes in the resulting statement, however, since the column's type is still INT, the INT will take precedence and the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the rules for data type precedence specify that the data type with the lower precedence is converted to the data type with the higher precedence. If the conversion is not a supported implicit conversion, an error is returned. When both operand expressions have the same data type, the result of the operation has that data type.
{quote}
(1) SQL Server 2000: [https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
(2) current documentation: [https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)
]

I believe we should get this committed now, since it adds a real value for sqoop users, even without tests.[
|https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)]Since testing a connection reset is not a trivial issue, I've opened SQOOP-3325, to track the implementation of the tests.






 

> Sqoop import fails after TCP connection reset if split by datetime column
> -------------------------------------------------------------------------
>
>                 Key: SQOOP-3082
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3082
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.6
>            Reporter: Sergey Svynarchuk
>            Priority: Major
>         Attachments: SQOOP-3082-1.patch, SQOOP-3082.patch
>
>
> If sqoop-to-mssqlserver connection reset, the whole command fails with "Connection reset with com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near '00'" . On reestablishing connection, Sqoop tries to resume import from the last record that was successfully read by :
> {code}
> 2016-12-10 15:18:54,523 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select * from test.dbo.test1 WITH (nolock) where Date >= '2015-01-10' and Date <= '2016-11-24' and ( Date > 2015-09-18 00:00:00.0 ) AND ( Date < '2015-09-23 11:48:00.0' ) 
> {code}
> Not quoted 2015-09-18 00:00:00.0 in SQL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)