You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Eugene Chipizubov (Jira)" <ji...@apache.org> on 2020/10/07 08:56:00 UTC

[jira] [Commented] (SQOOP-1946) DateSplitter relies on database string-to-date conversion when creating splits based on date columns

    [ https://issues.apache.org/jira/browse/SQOOP-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209402#comment-17209402 ] 

Eugene Chipizubov commented on SQOOP-1946:
------------------------------------------

Is Sqoop as a project dead?

> DateSplitter relies on database string-to-date conversion when creating splits based on date columns
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-1946
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1946
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/generic
>    Affects Versions: 1.4.5
>            Reporter: Andre Araujo
>            Priority: Major
>         Attachments: SQOOP-1946.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> When running a sqoop import job with a split-by column of data type DATE, the DateSplitter relies on the database implicit string-to-date convertion when generating the lower and upper bound clauses for the splits. The splits are generated with clauses similar to the following:
> {code}
> date_col >= '2013-08-26 00:00:00.0'
> date_col <= '2013-08-26 00:00:00.0'
> {code}
> This forces the database to either implcitly convert the date_col to string or implicitly cast the literal string to a date type. In case the database default date format is not the expected, this could lead to either exceptions, as in the example below, or to unexpected behavior (string comparison of to dates in different string formats).
> For Oracle databases, for example, we may see the following errors when the database cannot implicitly convert the string above to date:
> {code}
> 2014-12-23 12:38:25,690 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: sales_date >= '2013-08-26 00:00:00.0' AND sales_date <= '2013-08-26 00:00:00.0'
> 2014-12-23 12:38:25,745 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Working on split: sales_date >= '2013-08-26 00:00:00.0' AND sales_date <= '2013-08-26 00:00:00.0'
> 2014-12-23 12:38:25,860 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: select * from schema.TABLE where ( date_col >= '2013-08-26 00:00:00.0' ) AND ( date_col <= '2013-08-26 00:00:00.0' )
> 2014-12-23 12:38:25,909 ERROR [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Top level exception:  
> java.sql.SQLDataException: ORA-01861: literal does not match format string 
> {code}
> A workaround for that is to set the default date format for the database user used by Sqoop to the "expected" format using a logon trigger in Oracle, like the one below:
> {code}
> CREATE OR REPLACE TRIGGER tr_a_l_set_date_format
>    AFTER LOGON ON DATABASE WHEN (USER = 'SQOOP_USER')
>    BEGIN
>      execute immediate 'alter session set nls_date_format="yyyy-mm-dd hh24:mi:ss"';
>    END;
> {code}
> A better form, though, would be to explicitly convert the date string literal to a DATE using a specific format, in the exact same way that OracleManager.datetimeToQueryString() does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)