You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Ze Jin (JIRA)" <ji...@apache.org> on 2015/12/26 09:36:49 UTC

[jira] [Commented] (SQOOP-2750) Support --fields-terminated-by value greater than 127 when using --hive-import

    [ https://issues.apache.org/jira/browse/SQOOP-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071835#comment-15071835 ] 

Ze Jin commented on SQOOP-2750:
-------------------------------

I think that the Affects version is 1.4.6, not 1.99.6. There are no src/java/apache/sqoop/hive/TableDefWriter.java in 1.99.6.

> Support --fields-terminated-by value greater than 127 when using --hive-import
> ------------------------------------------------------------------------------
>
>                 Key: SQOOP-2750
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2750
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: hive-integration
>    Affects Versions: 1.99.6
>            Reporter: Marcus Truscello
>            Priority: Minor
>              Labels: easyfix, newbie
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> Using a {{fields-terminated-by}} value greater than 127 builds a file with the correct delimiter but causes an exception when included with {{hive-import}}.  The relevant code is in {{src/java/apache/sqoop/hive/TableDefWriter.java}}:
> https://github.com/apache/sqoop/blob/f19e2a523579db8c28a96febfd3cf35a5d58adc6/src/java/org/apache/sqoop/hive/TableDefWriter.java#L278-L300
> The assumption is only half true.  Hive only supports delimiters up to 127 in *octal* form, but it also supports delimiters up to 255 in signed character form (two's compliment).  
> For example, a {{fields-terminated-by}} value {{'\0376'}} (ASCII 254) is valid for sqoop, but when used in a Hive table definition it should be converted to {{'-2'}} (with single quotes).
> I suggest rejecting delimiters over 255, converting delimiters over 127 to two's compliment signed characters, and leaving delimiters at or below 127 as octal.
> (Work estimate inflated to account of number of tests that may need to be modified.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)