You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Ranjan Bagchi (JIRA)" <ji...@apache.org> on 2015/10/23 20:01:27 UTC

[jira] [Created] (SQOOP-2639) Unable to export utf-8 data to MySQL using --direct mode

Ranjan Bagchi created SQOOP-2639:
------------------------------------

             Summary: Unable to export utf-8 data to MySQL using --direct mode
                 Key: SQOOP-2639
                 URL: https://issues.apache.org/jira/browse/SQOOP-2639
             Project: Sqoop
          Issue Type: Bug
          Components: connectors/mysql
    Affects Versions: 1.4.6
            Reporter: Ranjan Bagchi


I am able to import utf-8 data (non-latin1) data successfully into HDFS via:

sqoop import --connect jdbc:mysql://host/db --username XX --password YY \
        --mysql-delimiters \
        --table MYSQL_SRC_TABLE --target-dir ${SQOOP_DIR_PREFIX}/mysql_table --direct 

However, using 

sqoop export --connect  jdbc:mysql://host/db --username XX --password YY \
        --mysql-delimiters \
        --table MYSQL_DEST_TABLE --export-dir ${SQOOP_DIR_PREFIX}/mysql_table \
        --direct 

Cuts off the fields after the first non-latin1 character (eg a letter w/ an umlaut).
I tried other options like  -- --default-character-set=utf8, without success.

I was able to fix the problem with the following change:
Change https://svn.apache.org/repos/asf/sqoop/trunk/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java, line 322 from 
`this.mysqlCharSet = MySQLUtils.MYSQL_DEFAULT_CHARSET;`
to
`this.mysqlCharSet = "utf-8"; `

Hope this helps




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)