You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Szabolcs Vasas (JIRA)" <ji...@apache.org> on 2019/01/21 14:39:00 UTC
[jira] [Commented] (SQOOP-3421) Importing data from Oracle to Parquet as incremental dataset name fails

    [ https://issues.apache.org/jira/browse/SQOOP-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16747986#comment-16747986 ] 

Szabolcs Vasas commented on SQOOP-3421:
---------------------------------------

Hi [~dmateusp],

You have encountered a Kite limitation here. The problem is that since the table name is specified in SOME_SCHEMA.SOME_TABLE_NAME form Kite tries to create a dataset with that name but '.' is not permitted in Kite dataset names. The reason you get this error with Parquet file format only is that Kite was only used for Parquet reading/writing.
Kite dependency has been removed from Sqoop a couple of months ago so this issue is resolved in the latest trunk but unfortunately we do not have any releases yet which contain the fix.

Btw s3n file system is not deprecated you might want to use s3a in the future.

Regards,
Szabolcs

> Importing data from Oracle to Parquet as incremental dataset name fails
> -----------------------------------------------------------------------
>
>                 Key: SQOOP-3421
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3421
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.7
>            Reporter: Daniel Mateus Pires
>            Priority: Minor
>
> Hi there, I'm trying to run the following to import an Oracle table into S3 as Parquet:
> {code:bash}
> sqoop import --connect jdbc:oracle:thin:@//some.host:1521/ORCL --where="rownum < 100" --table SOME_SCHEMA.SOME_TABLE_NAME --password some_password --username some_username --num-mappers 4 --split-by PRD_ID --target-dir s3n://bucket/destination --temporary-rootdir s3n://bucket/temp/destination --compress --check-column PRD_MODIFY_DT --incremental lastmodified --map-column-java PRD_ATTR_TEXT=String --append
> {code}
> Version of Kite is: kite-data-s3-1.1.0.jar
> Version of Sqoop is: 1.4.7
> And I'm getting the following error:
> {code:text}
> 19/01/21 13:20:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM SOME_SCHEMA.SOME_TABLE_NAME t WHERE 1=0
> 19/01/21 13:20:34 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.dist/hive-site.xml
> 19/01/21 13:20:35 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.ValidationException: Dataset name 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not alphanumeric (plus '_')
> org.kitesdk.data.ValidationException: Dataset name 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not alphanumeric (plus '_')
> 	at org.kitesdk.data.ValidationException.check(ValidationException.java:55)
> 	at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:105)
> 	at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:68)
> 	at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209)
> 	at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137)
> 	at org.kitesdk.data.Datasets.create(Datasets.java:239)
> 	at org.kitesdk.data.Datasets.create(Datasets.java:307)
> 	at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:156)
> 	at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:130)
> 	at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:132)
> 	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:264)
> 	at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
> 	at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:454)
> 	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:520)
> 	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
> 		at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> 	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
> 	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
> 	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
> 	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
> {code}
> Importing as text file instead solves the issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)