You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org> on 2014/11/22 23:46:13 UTC
[jira] [Updated] (SQOOP-1780) Avro/Parquet schemas can't handle
Sqoop-generated non-alphanumeric column names
[ https://issues.apache.org/jira/browse/SQOOP-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jarek Jarcec Cecho updated SQOOP-1780:
--------------------------------------
Description:
I was importing a MySQL table that had columns that started with a number (1QP, 2QP, etc.). It looks like Sqoop appends an underscore on the front of those names to make them compatible with Hive, but Parquet/Avro schemas can't handle the non-alphanumeric value in the name of a field (or at least, at the start of it), throwing the following exception:
{code}
java.lang.IllegalStateException: Deprecated: field names are not alphanumeric (plus '_'): sqoop_import_team._1QP, sqoop_import_team._2QP, sqoop_import_team._3QP, sqoop_import_team._4QP
at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at org.kitesdk.data.spi.Compatibility.checkSchema(Compatibility.java:119)
at org.kitesdk.data.spi.Compatibility.checkDescriptor(Compatibility.java:133)
at org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:40)
at org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:76)
at org.kitesdk.data.Datasets.create(Datasets.java:200)
at org.kitesdk.data.Datasets.create(Datasets.java:240)
at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:81)
at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:70)
{code}
was:
I was importing a MySQL table that had columns that started with a number (1QP, 2QP, etc.). It looks like Sqoop appends an underscore on the front of those names to make them compatible with Hive, but Parquet/Avro schemas can't handle the non-alphanumeric value in the name of a field (or at least, at the start of it), throwing the following exception:
java.lang.IllegalStateException: Deprecated: field names are not alphanumeric (plus '_'): sqoop_import_team._1QP, sqoop_import_team._2QP, sqoop_import_team._3QP, sqoop_import_team._4QP
at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at org.kitesdk.data.spi.Compatibility.checkSchema(Compatibility.java:119)
at org.kitesdk.data.spi.Compatibility.checkDescriptor(Compatibility.java:133)
at org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:40)
at org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:76)
at org.kitesdk.data.Datasets.create(Datasets.java:200)
at org.kitesdk.data.Datasets.create(Datasets.java:240)
at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:81)
at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:70)
> Avro/Parquet schemas can't handle Sqoop-generated non-alphanumeric column names
> -------------------------------------------------------------------------------
>
> Key: SQOOP-1780
> URL: https://issues.apache.org/jira/browse/SQOOP-1780
> Project: Sqoop
> Issue Type: Bug
> Affects Versions: 1.4.5
> Reporter: Josh Wills
>
> I was importing a MySQL table that had columns that started with a number (1QP, 2QP, etc.). It looks like Sqoop appends an underscore on the front of those names to make them compatible with Hive, but Parquet/Avro schemas can't handle the non-alphanumeric value in the name of a field (or at least, at the start of it), throwing the following exception:
> {code}
> java.lang.IllegalStateException: Deprecated: field names are not alphanumeric (plus '_'): sqoop_import_team._1QP, sqoop_import_team._2QP, sqoop_import_team._3QP, sqoop_import_team._4QP
> at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
> at org.kitesdk.data.spi.Compatibility.checkSchema(Compatibility.java:119)
> at org.kitesdk.data.spi.Compatibility.checkDescriptor(Compatibility.java:133)
> at org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:40)
> at org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:76)
> at org.kitesdk.data.Datasets.create(Datasets.java:200)
> at org.kitesdk.data.Datasets.create(Datasets.java:240)
> at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:81)
> at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:70)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)