You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2016/07/07 18:08:11 UTC
[jira] [Resolved] (SPARK-16415) [Spark][SQL] - Failed to create
table due to catalog string error
[ https://issues.apache.org/jira/browse/SPARK-16415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Reynold Xin resolved SPARK-16415.
---------------------------------
Resolution: Fixed
Assignee: Adrian Wang
Fix Version/s: 2.0.0
> [Spark][SQL] - Failed to create table due to catalog string error
> -----------------------------------------------------------------
>
> Key: SPARK-16415
> URL: https://issues.apache.org/jira/browse/SPARK-16415
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0
> Reporter: Yi Zhou
> Assignee: Adrian Wang
> Priority: Critical
> Fix For: 2.0.0
>
>
> When create below table like below schema, Spark SQL error out for struct type
> SQL:
> {code}
> CREATE EXTERNAL TABLE date_dim_temporary
> ( d_date_sk bigint --not null
> , d_date_id string --not null
> , d_date string
> , d_month_seq int
> , d_week_seq int
> , d_quarter_seq int
> , d_year int
> , d_dow int
> , d_moy int
> , d_dom int
> , d_qoy int
> , d_fy_year int
> , d_fy_quarter_seq int
> , d_fy_week_seq int
> , d_day_name string
> , d_quarter_name string
> , d_holiday string
> , d_weekend string
> , d_following_holiday string
> , d_first_dom int
> , d_last_dom int
> , d_same_day_ly int
> , d_same_day_lq int
> , d_current_day string
> , d_current_week string
> , d_current_month string
> , d_current_quarter string
> , d_current_year string
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
> STORED AS TEXTFILE LOCATION '/user/root/benchmarks/test/data/date_dim'
> CREATE TABLE date_dim
> STORED AS ORC
> AS
> SELECT * FROM date_dim_temporary
> {code}
> Error Message:
> {code}
> 16/07/05 23:38:43 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 198.0 (TID 677, hw-node5): java.lang.IllegalArgumentException: Error: : expected at the position 400 of 'struct<d_date_sk:bigint,d_date_id:string,d_date:string,d_month_seq:int,d_week_seq:int,d_quarter_seq:int,d_year:int,d_dow:int,d_moy:int,d_dom:int,d_qoy:int,d_fy_year:int,d_fy_quarter_seq:int,d_fy_week_seq:int,d_day_name:string,d_quarter_name:string,d_holiday:string,d_weekend:string,d_following_holiday:string,d_first_dom:int,d_last_dom:int,d_same_day_ly:int,d_same_day_lq:int,d_current_day:string,... 4 more fields>' but ' ' is found.
> at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:360)
> at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:331)
> at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseType(TypeInfoUtils.java:483)
> at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseTypeInfos(TypeInfoUtils.java:305)
> at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils.getTypeInfoFromTypeString(TypeInfoUtils.java:770)
> at org.apache.spark.sql.hive.orc.OrcSerializer.<init>(OrcFileFormat.scala:184)
> at org.apache.spark.sql.hive.orc.OrcOutputWriter.<init>(OrcFileFormat.scala:220)
> at org.apache.spark.sql.hive.orc.OrcFileFormat$$anon$1.newInstance(OrcFileFormat.scala:93)
> at org.apache.spark.sql.execution.datasources.BaseWriterContainer.newOutputWriter(WriterContainer.scala:130)
> at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:246)
> at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
> at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
> at org.apache.spark.scheduler.Task.run(Task.scala:85)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org