You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "cen yuhai (JIRA)" <ji...@apache.org> on 2017/07/30 12:42:00 UTC
[jira] [Updated] (CARBONDATA-1338) Spark can not query data when
'spark.carbon.hive.schema.store' is true
[ https://issues.apache.org/jira/browse/CARBONDATA-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
cen yuhai updated CARBONDATA-1338:
----------------------------------
Summary: Spark can not query data when 'spark.carbon.hive.schema.store' is true (was: Can not query data when 'spark.carbon.hive.schema.store' is true)
> Spark can not query data when 'spark.carbon.hive.schema.store' is true
> ----------------------------------------------------------------------
>
> Key: CARBONDATA-1338
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1338
> Project: CarbonData
> Issue Type: Bug
> Reporter: cen yuhai
> Assignee: cen yuhai
> Fix For: 1.2.0
>
> Time Spent: 4h
> Remaining Estimate: 0h
>
> My step is as blow:
> {code}
> set spark.carbon.hive.schema.store=true in spark-defaults.conf
> spark-shell --jars carbonlib/carbondata_2.11-1.2.0-SNAPSHOT-shade-hadoop2.7.2.jar,carbonlib/carbondata-hive-1.2.0-SNAPSHOT.jar
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.CarbonSession._
> val rootPath = "hdfs://mycluster/user/master/carbon"
> val storeLocation = s"$rootPath/store"
> val warehouse = s"$rootPath/warehouse"
> val metastoredb = s"$rootPath/metastore_db"
> val carbon =SparkSession.builder().enableHiveSupport().getOrCreateCarbonSession(storeLocation, metastoredb)
> carbon.sql("create table temp.yuhai_carbon(id short, name string, scale decimal, country string, salary double) STORED BY 'carbondata'")
> carbon.sql("LOAD DATA INPATH 'hdfs://mycluster/user/master/sample.csv' INTO TABLE temp.yuhai_carbon")
> carbon.sql("select * from temp.yuhai_carbon").show
> {code}
> Exception:
> {code}
> Caused by: java.io.IOException: File does not exist: hdfs://mycluster/user/master/carbon/store/temp/yuhai_carbon/Metadata/schema
> at org.apache.carbondata.hadoop.util.SchemaReader.readCarbonTableFromStore(SchemaReader.java:70)
> at org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getOrCreateCarbonTable(CarbonTableInputFormat.java:142)
> at org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getQueryModel(CarbonTableInputFormat.java:441)
> at org.apache.carbondata.spark.rdd.CarbonScanRDD.internalCompute(CarbonScanRDD.scala:191)
> at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:50)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:331)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:295)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:331)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:295)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:331)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:295)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:88)
> at org.apache.spark.scheduler.Task.run(Task.scala:104)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:351)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)