You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jeetendra Gangele <ga...@gmail.com> on 2015/08/20 07:53:30 UTC

creating data warehouse with Spark and running query with Hive

HI All,

I have a data in HDFS partition with Year/month/data/event_type. And I am
creating a hive tables with this data, this data is in JSON so I am using
json serve and creating hive tables.
 below is the code
  val jsonFile =
hiveContext.read.json("hdfs://localhost:9000/housing/events_real/category=Impressions/date=1007465766/*")
    jsonFile.toDF().printSchema()
    jsonFile.write.saveAsTable("JsonFileTable")
    jsonFile.toDF().printSchema()
    val events = hiveContext.sql("SELECT category, uid FROM JsonFileTable")
    events.map(e => "Event: " + e).collect().foreach(println)

saveAstable  failing with Error saying MKDir failed to create the directory
 ,anybody has any idea?

Re: creating data warehouse with Spark and running query with Hive

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Can you paste the stacks-trace? Is it complaining about directory already
exists?

Thanks
Best Regards

On Thu, Aug 20, 2015 at 11:23 AM, Jeetendra Gangele <ga...@gmail.com>
wrote:

> HI All,
>
> I have a data in HDFS partition with Year/month/data/event_type. And I am
> creating a hive tables with this data, this data is in JSON so I am using
> json serve and creating hive tables.
>  below is the code
>   val jsonFile =
> hiveContext.read.json("hdfs://localhost:9000/housing/events_real/category=Impressions/date=1007465766/*")
>     jsonFile.toDF().printSchema()
>     jsonFile.write.saveAsTable("JsonFileTable")
>     jsonFile.toDF().printSchema()
>     val events = hiveContext.sql("SELECT category, uid FROM JsonFileTable")
>     events.map(e => "Event: " + e).collect().foreach(println)
>
> saveAstable  failing with Error saying MKDir failed to create the
> directory  ,anybody has any idea?
>
>
>
>
>
>