You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by sam smith <qu...@gmail.com> on 2022/12/10 12:07:13 UTC

Can we upload a csv dataset into Hive using SparkSQL?

Hello,

I want to create a table in Hive and then load a CSV file content into it
all by means of Spark SQL.
I saw in the docs the example with the .txt file BUT can we do instead
something like the following to accomplish what i want? :

String warehouseLocation = new
File("spark-warehouse").getAbsolutePath();SparkSession spark =
SparkSession
  .builder()
  .appName("Java Spark Hive Example")
  .config("spark.sql.warehouse.dir", warehouseLocation)
  .enableHiveSupport()
  .getOrCreate();
spark.sql("CREATE TABLE IF NOT EXISTS csvFile USING
hive");spark.sql("LOAD DATA LOCAL INPATH
'C:/Users/Me/Documents/examples/src/main/resources/data.csv' INTO
TABLE csvFile");

Re: Can we upload a csv dataset into Hive using SparkSQL?

Posted by Artemis User <ar...@dtechspace.com>.
Your DDL statement doesn't look right.  You may want to check the Spark 
SQL Reference online for how to create table in Hive format 
(https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-create-table-hiveformat.html). 
You should be able to populate the table directly using CREATE by 
providing a location parameter.

If creating a table from CSV is your only objective, you may want to 
consider using beeline since it is more efficient and probably supports 
more standard SQL functions...

On 12/10/22 7:07 AM, sam smith wrote:
> Hello,
>
> I want to create a table in Hive and then load a CSV file content into 
> it all by means of Spark SQL.
> I saw in the docs the example with the .txt file BUT can we do instead 
> something like the following to accomplish what i want? :
>
> |String warehouseLocation = new 
> File("spark-warehouse").getAbsolutePath(); SparkSession spark = 
> SparkSession .builder() .appName("Java Spark Hive Example") 
> .config("spark.sql.warehouse.dir", warehouseLocation) 
> .enableHiveSupport() .getOrCreate(); spark.sql("CREATE TABLE IF NOT 
> EXISTS csvFile USING hive"); spark.sql("LOAD DATA LOCAL INPATH 
> 'C:/Users/Me/Documents/examples/src/main/resources/data.csv' INTO 
> TABLE csvFile");|