You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by sam smith <qu...@gmail.com> on 2022/12/10 12:07:13 UTC
Can we upload a csv dataset into Hive using SparkSQL?
Hello,
I want to create a table in Hive and then load a CSV file content into it
all by means of Spark SQL.
I saw in the docs the example with the .txt file BUT can we do instead
something like the following to accomplish what i want? :
String warehouseLocation = new
File("spark-warehouse").getAbsolutePath();SparkSession spark =
SparkSession
.builder()
.appName("Java Spark Hive Example")
.config("spark.sql.warehouse.dir", warehouseLocation)
.enableHiveSupport()
.getOrCreate();
spark.sql("CREATE TABLE IF NOT EXISTS csvFile USING
hive");spark.sql("LOAD DATA LOCAL INPATH
'C:/Users/Me/Documents/examples/src/main/resources/data.csv' INTO
TABLE csvFile");
Re: Can we upload a csv dataset into Hive using SparkSQL?
Posted by Artemis User <ar...@dtechspace.com>.
Your DDL statement doesn't look right. You may want to check the Spark
SQL Reference online for how to create table in Hive format
(https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-create-table-hiveformat.html).
You should be able to populate the table directly using CREATE by
providing a location parameter.
If creating a table from CSV is your only objective, you may want to
consider using beeline since it is more efficient and probably supports
more standard SQL functions...
On 12/10/22 7:07 AM, sam smith wrote:
> Hello,
>
> I want to create a table in Hive and then load a CSV file content into
> it all by means of Spark SQL.
> I saw in the docs the example with the .txt file BUT can we do instead
> something like the following to accomplish what i want? :
>
> |String warehouseLocation = new
> File("spark-warehouse").getAbsolutePath(); SparkSession spark =
> SparkSession .builder() .appName("Java Spark Hive Example")
> .config("spark.sql.warehouse.dir", warehouseLocation)
> .enableHiveSupport() .getOrCreate(); spark.sql("CREATE TABLE IF NOT
> EXISTS csvFile USING hive"); spark.sql("LOAD DATA LOCAL INPATH
> 'C:/Users/Me/Documents/examples/src/main/resources/data.csv' INTO
> TABLE csvFile");|