You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Anubhav Tarar <an...@knoldus.in> on 2018/03/01 07:31:00 UTC
How to Load Data From a CSV to a parquet table
Hi i m trying to load data from a csv file into parquet in hive but got
this exception
hive> create table if not exists REGION( R_NAME string, R_REGIONKEY string,
R_COMMENT string ) stored as parquet;
OK
Time taken: 0.414 seconds
hive> load data local inpath
'file:///home/anubhav/Downloads/dbgen/region.tbl' into table region;
Loading data to table default.region
OK
Time taken: 1.011 seconds
hive> select * from region;
OK
Failed with exception java.io.IOException:java.lang.RuntimeException:
hdfs://localhost:54311/user/hive/warehouse/region/region.tbl is not a
Parquet file. expected magic number at tail [80, 65, 82, 49] but found
[115, 108, 124, 10]
Time taken: 0.108 seconds
can anyone help?hive version is 2.1
--
Thanks and Regards
* Anubhav Tarar *
* Software Consultant*
*Knoldus Software LLP <http://www.knoldus.com/home.knol> *
LinkedIn <http://in.linkedin.com/in/rahulforallp> Twitter
<https://twitter.com/RahulKu71223673> fb <ra...@facebook.com>
mob : 8588915184
Re: How to Load Data From a CSV to a parquet table
Posted by Jörn Franke <jo...@gmail.com>.
You have defined a parquet only table. It interprets your CSV file as parquet. You can for instance define 2 tables:
* one external for the CSV file
* one table for the parquet file
Afterwards you select from the first table and insert in the second table.
> On 1. Mar 2018, at 08:31, Anubhav Tarar <an...@knoldus.in> wrote:
>
> Hi i m trying to load data from a csv file into parquet in hive but got
> this exception
>
> hive> create table if not exists REGION( R_NAME string, R_REGIONKEY string,
> R_COMMENT string ) stored as parquet;
> OK
> Time taken: 0.414 seconds
> hive> load data local inpath
> 'file:///home/anubhav/Downloads/dbgen/region.tbl' into table region;
> Loading data to table default.region
> OK
> Time taken: 1.011 seconds
> hive> select * from region;
> OK
> Failed with exception java.io.IOException:java.lang.RuntimeException:
> hdfs://localhost:54311/user/hive/warehouse/region/region.tbl is not a
> Parquet file. expected magic number at tail [80, 65, 82, 49] but found
> [115, 108, 124, 10]
> Time taken: 0.108 seconds
>
> can anyone help?hive version is 2.1
>
> --
> Thanks and Regards
>
> * Anubhav Tarar *
>
>
> * Software Consultant*
> *Knoldus Software LLP <http://www.knoldus.com/home.knol> *
> LinkedIn <http://in.linkedin.com/in/rahulforallp> Twitter
> <https://twitter.com/RahulKu71223673> fb <ra...@facebook.com>
> mob : 8588915184