You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Sarath <sa...@algofusiontech.com> on 2012/09/28 07:57:31 UTC

Problem loading a CSV file

Hi,

I have created a new table using reference to a file on HDFS -
/create external table table1 (field1 STRING, field2 STRING, field3 
STRING, field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 
FLOAT, field8 STRING, field9 STRING) row format delimited fields 
terminated by ',' location '/user/hduser/dumps/table_dump.csv';/

The table got created successfully. But when I try retrieving rows from 
this table, it returns me nothing.
/hive> select * from table1;
OK
Time taken: 0.156 seconds/

I also tried creating the table first and then loading the HDFS file 
data into it -
/hive> create table table1 (field1 STRING, field2 STRING, field3 STRING, 
field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT, 
field8 STRING, field9 STRING) row format delimited fields terminated by ',';
OK
Time taken: 0.088 seconds/

But when I try to load data into this table I'm getting below error -
/hive> load data inpath '/user/hduser/dumps/table_dump.csv' overwrite 
into table table1;
FAILED: Error in semantic analysis: Line 1:17 Invalid path 
''/user/hduser/dumps/table_dump.csv'': No files matching path 
hdfs://master:54310/user/hduser/dumps/table_dump.csv/

What is going wrong? Is there a different way to load a CSV file using hive?

Regards,
Sarath.

RE: Problem loading a CSV file

Posted by "Savant, Keshav" <Ke...@fisglobal.com>.
Hi Sarath,

Considering your two step approach...

The load command by default searches for file in HDFS, so you are doing the same by following command

hive> load data inpath '/user/hduser/dumps/table_dump.csv' overwrite into table table1;

instead, you can use 'local' to tell hive that the CSV file is on local file system and not on HDFS, as below

hive> load data local inpath '/user/hduser/dumps/table_dump.csv' overwrite into table table1;

Hope that helps

Kind regards,
Keshav C Savant

From: Sarath [mailto:sarathchandra.josyam@algofusiontech.com]
Sent: Friday, September 28, 2012 11:28 AM
To: user@hive.apache.org
Subject: Problem loading a CSV file

Hi,

I have created a new table using reference to a file on HDFS -
create external table table1 (field1 STRING, field2 STRING, field3 STRING, field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT, field8 STRING, field9 STRING) row format delimited fields terminated by ',' location '/user/hduser/dumps/table_dump.csv';

The table got created successfully. But when I try retrieving rows from this table, it returns me nothing.
hive> select * from table1;
OK
Time taken: 0.156 seconds

I also tried creating the table first and then loading the HDFS file data into it -
hive> create table table1 (field1 STRING, field2 STRING, field3 STRING, field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT, field8 STRING, field9 STRING) row format delimited fields terminated by ',';
OK
Time taken: 0.088 seconds

But when I try to load data into this table I'm getting below error -
hive> load data inpath '/user/hduser/dumps/table_dump.csv' overwrite into table table1;
FAILED: Error in semantic analysis: Line 1:17 Invalid path ''/user/hduser/dumps/table_dump.csv'': No files matching path hdfs://master:54310/user/hduser/dumps/table_dump.csv

What is going wrong? Is there a different way to load a CSV file using hive?

Regards,
Sarath.

_____________
The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.

Re: Problem loading a CSV file

Posted by Sarath <sa...@algofusiontech.com>.
Thanks Miao and Savant for your responses.

My file is already in HDFS but still I'm facing the error. Infact I 
could browse to this file with the same path and see it's contents using 
the web interface of HDFS. So I don't think it's issue with path.

Did some trial and error and now it is working. Here's what I did -
-> Converted the CSV file into a plain text file and replaced all "," 
(commas) with ";" (semi-colons)
-> Used the same create query with external option as given in my 
earlier mail. But this time removed the row format specification.

create external table table1 (field1 STRING, field2 STRING, field3 STRING,
field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT,
field8 STRING, field9 STRING) location '/user/hduser/dumps/table_dump.txt';


Now I'm able to query, get the row count and do all other operations on 
this table.
So looks like I'm not using the right way to load the CSV files.

Can anyone throw some light here?

Regards,
Sarath.

On Friday 28 September 2012 03:14 PM, MiaoMiao wrote:
> When creating external table with location clause, you need to put
> your csv into HDFS.
> Or else you can load data local as Savant said.
>
> On Fri, Sep 28, 2012 at 1:57 PM, Sarath
> <sa...@algofusiontech.com> wrote:
>> Hi,
>>
>> I have created a new table using reference to a file on HDFS -
>> create external table table1 (field1 STRING, field2 STRING, field3 STRING,
>> field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT,
>> field8 STRING, field9 STRING) row format delimited fields terminated by ','
>> location '/user/hduser/dumps/table_dump.csv';
>>
>> The table got created successfully. But when I try retrieving rows from this
>> table, it returns me nothing.
>> hive> select * from table1;
>> OK
>> Time taken: 0.156 seconds
>>
>> I also tried creating the table first and then loading the HDFS file data
>> into it -
>> hive> create table table1 (field1 STRING, field2 STRING, field3 STRING,
>> field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT,
>> field8 STRING, field9 STRING) row format delimited fields terminated by ',';
>> OK
>> Time taken: 0.088 seconds
>>
>> But when I try to load data into this table I'm getting below error -
>> hive> load data inpath '/user/hduser/dumps/table_dump.csv' overwrite into
>> table table1;
>> FAILED: Error in semantic analysis: Line 1:17 Invalid path
>> ''/user/hduser/dumps/table_dump.csv'': No files matching path
>> hdfs://master:54310/user/hduser/dumps/table_dump.csv
>>
>> What is going wrong? Is there a different way to load a CSV file using hive?
>>
>> Regards,
>> Sarath.

Re: Problem loading a CSV file

Posted by MiaoMiao <li...@gmail.com>.
When creating external table with location clause, you need to put
your csv into HDFS.
Or else you can load data local as Savant said.

On Fri, Sep 28, 2012 at 1:57 PM, Sarath
<sa...@algofusiontech.com> wrote:
> Hi,
>
> I have created a new table using reference to a file on HDFS -
> create external table table1 (field1 STRING, field2 STRING, field3 STRING,
> field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT,
> field8 STRING, field9 STRING) row format delimited fields terminated by ','
> location '/user/hduser/dumps/table_dump.csv';
>
> The table got created successfully. But when I try retrieving rows from this
> table, it returns me nothing.
> hive> select * from table1;
> OK
> Time taken: 0.156 seconds
>
> I also tried creating the table first and then loading the HDFS file data
> into it -
> hive> create table table1 (field1 STRING, field2 STRING, field3 STRING,
> field3 STRING, field4 STRING, field5 FLOAT, field6 FLOAT, field7 FLOAT,
> field8 STRING, field9 STRING) row format delimited fields terminated by ',';
> OK
> Time taken: 0.088 seconds
>
> But when I try to load data into this table I'm getting below error -
> hive> load data inpath '/user/hduser/dumps/table_dump.csv' overwrite into
> table table1;
> FAILED: Error in semantic analysis: Line 1:17 Invalid path
> ''/user/hduser/dumps/table_dump.csv'': No files matching path
> hdfs://master:54310/user/hduser/dumps/table_dump.csv
>
> What is going wrong? Is there a different way to load a CSV file using hive?
>
> Regards,
> Sarath.