You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by ankit bhatnagar <ab...@gmail.com> on 2011/01/10 21:44:10 UTC

csv import with quotes

Hi All,

I am planning to join some csv files using hive.


referring to the developer guide on wiki quotes are not supported.
-------------------------------------------------------

Hive currently use these SerDe classes to serialize and deserialize data:

   -

   MetadataTypedColumnsetSerDe: This SerDe is used to read/write delimited
   records like CSV, tab-separated control-A separated records (sorry, quote is
   not supported yet.)



Did anybody write some Deserializer for the same?


Thanks
Ankit

Re: csv import with quotes

Posted by Larry Ogrodnek <la...@bizo.com>.
Are you using the jar I built?  It should be:

create table my_table(a string, b string, c string)
  row format serde 'com.bizo.hive.serde.csv.CSVSerde'
  stored as textfile
;

I.e. the class needs to be "com.bizo.hive.serde.csv.CSVSerde" unless
you built your own jar?



On Fri, Jan 21, 2011 at 1:51 PM, ankit bhatnagar <ab...@gmail.com> wrote:
> Hi Larry,
>
> add jar -- successful
>
> However
>
>  create table my_table(a string, b string, c string)
>     >   row format serde 'com.test.CSVSerde'
>     >   stored as textfile;
>
> when I execute the query-I get error
>
> Error in metadata: Cannot validate serde: com.vantage.hive.CSVSerde
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask
>
>
>
> Any thoughts
> Ankit
>
>

Re: csv import with quotes

Posted by ankit bhatnagar <ab...@gmail.com>.
Hi Larry,

add jar -- successful

However

 create table my_table(a string, b string, c string)
    >   row format serde 'com.test.CSVSerde'
    >   stored as textfile;

when I execute the query-I get error

Error in metadata: Cannot validate serde: com.vantage.hive.CSVSerde
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask



Any thoughts
Ankit

Re: csv import with quotes

Posted by Larry Ogrodnek <la...@bizo.com>.
I wrote a CSVSerde for hive that uses opencsv for csv parsing, so it
handles quotes and all other CSV weirdness.  I wrote it up here:

http://dev.bizo.com/2010/11/csv-and-hive.html



On Mon, Jan 10, 2011 at 12:44 PM, ankit bhatnagar <ab...@gmail.com> wrote:
> Hi All,
>
> I am planning to join some csv files using hive.
>
>
> referring to the developer guide on wiki quotes are not supported.
> -------------------------------------------------------
>
> Hive currently use these SerDe classes to serialize and deserialize data:
>
> MetadataTypedColumnsetSerDe: This SerDe is used to read/write delimited
> records like CSV, tab-separated control-A separated records (sorry, quote is
> not supported yet.)
>
>
> Did anybody write some Deserializer for the same?
>
>
> Thanks
> Ankit
>