You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Vidya Sujeet <sj...@gmail.com> on 2014/07/24 09:05:12 UTC

create table / data type syntax for csv files with comma in the column

Hello,

I have a csv file that has columns which contains commas within a string
enclosed with a ". ex: column name:*'Issue' *value:*"Other (phone, health
club, etc)"*

*Question:* What should the data type of 'Issue' be? Or how should I format
the table (row format delimited terminated by) so that the comma in the
column (issue) is accounted for correctly

I had set it as below but this puts the words in the string (ex: *"Other
(phone, health club, etc)") *  into separate columns

create table consumercomplaints (ComplaintID int,
                                  Product string,
                                  Subproduct string,
                                  Issue string,
                                  Subissue string,
                                  State string,
                                  ZIPcode int,
                                  Submittedvia string,
                                  Datereceived string,
                                  Datesenttocompany string,
                                  Company string,
                                  Companyresponse string,
                                  Timelyresponse string,
                                  Consumerdisputed string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
location '/user/hive/warehouse/mydb/consumer_complaints.csv';

Sample data -- Complaint ID,Product,Sub-product,Issue,Sub-issue,State,ZIP
code,Submitted via,Date received,Date sent to company,Company,Company
response,Timely response?,Consumer disputed? 943291,Debt collection,,Cont'd
attempts collect debt not owed,Debt is not
mine,MO,63123,Web,07/18/2014,07/18/2014,"Enhanced Recovery Company,
LLC",Closed with non-monetary relief,Yes, 943698,Bank account or
service,Checking account,Deposits and
withdrawals,,CA,93030,Web,07/18/2014,07/18/2014,U.S. Bancorp,In
progress,Yes, 943521,Debt collection,,Cont'd attempts collect debt not
owed,Debt is not mine,OH,44116,Web,07/18/2014,07/18/2014,"Vital Solutions,
Inc.",Closed with explanation,Yes, 943400,Debt collection,"Other (phone,
health club, etc.)",Communication tactics,Frequent or repeated
calls,MD,21133,Web,07/18/2014,07/18/2014,"The CBE Group, Inc.",Closed with
explanation,Yes,



Thanks,

Vidya

Re: create table / data type syntax for csv files with comma in the column

Posted by Vidya Sujeet <sj...@gmail.com>.
> Hello,
>
> I have a csv file that has columns which contains commas within a string
> enclosed with a ". ex: column name:*'Issue' *value:*"Other (phone, health
> club, etc)"*
>
> *Question:* What should the data type of 'Issue' be? Or how should I
> format the table (row format delimited terminated by) so that the comma in
> the column (issue) is accounted for correctly
>
> I had set it as below but this puts the words in the string (ex: *"Other
> (phone, health club, etc)") *  into separate columns
>
> create table consumercomplaints (ComplaintID int,
>                                   Product string,
>                                   Subproduct string,
>                                   Issue string,
>                                   Subissue string,
>                                   State string,
>                                   ZIPcode int,
>                                   Submittedvia string,
>                                   Datereceived string,
>                                   Datesenttocompany string,
>                                   Company string,
>                                   Companyresponse string,
>                                   Timelyresponse string,
>                                   Consumerdisputed string
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
> location '/user/hive/warehouse/mydb/consumer_complaints.csv';
>
> Sample data -- Complaint ID,Product,Sub-product,Issue,Sub-issue,State,ZIP
> code,Submitted via,Date received,Date sent to company,Company,Company
> response,Timely response?,Consumer disputed? 943291,Debt collection,,Cont'd
> attempts collect debt not owed,Debt is not
> mine,MO,63123,Web,07/18/2014,07/18/2014,"Enhanced Recovery Company,
> LLC",Closed with non-monetary relief,Yes, 943698,Bank account or
> service,Checking account,Deposits and
> withdrawals,,CA,93030,Web,07/18/2014,07/18/2014,U.S. Bancorp,In
> progress,Yes, 943521,Debt collection,,Cont'd attempts collect debt not
> owed,Debt is not mine,OH,44116,Web,07/18/2014,07/18/2014,"Vital Solutions,
> Inc.",Closed with explanation,Yes, 943400,Debt collection,"Other (phone,
> health club, etc.)",Communication tactics,Frequent or repeated
> calls,MD,21133,Web,07/18/2014,07/18/2014,"The CBE Group, Inc.",Closed with
> explanation,Yes,
>
>
>
> Thanks,
>
> Vidya
>