You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sandeep Reddy P <sa...@gmail.com> on 2012/06/26 21:07:46 UTC

Hive error when loading csv data.

Hi all,
I have a csv file with 46 columns but i'm getting error when i do some
analysis on that data type. For simplification i have taken 3 columns and
now my csv is like
c,zxy,xyz
d,"abc,def",abcd

i have created table for this data using,
hive> create table test3(
    > f1 string,
    > f2 string,
    > f3 string)
    > row format delimited
    > fields terminated by ",";
OK
Time taken: 0.143 seconds
hive> load data local inpath '/home/training/a.csv'
    > into table test3;
Copying data from file:/home/training/a.csv
Copying file: file:/home/training/a.csv
Loading data to table default.test3
OK
Time taken: 0.276 seconds
hive> select * from test3;
OK
c       zxy     xyz
d       "abc    def"
Time taken: 0.156 seconds

When i do select f2 from test3;
my results are,
OK
zxy
"abc
but this should be abc,def
When i open the same csv file with Microsoft Excel i got abc,def
How should i solve this error??



-- 
Thanks,
sandeep

Re: Hive error when loading csv data.

Posted by Michel Segel <mi...@hotmail.com>.
What I am suggesting is to write a simple script , maybe using python, where you replace the commas that are used as field delimiter

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <sa...@gmail.com> wrote:

> If i do that my data will be d|"abc|def"|abcd my problem is not solved
> 
> On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <mi...@hotmail.com>wrote:
> 
>> Yup. I just didnt add the quotes.
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <sa...@gmail.com>
>> wrote:
>> 
>>> Thanks for the reply.
>>> I didnt get that Michael. My f2 should be "abc,def"
>>> 
>>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
>> michael_segel@hotmail.com>wrote:
>>> 
>>>> Alternatively you could write a simple script to convert the csv to a
>> pipe
>>>> delimited file so that "abc,def" will be abc,def.
>>>> 
>>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>>>> 
>>>>> Hive's delimited-fields-format record reader does not handle quoted
>>>>> text that carry the same delimiter within them. Excel supports such
>>>>> records, so it reads it fine.
>>>>> 
>>>>> You will need to create your table with a custom InputFormat class
>>>>> that can handle this (Try using OpenCSV readers, they support this),
>>>>> instead of relying on Hive to do this for you. If you're successful in
>>>>> your approach, please also consider contributing something back to
>>>>> Hive/Pig to help others.
>>>>> 
>>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
>>>>> <sa...@gmail.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> Hi all,
>>>>>> I have a csv file with 46 columns but i'm getting error when i do some
>>>>>> analysis on that data type. For simplification i have taken 3 columns
>>>> and
>>>>>> now my csv is like
>>>>>> c,zxy,xyz
>>>>>> d,"abc,def",abcd
>>>>>> 
>>>>>> i have created table for this data using,
>>>>>> hive> create table test3(
>>>>>>> f1 string,
>>>>>>> f2 string,
>>>>>>> f3 string)
>>>>>>> row format delimited
>>>>>>> fields terminated by ",";
>>>>>> OK
>>>>>> Time taken: 0.143 seconds
>>>>>> hive> load data local inpath '/home/training/a.csv'
>>>>>>> into table test3;
>>>>>> Copying data from file:/home/training/a.csv
>>>>>> Copying file: file:/home/training/a.csv
>>>>>> Loading data to table default.test3
>>>>>> OK
>>>>>> Time taken: 0.276 seconds
>>>>>> hive> select * from test3;
>>>>>> OK
>>>>>> c       zxy     xyz
>>>>>> d       "abc    def"
>>>>>> Time taken: 0.156 seconds
>>>>>> 
>>>>>> When i do select f2 from test3;
>>>>>> my results are,
>>>>>> OK
>>>>>> zxy
>>>>>> "abc
>>>>>> but this should be abc,def
>>>>>> When i open the same csv file with Microsoft Excel i got abc,def
>>>>>> How should i solve this error??
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Thanks,
>>>>>> sandeep
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Harsh J
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Thanks,
>>> sandeep
>> 
> 
> 
> 
> -- 
> Thanks,
> sandeep

Re: Hive error when loading csv data.

Posted by Thejas Nair <th...@hortonworks.com>.
More options -
Official apache instructions for 1.0 - 
http://hadoop.apache.org/common/docs/r1.0.3/single_node_setup.html

If you want to try it out on single node on Amazon ec2-
Instructions for HDP distro - 
http://hortonworks.com/community/virtual-sandbox/

If you want a wizard based guided install on single node, you can use 
HDP for that as well - http://hortonworks.com/download/

Thanks,
Thejas



On 6/27/12 8:38 AM, Ruslan Al-Fakikh wrote:
> Hi,
>
> You may try Cloudera's pseudo-distributed mode
> https://ccp.cloudera.com/display/CDHDOC/CDH3+Deployment+in+Pseudo-Distributed+Mode
> You may also try Cloudera's demo VM
> https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM
>
> Regards,
> Ruslan Al-Fakikh
>
> On Wed, Jun 27, 2012 at 4:39 PM, ramakanth reddy
> <ra...@gmail.com>  wrote:
>> Hi
>>
>> Can any help me how to start working with hadoop in single Node and cluster
>> environment,please send me some useful links.
>>
>> On Wed, Jun 27, 2012 at 4:50 PM, Subir S<su...@gmail.com>  wrote:
>>
>>> Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
>>> may help.
>>>
>>> [1]
>>>
>>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
>>> [2]
>>>
>>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html
>>>
>>> CCed pig user-list also.
>>>
>>>
>>> On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P<
>>> sandeepreddy.3647@gmail.com>  wrote:
>>>
>>>> Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
>>>> back.
>>>>
>>>> On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel<
>>> michael_segel@hotmail.com
>>>>> wrote:
>>>>
>>>>> Sorry,
>>>>> I was saying  that you can write a python script that replaces the
>>>>> delimiter with a | and ignore the commas within quotes.
>>>>>
>>>>>
>>>>> Sent from a remote device. Please excuse any typos...
>>>>>
>>>>> Mike Segel
>>>>>
>>>>> On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P<
>>>> sandeepreddy.3647@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> If i do that my data will be d|"abc|def"|abcd my problem is not
>>> solved
>>>>>>
>>>>>> On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel<
>>>> michael_segel@hotmail.com
>>>>>> wrote:
>>>>>>
>>>>>>> Yup. I just didnt add the quotes.
>>>>>>>
>>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>>>
>>>>>>> Mike Segel
>>>>>>>
>>>>>>> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P<
>>>>> sandeepreddy.3647@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for the reply.
>>>>>>>> I didnt get that Michael. My f2 should be "abc,def"
>>>>>>>>
>>>>>>>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel<
>>>>>>> michael_segel@hotmail.com>wrote:
>>>>>>>>
>>>>>>>>> Alternatively you could write a simple script to convert the csv
>>> to
>>>> a
>>>>>>> pipe
>>>>>>>>> delimited file so that "abc,def" will be abc,def.
>>>>>>>>>
>>>>>>>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>>>>>>>>>
>>>>>>>>>> Hive's delimited-fields-format record reader does not handle
>>> quoted
>>>>>>>>>> text that carry the same delimiter within them. Excel supports
>>> such
>>>>>>>>>> records, so it reads it fine.
>>>>>>>>>>
>>>>>>>>>> You will need to create your table with a custom InputFormat
>>> class
>>>>>>>>>> that can handle this (Try using OpenCSV readers, they support
>>>> this),
>>>>>>>>>> instead of relying on Hive to do this for you. If you're
>>> successful
>>>>> in
>>>>>>>>>> your approach, please also consider contributing something back
>>> to
>>>>>>>>>> Hive/Pig to help others.
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
>>>>>>>>>> <sa...@gmail.com>  wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>> I have a csv file with 46 columns but i'm getting error when i
>>> do
>>>>> some
>>>>>>>>>>> analysis on that data type. For simplification i have taken 3
>>>>> columns
>>>>>>>>> and
>>>>>>>>>>> now my csv is like
>>>>>>>>>>> c,zxy,xyz
>>>>>>>>>>> d,"abc,def",abcd
>>>>>>>>>>>
>>>>>>>>>>> i have created table for this data using,
>>>>>>>>>>> hive>  create table test3(
>>>>>>>>>>>> f1 string,
>>>>>>>>>>>> f2 string,
>>>>>>>>>>>> f3 string)
>>>>>>>>>>>> row format delimited
>>>>>>>>>>>> fields terminated by ",";
>>>>>>>>>>> OK
>>>>>>>>>>> Time taken: 0.143 seconds
>>>>>>>>>>> hive>  load data local inpath '/home/training/a.csv'
>>>>>>>>>>>> into table test3;
>>>>>>>>>>> Copying data from file:/home/training/a.csv
>>>>>>>>>>> Copying file: file:/home/training/a.csv
>>>>>>>>>>> Loading data to table default.test3
>>>>>>>>>>> OK
>>>>>>>>>>> Time taken: 0.276 seconds
>>>>>>>>>>> hive>  select * from test3;
>>>>>>>>>>> OK
>>>>>>>>>>> c       zxy     xyz
>>>>>>>>>>> d       "abc    def"
>>>>>>>>>>> Time taken: 0.156 seconds
>>>>>>>>>>>
>>>>>>>>>>> When i do select f2 from test3;
>>>>>>>>>>> my results are,
>>>>>>>>>>> OK
>>>>>>>>>>> zxy
>>>>>>>>>>> "abc
>>>>>>>>>>> but this should be abc,def
>>>>>>>>>>> When i open the same csv file with Microsoft Excel i got abc,def
>>>>>>>>>>> How should i solve this error??
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Thanks,
>>>>>>>>>>> sandeep
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Harsh J
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks,
>>>>>>>> sandeep
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> sandeep
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> sandeep
>>>>
>>>
>>
>>
>>
>> --
>> Thanks&Regards,
>> Ramakanth,
>> +91-8884035968.
>
>
>


Re: Hive error when loading csv data.

Posted by Ruslan Al-Fakikh <ru...@jalent.ru>.
Hi,

You may try Cloudera's pseudo-distributed mode
https://ccp.cloudera.com/display/CDHDOC/CDH3+Deployment+in+Pseudo-Distributed+Mode
You may also try Cloudera's demo VM
https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM

Regards,
Ruslan Al-Fakikh

On Wed, Jun 27, 2012 at 4:39 PM, ramakanth reddy
<ra...@gmail.com> wrote:
> Hi
>
> Can any help me how to start working with hadoop in single Node and cluster
> environment,please send me some useful links.
>
> On Wed, Jun 27, 2012 at 4:50 PM, Subir S <su...@gmail.com> wrote:
>
>> Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
>> may help.
>>
>> [1]
>>
>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
>> [2]
>>
>> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html
>>
>> CCed pig user-list also.
>>
>>
>> On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P <
>> sandeepreddy.3647@gmail.com> wrote:
>>
>> > Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
>> > back.
>> >
>> > On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <
>> michael_segel@hotmail.com
>> > >wrote:
>> >
>> > > Sorry,
>> > > I was saying  that you can write a python script that replaces the
>> > > delimiter with a | and ignore the commas within quotes.
>> > >
>> > >
>> > > Sent from a remote device. Please excuse any typos...
>> > >
>> > > Mike Segel
>> > >
>> > > On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <
>> > sandeepreddy.3647@gmail.com>
>> > > wrote:
>> > >
>> > > > If i do that my data will be d|"abc|def"|abcd my problem is not
>> solved
>> > > >
>> > > > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <
>> > michael_segel@hotmail.com
>> > > >wrote:
>> > > >
>> > > >> Yup. I just didnt add the quotes.
>> > > >>
>> > > >> Sent from a remote device. Please excuse any typos...
>> > > >>
>> > > >> Mike Segel
>> > > >>
>> > > >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <
>> > > sandeepreddy.3647@gmail.com>
>> > > >> wrote:
>> > > >>
>> > > >>> Thanks for the reply.
>> > > >>> I didnt get that Michael. My f2 should be "abc,def"
>> > > >>>
>> > > >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
>> > > >> michael_segel@hotmail.com>wrote:
>> > > >>>
>> > > >>>> Alternatively you could write a simple script to convert the csv
>> to
>> > a
>> > > >> pipe
>> > > >>>> delimited file so that "abc,def" will be abc,def.
>> > > >>>>
>> > > >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>> > > >>>>
>> > > >>>>> Hive's delimited-fields-format record reader does not handle
>> quoted
>> > > >>>>> text that carry the same delimiter within them. Excel supports
>> such
>> > > >>>>> records, so it reads it fine.
>> > > >>>>>
>> > > >>>>> You will need to create your table with a custom InputFormat
>> class
>> > > >>>>> that can handle this (Try using OpenCSV readers, they support
>> > this),
>> > > >>>>> instead of relying on Hive to do this for you. If you're
>> successful
>> > > in
>> > > >>>>> your approach, please also consider contributing something back
>> to
>> > > >>>>> Hive/Pig to help others.
>> > > >>>>>
>> > > >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
>> > > >>>>> <sa...@gmail.com> wrote:
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>> Hi all,
>> > > >>>>>> I have a csv file with 46 columns but i'm getting error when i
>> do
>> > > some
>> > > >>>>>> analysis on that data type. For simplification i have taken 3
>> > > columns
>> > > >>>> and
>> > > >>>>>> now my csv is like
>> > > >>>>>> c,zxy,xyz
>> > > >>>>>> d,"abc,def",abcd
>> > > >>>>>>
>> > > >>>>>> i have created table for this data using,
>> > > >>>>>> hive> create table test3(
>> > > >>>>>>> f1 string,
>> > > >>>>>>> f2 string,
>> > > >>>>>>> f3 string)
>> > > >>>>>>> row format delimited
>> > > >>>>>>> fields terminated by ",";
>> > > >>>>>> OK
>> > > >>>>>> Time taken: 0.143 seconds
>> > > >>>>>> hive> load data local inpath '/home/training/a.csv'
>> > > >>>>>>> into table test3;
>> > > >>>>>> Copying data from file:/home/training/a.csv
>> > > >>>>>> Copying file: file:/home/training/a.csv
>> > > >>>>>> Loading data to table default.test3
>> > > >>>>>> OK
>> > > >>>>>> Time taken: 0.276 seconds
>> > > >>>>>> hive> select * from test3;
>> > > >>>>>> OK
>> > > >>>>>> c       zxy     xyz
>> > > >>>>>> d       "abc    def"
>> > > >>>>>> Time taken: 0.156 seconds
>> > > >>>>>>
>> > > >>>>>> When i do select f2 from test3;
>> > > >>>>>> my results are,
>> > > >>>>>> OK
>> > > >>>>>> zxy
>> > > >>>>>> "abc
>> > > >>>>>> but this should be abc,def
>> > > >>>>>> When i open the same csv file with Microsoft Excel i got abc,def
>> > > >>>>>> How should i solve this error??
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>> --
>> > > >>>>>> Thanks,
>> > > >>>>>> sandeep
>> > > >>>>>>
>> > > >>>>>> --
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>
>> > > >>>>>
>> > > >>>>>
>> > > >>>>> --
>> > > >>>>> Harsh J
>> > > >>>>>
>> > > >>>>
>> > > >>>>
>> > > >>>
>> > > >>>
>> > > >>> --
>> > > >>> Thanks,
>> > > >>> sandeep
>> > > >>
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Thanks,
>> > > > sandeep
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks,
>> > sandeep
>> >
>>
>
>
>
> --
> Thanks&Regards,
> Ramakanth,
> +91-8884035968.



-- 
Best Regards,
Ruslan Al-Fakikh

Re: Hive error when loading csv data.

Posted by ramakanth reddy <ra...@gmail.com>.
Hi

Can any help me how to start working with hadoop in single Node and cluster
environment,please send me some useful links.

On Wed, Jun 27, 2012 at 4:50 PM, Subir S <su...@gmail.com> wrote:

> Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
> may help.
>
> [1]
>
> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
> [2]
>
> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html
>
> CCed pig user-list also.
>
>
> On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P <
> sandeepreddy.3647@gmail.com> wrote:
>
> > Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
> > back.
> >
> > On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <
> michael_segel@hotmail.com
> > >wrote:
> >
> > > Sorry,
> > > I was saying  that you can write a python script that replaces the
> > > delimiter with a | and ignore the commas within quotes.
> > >
> > >
> > > Sent from a remote device. Please excuse any typos...
> > >
> > > Mike Segel
> > >
> > > On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <
> > sandeepreddy.3647@gmail.com>
> > > wrote:
> > >
> > > > If i do that my data will be d|"abc|def"|abcd my problem is not
> solved
> > > >
> > > > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <
> > michael_segel@hotmail.com
> > > >wrote:
> > > >
> > > >> Yup. I just didnt add the quotes.
> > > >>
> > > >> Sent from a remote device. Please excuse any typos...
> > > >>
> > > >> Mike Segel
> > > >>
> > > >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <
> > > sandeepreddy.3647@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> Thanks for the reply.
> > > >>> I didnt get that Michael. My f2 should be "abc,def"
> > > >>>
> > > >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
> > > >> michael_segel@hotmail.com>wrote:
> > > >>>
> > > >>>> Alternatively you could write a simple script to convert the csv
> to
> > a
> > > >> pipe
> > > >>>> delimited file so that "abc,def" will be abc,def.
> > > >>>>
> > > >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
> > > >>>>
> > > >>>>> Hive's delimited-fields-format record reader does not handle
> quoted
> > > >>>>> text that carry the same delimiter within them. Excel supports
> such
> > > >>>>> records, so it reads it fine.
> > > >>>>>
> > > >>>>> You will need to create your table with a custom InputFormat
> class
> > > >>>>> that can handle this (Try using OpenCSV readers, they support
> > this),
> > > >>>>> instead of relying on Hive to do this for you. If you're
> successful
> > > in
> > > >>>>> your approach, please also consider contributing something back
> to
> > > >>>>> Hive/Pig to help others.
> > > >>>>>
> > > >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> > > >>>>> <sa...@gmail.com> wrote:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Hi all,
> > > >>>>>> I have a csv file with 46 columns but i'm getting error when i
> do
> > > some
> > > >>>>>> analysis on that data type. For simplification i have taken 3
> > > columns
> > > >>>> and
> > > >>>>>> now my csv is like
> > > >>>>>> c,zxy,xyz
> > > >>>>>> d,"abc,def",abcd
> > > >>>>>>
> > > >>>>>> i have created table for this data using,
> > > >>>>>> hive> create table test3(
> > > >>>>>>> f1 string,
> > > >>>>>>> f2 string,
> > > >>>>>>> f3 string)
> > > >>>>>>> row format delimited
> > > >>>>>>> fields terminated by ",";
> > > >>>>>> OK
> > > >>>>>> Time taken: 0.143 seconds
> > > >>>>>> hive> load data local inpath '/home/training/a.csv'
> > > >>>>>>> into table test3;
> > > >>>>>> Copying data from file:/home/training/a.csv
> > > >>>>>> Copying file: file:/home/training/a.csv
> > > >>>>>> Loading data to table default.test3
> > > >>>>>> OK
> > > >>>>>> Time taken: 0.276 seconds
> > > >>>>>> hive> select * from test3;
> > > >>>>>> OK
> > > >>>>>> c       zxy     xyz
> > > >>>>>> d       "abc    def"
> > > >>>>>> Time taken: 0.156 seconds
> > > >>>>>>
> > > >>>>>> When i do select f2 from test3;
> > > >>>>>> my results are,
> > > >>>>>> OK
> > > >>>>>> zxy
> > > >>>>>> "abc
> > > >>>>>> but this should be abc,def
> > > >>>>>> When i open the same csv file with Microsoft Excel i got abc,def
> > > >>>>>> How should i solve this error??
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Thanks,
> > > >>>>>> sandeep
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Harsh J
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Thanks,
> > > >>> sandeep
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks,
> > > > sandeep
> > >
> >
> >
> >
> > --
> > Thanks,
> > sandeep
> >
>



-- 
Thanks&Regards,
Ramakanth,
+91-8884035968.

Re: Hive error when loading csv data.

Posted by ramakanth reddy <ra...@gmail.com>.
Hi

Can any help me how to start working with hadoop in single Node and cluster
environment,please send me some useful links.

On Wed, Jun 27, 2012 at 4:50 PM, Subir S <su...@gmail.com> wrote:

> Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
> may help.
>
> [1]
>
> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
> [2]
>
> http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html
>
> CCed pig user-list also.
>
>
> On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P <
> sandeepreddy.3647@gmail.com> wrote:
>
> > Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
> > back.
> >
> > On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <
> michael_segel@hotmail.com
> > >wrote:
> >
> > > Sorry,
> > > I was saying  that you can write a python script that replaces the
> > > delimiter with a | and ignore the commas within quotes.
> > >
> > >
> > > Sent from a remote device. Please excuse any typos...
> > >
> > > Mike Segel
> > >
> > > On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <
> > sandeepreddy.3647@gmail.com>
> > > wrote:
> > >
> > > > If i do that my data will be d|"abc|def"|abcd my problem is not
> solved
> > > >
> > > > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <
> > michael_segel@hotmail.com
> > > >wrote:
> > > >
> > > >> Yup. I just didnt add the quotes.
> > > >>
> > > >> Sent from a remote device. Please excuse any typos...
> > > >>
> > > >> Mike Segel
> > > >>
> > > >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <
> > > sandeepreddy.3647@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> Thanks for the reply.
> > > >>> I didnt get that Michael. My f2 should be "abc,def"
> > > >>>
> > > >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
> > > >> michael_segel@hotmail.com>wrote:
> > > >>>
> > > >>>> Alternatively you could write a simple script to convert the csv
> to
> > a
> > > >> pipe
> > > >>>> delimited file so that "abc,def" will be abc,def.
> > > >>>>
> > > >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
> > > >>>>
> > > >>>>> Hive's delimited-fields-format record reader does not handle
> quoted
> > > >>>>> text that carry the same delimiter within them. Excel supports
> such
> > > >>>>> records, so it reads it fine.
> > > >>>>>
> > > >>>>> You will need to create your table with a custom InputFormat
> class
> > > >>>>> that can handle this (Try using OpenCSV readers, they support
> > this),
> > > >>>>> instead of relying on Hive to do this for you. If you're
> successful
> > > in
> > > >>>>> your approach, please also consider contributing something back
> to
> > > >>>>> Hive/Pig to help others.
> > > >>>>>
> > > >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> > > >>>>> <sa...@gmail.com> wrote:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Hi all,
> > > >>>>>> I have a csv file with 46 columns but i'm getting error when i
> do
> > > some
> > > >>>>>> analysis on that data type. For simplification i have taken 3
> > > columns
> > > >>>> and
> > > >>>>>> now my csv is like
> > > >>>>>> c,zxy,xyz
> > > >>>>>> d,"abc,def",abcd
> > > >>>>>>
> > > >>>>>> i have created table for this data using,
> > > >>>>>> hive> create table test3(
> > > >>>>>>> f1 string,
> > > >>>>>>> f2 string,
> > > >>>>>>> f3 string)
> > > >>>>>>> row format delimited
> > > >>>>>>> fields terminated by ",";
> > > >>>>>> OK
> > > >>>>>> Time taken: 0.143 seconds
> > > >>>>>> hive> load data local inpath '/home/training/a.csv'
> > > >>>>>>> into table test3;
> > > >>>>>> Copying data from file:/home/training/a.csv
> > > >>>>>> Copying file: file:/home/training/a.csv
> > > >>>>>> Loading data to table default.test3
> > > >>>>>> OK
> > > >>>>>> Time taken: 0.276 seconds
> > > >>>>>> hive> select * from test3;
> > > >>>>>> OK
> > > >>>>>> c       zxy     xyz
> > > >>>>>> d       "abc    def"
> > > >>>>>> Time taken: 0.156 seconds
> > > >>>>>>
> > > >>>>>> When i do select f2 from test3;
> > > >>>>>> my results are,
> > > >>>>>> OK
> > > >>>>>> zxy
> > > >>>>>> "abc
> > > >>>>>> but this should be abc,def
> > > >>>>>> When i open the same csv file with Microsoft Excel i got abc,def
> > > >>>>>> How should i solve this error??
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Thanks,
> > > >>>>>> sandeep
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Harsh J
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Thanks,
> > > >>> sandeep
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks,
> > > > sandeep
> > >
> >
> >
> >
> > --
> > Thanks,
> > sandeep
> >
>



-- 
Thanks&Regards,
Ramakanth,
+91-8884035968.

Re: Hive error when loading csv data.

Posted by Subir S <su...@gmail.com>.
Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
may help.

[1]
http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
[2]
http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html

CCed pig user-list also.


On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P <
sandeepreddy.3647@gmail.com> wrote:

> Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
> back.
>
> On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <michael_segel@hotmail.com
> >wrote:
>
> > Sorry,
> > I was saying  that you can write a python script that replaces the
> > delimiter with a | and ignore the commas within quotes.
> >
> >
> > Sent from a remote device. Please excuse any typos...
> >
> > Mike Segel
> >
> > On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <
> sandeepreddy.3647@gmail.com>
> > wrote:
> >
> > > If i do that my data will be d|"abc|def"|abcd my problem is not solved
> > >
> > > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <
> michael_segel@hotmail.com
> > >wrote:
> > >
> > >> Yup. I just didnt add the quotes.
> > >>
> > >> Sent from a remote device. Please excuse any typos...
> > >>
> > >> Mike Segel
> > >>
> > >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <
> > sandeepreddy.3647@gmail.com>
> > >> wrote:
> > >>
> > >>> Thanks for the reply.
> > >>> I didnt get that Michael. My f2 should be "abc,def"
> > >>>
> > >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
> > >> michael_segel@hotmail.com>wrote:
> > >>>
> > >>>> Alternatively you could write a simple script to convert the csv to
> a
> > >> pipe
> > >>>> delimited file so that "abc,def" will be abc,def.
> > >>>>
> > >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
> > >>>>
> > >>>>> Hive's delimited-fields-format record reader does not handle quoted
> > >>>>> text that carry the same delimiter within them. Excel supports such
> > >>>>> records, so it reads it fine.
> > >>>>>
> > >>>>> You will need to create your table with a custom InputFormat class
> > >>>>> that can handle this (Try using OpenCSV readers, they support
> this),
> > >>>>> instead of relying on Hive to do this for you. If you're successful
> > in
> > >>>>> your approach, please also consider contributing something back to
> > >>>>> Hive/Pig to help others.
> > >>>>>
> > >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> > >>>>> <sa...@gmail.com> wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>> Hi all,
> > >>>>>> I have a csv file with 46 columns but i'm getting error when i do
> > some
> > >>>>>> analysis on that data type. For simplification i have taken 3
> > columns
> > >>>> and
> > >>>>>> now my csv is like
> > >>>>>> c,zxy,xyz
> > >>>>>> d,"abc,def",abcd
> > >>>>>>
> > >>>>>> i have created table for this data using,
> > >>>>>> hive> create table test3(
> > >>>>>>> f1 string,
> > >>>>>>> f2 string,
> > >>>>>>> f3 string)
> > >>>>>>> row format delimited
> > >>>>>>> fields terminated by ",";
> > >>>>>> OK
> > >>>>>> Time taken: 0.143 seconds
> > >>>>>> hive> load data local inpath '/home/training/a.csv'
> > >>>>>>> into table test3;
> > >>>>>> Copying data from file:/home/training/a.csv
> > >>>>>> Copying file: file:/home/training/a.csv
> > >>>>>> Loading data to table default.test3
> > >>>>>> OK
> > >>>>>> Time taken: 0.276 seconds
> > >>>>>> hive> select * from test3;
> > >>>>>> OK
> > >>>>>> c       zxy     xyz
> > >>>>>> d       "abc    def"
> > >>>>>> Time taken: 0.156 seconds
> > >>>>>>
> > >>>>>> When i do select f2 from test3;
> > >>>>>> my results are,
> > >>>>>> OK
> > >>>>>> zxy
> > >>>>>> "abc
> > >>>>>> but this should be abc,def
> > >>>>>> When i open the same csv file with Microsoft Excel i got abc,def
> > >>>>>> How should i solve this error??
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Thanks,
> > >>>>>> sandeep
> > >>>>>>
> > >>>>>> --
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Harsh J
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> Thanks,
> > >>> sandeep
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > sandeep
> >
>
>
>
> --
> Thanks,
> sandeep
>

Re: Hive error when loading csv data.

Posted by Subir S <su...@gmail.com>.
Pig has this CSVExcelStorage [1] and CSVLoader [2] as part of PiggyBank. It
may help.

[1]
http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
[2]
http://pig.apache.org/docs/r0.9.2/api/org/apache/pig/piggybank/storage/CSVLoader.html

CCed pig user-list also.


On Wed, Jun 27, 2012 at 8:22 AM, Sandeep Reddy P <
sandeepreddy.3647@gmail.com> wrote:

> Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
> back.
>
> On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <michael_segel@hotmail.com
> >wrote:
>
> > Sorry,
> > I was saying  that you can write a python script that replaces the
> > delimiter with a | and ignore the commas within quotes.
> >
> >
> > Sent from a remote device. Please excuse any typos...
> >
> > Mike Segel
> >
> > On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <
> sandeepreddy.3647@gmail.com>
> > wrote:
> >
> > > If i do that my data will be d|"abc|def"|abcd my problem is not solved
> > >
> > > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <
> michael_segel@hotmail.com
> > >wrote:
> > >
> > >> Yup. I just didnt add the quotes.
> > >>
> > >> Sent from a remote device. Please excuse any typos...
> > >>
> > >> Mike Segel
> > >>
> > >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <
> > sandeepreddy.3647@gmail.com>
> > >> wrote:
> > >>
> > >>> Thanks for the reply.
> > >>> I didnt get that Michael. My f2 should be "abc,def"
> > >>>
> > >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
> > >> michael_segel@hotmail.com>wrote:
> > >>>
> > >>>> Alternatively you could write a simple script to convert the csv to
> a
> > >> pipe
> > >>>> delimited file so that "abc,def" will be abc,def.
> > >>>>
> > >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
> > >>>>
> > >>>>> Hive's delimited-fields-format record reader does not handle quoted
> > >>>>> text that carry the same delimiter within them. Excel supports such
> > >>>>> records, so it reads it fine.
> > >>>>>
> > >>>>> You will need to create your table with a custom InputFormat class
> > >>>>> that can handle this (Try using OpenCSV readers, they support
> this),
> > >>>>> instead of relying on Hive to do this for you. If you're successful
> > in
> > >>>>> your approach, please also consider contributing something back to
> > >>>>> Hive/Pig to help others.
> > >>>>>
> > >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> > >>>>> <sa...@gmail.com> wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>> Hi all,
> > >>>>>> I have a csv file with 46 columns but i'm getting error when i do
> > some
> > >>>>>> analysis on that data type. For simplification i have taken 3
> > columns
> > >>>> and
> > >>>>>> now my csv is like
> > >>>>>> c,zxy,xyz
> > >>>>>> d,"abc,def",abcd
> > >>>>>>
> > >>>>>> i have created table for this data using,
> > >>>>>> hive> create table test3(
> > >>>>>>> f1 string,
> > >>>>>>> f2 string,
> > >>>>>>> f3 string)
> > >>>>>>> row format delimited
> > >>>>>>> fields terminated by ",";
> > >>>>>> OK
> > >>>>>> Time taken: 0.143 seconds
> > >>>>>> hive> load data local inpath '/home/training/a.csv'
> > >>>>>>> into table test3;
> > >>>>>> Copying data from file:/home/training/a.csv
> > >>>>>> Copying file: file:/home/training/a.csv
> > >>>>>> Loading data to table default.test3
> > >>>>>> OK
> > >>>>>> Time taken: 0.276 seconds
> > >>>>>> hive> select * from test3;
> > >>>>>> OK
> > >>>>>> c       zxy     xyz
> > >>>>>> d       "abc    def"
> > >>>>>> Time taken: 0.156 seconds
> > >>>>>>
> > >>>>>> When i do select f2 from test3;
> > >>>>>> my results are,
> > >>>>>> OK
> > >>>>>> zxy
> > >>>>>> "abc
> > >>>>>> but this should be abc,def
> > >>>>>> When i open the same csv file with Microsoft Excel i got abc,def
> > >>>>>> How should i solve this error??
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Thanks,
> > >>>>>> sandeep
> > >>>>>>
> > >>>>>> --
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Harsh J
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> Thanks,
> > >>> sandeep
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > sandeep
> >
>
>
>
> --
> Thanks,
> sandeep
>

Re: Hive error when loading csv data.

Posted by Sandeep Reddy P <sa...@gmail.com>.
Thanks Michael Sorry i didnt get that soon. I'll try that and reply you
back.

On Tue, Jun 26, 2012 at 10:13 PM, Michel Segel <mi...@hotmail.com>wrote:

> Sorry,
> I was saying  that you can write a python script that replaces the
> delimiter with a | and ignore the commas within quotes.
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <sa...@gmail.com>
> wrote:
>
> > If i do that my data will be d|"abc|def"|abcd my problem is not solved
> >
> > On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <michael_segel@hotmail.com
> >wrote:
> >
> >> Yup. I just didnt add the quotes.
> >>
> >> Sent from a remote device. Please excuse any typos...
> >>
> >> Mike Segel
> >>
> >> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <
> sandeepreddy.3647@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the reply.
> >>> I didnt get that Michael. My f2 should be "abc,def"
> >>>
> >>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
> >> michael_segel@hotmail.com>wrote:
> >>>
> >>>> Alternatively you could write a simple script to convert the csv to a
> >> pipe
> >>>> delimited file so that "abc,def" will be abc,def.
> >>>>
> >>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
> >>>>
> >>>>> Hive's delimited-fields-format record reader does not handle quoted
> >>>>> text that carry the same delimiter within them. Excel supports such
> >>>>> records, so it reads it fine.
> >>>>>
> >>>>> You will need to create your table with a custom InputFormat class
> >>>>> that can handle this (Try using OpenCSV readers, they support this),
> >>>>> instead of relying on Hive to do this for you. If you're successful
> in
> >>>>> your approach, please also consider contributing something back to
> >>>>> Hive/Pig to help others.
> >>>>>
> >>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> >>>>> <sa...@gmail.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>> Hi all,
> >>>>>> I have a csv file with 46 columns but i'm getting error when i do
> some
> >>>>>> analysis on that data type. For simplification i have taken 3
> columns
> >>>> and
> >>>>>> now my csv is like
> >>>>>> c,zxy,xyz
> >>>>>> d,"abc,def",abcd
> >>>>>>
> >>>>>> i have created table for this data using,
> >>>>>> hive> create table test3(
> >>>>>>> f1 string,
> >>>>>>> f2 string,
> >>>>>>> f3 string)
> >>>>>>> row format delimited
> >>>>>>> fields terminated by ",";
> >>>>>> OK
> >>>>>> Time taken: 0.143 seconds
> >>>>>> hive> load data local inpath '/home/training/a.csv'
> >>>>>>> into table test3;
> >>>>>> Copying data from file:/home/training/a.csv
> >>>>>> Copying file: file:/home/training/a.csv
> >>>>>> Loading data to table default.test3
> >>>>>> OK
> >>>>>> Time taken: 0.276 seconds
> >>>>>> hive> select * from test3;
> >>>>>> OK
> >>>>>> c       zxy     xyz
> >>>>>> d       "abc    def"
> >>>>>> Time taken: 0.156 seconds
> >>>>>>
> >>>>>> When i do select f2 from test3;
> >>>>>> my results are,
> >>>>>> OK
> >>>>>> zxy
> >>>>>> "abc
> >>>>>> but this should be abc,def
> >>>>>> When i open the same csv file with Microsoft Excel i got abc,def
> >>>>>> How should i solve this error??
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Thanks,
> >>>>>> sandeep
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Harsh J
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Thanks,
> >>> sandeep
> >>
> >
> >
> >
> > --
> > Thanks,
> > sandeep
>



-- 
Thanks,
sandeep

Re: Hive error when loading csv data.

Posted by Michel Segel <mi...@hotmail.com>.
Sorry,
I was saying  that you can write a python script that replaces the delimiter with a | and ignore the commas within quotes.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P <sa...@gmail.com> wrote:

> If i do that my data will be d|"abc|def"|abcd my problem is not solved
> 
> On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <mi...@hotmail.com>wrote:
> 
>> Yup. I just didnt add the quotes.
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <sa...@gmail.com>
>> wrote:
>> 
>>> Thanks for the reply.
>>> I didnt get that Michael. My f2 should be "abc,def"
>>> 
>>> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
>> michael_segel@hotmail.com>wrote:
>>> 
>>>> Alternatively you could write a simple script to convert the csv to a
>> pipe
>>>> delimited file so that "abc,def" will be abc,def.
>>>> 
>>>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>>>> 
>>>>> Hive's delimited-fields-format record reader does not handle quoted
>>>>> text that carry the same delimiter within them. Excel supports such
>>>>> records, so it reads it fine.
>>>>> 
>>>>> You will need to create your table with a custom InputFormat class
>>>>> that can handle this (Try using OpenCSV readers, they support this),
>>>>> instead of relying on Hive to do this for you. If you're successful in
>>>>> your approach, please also consider contributing something back to
>>>>> Hive/Pig to help others.
>>>>> 
>>>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
>>>>> <sa...@gmail.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> Hi all,
>>>>>> I have a csv file with 46 columns but i'm getting error when i do some
>>>>>> analysis on that data type. For simplification i have taken 3 columns
>>>> and
>>>>>> now my csv is like
>>>>>> c,zxy,xyz
>>>>>> d,"abc,def",abcd
>>>>>> 
>>>>>> i have created table for this data using,
>>>>>> hive> create table test3(
>>>>>>> f1 string,
>>>>>>> f2 string,
>>>>>>> f3 string)
>>>>>>> row format delimited
>>>>>>> fields terminated by ",";
>>>>>> OK
>>>>>> Time taken: 0.143 seconds
>>>>>> hive> load data local inpath '/home/training/a.csv'
>>>>>>> into table test3;
>>>>>> Copying data from file:/home/training/a.csv
>>>>>> Copying file: file:/home/training/a.csv
>>>>>> Loading data to table default.test3
>>>>>> OK
>>>>>> Time taken: 0.276 seconds
>>>>>> hive> select * from test3;
>>>>>> OK
>>>>>> c       zxy     xyz
>>>>>> d       "abc    def"
>>>>>> Time taken: 0.156 seconds
>>>>>> 
>>>>>> When i do select f2 from test3;
>>>>>> my results are,
>>>>>> OK
>>>>>> zxy
>>>>>> "abc
>>>>>> but this should be abc,def
>>>>>> When i open the same csv file with Microsoft Excel i got abc,def
>>>>>> How should i solve this error??
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Thanks,
>>>>>> sandeep
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Harsh J
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Thanks,
>>> sandeep
>> 
> 
> 
> 
> -- 
> Thanks,
> sandeep

Re: Hive error when loading csv data.

Posted by Sandeep Reddy P <sa...@gmail.com>.
If i do that my data will be d|"abc|def"|abcd my problem is not solved

On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel <mi...@hotmail.com>wrote:

> Yup. I just didnt add the quotes.
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <sa...@gmail.com>
> wrote:
>
> > Thanks for the reply.
> > I didnt get that Michael. My f2 should be "abc,def"
> >
> > On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <
> michael_segel@hotmail.com>wrote:
> >
> >> Alternatively you could write a simple script to convert the csv to a
> pipe
> >> delimited file so that "abc,def" will be abc,def.
> >>
> >> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
> >>
> >>> Hive's delimited-fields-format record reader does not handle quoted
> >>> text that carry the same delimiter within them. Excel supports such
> >>> records, so it reads it fine.
> >>>
> >>> You will need to create your table with a custom InputFormat class
> >>> that can handle this (Try using OpenCSV readers, they support this),
> >>> instead of relying on Hive to do this for you. If you're successful in
> >>> your approach, please also consider contributing something back to
> >>> Hive/Pig to help others.
> >>>
> >>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> >>> <sa...@gmail.com> wrote:
> >>>>
> >>>>
> >>>> Hi all,
> >>>> I have a csv file with 46 columns but i'm getting error when i do some
> >>>> analysis on that data type. For simplification i have taken 3 columns
> >> and
> >>>> now my csv is like
> >>>> c,zxy,xyz
> >>>> d,"abc,def",abcd
> >>>>
> >>>> i have created table for this data using,
> >>>> hive> create table test3(
> >>>>> f1 string,
> >>>>> f2 string,
> >>>>> f3 string)
> >>>>> row format delimited
> >>>>> fields terminated by ",";
> >>>> OK
> >>>> Time taken: 0.143 seconds
> >>>> hive> load data local inpath '/home/training/a.csv'
> >>>>> into table test3;
> >>>> Copying data from file:/home/training/a.csv
> >>>> Copying file: file:/home/training/a.csv
> >>>> Loading data to table default.test3
> >>>> OK
> >>>> Time taken: 0.276 seconds
> >>>> hive> select * from test3;
> >>>> OK
> >>>> c       zxy     xyz
> >>>> d       "abc    def"
> >>>> Time taken: 0.156 seconds
> >>>>
> >>>> When i do select f2 from test3;
> >>>> my results are,
> >>>> OK
> >>>> zxy
> >>>> "abc
> >>>> but this should be abc,def
> >>>> When i open the same csv file with Microsoft Excel i got abc,def
> >>>> How should i solve this error??
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Thanks,
> >>>> sandeep
> >>>>
> >>>> --
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Harsh J
> >>>
> >>
> >>
> >
> >
> > --
> > Thanks,
> > sandeep
>



-- 
Thanks,
sandeep

Re: Hive error when loading csv data.

Posted by Michel Segel <mi...@hotmail.com>.
Yup. I just didnt add the quotes.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P <sa...@gmail.com> wrote:

> Thanks for the reply.
> I didnt get that Michael. My f2 should be "abc,def"
> 
> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <mi...@hotmail.com>wrote:
> 
>> Alternatively you could write a simple script to convert the csv to a pipe
>> delimited file so that "abc,def" will be abc,def.
>> 
>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>> 
>>> Hive's delimited-fields-format record reader does not handle quoted
>>> text that carry the same delimiter within them. Excel supports such
>>> records, so it reads it fine.
>>> 
>>> You will need to create your table with a custom InputFormat class
>>> that can handle this (Try using OpenCSV readers, they support this),
>>> instead of relying on Hive to do this for you. If you're successful in
>>> your approach, please also consider contributing something back to
>>> Hive/Pig to help others.
>>> 
>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
>>> <sa...@gmail.com> wrote:
>>>> 
>>>> 
>>>> Hi all,
>>>> I have a csv file with 46 columns but i'm getting error when i do some
>>>> analysis on that data type. For simplification i have taken 3 columns
>> and
>>>> now my csv is like
>>>> c,zxy,xyz
>>>> d,"abc,def",abcd
>>>> 
>>>> i have created table for this data using,
>>>> hive> create table test3(
>>>>> f1 string,
>>>>> f2 string,
>>>>> f3 string)
>>>>> row format delimited
>>>>> fields terminated by ",";
>>>> OK
>>>> Time taken: 0.143 seconds
>>>> hive> load data local inpath '/home/training/a.csv'
>>>>> into table test3;
>>>> Copying data from file:/home/training/a.csv
>>>> Copying file: file:/home/training/a.csv
>>>> Loading data to table default.test3
>>>> OK
>>>> Time taken: 0.276 seconds
>>>> hive> select * from test3;
>>>> OK
>>>> c       zxy     xyz
>>>> d       "abc    def"
>>>> Time taken: 0.156 seconds
>>>> 
>>>> When i do select f2 from test3;
>>>> my results are,
>>>> OK
>>>> zxy
>>>> "abc
>>>> but this should be abc,def
>>>> When i open the same csv file with Microsoft Excel i got abc,def
>>>> How should i solve this error??
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Thanks,
>>>> sandeep
>>>> 
>>>> --
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Harsh J
>>> 
>> 
>> 
> 
> 
> -- 
> Thanks,
> sandeep

Re: Hive error when loading csv data.

Posted by Hitesh Shah <hi...@hortonworks.com>.
Michael's suggestion was to change your data to:

c|zxy|xyz
d|abc,def|abcd

and then use "|" as the delimiter. 

-- Hitesh

On Jun 26, 2012, at 2:30 PM, Sandeep Reddy P wrote:

> Thanks for the reply.
> I didnt get that Michael. My f2 should be "abc,def"
> 
> On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <mi...@hotmail.com>wrote:
> 
>> Alternatively you could write a simple script to convert the csv to a pipe
>> delimited file so that "abc,def" will be abc,def.
>> 
>> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>> 
>>> Hive's delimited-fields-format record reader does not handle quoted
>>> text that carry the same delimiter within them. Excel supports such
>>> records, so it reads it fine.
>>> 
>>> You will need to create your table with a custom InputFormat class
>>> that can handle this (Try using OpenCSV readers, they support this),
>>> instead of relying on Hive to do this for you. If you're successful in
>>> your approach, please also consider contributing something back to
>>> Hive/Pig to help others.
>>> 
>>> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
>>> <sa...@gmail.com> wrote:
>>>> 
>>>> 
>>>> Hi all,
>>>> I have a csv file with 46 columns but i'm getting error when i do some
>>>> analysis on that data type. For simplification i have taken 3 columns
>> and
>>>> now my csv is like
>>>> c,zxy,xyz
>>>> d,"abc,def",abcd
>>>> 
>>>> i have created table for this data using,
>>>> hive> create table test3(
>>>>> f1 string,
>>>>> f2 string,
>>>>> f3 string)
>>>>> row format delimited
>>>>> fields terminated by ",";
>>>> OK
>>>> Time taken: 0.143 seconds
>>>> hive> load data local inpath '/home/training/a.csv'
>>>>> into table test3;
>>>> Copying data from file:/home/training/a.csv
>>>> Copying file: file:/home/training/a.csv
>>>> Loading data to table default.test3
>>>> OK
>>>> Time taken: 0.276 seconds
>>>> hive> select * from test3;
>>>> OK
>>>> c       zxy     xyz
>>>> d       "abc    def"
>>>> Time taken: 0.156 seconds
>>>> 
>>>> When i do select f2 from test3;
>>>> my results are,
>>>> OK
>>>> zxy
>>>> "abc
>>>> but this should be abc,def
>>>> When i open the same csv file with Microsoft Excel i got abc,def
>>>> How should i solve this error??
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Thanks,
>>>> sandeep
>>>> 
>>>> --
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Harsh J
>>> 
>> 
>> 
> 
> 
> -- 
> Thanks,
> sandeep


Re: Hive error when loading csv data.

Posted by Sandeep Reddy P <sa...@gmail.com>.
Thanks for the reply.
I didnt get that Michael. My f2 should be "abc,def"

On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel <mi...@hotmail.com>wrote:

> Alternatively you could write a simple script to convert the csv to a pipe
> delimited file so that "abc,def" will be abc,def.
>
> On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
>
> > Hive's delimited-fields-format record reader does not handle quoted
> > text that carry the same delimiter within them. Excel supports such
> > records, so it reads it fine.
> >
> > You will need to create your table with a custom InputFormat class
> > that can handle this (Try using OpenCSV readers, they support this),
> > instead of relying on Hive to do this for you. If you're successful in
> > your approach, please also consider contributing something back to
> > Hive/Pig to help others.
> >
> > On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> > <sa...@gmail.com> wrote:
> >>
> >>
> >> Hi all,
> >> I have a csv file with 46 columns but i'm getting error when i do some
> >> analysis on that data type. For simplification i have taken 3 columns
> and
> >> now my csv is like
> >> c,zxy,xyz
> >> d,"abc,def",abcd
> >>
> >> i have created table for this data using,
> >> hive> create table test3(
> >>     > f1 string,
> >>     > f2 string,
> >>     > f3 string)
> >>     > row format delimited
> >>     > fields terminated by ",";
> >> OK
> >> Time taken: 0.143 seconds
> >> hive> load data local inpath '/home/training/a.csv'
> >>     > into table test3;
> >> Copying data from file:/home/training/a.csv
> >> Copying file: file:/home/training/a.csv
> >> Loading data to table default.test3
> >> OK
> >> Time taken: 0.276 seconds
> >> hive> select * from test3;
> >> OK
> >> c       zxy     xyz
> >> d       "abc    def"
> >> Time taken: 0.156 seconds
> >>
> >> When i do select f2 from test3;
> >> my results are,
> >> OK
> >> zxy
> >> "abc
> >> but this should be abc,def
> >> When i open the same csv file with Microsoft Excel i got abc,def
> >> How should i solve this error??
> >>
> >>
> >>
> >> --
> >> Thanks,
> >> sandeep
> >>
> >> --
> >>
> >>
> >>
> >
> >
> >
> > --
> > Harsh J
> >
>
>


-- 
Thanks,
sandeep

Re: Hive error when loading csv data.

Posted by Michael Segel <mi...@hotmail.com>.
Alternatively you could write a simple script to convert the csv to a pipe delimited file so that "abc,def" will be abc,def.

On Jun 26, 2012, at 2:51 PM, Harsh J wrote:

> Hive's delimited-fields-format record reader does not handle quoted
> text that carry the same delimiter within them. Excel supports such
> records, so it reads it fine.
> 
> You will need to create your table with a custom InputFormat class
> that can handle this (Try using OpenCSV readers, they support this),
> instead of relying on Hive to do this for you. If you're successful in
> your approach, please also consider contributing something back to
> Hive/Pig to help others.
> 
> On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
> <sa...@gmail.com> wrote:
>> 
>> 
>> Hi all,
>> I have a csv file with 46 columns but i'm getting error when i do some
>> analysis on that data type. For simplification i have taken 3 columns and
>> now my csv is like
>> c,zxy,xyz
>> d,"abc,def",abcd
>> 
>> i have created table for this data using,
>> hive> create table test3(
>>     > f1 string,
>>     > f2 string,
>>     > f3 string)
>>     > row format delimited
>>     > fields terminated by ",";
>> OK
>> Time taken: 0.143 seconds
>> hive> load data local inpath '/home/training/a.csv'
>>     > into table test3;
>> Copying data from file:/home/training/a.csv
>> Copying file: file:/home/training/a.csv
>> Loading data to table default.test3
>> OK
>> Time taken: 0.276 seconds
>> hive> select * from test3;
>> OK
>> c       zxy     xyz
>> d       "abc    def"
>> Time taken: 0.156 seconds
>> 
>> When i do select f2 from test3;
>> my results are,
>> OK
>> zxy
>> "abc
>> but this should be abc,def
>> When i open the same csv file with Microsoft Excel i got abc,def
>> How should i solve this error??
>> 
>> 
>> 
>> --
>> Thanks,
>> sandeep
>> 
>> --
>> 
>> 
>> 
> 
> 
> 
> -- 
> Harsh J
> 


Re: Hive error when loading csv data.

Posted by Harsh J <ha...@cloudera.com>.
Hive's delimited-fields-format record reader does not handle quoted
text that carry the same delimiter within them. Excel supports such
records, so it reads it fine.

You will need to create your table with a custom InputFormat class
that can handle this (Try using OpenCSV readers, they support this),
instead of relying on Hive to do this for you. If you're successful in
your approach, please also consider contributing something back to
Hive/Pig to help others.

On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
<sa...@gmail.com> wrote:
>
>
> Hi all,
> I have a csv file with 46 columns but i'm getting error when i do some
> analysis on that data type. For simplification i have taken 3 columns and
> now my csv is like
> c,zxy,xyz
> d,"abc,def",abcd
>
> i have created table for this data using,
> hive> create table test3(
>     > f1 string,
>     > f2 string,
>     > f3 string)
>     > row format delimited
>     > fields terminated by ",";
> OK
> Time taken: 0.143 seconds
> hive> load data local inpath '/home/training/a.csv'
>     > into table test3;
> Copying data from file:/home/training/a.csv
> Copying file: file:/home/training/a.csv
> Loading data to table default.test3
> OK
> Time taken: 0.276 seconds
> hive> select * from test3;
> OK
> c       zxy     xyz
> d       "abc    def"
> Time taken: 0.156 seconds
>
> When i do select f2 from test3;
> my results are,
> OK
> zxy
> "abc
> but this should be abc,def
> When i open the same csv file with Microsoft Excel i got abc,def
> How should i solve this error??
>
>
>
> --
> Thanks,
> sandeep
>
> --
>
>
>



-- 
Harsh J