You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Vikas Srivastava <vi...@one97.net> on 2011/08/19 12:35:21 UTC

problem in reading data fro table (fields terminated by '||')

HI team,

I m facing a problem like i have made a table which is (** fields terminated
by '||' and lines terminated by '\n'*)

but when i fetching data from this table . i didnt get the desire output .

*data is in below format*

IN||2011-03-28 21:59:24||2011-03-28
22:00:09||919040573650||||122||0||USSD_LOCAL_SESSION_TIMEOUT||||||44319||0||||\N||||0
NW||2011-03-28 21:59:24||2011-03-28
22:00:09||918793839387||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356575_1||||24268||0||||\N||||0
NW||2011-03-28 21:59:24||2011-03-28
22:00:09||918090245162||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356576_1||||24269||0||||\N||||0
NW||2011-03-28 21:59:24||2011-03-28
22:00:09||918233329288||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356580_1||||24271||0||||\N||||0
NW||2011-03-28 21:59:24||2011-03-28
22:00:09||918699490209||||121||0||USSD_LOCAL_SESSION_TIMEOUT||0000000019795974_2||||24281||0||||\N||||0


*actually problem is hive reads single '|' as a fields separators due to
which 2 columns divided into 3 columns .*


Anybody have the solution for that !!!!!!




-- 
With Regards
Vikas Srivastava

DWH & Analytics Team
Mob:+91 9560885900
One97 | Let's get talking !

Re: problem in reading data fro table (fields terminated by '||')

Posted by Edward Capriolo <ed...@gmail.com>.
You are going to have to write your own serde. AFAIK when you create a
standard table the LazySimpleSerde is the default and it only accepts single
character delimiters. It you want to 'cheat' the lazy simple serde, you can
use 'fields terminated by |' and introduce a dummy column after each real
one. Or reformat your input data to make it more hive friendly.

Edward

On Fri, Aug 19, 2011 at 9:11 AM, Siddharth Tiwari <siddharth.tiwari@live.com
> wrote:

>  You will have to parse this data accordingly
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
>
>
> ------------------------------
> Date: Fri, 19 Aug 2011 16:05:21 +0530
> Subject: problem in reading data fro table (fields terminated by '||')
> From: vikas.srivastava@one97.net
> To: user@hive.apache.org
> CC: nitin2.kumar@one97.net
>
> HI team,
>
> I m facing a problem like i have made a table which is (** fields
> terminated by '||' and lines terminated by '\n'*)
>
> but when i fetching data from this table . i didnt get the desire output .
>
> *data is in below format*
>
> IN||2011-03-28 21:59:24||2011-03-28
> 22:00:09||919040573650||||122||0||USSD_LOCAL_SESSION_TIMEOUT||||||44319||0||||\N||||0
> NW||2011-03-28 21:59:24||2011-03-28
> 22:00:09||918793839387||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356575_1||||24268||0||||\N||||0
> NW||2011-03-28 21:59:24||2011-03-28
> 22:00:09||918090245162||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356576_1||||24269||0||||\N||||0
> NW||2011-03-28 21:59:24||2011-03-28
> 22:00:09||918233329288||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356580_1||||24271||0||||\N||||0
> NW||2011-03-28 21:59:24||2011-03-28
> 22:00:09||918699490209||||121||0||USSD_LOCAL_SESSION_TIMEOUT||0000000019795974_2||||24281||0||||\N||||0
>
>
> *actually problem is hive reads single '|' as a fields separators due to
> which 2 columns divided into 3 columns .*
>
>
> Anybody have the solution for that !!!!!!
>
>
>
>
> --
> With Regards
> Vikas Srivastava
>
> DWH & Analytics Team
> Mob:+91 9560885900
> One97 | Let's get talking !
>
>

RE: problem in reading data fro table (fields terminated by '||')

Posted by Siddharth Tiwari <si...@live.com>.
You can write a map/reduce job to do it for you accordingly, leveraging its power of parallel processing.

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!


Date: Fri, 19 Aug 2011 18:46:22 +0530
Subject: Re: problem in reading data fro table (fields terminated by '||')
From: vikas.srivastava@one97.net
To: user@hive.apache.org

hey sid!!

thanks bro...

but cant parse the file actually have 3TB data in that format . so i need to find the solution and 1 more thing it ll take much time to parse it.!!!!!!!!


regards
Vikas Srivastava

On Fri, Aug 19, 2011 at 6:41 PM, Siddharth Tiwari <si...@live.com> wrote:






You will have to parse this data accordingly 

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!


Date: Fri, 19 Aug 2011 16:05:21 +0530
Subject: problem in reading data fro table (fields terminated by '||')
From: vikas.srivastava@one97.net

To: user@hive.apache.org
CC: nitin2.kumar@one97.net

HI team,


I m facing a problem like i have made a table which is (* fields terminated by '||' and lines terminated by '\n')


but when i fetching data from this table . i didnt get the desire output .

data is in below format

IN||2011-03-28 21:59:24||2011-03-28 22:00:09||919040573650||||122||0||USSD_LOCAL_SESSION_TIMEOUT||||||44319||0||||\N||||0


NW||2011-03-28 21:59:24||2011-03-28 22:00:09||918793839387||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356575_1||||24268||0||||\N||||0
NW||2011-03-28 21:59:24||2011-03-28 22:00:09||918090245162||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356576_1||||24269||0||||\N||||0


NW||2011-03-28 21:59:24||2011-03-28 22:00:09||918233329288||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356580_1||||24271||0||||\N||||0
NW||2011-03-28 21:59:24||2011-03-28 22:00:09||918699490209||||121||0||USSD_LOCAL_SESSION_TIMEOUT||0000000019795974_2||||24281||0||||\N||||0



actually problem is hive reads single '|' as a fields separators due to which 2 columns divided into 3 columns .


Anybody have the solution for that !!!!!! 






-- 
With Regards
Vikas Srivastava

DWH & Analytics Team

Mob:+91 9560885900
One97 | Let's get talking !
 		 	   		  


-- 
With Regards
Vikas Srivastava

DWH & Analytics Team
Mob:+91 9560885900
One97 | Let's get talking !
 		 	   		  

Re: problem in reading data fro table (fields terminated by '||')

Posted by Vikas Srivastava <vi...@one97.net>.
hey sid!!

thanks bro...

but cant parse the file actually have 3TB data in that format . so i need to
find the solution and 1 more thing it ll take much time to parse it.!!!!!!!!

regards
Vikas Srivastava

On Fri, Aug 19, 2011 at 6:41 PM, Siddharth Tiwari <siddharth.tiwari@live.com
> wrote:

>  You will have to parse this data accordingly
>
> **------------------------**
> *Cheers !!!*
> *Siddharth Tiwari*
> Have a refreshing day !!!
>
>
> ------------------------------
> Date: Fri, 19 Aug 2011 16:05:21 +0530
> Subject: problem in reading data fro table (fields terminated by '||')
> From: vikas.srivastava@one97.net
> To: user@hive.apache.org
> CC: nitin2.kumar@one97.net
>
>
> HI team,
>
> I m facing a problem like i have made a table which is (** fields
> terminated by '||' and lines terminated by '\n'*)
>
> but when i fetching data from this table . i didnt get the desire output .
>
> *data is in below format*
>
> IN||2011-03-28 21:59:24||2011-03-28
> 22:00:09||919040573650||||122||0||USSD_LOCAL_SESSION_TIMEOUT||||||44319||0||||\N||||0
> NW||2011-03-28 21:59:24||2011-03-28
> 22:00:09||918793839387||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356575_1||||24268||0||||\N||||0
> NW||2011-03-28 21:59:24||2011-03-28
> 22:00:09||918090245162||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356576_1||||24269||0||||\N||||0
> NW||2011-03-28 21:59:24||2011-03-28
> 22:00:09||918233329288||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356580_1||||24271||0||||\N||||0
> NW||2011-03-28 21:59:24||2011-03-28
> 22:00:09||918699490209||||121||0||USSD_LOCAL_SESSION_TIMEOUT||0000000019795974_2||||24281||0||||\N||||0
>
>
> *actually problem is hive reads single '|' as a fields separators due to
> which 2 columns divided into 3 columns .*
>
>
> Anybody have the solution for that !!!!!!
>
>
>
>
> --
> With Regards
> Vikas Srivastava
>
> DWH & Analytics Team
> Mob:+91 9560885900
> One97 | Let's get talking !
>
>


-- 
With Regards
Vikas Srivastava

DWH & Analytics Team
Mob:+91 9560885900
One97 | Let's get talking !

RE: problem in reading data fro table (fields terminated by '||')

Posted by Siddharth Tiwari <si...@live.com>.
You will have to parse this data accordingly 

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!


Date: Fri, 19 Aug 2011 16:05:21 +0530
Subject: problem in reading data fro table (fields terminated by '||')
From: vikas.srivastava@one97.net
To: user@hive.apache.org
CC: nitin2.kumar@one97.net

HI team,

I m facing a problem like i have made a table which is (* fields terminated by '||' and lines terminated by '\n')


but when i fetching data from this table . i didnt get the desire output .

data is in below format

IN||2011-03-28 21:59:24||2011-03-28 22:00:09||919040573650||||122||0||USSD_LOCAL_SESSION_TIMEOUT||||||44319||0||||\N||||0

NW||2011-03-28 21:59:24||2011-03-28 22:00:09||918793839387||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356575_1||||24268||0||||\N||||0
NW||2011-03-28 21:59:24||2011-03-28 22:00:09||918090245162||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356576_1||||24269||0||||\N||||0

NW||2011-03-28 21:59:24||2011-03-28 22:00:09||918233329288||||122||0||USSD_LOCAL_SESSION_TIMEOUT||0000000020356580_1||||24271||0||||\N||||0
NW||2011-03-28 21:59:24||2011-03-28 22:00:09||918699490209||||121||0||USSD_LOCAL_SESSION_TIMEOUT||0000000019795974_2||||24281||0||||\N||||0


actually problem is hive reads single '|' as a fields separators due to which 2 columns divided into 3 columns .


Anybody have the solution for that !!!!!! 





-- 
With Regards
Vikas Srivastava

DWH & Analytics Team
Mob:+91 9560885900
One97 | Let's get talking !