You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Daniel Joanes <dj...@gmail.com> on 2010/02/25 17:44:33 UTC
Hive question.
If I have a line like the following:
<2010-02-09 18:00:16.123 UTC>:[48394803]:<MDS-CS_MDS1>:<DEBUG>:<LAYER =
EP2P, EVENT = Receiving, DEVICEPIN = 2032acb14, GMETAG = -1966209606, TYPE =
22, METHOD = onInEp2p, DESTINATION = 24a69edf, CONFIRM = true,
EXP_TIMEOUT(S) = 3600, SIZE = 7312>
What would be the best way to store it into a table like this:
ts STRING "2010-02-09 18:00:16.123 UTC"
epochtime INT "345093824" // <-- I'm not sure how to do this
column either
requestId INT 48394803
component STRING "MDS-CS_MDS1"
log_level STRING "DEBUG"
properties STRING "LAYER = EP2P, EVENT = Receiving, DEVICEPIN =
2032acb14, GMETAG = -1966209606, TYPE = 22, METHOD =
onInEp2p, DESTINATION = 24a69edf, CONFIRM = true,
EXP_TIMEOUT(S) = 3600, SIZE = 7312"
Thanks,
Daniel
Re: Hive question.
Posted by Carl Steinbach <ca...@cloudera.com>.
You can do a type conversion using the CAST UDF (while streaming the data
from one table to another). See the documentation here:
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Type_Conversion_Functions
Carl
On Thu, Feb 25, 2010 at 11:02 AM, Daniel Joanes <dj...@gmail.com> wrote:
> Awesome, that worked. From what I can tell the columns in my table have to
> be strings.. how would I use other data types?
>
>
> On Thu, Feb 25, 2010 at 1:49 PM, Carl Steinbach <ca...@cloudera.com> wrote:
>
>> Hi Daniel,
>>
>> You can use the RegexSerDe to extract the fields embedded in the text. Try
>> looking at the examples in
>> contrib/src/test/queries/clientpositive/serde_regex.q
>>
>> Carl
>>
>>
>> On Thu, Feb 25, 2010 at 8:44 AM, Daniel Joanes <dj...@gmail.com> wrote:
>>
>>> If I have a line like the following:
>>>
>>> <2010-02-09 18:00:16.123 UTC>:[48394803]:<MDS-CS_MDS1>:<DEBUG>:<LAYER =
>>> EP2P, EVENT = Receiving, DEVICEPIN = 2032acb14, GMETAG = -1966209606, TYPE =
>>> 22, METHOD = onInEp2p, DESTINATION = 24a69edf, CONFIRM = true,
>>> EXP_TIMEOUT(S) = 3600, SIZE = 7312>
>>>
>>> What would be the best way to store it into a table like this:
>>>
>>> ts STRING "2010-02-09 18:00:16.123 UTC"
>>> epochtime INT "345093824" // <-- I'm not sure how to do this
>>> column either
>>> requestId INT 48394803
>>> component STRING "MDS-CS_MDS1"
>>> log_level STRING "DEBUG"
>>> properties STRING "LAYER = EP2P, EVENT = Receiving, DEVICEPIN =
>>> 2032acb14, GMETAG = -1966209606, TYPE = 22, METHOD =
>>> onInEp2p, DESTINATION = 24a69edf, CONFIRM =
>>> true, EXP_TIMEOUT(S) = 3600, SIZE = 7312"
>>>
>>> Thanks,
>>>
>>> Daniel
>>>
>>
>>
>
Re: Hive question.
Posted by Daniel Joanes <dj...@gmail.com>.
Awesome, that worked. From what I can tell the columns in my table have to
be strings.. how would I use other data types?
On Thu, Feb 25, 2010 at 1:49 PM, Carl Steinbach <ca...@cloudera.com> wrote:
> Hi Daniel,
>
> You can use the RegexSerDe to extract the fields embedded in the text. Try
> looking at the examples in
> contrib/src/test/queries/clientpositive/serde_regex.q
>
> Carl
>
>
> On Thu, Feb 25, 2010 at 8:44 AM, Daniel Joanes <dj...@gmail.com> wrote:
>
>> If I have a line like the following:
>>
>> <2010-02-09 18:00:16.123 UTC>:[48394803]:<MDS-CS_MDS1>:<DEBUG>:<LAYER =
>> EP2P, EVENT = Receiving, DEVICEPIN = 2032acb14, GMETAG = -1966209606, TYPE =
>> 22, METHOD = onInEp2p, DESTINATION = 24a69edf, CONFIRM = true,
>> EXP_TIMEOUT(S) = 3600, SIZE = 7312>
>>
>> What would be the best way to store it into a table like this:
>>
>> ts STRING "2010-02-09 18:00:16.123 UTC"
>> epochtime INT "345093824" // <-- I'm not sure how to do this
>> column either
>> requestId INT 48394803
>> component STRING "MDS-CS_MDS1"
>> log_level STRING "DEBUG"
>> properties STRING "LAYER = EP2P, EVENT = Receiving, DEVICEPIN =
>> 2032acb14, GMETAG = -1966209606, TYPE = 22, METHOD =
>> onInEp2p, DESTINATION = 24a69edf, CONFIRM = true,
>> EXP_TIMEOUT(S) = 3600, SIZE = 7312"
>>
>> Thanks,
>>
>> Daniel
>>
>
>
Re: Hive question.
Posted by Carl Steinbach <ca...@cloudera.com>.
Hi Daniel,
You can use the RegexSerDe to extract the fields embedded in the text. Try
looking at the examples in
contrib/src/test/queries/clientpositive/serde_regex.q
Carl
On Thu, Feb 25, 2010 at 8:44 AM, Daniel Joanes <dj...@gmail.com> wrote:
> If I have a line like the following:
>
> <2010-02-09 18:00:16.123 UTC>:[48394803]:<MDS-CS_MDS1>:<DEBUG>:<LAYER =
> EP2P, EVENT = Receiving, DEVICEPIN = 2032acb14, GMETAG = -1966209606, TYPE =
> 22, METHOD = onInEp2p, DESTINATION = 24a69edf, CONFIRM = true,
> EXP_TIMEOUT(S) = 3600, SIZE = 7312>
>
> What would be the best way to store it into a table like this:
>
> ts STRING "2010-02-09 18:00:16.123 UTC"
> epochtime INT "345093824" // <-- I'm not sure how to do this
> column either
> requestId INT 48394803
> component STRING "MDS-CS_MDS1"
> log_level STRING "DEBUG"
> properties STRING "LAYER = EP2P, EVENT = Receiving, DEVICEPIN =
> 2032acb14, GMETAG = -1966209606, TYPE = 22, METHOD =
> onInEp2p, DESTINATION = 24a69edf, CONFIRM = true,
> EXP_TIMEOUT(S) = 3600, SIZE = 7312"
>
> Thanks,
>
> Daniel
>