You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by bichonfrise74 <bi...@gmail.com> on 2011/05/07 00:47:48 UTC
Apache Log Date Format
Hi,
I am using this to load the apache log into Hadoop via Hive (my version is
0.4.1).
CREATE TABLE apache_log (
...
logdate STRING,
...
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "([^ ]*) ([^ ]*) ([^ ]*)
\\[(\\w+\/\\w+\/\\w+)\:(\\d+:\\d+:\\d+) ...
...
The date is coming in this format: dd/mmm/yyyy.
I would like to be able to load the data using this date format:
yyyy-mmm-dd.
1. Has anyone done this before loading the date in a different a different
format?
2. Also, how do you specify in the create table statement above that the
partition is the logdate?
3. And when I tried to convert the old date into unixtime format via this
sql, hive complains.
hive> select from_unixtime( unix_timestamp( logdate, 'dd/MMM/yyyy')) from
apache_log;
FAILED: Error in semantic analysis: line 1:7 Function Argument Type Mismatch
from_unixtime: Looking for UDF "from_unixtime" with parameters [class
org.apache.hadoop.io.LongWritable]
Has anyone encountered these issues before?
Thanks.
Re: Apache Log Date Format
Posted by Jov <zh...@gmail.com>.
在 2011-5-7 上午6:48,"bichonfrise74" <bi...@gmail.com>写道:
>
> Hi,
>
> I am using this to load the apache log into Hadoop via Hive (my version is
0.4.1).
>
> CREATE TABLE apache_log (
> ...
> logdate STRING,
> ...
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> WITH SERDEPROPERTIES (
> "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*)
\\[(\\w+\/\\w+\/\\w+)\:(\\d+:\\d+:\\d+) ...
> ...
>
> The date is coming in this format: dd/mmm/yyyy.
> I would like to be able to load the data using this date format:
yyyy-mmm-dd.
>
> 1. Has anyone done this before loading the date in a different a different
format?
> 2. Also, how do you specify in the create table statement above that the
partition is the logdate?
> 3. And when I tried to convert the old date into unixtime format via this
sql, hive complains.
>
> hive> select from_unixtime( unix_timestamp( logdate, 'dd/MMM/yyyy')) from
apache_log;
> FAILED: Error in semantic analysis: line 1:7 Function Argument Type
Mismatch from_unixtime: Looking for UDF "from_unixtime" with parameters
[class org.apache.hadoop.io.LongWritable]
The unix_timestamp func returns bigint while the from_unixtime func only
accepts int as its parameter.so you should use cast:
from_unixtime(cast( unix_timestamp( logdate, 'dd/MMM/yyyy') as int))
> Has anyone encountered these issues before?
>
> Thanks.