You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Mungeol Heo <mu...@gmail.com> on 2015/01/30 10:34:52 UTC
Flume gives "java.lang.IllegalArgumentException" when using
regex_extractor for extracting timestamp from apache access log
case 1:
the setting I used is listed below.
----------
agent01.sources.source01.interceptors.interceptor02.type = regex_extractor
agent01.sources.source01.interceptors.interceptor02.regex =
^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[(\\d+\\/[a-zA-z]{3}\\/\\d{4}:\\d{2}:\\d{2}:\\d{2})\\s\\+0900\\]\\s
agent01.sources.source01.interceptors.interceptor02.serializers = s01
agent01.sources.source01.interceptors.interceptor02.serializers.s01.type
= org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer
agent01.sources.source01.interceptors.interceptor02.serializers.s01.pattern
= dd/MMM/yyyy:HH:mm:ss
agent01.sources.source01.interceptors.interceptor02.serializers.s01.name
= timestamp
----------
It gives me an 'java.lang.IllegalArgumentException: Invalid format:
"30/Jan/2015:15:01:03" is malformed at "Jan/2015:15:01:03"' error.
case 2:
the setting I used is listed below.
----------
regex = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[\\d+\\/([a-zA-z]{3})\\/\\d{4}:\\d{2}:\\d{2}:\\d{2}\\s\\+0900\\]\\s
pattern = MMM
----------
it gives me an 'java.lang.IllegalArgumentException: Invalid format:
"Jan"' error.
case 3:
the setting I used are listed below.
----------
regex = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[\\d+\\/[a-zA-z]{3}(\\/\\d{4}:\\d{2}:\\d{2}:\\d{2})\\s\\+0900\\]\\s
pattern = /yyyy:HH:mm:ss
----------
and
----------
regex = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[(\\d+\\/)[a-zA-z]{3}\\/\\d{4}:\\d{2}:\\d{2}:\\d{2}\\s\\+0900\\]\\s
pattern = dd/
----------
It works OK.
So, as I see, flume gives 'java.lang.IllegalArgumentException" error
because it fails to mapping "Jan" by using "MMM" pattern.
BTW, I used Cloudera Express 5.3.1.
And, the setting of case 1 works fine at another server which using
java 1.6.0_29.
Is is true that different java version is the reason causes mapping
"Jan" failed by using "MMM" pattern?
Is there anything that I missed?
Any help will be great.
Thank you
- mungeol
Re: Flume gives "java.lang.IllegalArgumentException" when using
regex_extractor for extracting timestamp from apache access log
Posted by Mungeol Heo <mu...@gmail.com>.
I found the problem causes the error which was mentioned above.
It is because of the LANG setting of system.
It works fine after changing LANG to "en_US.UTF-8"
On Fri, Jan 30, 2015 at 6:34 PM, Mungeol Heo <mu...@gmail.com> wrote:
> case 1:
>
> the setting I used is listed below.
>
> ----------
> agent01.sources.source01.interceptors.interceptor02.type = regex_extractor
> agent01.sources.source01.interceptors.interceptor02.regex =
> ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[(\\d+\\/[a-zA-z]{3}\\/\\d{4}:\\d{2}:\\d{2}:\\d{2})\\s\\+0900\\]\\s
> agent01.sources.source01.interceptors.interceptor02.serializers = s01
> agent01.sources.source01.interceptors.interceptor02.serializers.s01.type
> = org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer
> agent01.sources.source01.interceptors.interceptor02.serializers.s01.pattern
> = dd/MMM/yyyy:HH:mm:ss
> agent01.sources.source01.interceptors.interceptor02.serializers.s01.name
> = timestamp
> ----------
>
> It gives me an 'java.lang.IllegalArgumentException: Invalid format:
> "30/Jan/2015:15:01:03" is malformed at "Jan/2015:15:01:03"' error.
>
>
>
> case 2:
>
> the setting I used is listed below.
>
> ----------
> regex = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[\\d+\\/([a-zA-z]{3})\\/\\d{4}:\\d{2}:\\d{2}:\\d{2}\\s\\+0900\\]\\s
> pattern = MMM
> ----------
>
> it gives me an 'java.lang.IllegalArgumentException: Invalid format:
> "Jan"' error.
>
>
>
> case 3:
>
> the setting I used are listed below.
>
> ----------
> regex = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[\\d+\\/[a-zA-z]{3}(\\/\\d{4}:\\d{2}:\\d{2}:\\d{2})\\s\\+0900\\]\\s
> pattern = /yyyy:HH:mm:ss
> ----------
>
> and
>
> ----------
> regex = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[(\\d+\\/)[a-zA-z]{3}\\/\\d{4}:\\d{2}:\\d{2}:\\d{2}\\s\\+0900\\]\\s
> pattern = dd/
> ----------
>
> It works OK.
>
> So, as I see, flume gives 'java.lang.IllegalArgumentException" error
> because it fails to mapping "Jan" by using "MMM" pattern.
>
> BTW, I used Cloudera Express 5.3.1.
> And, the setting of case 1 works fine at another server which using
> java 1.6.0_29.
>
> Is is true that different java version is the reason causes mapping
> "Jan" failed by using "MMM" pattern?
> Is there anything that I missed?
> Any help will be great.
>
> Thank you
>
> - mungeol