flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mungeol Heo <mungeol....@gmail.com>
Subject Flume gives "java.lang.IllegalArgumentException" when using regex_extractor for extracting timestamp from apache access log
Date Fri, 30 Jan 2015 09:34:52 GMT
case 1:

the setting I used is listed below.

agent01.sources.source01.interceptors.interceptor02.type = regex_extractor
agent01.sources.source01.interceptors.interceptor02.regex =
agent01.sources.source01.interceptors.interceptor02.serializers = s01
= org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer
= dd/MMM/yyyy:HH:mm:ss
= timestamp

It gives me an 'java.lang.IllegalArgumentException: Invalid format:
"30/Jan/2015:15:01:03" is malformed at "Jan/2015:15:01:03"' error.

case 2:

the setting I used is listed below.

regex = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[\\d+\\/([a-zA-z]{3})\\/\\d{4}:\\d{2}:\\d{2}:\\d{2}\\s\\+0900\\]\\s
pattern = MMM

it gives me an 'java.lang.IllegalArgumentException: Invalid format:
"Jan"' error.

case 3:

the setting I used are listed below.

regex  = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[\\d+\\/[a-zA-z]{3}(\\/\\d{4}:\\d{2}:\\d{2}:\\d{2})\\s\\+0900\\]\\s
pattern  = /yyyy:HH:mm:ss


regex = ^\\d+\\.\\d+.\\d+.\\d+\\s\\S+\\s\\S+\\s\\[(\\d+\\/)[a-zA-z]{3}\\/\\d{4}:\\d{2}:\\d{2}:\\d{2}\\s\\+0900\\]\\s
pattern = dd/

It works OK.

So, as I see, flume gives 'java.lang.IllegalArgumentException" error
because it fails to mapping "Jan" by using "MMM" pattern.

BTW, I used Cloudera Express 5.3.1.
And, the setting of case 1 works fine at another server which using
java 1.6.0_29.

Is is true that different java version is the reason causes mapping
"Jan" failed by using "MMM" pattern?
Is there anything that I missed?
Any help will be great.

Thank you

- mungeol

View raw message