You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flume.apache.org by larryzhang <zh...@gmail.com> on 2013/03/11 11:49:28 UTC

JVM error while collecting from hour dividing log with flume-ng

Hi,
     I want to collect and analyse user logs every 5 minutes. Now we 
have origin log file which generated by nginx and divided by hour, about 
30,000,000 logs per hour. The log format is like this:
            60.222.199.118 - - [11/Mar/2013:16:00:00 +0800] "GET ....
     Because I want to firstly collect the logs into file. so I wrote a 
FileEventSink, just did some modification based on 
org.apache.flume.sink.hdfs.BucketWriter.java and 
org.apache.flume.sink.hdfs.HDFSEventSink.java. Following is my flume 
config file:
==================
a1.sources = r1
a1.channels = c1
a1.sinks = k1

a1.sources.r1.type = cn.larry.flume.source.MyExecSource      //I need to 
fetch time and other info into headers, so I add these logics based on 
ExecSource
a1.sources.r1.command = tail -n +0 -F /data2/log/log_2013031117.log
a1.sources.r1.channels = c1
a1.sources.r1.batchSize = 1         //I set this to 1 because otherwise 
it will lost data at the end of the log file, I apply this patch 
https://issues.apache.org/jira/browse/FLUME-1819 but it seems no help...

a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000000
a1.channels.c1.transactionCapacity = 10000

a1.sinks.k1.type = cn.larry.flume.sink.FileEventSink
a1.sinks.k1.channel = c1
a1.sinks.k1.file.path = /opt/livedata/%Y%m%d/%H
a1.sinks.k1.file.filePrefix = log-%Y%m%d%H%M
a1.sinks.k1.file.round = true
a1.sinks.k1.file.roundValue = 5
a1.sinks.k1.file.roundUnit = minute
a1.sinks.k1.file.rollInterval=300
a1.sinks.k1.file.rollSize=0
a1.sinks.k1.file.rollCount=0
a1.sinks.k1.file.batchSize=100

    And because I need to change the source log file name each hour, so 
I wrote a script, which does 3 things:
      1. At the 1st minute per hour:
          ->copy a new config file, which just change the source log 
file name(a1.sources.r1.command = tail -n +0 -F /data2/log/log_<new 
time>.log)
          ->start new flume process which use the new conifg. (I did 
this because if flume process die, it won't affect next hour)
      2. At the 30th minute per hour:
          -> kill the flume process of last hour.
   This project has been run more than 10 days, most of time it works 
well, but sometimes flume process crashed  due to JVM error, about once 
every 2 days! I used jvm version 1.6.0_27-ea. Here's the log of error 
which happened on 2013-03-11:

2013-03-11 05:45:11,302 (file-k1-roll-timer-0) [INFO - 
cn.larry.flume.sink.FileBucketWriter.renameBucket(FileBucketWriter.java:408)] 
Renaming /opt/livedata/20130311/05/log-201303110540.1362951611255.tmp to 
/opt/livedata/20130311/05/log-201303110540.1362951611255
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00002b4247be034e, pid=1463, tid=1098979648
#
# JRE version: 6.0_18-b07
# Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode 
linux-amd64 )
# Problematic frame:
# V  [libjvm.so+0x2de34e]
#
# An error report file with more information is saved as:
# /opt/scripts/tvhadoop/flume/flume-1.3.0/bin/hs_err_pid1463.log
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
#
+ exec /usr/local/jdk/bin/java -Xmx2048m 
-Dflume.root.logger=INFO,console -cp 
'/opt/tvhadoop/apache-flume-1.3.1-bin/conf:/opt/tvhadoop/apache-flume-1.3.1-bin/lib/*' 
-Djava.library.path= org.apache.flume.node.Application -f 
/opt/tvhadoop/apache-flume-1.3.1-bin/conf/flume_2013031106.conf -n a1

    and the jvm dump file is in the attachments.
    I wonder how to handle this problem.
    And another question is about the execSource, I don't know why it 
loss data if batchsize > 1. If I use file channel, I must make a large 
batchsize to fulfill the throughputs...

Thanks & Best regards,
larry

Re: JVM error while collecting from hour dividing log with flume-ng

Posted by larryzhang <zh...@gmail.com>.

Great. I had updated jvm version to 1.6.0_31 yesterday, and it works 
well till now.   Thanks a lot.
On 03/12/2013 12:07 AM, Brock Noland wrote:
> You are using a known bad jvm version. I would upgrade: 
> http://wiki.apache.org/hadoop/HadoopJavaVersions
>
>
> On Mon, Mar 11, 2013 at 5:49 AM, larryzhang <zhangtt51@gmail.com 
> <ma...@gmail.com>> wrote:
>
>
>     Hi,
>         I want to collect and analyse user logs every 5 minutes. Now
>     we have origin log file which generated by nginx and divided by
>     hour, about 30,000,000 logs per hour. The log format is like this:
>                60.222.199.118 - - [11/Mar/2013:16:00:00 +0800] "GET ....
>         Because I want to firstly collect the logs into file. so I
>     wrote a FileEventSink, just did some modification based on
>     org.apache.flume.sink.hdfs.BucketWriter.java and
>     org.apache.flume.sink.hdfs.HDFSEventSink.java. Following is my
>     flume config file:
>     ==================
>     a1.sources = r1
>     a1.channels = c1
>     a1.sinks = k1
>
>     a1.sources.r1.type = cn.larry.flume.source.MyExecSource      //I
>     need to fetch time and other info into headers, so I add these
>     logics based on ExecSource
>     a1.sources.r1.command = tail -n +0 -F /data2/log/log_2013031117
>     <tel:2013031117>.log
>     a1.sources.r1.channels = c1
>     a1.sources.r1.batchSize = 1         //I set this to 1 because
>     otherwise it will lost data at the end of the log file, I apply
>     this patch https://issues.apache.org/jira/browse/FLUME-1819 but it
>     seems no help...
>
>     a1.channels.c1.type = memory
>     a1.channels.c1.capacity = 1000000
>     a1.channels.c1.transactionCapacity = 10000
>
>     a1.sinks.k1.type = cn.larry.flume.sink.FileEventSink
>     a1.sinks.k1.channel = c1
>     a1.sinks.k1.file.path = /opt/livedata/%Y%m%d/%H
>     a1.sinks.k1.file.filePrefix = log-%Y%m%d%H%M
>     a1.sinks.k1.file.round = true
>     a1.sinks.k1.file.roundValue = 5
>     a1.sinks.k1.file.roundUnit = minute
>     a1.sinks.k1.file.rollInterval=300
>     a1.sinks.k1.file.rollSize=0
>     a1.sinks.k1.file.rollCount=0
>     a1.sinks.k1.file.batchSize=100
>
>        And because I need to change the source log file name each
>     hour, so I wrote a script, which does 3 things:
>          1. At the 1st minute per hour:
>              ->copy a new config file, which just change the source
>     log file name(a1.sources.r1.command = tail -n +0 -F
>     /data2/log/log_<new time>.log)
>              ->start new flume process which use the new conifg. (I
>     did this because if flume process die, it won't affect next hour)
>          2. At the 30th minute per hour:
>              -> kill the flume process of last hour.
>       This project has been run more than 10 days, most of time it
>     works well, but sometimes flume process crashed due to JVM error,
>     about once every 2 days! I used jvm version 1.6.0_27-ea. Here's
>     the log of error which happened on 2013-03-11:
>
>     2013-03-11 05:45:11,302 (file-k1-roll-timer-0) [INFO -
>     cn.larry.flume.sink.FileBucketWriter.renameBucket(FileBucketWriter.java:408)]
>     Renaming /opt/livedata/20130311/05
>     <tel:20130311%2F05>/log-201303110540.1362951611255.tmp to
>     /opt/livedata/20130311/05
>     <tel:20130311%2F05>/log-201303110540.1362951611255
>     #
>     # A fatal error has been detected by the Java Runtime Environment:
>     #
>     #  SIGSEGV (0xb) at pc=0x00002b4247be034e, pid=1463, tid=1098979648
>     #
>     # JRE version: 6.0_18-b07
>     # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode
>     linux-amd64 )
>     # Problematic frame:
>     # V  [libjvm.so+0x2de34e]
>     #
>     # An error report file with more information is saved as:
>     # /opt/scripts/tvhadoop/flume/flume-1.3.0/bin/hs_err_pid1463.log
>     #
>     # If you would like to submit a bug report, please visit:
>     # http://java.sun.com/webapps/bugreport/crash.jsp
>     #
>     + exec /usr/local/jdk/bin/java -Xmx2048m
>     -Dflume.root.logger=INFO,console -cp
>     '/opt/tvhadoop/apache-flume-1.3.1-bin/conf:/opt/tvhadoop/apache-flume-1.3.1-bin/lib/*'
>     -Djava.library.path= org.apache.flume.node.Application -f
>     /opt/tvhadoop/apache-flume-1.3.1-bin/conf/flume_2013031106
>     <tel:2013031106>.conf -n a1
>
>        and the jvm dump file is in the attachments.
>        I wonder how to handle this problem.
>        And another question is about the execSource, I don't know why
>     it loss data if batchsize > 1. If I use file channel, I must make
>     a large batchsize to fulfill the throughputs...
>
>     Thanks & Best regards,
>     larry
>
>
>
>
>
>
>
>
> -- 
> Apache MRUnit - Unit testing MapReduce - 
> http://incubator.apache.org/mrunit/

Re: JVM error while collecting from hour dividing log with flume-ng

Posted by Brock Noland <br...@cloudera.com>.

You are using a known bad jvm version. I would upgrade:
http://wiki.apache.org/hadoop/HadoopJavaVersions


On Mon, Mar 11, 2013 at 5:49 AM, larryzhang <zh...@gmail.com> wrote:

>
>  Hi,
>     I want to collect and analyse user logs every 5 minutes. Now we have
> origin log file which generated by nginx and divided by hour, about
> 30,000,000 logs per hour. The log format is like this:
>            60.222.199.118 - - [11/Mar/2013:16:00:00 +0800] "GET ....
>     Because I want to firstly collect the logs into file. so I wrote a
> FileEventSink, just did some modification based on
> org.apache.flume.sink.hdfs.BucketWriter.java and
> org.apache.flume.sink.hdfs.HDFSEventSink.java. Following is my flume config
> file:
> ==================
> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
>
> a1.sources.r1.type = cn.larry.flume.source.MyExecSource      //I need to
> fetch time and other info into headers, so I add these logics based on
> ExecSource
> a1.sources.r1.command = tail -n +0 -F /data2/log/log_2013031117.log
> a1.sources.r1.channels = c1
> a1.sources.r1.batchSize = 1         //I set this to 1 because otherwise it
> will lost data at the end of the log file, I apply this patch
> https://issues.apache.org/jira/browse/FLUME-1819 but it seems no help...
>
> a1.channels.c1.type = memory
> a1.channels.c1.capacity = 1000000
> a1.channels.c1.transactionCapacity = 10000
>
> a1.sinks.k1.type = cn.larry.flume.sink.FileEventSink
> a1.sinks.k1.channel = c1
> a1.sinks.k1.file.path = /opt/livedata/%Y%m%d/%H
> a1.sinks.k1.file.filePrefix = log-%Y%m%d%H%M
> a1.sinks.k1.file.round = true
> a1.sinks.k1.file.roundValue = 5
> a1.sinks.k1.file.roundUnit = minute
> a1.sinks.k1.file.rollInterval=300
> a1.sinks.k1.file.rollSize=0
> a1.sinks.k1.file.rollCount=0
> a1.sinks.k1.file.batchSize=100
>
>    And because I need to change the source log file name each hour, so I
> wrote a script, which does 3 things:
>      1. At the 1st minute per hour:
>          ->copy a new config file, which just change the source log file
> name(a1.sources.r1.command = tail -n +0 -F /data2/log/log_<new time>.log)
>          ->start new flume process which use the new conifg. (I did this
> because if flume process die, it won't affect next hour)
>      2. At the 30th minute per hour:
>          -> kill the flume process of last hour.
>   This project has been run more than 10 days, most of time it works well,
> but sometimes flume process crashed  due to JVM error, about once every 2
> days! I used jvm version 1.6.0_27-ea. Here's the log of error which
> happened on 2013-03-11:
>
> 2013-03-11 05:45:11,302 (file-k1-roll-timer-0) [INFO -
> cn.larry.flume.sink.FileBucketWriter.renameBucket(FileBucketWriter.java:408)]
> Renaming /opt/livedata/20130311/05/log-201303110540.1362951611255.tmp to
> /opt/livedata/20130311/05/log-201303110540.1362951611255
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00002b4247be034e, pid=1463, tid=1098979648
> #
> # JRE version: 6.0_18-b07
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode
> linux-amd64 )
> # Problematic frame:
> # V  [libjvm.so+0x2de34e]
> #
> # An error report file with more information is saved as:
> # /opt/scripts/tvhadoop/flume/flume-1.3.0/bin/hs_err_pid1463.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> + exec /usr/local/jdk/bin/java -Xmx2048m -Dflume.root.logger=INFO,console
> -cp
> '/opt/tvhadoop/apache-flume-1.3.1-bin/conf:/opt/tvhadoop/apache-flume-1.3.1-bin/lib/*'
> -Djava.library.path= org.apache.flume.node.Application -f
> /opt/tvhadoop/apache-flume-1.3.1-bin/conf/flume_2013031106.conf -n a1
>
>    and the jvm dump file is in the attachments.
>    I wonder how to handle this problem.
>    And another question is about the execSource, I don't know why it loss
> data if batchsize > 1. If I use file channel, I must make a large batchsize
> to fulfill the throughputs...
>
> Thanks & Best regards,
> larry
>
>
>
>
>
>


-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/