You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by David Gates <da...@gmail.com> on 2014/06/18 21:57:30 UTC

Getting java.lang.OutOfMemoryError: GC overhead limit exceeded with Morphline Interceptor

When running a test to see if I can get Morphline Interceptor working to
convert some timestamps in logfiles, I am getting a java GC overhead limit
exceeded error.

Command:

flume-ng agent --conf ./conf -f testflume.conf
-Dflume.root.logger=DBUG,console -n agent

testflume.conf:

agent.channels.memory-channel.type = memory
agent.sources.spool-source.type = spooldir
agent.sources.spool-source.spoolDir = /home/impala/spool/
agent.sources.spool-source.channels = memory-channel
agent.sources.spool-source.interceptors = morphlineinterceptor
agent.sources.spool-source.interceptors.morphlineinterceptor.type =
org.apache.flume.sink.solr.morphline.MorphlineInterceptor$Builder
agent.sources.spool-source.interceptors.morphlineinterceptor.morphlineFile
= /root/morphline.conf
agent.sources.spool-source.interceptors.morphlineinterceptor.morphlineId =
morphline1
agent.sinks.hdfs-sink.channel = memory-channel
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.hdfs.path = /user/impala/
agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.channels = memory-channel
agent.sources = spool-source
agent.sinks = hdfs-sink

morphline.conf:

morphlines : [
        {
                id : morphline1
                importCommands : ["com.cloudera.**"]

                commands : [
                        {
                                readCSV {
                                        separator: ";"
                                        trim: true
                                        columns:
[Header1,Header2,Header3,ConnectionType,SessionID,ReleaseCause,StartTime,AnswerTime,ReleaseTime,MinutesWest,ReleaseCauseProto,ReleaseCauseNum,FirstReleaseDialogue,TrunkIDOrig,VOIPProtoOrig,SourceNumOrig,SourceHostOrig,DestNumOrig,DestHostOrig,OrigCallID,OrigRemotePayloadIP,OrigRemotePayloadUDP,OrigLocalPayloadIP,OrigLocalPayloadUDP,OrigCodecList,OrigIngressPackets,OrigEgressPackets,OrigIngressOctets,OrigEgressOctets,OrigIngressPacketLoss,OrigIngressDelay,OrigIngressJitter,TrunkIDTerm,VOIPProtoTerm,SourceNumTerm,SourceHostTerm,DestNumTerm,DestHostTerm,TermCallID,TermRemotePayloadIP,TermRemotePayloadUDP,TermLocalPayloadIP,TermLocalPayloadUDP,TermCodecList,TermIngressPackets,TermEgressPackets,TermIngressOctets,TermEgressOctets,TermIngpressPacketLoss,TermIngressDelay,TermIngressJitter,FinalRouteIndication,RoutingDigits,CallDurSec,PostDialDelaySec,RingTimeSec,DurMiniSec,ConfID,RPIDANI,RouteEntryIndex,RouteTable,LNPDip,IngressLRN,EgressLRN,CNAMDip,DNCDip,OrigTrunkAlias,TermTrunkAlias,ERSDip,OLIDigits]
                                }
                        }
                ]
        }
]


I had originally also had a convertTimeStamp command in there but removed
it to troubleshoot.

The error I get when running is as follows:

14/05/21 08:37:23 INFO api.MorphlineContext: Importing commands
14/05/21 08:37:31 ERROR node.PollingPropertiesFileConfigurationProvider:
Unhandled error
java.lang.OutOfMemoryError: GC overhead limit exceeded


Ive tried googling but I can't find anything specific to flume/morphline
with GC limit exceeded, any help/ideas would be appreciated.

Re: Getting java.lang.OutOfMemoryError: GC overhead limit exceeded with Morphline Interceptor

Posted by Wolfgang Hoschek <wh...@cloudera.com>.
The default memory settings for flume are extremely low. Try giving it more Java memory.

On Jun 18, 2014, at 12:57 PM, David Gates <da...@gmail.com> wrote:

> When running a test to see if I can get Morphline Interceptor working to
> convert some timestamps in logfiles, I am getting a java GC overhead limit
> exceeded error.
> 
> Command:
> 
> flume-ng agent --conf ./conf -f testflume.conf
> -Dflume.root.logger=DBUG,console -n agent
> 
> testflume.conf:
> 
> agent.channels.memory-channel.type = memory
> agent.sources.spool-source.type = spooldir
> agent.sources.spool-source.spoolDir = /home/impala/spool/
> agent.sources.spool-source.channels = memory-channel
> agent.sources.spool-source.interceptors = morphlineinterceptor
> agent.sources.spool-source.interceptors.morphlineinterceptor.type =
> org.apache.flume.sink.solr.morphline.MorphlineInterceptor$Builder
> agent.sources.spool-source.interceptors.morphlineinterceptor.morphlineFile
> = /root/morphline.conf
> agent.sources.spool-source.interceptors.morphlineinterceptor.morphlineId =
> morphline1
> agent.sinks.hdfs-sink.channel = memory-channel
> agent.sinks.hdfs-sink.type = hdfs
> agent.sinks.hdfs-sink.hdfs.path = /user/impala/
> agent.sinks.hdfs-sink.hdfs.fileType = DataStream
> agent.channels = memory-channel
> agent.sources = spool-source
> agent.sinks = hdfs-sink
> 
> morphline.conf:
> 
> morphlines : [
>        {
>                id : morphline1
>                importCommands : ["com.cloudera.**"]
> 
>                commands : [
>                        {
>                                readCSV {
>                                        separator: ";"
>                                        trim: true
>                                        columns:
> [Header1,Header2,Header3,ConnectionType,SessionID,ReleaseCause,StartTime,AnswerTime,ReleaseTime,MinutesWest,ReleaseCauseProto,ReleaseCauseNum,FirstReleaseDialogue,TrunkIDOrig,VOIPProtoOrig,SourceNumOrig,SourceHostOrig,DestNumOrig,DestHostOrig,OrigCallID,OrigRemotePayloadIP,OrigRemotePayloadUDP,OrigLocalPayloadIP,OrigLocalPayloadUDP,OrigCodecList,OrigIngressPackets,OrigEgressPackets,OrigIngressOctets,OrigEgressOctets,OrigIngressPacketLoss,OrigIngressDelay,OrigIngressJitter,TrunkIDTerm,VOIPProtoTerm,SourceNumTerm,SourceHostTerm,DestNumTerm,DestHostTerm,TermCallID,TermRemotePayloadIP,TermRemotePayloadUDP,TermLocalPayloadIP,TermLocalPayloadUDP,TermCodecList,TermIngressPackets,TermEgressPackets,TermIngressOctets,TermEgressOctets,TermIngpressPacketLoss,TermIngressDelay,TermIngressJitter,FinalRouteIndication,RoutingDigits,CallDurSec,PostDialDelaySec,RingTimeSec,DurMiniSec,ConfID,RPIDANI,RouteEntryIndex,RouteTable,LNPDip,IngressLRN,EgressLRN,CNAMDip,DNCDip,OrigTrunkAlias,TermTrunkAlias,ERSDip,OLIDigits]
>                                }
>                        }
>                ]
>        }
> ]
> 
> 
> I had originally also had a convertTimeStamp command in there but removed
> it to troubleshoot.
> 
> The error I get when running is as follows:
> 
> 14/05/21 08:37:23 INFO api.MorphlineContext: Importing commands
> 14/05/21 08:37:31 ERROR node.PollingPropertiesFileConfigurationProvider:
> Unhandled error
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 
> 
> Ive tried googling but I can't find anything specific to flume/morphline
> with GC limit exceeded, any help/ideas would be appreciated.