You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by "Wan Yi (武汉_技术部_搜索与精准化_万毅)" <wa...@yhd.com> on 2014/11/27 10:05:22 UTC
答复: Why failover sink processor does not work
I have investigated the HDFSEventSink source code, found if the exception was IOException , the exception would not throw to the upper layer,
So FailOverSinkProcessor would not mark this sink as dead.
} catch (IOException eIO) {
transaction.rollback();
LOG.warn("HDFS IO error", eIO);
return Status.BACKOFF;
} catch (Throwable th) {
transaction.rollback();
LOG.error("process failed", th);
if (th instanceof Error) {
throw (Error) th;
} else {
throw new EventDeliveryException(th);
}
}
发件人: Wan Yi(武汉_技术部_搜索与精准化_万毅) [mailto:wanyi@yhd.com]
发送时间: 2014年11月27日 16:02
收件人: user@flume.apache.org
主题: re: Why failover sink processor does not work
By the way, I use flume-1.4.0
Wayne Wan
发件人: Wan Yi(武汉_技术部_搜索与精准化_万毅) [mailto:wanyi@yhd.com]
发送时间: 2014年11月27日 15:57
收件人: user@flume.apache.org<ma...@flume.apache.org>
主题: Why failover sink processor does not work
I use hdfs to store our logs, but the failover processor seems does not work when I killed the hdfs cluster that used by the high priority sink(sinks1).
Below is my config
#### define agent
a1.sources = src1
a1.sinks = sinks1 sinks5
a1.channels = ch1
a1.sinkgroups = g1
#### defined the sink group
a1.sinkgroups.g1.sinks = sinks1 sinks5
a1.sinkgroups.g1.processor.type = failover
a1.sinkgroups.g1.processor.priority.sinks1 = 5
a1.sinkgroups.g1.processor.priority.sinks5 = 1
a1.sinkgroups.g1.processor.maxpenalty = 1000
#### define http source
a1.sources.src1.type = **
a1.sources.src1.port = 8081
a1.sources.src1.contextPath = /
a1.sources.src1.urlPattern = /t
a1.sources.src1.handler = **
a1.sources.src1.channels = ch1
#### define hdfs sink
a1.sinks.sinks1.type = hdfs
a1.sinks.sinks1.channel = ch1
a1.sinks.sinks1.hdfs.path = hdfs://host1:9000/user/hadoop/flume/ds=%y-%m-%d
a1.sinks.sinks1.hdfs.filePrefix = %{host}
a1.sinks.sinks1.hdfs.batchSize = 1000
a1.sinks.sinks1.hdfs.rollCount = 0
a1.sinks.sinks1.hdfs.rollSize = 0
a1.sinks.sinks1.hdfs.rollInterval = 300
a1.sinks.sinks1.hdfs.idleTimeout = 1800000
a1.sinks.sinks1.hdfs.callTimeout = 20000
a1.sinks.sinks1.hdfs.threadsPoolSize = 250
a1.sinks.sinks1.hdfs.writeFormat = Text
a1.sinks.sinks1.hdfs.fileType = DataStream
#### define hdfs sink
a1.sinks.sinks5.type = hdfs
a1.sinks.sinks5.channel = ch1
a1.sinks.sinks5.hdfs.path = hdfs://host2:8020/user/hadoop/flume/ds=%y-%m-%d
a1.sinks.sinks5.hdfs.filePrefix = %{host}
a1.sinks.sinks5.hdfs.batchSize = 1000
a1.sinks.sinks5.hdfs.rollCount = 0
a1.sinks.sinks5.hdfs.rollSize = 0
a1.sinks.sinks5.hdfs.rollInterval = 300
a1.sinks.sinks5.hdfs.idleTimeout = 1800000
a1.sinks.sinks5.hdfs.callTimeout = 20000
a1.sinks.sinks5.hdfs.threadsPoolSize = 250
a1.sinks.sinks5.hdfs.writeFormat = Text
a1.sinks.sinks5.hdfs.fileType = DataStream
#### define memory channel1
a1.channels.ch1.type = memory
a1.channels.ch1.capacity = 10000
a1.channels.ch1.transactionCapacity = 1000
Wayne Wan
Re: 答复: Why failover sink processor does not work
Posted by Arvind Prabhakar <ar...@streamsets.com>.
Hi Wayne,
Thanks for doing the investigation, this seems like a legitimate problem. I
have created an issue to track this (FLUME-2564
<https://issues.apache.org/jira/browse/FLUME-2564>). In case you already
have a patch to address this problem, please post that on the Jira.
Regards,
Arvind Prabhakar
On Thu, Nov 27, 2014 at 1:05 AM, Wan Yi(武汉_技术部_搜索与精准化_万毅) <wa...@yhd.com>
wrote:
> I have investigated the HDFSEventSink source code, found if the
> exception was IOException , the exception would not throw to the upper
> layer,
>
> So FailOverSinkProcessor would not mark this sink as dead.
>
>
>
> } *catch* (IOException eIO) {
>
> transaction.rollback();
>
> *LOG*.warn("HDFS IO error", eIO);
>
> *return* Status.*BACKOFF*;
>
> } *catch* (Throwable th) {
>
> transaction.rollback();
>
> *LOG*.error("process failed", th);
>
> *if* (th *instanceof* Error) {
>
> *throw* (Error) th;
>
> } *else* {
>
> *throw* *new* EventDeliveryException(th);
>
> }
>
> }
>
>
>
>
>
>
>
> *发件人:* Wan Yi(武汉_技术部_搜索与精准化_万毅) [mailto:wanyi@yhd.com]
> *发送时间:* 2014年11月27日 16:02
> *收件人:* user@flume.apache.org
> *主题:* re: Why failover sink processor does not work
>
>
>
> By the way, I use flume-1.4.0
>
>
>
>
>
>
>
> Wayne Wan
>
> *发件人:* Wan Yi(武汉_技术部_搜索与精准化_万毅) [mailto:wanyi@yhd.com <wa...@yhd.com>]
> *发送时间:* 2014年11月27日 15:57
> *收件人:* user@flume.apache.org
> *主题:* Why failover sink processor does not work
>
>
>
> I use hdfs to store our logs, but the failover processor seems does not
> work when I killed the hdfs cluster that used by the high priority
> sink(sinks1).
>
>
>
> *Below is my config*
>
>
>
> #### define agent
>
> a1.sources = src1
>
> a1.sinks = sinks1 sinks5
>
> a1.channels = ch1
>
> a1.sinkgroups = g1
>
> #### defined the sink group
>
> a1.sinkgroups.g1.sinks = sinks1 sinks5
>
> a1.sinkgroups.g1.processor.type = failover
>
> a1.sinkgroups.g1.processor.priority.sinks1 = 5
>
> a1.sinkgroups.g1.processor.priority.sinks5 = 1
>
> a1.sinkgroups.g1.processor.maxpenalty = 1000
>
>
>
> #### define http source
>
> a1.sources.src1.type = **
>
> a1.sources.src1.port = 8081
>
> a1.sources.src1.contextPath = /
>
> a1.sources.src1.urlPattern = /t
>
> a1.sources.src1.handler = **
>
> a1.sources.src1.channels = ch1
>
>
>
> #### define hdfs sink
>
> a1.sinks.sinks1.type = hdfs
>
> a1.sinks.sinks1.channel = ch1
>
> a1.sinks.sinks1.hdfs.path = hdfs://host1:9000/user/hadoop/flume/ds=%y-%m-%d
>
> a1.sinks.sinks1.hdfs.filePrefix = %{host}
>
> a1.sinks.sinks1.hdfs.batchSize = 1000
>
> a1.sinks.sinks1.hdfs.rollCount = 0
>
> a1.sinks.sinks1.hdfs.rollSize = 0
>
> a1.sinks.sinks1.hdfs.rollInterval = 300
>
> a1.sinks.sinks1.hdfs.idleTimeout = 1800000
>
> a1.sinks.sinks1.hdfs.callTimeout = 20000
>
> a1.sinks.sinks1.hdfs.threadsPoolSize = 250
>
> a1.sinks.sinks1.hdfs.writeFormat = Text
>
> a1.sinks.sinks1.hdfs.fileType = DataStream
>
>
>
> #### define hdfs sink
>
> a1.sinks.sinks5.type = hdfs
>
> a1.sinks.sinks5.channel = ch1
>
> a1.sinks.sinks5.hdfs.path = hdfs://host2:8020/user/hadoop/flume/ds=%y-%m-%d
>
> a1.sinks.sinks5.hdfs.filePrefix = %{host}
>
> a1.sinks.sinks5.hdfs.batchSize = 1000
>
> a1.sinks.sinks5.hdfs.rollCount = 0
>
> a1.sinks.sinks5.hdfs.rollSize = 0
>
> a1.sinks.sinks5.hdfs.rollInterval = 300
>
> a1.sinks.sinks5.hdfs.idleTimeout = 1800000
>
> a1.sinks.sinks5.hdfs.callTimeout = 20000
>
> a1.sinks.sinks5.hdfs.threadsPoolSize = 250
>
> a1.sinks.sinks5.hdfs.writeFormat = Text
>
> a1.sinks.sinks5.hdfs.fileType = DataStream
>
>
>
>
>
> #### define memory channel1
>
> a1.channels.ch1.type = memory
>
> a1.channels.ch1.capacity = 10000
>
> a1.channels.ch1.transactionCapacity = 1000
>
>
>
>
>
>
>
>
>
>
>
> Wayne Wan
>
>
>