You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Mike Percy (JIRA)" <ji...@apache.org> on 2014/07/03 07:33:51 UTC

[jira] [Commented] (FLUME-2410) Support HCFS Semantics for sinks/sources (agent.sinks.hdfs-sink.hcfs.path)

    [ https://issues.apache.org/jira/browse/FLUME-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051084#comment-14051084 ] 

Mike Percy commented on FLUME-2410:
-----------------------------------

Agreed it would be better to remove the .hdfs. stuff (by default) from the HDFS sink configuration parameters. It could be done, however backwards compatibility should be maintained and we should support both forms for at least one release cycle.

> Support HCFS Semantics for sinks/sources (agent.sinks.hdfs-sink.hcfs.path)
> --------------------------------------------------------------------------
>
>                 Key: FLUME-2410
>                 URL: https://issues.apache.org/jira/browse/FLUME-2410
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>            Reporter: jay vyas
>            Priority: Minor
>
> Flume has been designed to support any Hadoop Compatible File System, but hard codes semantics for HDFS....
> For example: we can build sinks that work well with different hadoop compatible file systems. For example, the following will write to glusterfs if the glusterfs hadoop plugin is enabled:
> {noformat}
>         agent.channels.memory-channel.type = memory
>         agent.channels.memory-channel.capacity = 2000
>         agent.channels.memory-channel.transactionCapacity = 100
>         agent.sources.tail-source.type = exec
>         agent.sources.tail-source.command = tail -F /tmp/flume-smoke
>         agent.sources.tail-source.channels = memory-channel
>         agent.sinks.log-sink.channel = memory-channel
>         agent.sinks.log-sink.type = logger
>         # Define a sink that outputs to the DFS
>         agent.sinks.hdfs-sink.channel = memory-channel
>         agent.sinks.hdfs-sink.type = hdfs
>         agent.sinks.hdfs-sink.hdfs.path = glusterfs:///tmp/flume-test
>         agent.sinks.hdfs-sink.hdfs.fileType = DataStream
>         # activate the channels/sinks/sources
>         agent.channels = memory-channel
>         agent.sources = tail-source
>         agent.sinks = log-sink hdfs-sink
> {noformat}
> Similar streams would exist for S3 and so on - as long as the file system is configured properly in hadoop (core-site.xml).  
> Since these are examples of hadoop compatible file systems - and flume clearly supports them - flume should support   {{agent.sinks.HCFS-sink.hcfs.path}} as the way of defining these streams, and possibly deprecate the      {{agent.sinks.hdfs-sink.hdfs.path}} semantics - because it misleads to assume that hdfs is the only hadoop compatible file system which flume supports.



--
This message was sent by Atlassian JIRA
(v6.2#6252)