You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Thiruvalluvan M. G." <th...@yahoo.com> on 2013/04/26 12:30:06 UTC

Review Request: Additional configuration parameters for HDFSSink

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10606/
-----------------------------------------------------------

Review request for Flume.


Description
-------

This patch adds additional configuration parameters for HDFS Sink. They are for choosing the HDFS block size, HDFS replication factor and buffer size. These can now be chosen on per-sink basis.


This addresses bug FLUME-2003.
    https://issues.apache.org/jira/browse/FLUME-2003


Diffs
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 693c0d7 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java ff4f223 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 0c618b5 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java c87fafe 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 1a401d6 

Diff: https://reviews.apache.org/r/10606/diff/


Testing
-------

I've tested this as follows:

(1) Without specifying these things work as before
(2) With these new parameters specified, the new HDFS files have the specified block size or replication or both.


Thanks,

Thiruvalluvan M. G.


Re: Review Request: Additional configuration parameters for HDFSSink

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10606/#review20293
-----------------------------------------------------------


Hi Thiru, sorry I took so long to get back to this.

This looks great except for a couple things:
1. This change will break on a federated filesystem. See https://issues.apache.org/jira/browse/HADOOP-8014 ... it's a pain but in order to support this we need to use reflection to see if the underlying implementation supports it and if it does we should call getDefaultReplication(Path) and getDefaultBlockSize(Path) instead of the no-arg forms
2. Needs user doc for "hdfs.hdfsDfsBlockSize"

For #1, check out https://issues.apache.org/jira/browse/FLUME-2027 for how we did the runtime check for getDefaultReplication()


- Mike Percy


On April 26, 2013, 10:30 a.m., Thiruvalluvan M. G. wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10606/
> -----------------------------------------------------------
> 
> (Updated April 26, 2013, 10:30 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This patch adds additional configuration parameters for HDFS Sink. They are for choosing the HDFS block size, HDFS replication factor and buffer size. These can now be chosen on per-sink basis.
> 
> 
> This addresses bug FLUME-2003.
>     https://issues.apache.org/jira/browse/FLUME-2003
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 693c0d7 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java ff4f223 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 0c618b5 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java c87fafe 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 1a401d6 
> 
> Diff: https://reviews.apache.org/r/10606/diff/
> 
> 
> Testing
> -------
> 
> I've tested this as follows:
> 
> (1) Without specifying these things work as before
> (2) With these new parameters specified, the new HDFS files have the specified block size or replication or both.
> 
> 
> Thanks,
> 
> Thiruvalluvan M. G.
> 
>