You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by Hari Shreedharan <hs...@cloudera.com> on 2014/04/25 03:17:21 UTC

Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/
-----------------------------------------------------------

Review request for Flume.


Bugs: FLUME-2357
    https://issues.apache.org/jira/browse/FLUME-2357


Repository: flume-git


Description
-------

Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.


Diffs
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 
  pom.xml 2aa0ad1 

Diff: https://reviews.apache.org/r/20698/diff/


Testing
-------

Added new unit test. All current tests pass.


Thanks,

Hari Shreedharan


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Hari Shreedharan <hs...@cloudera.com>.

> On April 28, 2014, 6:34 a.m., Mike Percy wrote:
> > flume-ng-doc/sphinx/FlumeUserGuide.rst, line 1600
> > <https://reviews.apache.org/r/20698/diff/2/?file=568296#file568296line1600>
> >
> >     AFAICT, "Unlimited" is not actually a valid value to have in a config file, so I think people may get confused when trying to specify it. How about using 0 as the default? Of course, we'd need to indicate in the docs that 0 means it will retry forever. I'd also humbly suggest a rename of this property to "maxCloseAttempts" if we do the above, since it should be intuitive that 0 would not make sense for max # of close attempts.
> >     
> >     I do agree that conceptually, unlimited is the right default.
> >     
> >     Minor nit: end-of-line whitespace on this line

The reason I used hdfs.closeTries is that the parameter existed previously (it was parsed in the AbstractHDFSWriter class). Since that parameter did the same thing and has been removed in this patch - I want to make sure the old parameter works as it used to.


> On April 28, 2014, 6:34 a.m., Mike Percy wrote:
> > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java, line 636
> > <https://reviews.apache.org/r/20698/diff/2/?file=568298#file568298line636>
> >
> >     Worth a comment that the BucketWriter may be kept alive, and that this method may be called by the scheduledClose Callable on the timedRollerPool even after the BucketWriter is closed.
> >     
> >     However, since there is an implicit ref to the BucketWriter from the Callable, I don't see the point of copying in all these args when we could get them from the BucketWriter object.

The reason is that when the BucketWriter#close method is called due to count or size based rotation, we re-use the same bucket writer (we close and then immediately open), which means that the bucketPath and targetPath fields of the bucketwriter instance now changes to the path and final path of the new file. So if an old file failed to close/rename, we need to make sure we can rename that even later. I will note that in a comment.


- Hari


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/#review41576
-----------------------------------------------------------


On April 28, 2014, 9:36 p.m., Hari Shreedharan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20698/
> -----------------------------------------------------------
> 
> (Updated April 28, 2014, 9:36 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2357
>     https://issues.apache.org/jira/browse/FLUME-2357
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 
>   pom.xml 2aa0ad1 
> 
> Diff: https://reviews.apache.org/r/20698/diff/
> 
> 
> Testing
> -------
> 
> Added new unit test. All current tests pass.
> 
> 
> Thanks,
> 
> Hari Shreedharan
> 
>


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/#review41576
-----------------------------------------------------------



flume-ng-doc/sphinx/FlumeUserGuide.rst
<https://reviews.apache.org/r/20698/#comment75097>

    AFAICT, "Unlimited" is not actually a valid value to have in a config file, so I think people may get confused when trying to specify it. How about using 0 as the default? Of course, we'd need to indicate in the docs that 0 means it will retry forever. I'd also humbly suggest a rename of this property to "maxCloseAttempts" if we do the above, since it should be intuitive that 0 would not make sense for max # of close attempts.
    
    I do agree that conceptually, unlimited is the right default.
    
    Minor nit: end-of-line whitespace on this line



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75101>

    I don't think a single static instance is going to work with different FileSystem impls, say in a federated case, or in a situation where we have some sinks writing to HDFS and some sinks writing to S3 or a local FS.
    
    Style note: All this detection / reflection stuff should be factored into a subroutine (this method is already long enough).
    
    Because we are not able to use a JVM static for the Method, let's also avoid doing the reflection under the static lock.



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75102>

    Please factor creation of the callable into a subroutine as well, so we don't make close() greater than a screenful of code.



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75104>

    Hmm, how about "== false"?
    
    Also, this would read more easily if isClosedMethod.invoke(...) was factored into a private method that could be called more naturally... directly invoking on the reflection stuff is additionally awkward.



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75103>

    The logic is a little wonky here... if there is no close failure but there is a rename failure, this will potentially retry forever. Either the failure scenarios need to be separated out, or this should respect the max close tries config param.



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75107>

    Do we need to null this out? There should not be a circular ref from FS to this Callable, right?



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75113>

    s/minutes/seconds/



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75116>

    Why are we nulling-out fileSystem all over the place? I may actually be guilty of doing this sillyness in the past.. but we should get away from this.
    
    I think we should consider this instance of FileSystem to be owned by BucketWriter, and be aggressive about getting rid of refs to BucketWriter from stuff like the HDFSSink when they are no longer needed.



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75118>

    Worth a comment that the BucketWriter may be kept alive, and that this method may be called by the scheduledClose Callable on the timedRollerPool even after the BucketWriter is closed.
    
    However, since there is an implicit ref to the BucketWriter from the Callable, I don't see the point of copying in all these args when we could get them from the BucketWriter object.



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/20698/#comment75115>

    Nit: This could be <Void> typed and be forced to return null



flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
<https://reviews.apache.org/r/20698/#comment75119>

    These tests have a lot of copy/paste, can this be reduced?


- Mike Percy


On April 25, 2014, 5:11 a.m., Hari Shreedharan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20698/
> -----------------------------------------------------------
> 
> (Updated April 25, 2014, 5:11 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2357
>     https://issues.apache.org/jira/browse/FLUME-2357
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 
>   pom.xml 2aa0ad1 
> 
> Diff: https://reviews.apache.org/r/20698/diff/
> 
> 
> Testing
> -------
> 
> Added new unit test. All current tests pass.
> 
> 
> Thanks,
> 
> Hari Shreedharan
> 
>


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Hari Shreedharan <hs...@cloudera.com>.

> On April 28, 2014, 10:20 p.m., Mike Percy wrote:
> > flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java, line 1367
> > <https://reviews.apache.org/r/20698/diff/2-3/?file=568309#file568309line1367>
> >
> >     What is this testing? Did you mean to assert on retry interval here?

This actually tests that a negative retry interval just disables retries, which means the total number of tries would be 1.


> On April 28, 2014, 10:20 p.m., Mike Percy wrote:
> > pom.xml, line 52
> > <https://reviews.apache.org/r/20698/diff/3/?file=569688#file569688line52>
> >
> >     How about a separate patch for this?

Will do.


- Hari


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/#review41642
-----------------------------------------------------------


On April 28, 2014, 9:36 p.m., Hari Shreedharan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20698/
> -----------------------------------------------------------
> 
> (Updated April 28, 2014, 9:36 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2357
>     https://issues.apache.org/jira/browse/FLUME-2357
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 
>   pom.xml 2aa0ad1 
> 
> Diff: https://reviews.apache.org/r/20698/diff/
> 
> 
> Testing
> -------
> 
> Added new unit test. All current tests pass.
> 
> 
> Thanks,
> 
> Hari Shreedharan
> 
>


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/#review41642
-----------------------------------------------------------



flume-ng-doc/sphinx/FlumeUserGuide.rst
<https://reviews.apache.org/r/20698/#comment75196>

    How about: "If set to 1, this sink will not re-try a failed close (due to, for example, NameNode or DataNode failure), and may leave the file in an open state with a .tmp extension"
    
    Negative numbers don't really make sense, and I believe the impl treats them the same as 0 so probably documenting that doesn't really add clarity



flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
<https://reviews.apache.org/r/20698/#comment75195>

    What is this testing? Did you mean to assert on retry interval here?



pom.xml
<https://reviews.apache.org/r/20698/#comment75197>

    How about a separate patch for this?


- Mike Percy


On April 28, 2014, 9:36 p.m., Hari Shreedharan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20698/
> -----------------------------------------------------------
> 
> (Updated April 28, 2014, 9:36 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2357
>     https://issues.apache.org/jira/browse/FLUME-2357
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 
>   pom.xml 2aa0ad1 
> 
> Diff: https://reviews.apache.org/r/20698/diff/
> 
> 
> Testing
> -------
> 
> Added new unit test. All current tests pass.
> 
> 
> Thanks,
> 
> Hari Shreedharan
> 
>


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/#review41651
-----------------------------------------------------------

Ship it!


LGTM

- Mike Percy


On April 28, 2014, 10:55 p.m., Hari Shreedharan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20698/
> -----------------------------------------------------------
> 
> (Updated April 28, 2014, 10:55 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2357
>     https://issues.apache.org/jira/browse/FLUME-2357
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 
> 
> Diff: https://reviews.apache.org/r/20698/diff/
> 
> 
> Testing
> -------
> 
> Added new unit test. All current tests pass.
> 
> 
> Thanks,
> 
> Hari Shreedharan
> 
>


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/
-----------------------------------------------------------

(Updated April 28, 2014, 10:55 p.m.)


Review request for Flume.


Bugs: FLUME-2357
    https://issues.apache.org/jira/browse/FLUME-2357


Repository: flume-git


Description
-------

Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.


Diffs (updated)
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 

Diff: https://reviews.apache.org/r/20698/diff/


Testing
-------

Added new unit test. All current tests pass.


Thanks,

Hari Shreedharan


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/
-----------------------------------------------------------

(Updated April 28, 2014, 10:33 p.m.)


Review request for Flume.


Bugs: FLUME-2357
    https://issues.apache.org/jira/browse/FLUME-2357


Repository: flume-git


Description
-------

Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.


Diffs (updated)
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 

Diff: https://reviews.apache.org/r/20698/diff/


Testing
-------

Added new unit test. All current tests pass.


Thanks,

Hari Shreedharan


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/
-----------------------------------------------------------

(Updated April 28, 2014, 9:36 p.m.)


Review request for Flume.


Bugs: FLUME-2357
    https://issues.apache.org/jira/browse/FLUME-2357


Repository: flume-git


Description
-------

Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.


Diffs (updated)
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 
  pom.xml 2aa0ad1 

Diff: https://reviews.apache.org/r/20698/diff/


Testing
-------

Added new unit test. All current tests pass.


Thanks,

Hari Shreedharan


Re: Review Request 20698: FLUME-2357. HDFS sink should retry closing files that previously had close errors

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20698/
-----------------------------------------------------------

(Updated April 25, 2014, 5:11 a.m.)


Review request for Flume.


Changes
-------

Added more tests. Removed some unused imports


Bugs: FLUME-2357
    https://issues.apache.org/jira/browse/FLUME-2357


Repository: flume-git


Description
-------

Much of the size of the patch is due to a couple of file renames. Otherwise the patch itself is pretty simple. In the Bucketwriter, if a close fails, we simply reschedule the close to happen sometime later until it finally succeeds or till we hit a maximum count. Added a test case too. This depends on the presence of the isFileClosed method in the HDFS client API. If the method is absent, reattempts are not done.


Diffs (updated)
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 7b918ed 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java da0466d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java e82d13d 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 5518547 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java e20d1ee 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 4ea78c1 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 5fe9f1b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockDataStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystem.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFileSystemCloseRetryWrapper.java b5d89e6 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStream.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockFsDataOutputStreamCloseRetryWrapper.java 1d8c140 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java b7cc586 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 87918d1 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestUseRawLocalFileSystem.java 4476530 
  pom.xml 2aa0ad1 

Diff: https://reviews.apache.org/r/20698/diff/


Testing
-------

Added new unit test. All current tests pass.


Thanks,

Hari Shreedharan