You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by Mike Percy <mp...@apache.org> on 2013/02/28 07:52:11 UTC

Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/
-----------------------------------------------------------

Review request for Flume.


Description
-------

This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.


This addresses bug FLUME-1916.
    https://issues.apache.org/jira/browse/FLUME-1916


Diffs
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 0786857 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java ebe277c 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 

Diff: https://reviews.apache.org/r/9675/diff/


Testing
-------

Added unit test.


Thanks,

Mike Percy


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Mike Percy <mp...@apache.org>.

> On March 1, 2013, 4:47 a.m., Hari Shreedharan wrote:
> > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java, line 430
> > <https://reviews.apache.org/r/9675/diff/2/?file=263764#file263764line430>
> >
> >     will not continue rolling files in this directory?

Fixed


> On March 1, 2013, 4:47 a.m., Hari Shreedharan wrote:
> > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java, line 440
> > <https://reviews.apache.org/r/9675/diff/2/?file=263764#file263764line440>
> >
> >     Sorry, I think I missed this one earlier. At this point doRotate is always going to be true. Was this logic meant to check if shouldRotate() returned true because of under replication? If it did, don't rotate if it is under replicated? If it is, then you need to set doRotate to false in the if block above.

Wow, good catch! Fixed.


> On March 1, 2013, 4:47 a.m., Hari Shreedharan wrote:
> > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java, line 484
> > <https://reviews.apache.org/r/9675/diff/2/?file=263764#file263764line484>
> >
> >     One more thing I missed here. If this a previous file owned by this bucket writer was under replicated, then isUnderReplicated is never reset to false, causing every following file to be rotated prematurely. It should either be done here or in the if(shouldRotate()) {..} block.

Added to open() call


- Mike


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/#review17224
-----------------------------------------------------------


On March 1, 2013, 5:44 a.m., Mike Percy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9675/
> -----------------------------------------------------------
> 
> (Updated March 1, 2013, 5:44 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.
> 
> 
> This addresses bug FLUME-1916.
>     https://issues.apache.org/jira/browse/FLUME-1916
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 
> 
> Diff: https://reviews.apache.org/r/9675/diff/
> 
> 
> Testing
> -------
> 
> Added unit test.
> 
> 
> Thanks,
> 
> Mike Percy
> 
>


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/#review17224
-----------------------------------------------------------



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/9675/#comment36594>

    will not continue rolling files in this directory?



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/9675/#comment36592>

    Sorry, I think I missed this one earlier. At this point doRotate is always going to be true. Was this logic meant to check if shouldRotate() returned true because of under replication? If it did, don't rotate if it is under replicated? If it is, then you need to set doRotate to false in the if block above.



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/9675/#comment36593>

    One more thing I missed here. If this a previous file owned by this bucket writer was under replicated, then isUnderReplicated is never reset to false, causing every following file to be rotated prematurely. It should either be done here or in the if(shouldRotate()) {..} block.


- Hari Shreedharan


On March 1, 2013, 1:53 a.m., Mike Percy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9675/
> -----------------------------------------------------------
> 
> (Updated March 1, 2013, 1:53 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.
> 
> 
> This addresses bug FLUME-1916.
>     https://issues.apache.org/jira/browse/FLUME-1916
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 
> 
> Diff: https://reviews.apache.org/r/9675/diff/
> 
> 
> Testing
> -------
> 
> Added unit test.
> 
> 
> Thanks,
> 
> Mike Percy
> 
>


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/#review17236
-----------------------------------------------------------

Ship it!


Ship It!

- Hari Shreedharan


On March 1, 2013, 6:38 a.m., Mike Percy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9675/
> -----------------------------------------------------------
> 
> (Updated March 1, 2013, 6:38 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.
> 
> 
> This addresses bug FLUME-1916.
>     https://issues.apache.org/jira/browse/FLUME-1916
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 
> 
> Diff: https://reviews.apache.org/r/9675/diff/
> 
> 
> Testing
> -------
> 
> Added unit test.
> 
> 
> Thanks,
> 
> Mike Percy
> 
>


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/
-----------------------------------------------------------

(Updated March 1, 2013, 6:38 a.m.)


Review request for Flume.


Description
-------

This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.


This addresses bug FLUME-1916.
    https://issues.apache.org/jira/browse/FLUME-1916


Diffs (updated)
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 

Diff: https://reviews.apache.org/r/9675/diff/


Testing
-------

Added unit test.


Thanks,

Mike Percy


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Mike Percy <mp...@apache.org>.

> On March 1, 2013, 5:58 a.m., Hari Shreedharan wrote:
> > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java, lines 429-446
> > <https://reviews.apache.org/r/9675/diff/3/?file=263788#file263788line429>
> >
> >     I get that since consecutiveUnderReplCount is reset the counter gets reset on open, but you need to also set isUnderReplicated to false, otherwise once that is set to true, it is never reset to false and the consecutiveUnderReplCount can get incremented for every append following one rotation due to under replication.

Ugh, you are right. I tried to add a unit test for bringing back the DN but I couldn't get the NN to recognize it, so here is a fix without a unit test. I will try to add a working unit test for this condition in a follow up patch.


- Mike


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/#review17229
-----------------------------------------------------------


On March 1, 2013, 6:38 a.m., Mike Percy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9675/
> -----------------------------------------------------------
> 
> (Updated March 1, 2013, 6:38 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.
> 
> 
> This addresses bug FLUME-1916.
>     https://issues.apache.org/jira/browse/FLUME-1916
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 
> 
> Diff: https://reviews.apache.org/r/9675/diff/
> 
> 
> Testing
> -------
> 
> Added unit test.
> 
> 
> Thanks,
> 
> Mike Percy
> 
>


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/#review17229
-----------------------------------------------------------



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
<https://reviews.apache.org/r/9675/#comment36607>

    I get that since consecutiveUnderReplCount is reset the counter gets reset on open, but you need to also set isUnderReplicated to false, otherwise once that is set to true, it is never reset to false and the consecutiveUnderReplCount can get incremented for every append following one rotation due to under replication.


- Hari Shreedharan


On March 1, 2013, 5:44 a.m., Mike Percy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9675/
> -----------------------------------------------------------
> 
> (Updated March 1, 2013, 5:44 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.
> 
> 
> This addresses bug FLUME-1916.
>     https://issues.apache.org/jira/browse/FLUME-1916
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 
> 
> Diff: https://reviews.apache.org/r/9675/diff/
> 
> 
> Testing
> -------
> 
> Added unit test.
> 
> 
> Thanks,
> 
> Mike Percy
> 
>


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/
-----------------------------------------------------------

(Updated March 1, 2013, 5:44 a.m.)


Review request for Flume.


Changes
-------

Updated patch per review


Description
-------

This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.


This addresses bug FLUME-1916.
    https://issues.apache.org/jira/browse/FLUME-1916


Diffs (updated)
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 

Diff: https://reviews.apache.org/r/9675/diff/


Testing
-------

Added unit test.


Thanks,

Mike Percy


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/
-----------------------------------------------------------

(Updated March 1, 2013, 1:53 a.m.)


Review request for Flume.


Changes
-------

Rebase & feedback from Hari


Description
-------

This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.


This addresses bug FLUME-1916.
    https://issues.apache.org/jira/browse/FLUME-1916


Diffs (updated)
-----

  flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
  flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
  flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 

Diff: https://reviews.apache.org/r/9675/diff/


Testing
-------

Added unit test.


Thanks,

Mike Percy


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Mike Percy <mp...@apache.org>.

> On Feb. 28, 2013, 11:55 p.m., Hari Shreedharan wrote:
> > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java, lines 69-72
> > <https://reviews.apache.org/r/9675/diff/1/?file=263460#file263460line69>
> >
> >     nit: This getFsDesiredReplication call can be avoided if configuredMinReplicas != null by moving that call to an else block.

Done


> On Feb. 28, 2013, 11:55 p.m., Hari Shreedharan wrote:
> > flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java, lines 100-105
> > <https://reviews.apache.org/r/9675/diff/1/?file=263460#file263460line100>
> >
> >     Is this method really required? If fs is null it means the registerCurrentStream was not called (or the unregister method was called). Can fs become null in any other case?

I left it in for now... we might want to expose this to the BucketWriter for the future, I don't think it hurts anything


- Mike


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/#review17213
-----------------------------------------------------------


On March 1, 2013, 1:53 a.m., Mike Percy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9675/
> -----------------------------------------------------------
> 
> (Updated March 1, 2013, 1:53 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.
> 
> 
> This addresses bug FLUME-1916.
>     https://issues.apache.org/jira/browse/FLUME-1916
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java cdc37f6 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java 99b6150 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 
> 
> Diff: https://reviews.apache.org/r/9675/diff/
> 
> 
> Testing
> -------
> 
> Added unit test.
> 
> 
> Thanks,
> 
> Mike Percy
> 
>


Re: Review Request: FLUME-1916: HDFS sink should poll for # of active replicas. If less than required, roll the file

Posted by Hari Shreedharan <hs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9675/#review17213
-----------------------------------------------------------


+1. Looks good. I have a couple of comments which are nitpicks. If you think they need not be fixed, I will commit this patch.


flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/9675/#comment36504>

    nit: This getFsDesiredReplication call can be avoided if configuredMinReplicas != null by moving that call to an else block.



flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java
<https://reviews.apache.org/r/9675/#comment36507>

    Is this method really required? If fs is null it means the registerCurrentStream was not called (or the unregister method was called). Can fs become null in any other case?


- Hari Shreedharan


On Feb. 28, 2013, 6:52 a.m., Mike Percy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9675/
> -----------------------------------------------------------
> 
> (Updated Feb. 28, 2013, 6:52 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This is based on https://issues.apache.org/jira/browse/HBASE-2234 ... essentially if datanodes are dying en masse then we want to close our files as we are writing so that we get new pipeline assignments from the namenode.
> 
> 
> This addresses bug FLUME-1916.
>     https://issues.apache.org/jira/browse/FLUME-1916
> 
> 
> Diffs
> -----
> 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5ac903e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/AbstractHDFSWriter.java PRE-CREATION 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 0786857 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSCompressedDataStream.java 1f3521e 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSDataStream.java 4ea4327 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 3bd25f4 
>   flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSWriter.java abca21f 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/MockHDFSWriter.java 5e8628b 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestBucketWriter.java ebe277c 
>   flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSinkOnMiniCluster.java 2e71069 
> 
> Diff: https://reviews.apache.org/r/9675/diff/
> 
> 
> Testing
> -------
> 
> Added unit test.
> 
> 
> Thanks,
> 
> Mike Percy
> 
>