You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Brock Noland (Created) (JIRA)" <ji...@apache.org> on 2012/02/21 19:41:48 UTC
[jira] [Created] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
All HDFS Operations in HDFSEventSink should have a timeout
----------------------------------------------------------
Key: FLUME-985
URL: https://issues.apache.org/jira/browse/FLUME-985
Project: Flume
Issue Type: Improvement
Components: Sinks+Sources
Affects Versions: v1.0.0
Reporter: Brock Noland
Assignee: Brock Noland
In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "Brock Noland (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brock Noland updated FLUME-985:
-------------------------------
Status: Patch Available (was: Open)
Marking "Patch Available"
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Attachments: FLUME-985-0.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237206#comment-13237206 ]
Hudson commented on FLUME-985:
------------------------------
Integrated in flume-trunk #143 (See [https://builds.apache.org/job/flume-trunk/143/])
FLUME-985. All HDFS Operations should have a timeout.
(Brock Noland via Arvind Prabhakar) (Revision 1304600)
Result = SUCCESS
arvind : http://svn.apache.org/viewvc/?view=rev&rev=1304600
Files :
* /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/pom.xml
* /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java
* /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java
* /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java
* /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadSeqWriter.java
* /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadWriterFactory.java
* /incubator/flume/trunk/flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Fix For: v1.2.0
>
> Attachments: FLUME-985-0.patch, FLUME-985-1.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "Brock Noland (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brock Noland updated FLUME-985:
-------------------------------
Attachment: FLUME-985-1.patch
Rebased patch is attached.
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Attachments: FLUME-985-0.patch, FLUME-985-1.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237125#comment-13237125 ]
jiraposter@reviews.apache.org commented on FLUME-985:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3988/#review6311
-----------------------------------------------------------
Ship it!
+1
- Arvind
On 2012-03-23 20:55:21, Brock Noland wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/3988/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2012-03-23 20:55:21)
bq.
bq.
bq. Review request for Flume.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. 1) All HDFS actions are now done in async mode
bq. 2) If an HDFS append timesout, the file is closed and reopened.
bq. 3) Batching is now handled by BucketWriter which was always aware of the batch size.
bq.
bq.
bq. This addresses bug FLUME-985.
bq. https://issues.apache.org/jira/browse/FLUME-985
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. flume-ng-sinks/flume-hdfs-sink/pom.xml bef2ca7
bq. flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 45769f6
bq. flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 1fdaddd
bq. flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 19b2559
bq. flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadSeqWriter.java 8a6740f
bq. flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadWriterFactory.java b067c00
bq. flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 8fa72a1
bq.
bq. Diff: https://reviews.apache.org/r/3988/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. 1) Unit tests were added for close/reopen scenario.
bq. 2) All unit tests pass
bq. 3) I manually verified this patch improved FlumeNG's behavior when the datanode it's writing to is restarted. In the past FlumeNG had to be restarted, now Flume moves on and starts writing to a new file.
bq.
bq.
bq. Thanks,
bq.
bq. Brock
bq.
bq.
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Fix For: v1.2.0
>
> Attachments: FLUME-985-0.patch, FLUME-985-1.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "Arvind Prabhakar (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arvind Prabhakar updated FLUME-985:
-----------------------------------
Resolution: Fixed
Fix Version/s: v1.2.0
Status: Resolved (was: Patch Available)
Patch committed. Thanks Brock!
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Fix For: v1.2.0
>
> Attachments: FLUME-985-0.patch, FLUME-985-1.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212997#comment-13212997 ]
jiraposter@reviews.apache.org commented on FLUME-985:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3988/
-----------------------------------------------------------
Review request for Flume.
Summary
-------
1) All HDFS actions are now done in async mode
2) If an HDFS append timesout, the file is closed and reopened.
3) Batching is now handled by BucketWriter which was always aware of the batch size.
This addresses bug FLUME-985.
https://issues.apache.org/jira/browse/FLUME-985
Diffs
-----
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 19b2559
flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadSeqWriter.java 8a6740f
flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 7d8ee8a
flume-ng-sinks/flume-hdfs-sink/pom.xml f27851e
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 45769f6
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 3da90a5
Diff: https://reviews.apache.org/r/3988/diff
Testing
-------
1) Unit tests were added for close/reopen scenario.
2) All unit tests pass
3) I manually verified this patch improved FlumeNG's behavior when the datanode it's writing to is restarted. In the past FlumeNG had to be restarted, now Flume moves on and starts writing to a new file.
Thanks,
Brock
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Attachments: FLUME-985-0.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237112#comment-13237112 ]
jiraposter@reviews.apache.org commented on FLUME-985:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3988/
-----------------------------------------------------------
(Updated 2012-03-23 20:55:21.762184)
Review request for Flume.
Changes
-------
Rebased patch attached. Attaching to JIRA for commit.
Summary
-------
1) All HDFS actions are now done in async mode
2) If an HDFS append timesout, the file is closed and reopened.
3) Batching is now handled by BucketWriter which was always aware of the batch size.
This addresses bug FLUME-985.
https://issues.apache.org/jira/browse/FLUME-985
Diffs (updated)
-----
flume-ng-sinks/flume-hdfs-sink/pom.xml bef2ca7
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 45769f6
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 1fdaddd
flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 19b2559
flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadSeqWriter.java 8a6740f
flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadWriterFactory.java b067c00
flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 8fa72a1
Diff: https://reviews.apache.org/r/3988/diff
Testing
-------
1) Unit tests were added for close/reopen scenario.
2) All unit tests pass
3) I manually verified this patch improved FlumeNG's behavior when the datanode it's writing to is restarted. In the past FlumeNG had to be restarted, now Flume moves on and starts writing to a new file.
Thanks,
Brock
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Attachments: FLUME-985-0.patch, FLUME-985-1.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235465#comment-13235465 ]
jiraposter@reviews.apache.org commented on FLUME-985:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3988/#review6220
-----------------------------------------------------------
Ship it!
sorry I didn't look at this earlier.
Looks fine to me. Please see if the code needs to be rebased.
- Prasad
On 2012-02-21 21:51:32, Brock Noland wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/3988/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2012-02-21 21:51:32)
bq.
bq.
bq. Review request for Flume.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. 1) All HDFS actions are now done in async mode
bq. 2) If an HDFS append timesout, the file is closed and reopened.
bq. 3) Batching is now handled by BucketWriter which was always aware of the batch size.
bq.
bq.
bq. This addresses bug FLUME-985.
bq. https://issues.apache.org/jira/browse/FLUME-985
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSSequenceFile.java 19b2559
bq. flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/HDFSBadSeqWriter.java 8a6740f
bq. flume-ng-sinks/flume-hdfs-sink/src/test/java/org/apache/flume/sink/hdfs/TestHDFSEventSink.java 7d8ee8a
bq. flume-ng-sinks/flume-hdfs-sink/pom.xml f27851e
bq. flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/BucketWriter.java 45769f6
bq. flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs/HDFSEventSink.java 3da90a5
bq.
bq. Diff: https://reviews.apache.org/r/3988/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. 1) Unit tests were added for close/reopen scenario.
bq. 2) All unit tests pass
bq. 3) I manually verified this patch improved FlumeNG's behavior when the datanode it's writing to is restarted. In the past FlumeNG had to be restarted, now Flume moves on and starts writing to a new file.
bq.
bq.
bq. Thanks,
bq.
bq. Brock
bq.
bq.
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Attachments: FLUME-985-0.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "Brock Noland (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brock Noland updated FLUME-985:
-------------------------------
Attachment: FLUME-985-0.patch
attaching current patch.
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Attachments: FLUME-985-0.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (FLUME-985) All HDFS Operations in HDFSEventSink
should have a timeout
Posted by "Brock Noland (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/FLUME-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237128#comment-13237128 ]
Brock Noland commented on FLUME-985:
------------------------------------
Thanks!
> All HDFS Operations in HDFSEventSink should have a timeout
> ----------------------------------------------------------
>
> Key: FLUME-985
> URL: https://issues.apache.org/jira/browse/FLUME-985
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.0.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Fix For: v1.2.0
>
> Attachments: FLUME-985-0.patch, FLUME-985-1.patch
>
>
> In FLUME-871 appends were made asynchronous so we could time them out. All HDFS Operations should be done this same way.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira