You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by "Sourygna Luangsay (JIRA)" <ji...@apache.org> on 2012/09/15 11:47:07 UTC

[jira] [Created] (CHUKWA-664) network compression between agent and collector

Sourygna Luangsay created CHUKWA-664:
----------------------------------------

             Summary: network compression between agent and collector
                 Key: CHUKWA-664
                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
             Project: Chukwa
          Issue Type: New Feature
          Components: Data Collection
    Affects Versions: 0.5.0, 0.6.0
            Reporter: Sourygna Luangsay
            Priority: Trivial
             Fix For: 0.6.0


As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CHUKWA-664) network compression between agent and collector

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Yang updated CHUKWA-664:
-----------------------------

    Status: Patch Available  (was: Open)
    
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Assignee: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664-2.patch, chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CHUKWA-664) network compression between agent and collector

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Yang updated CHUKWA-664:
-----------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this, thanks Sourygna.
                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Assignee: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664-2.patch, chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CHUKWA-664) network compression between agent and collector

Posted by "Sourygna Luangsay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sourygna Luangsay updated CHUKWA-664:
-------------------------------------

    Attachment: chukwa-664-2.patch
    
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664-2.patch, chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (CHUKWA-664) network compression between agent and collector

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Yang reassigned CHUKWA-664:
--------------------------------

    Assignee: Sourygna Luangsay
    
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Assignee: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664-2.patch, chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CHUKWA-664) network compression between agent and collector

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456536#comment-13456536 ] 

Eric Yang commented on CHUKWA-664:
----------------------------------

When compression is enabled, flush is called on every chunk.  When it is not compressed, flush is not called.  Flush is the cause of increased TCP fragments.  When agent to collector subscribing ratio is too high, increased TCP fragments can cause excessive retransmission under high load conditions and leading to tcp incast problem.  A chunk is typically very small, and we don't need to flush immediately.  This would save number of TCP headers to send.  Collector would provide HTTP response code to agent if re-transmit of the last set of chunks is necessary.  Therefore, it is best to let TCP buffer fill up then send data.  This will help the throughput rate for compressed data stream for the current patch.
                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CHUKWA-664) network compression between agent and collector

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501458#comment-13501458 ] 

Hudson commented on CHUKWA-664:
-------------------------------

Integrated in Chukwa-trunk #460 (See [https://builds.apache.org/job/Chukwa-trunk/460/])
    CHUKWA-664. Added network compression between agent and collector. (Sourygna Luangsay via Eric Yang) (Revision 1411817)

     Result = SUCCESS
eyang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1411817
Files : 
* /incubator/chukwa/trunk/CHANGES.txt
* /incubator/chukwa/trunk/conf/chukwa-common.xml
* /incubator/chukwa/trunk/src/main/java/org/apache/hadoop/chukwa/datacollection/agent/ChukwaAgent.java
* /incubator/chukwa/trunk/src/main/java/org/apache/hadoop/chukwa/datacollection/collector/servlet/ServletCollector.java
* /incubator/chukwa/trunk/src/main/java/org/apache/hadoop/chukwa/datacollection/sender/ChukwaHttpSender.java

                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Assignee: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664-2.patch, chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CHUKWA-664) network compression between agent and collector

Posted by "Sourygna Luangsay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sourygna Luangsay updated CHUKWA-664:
-------------------------------------

    Attachment: chukwa-664.patch
    
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CHUKWA-664) network compression between agent and collector

Posted by "Sourygna Luangsay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457624#comment-13457624 ] 

Sourygna Luangsay commented on CHUKWA-664:
------------------------------------------

OK for the POST: I had to modify getContentLength() in BuffersRequestEntity class to make it work. Tonight, I have to refactor a bit my code, test it and I'll submit the new patch.
                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CHUKWA-664) network compression between agent and collector

Posted by "Sourygna Luangsay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456365#comment-13456365 ] 

Sourygna Luangsay commented on CHUKWA-664:
------------------------------------------

Here is a first patch that adds the compression feature. I have tried it with DefaultCodec, GzipCodec, BZip2Codec and it seems OK.
If it looks good, I'll submit another patch with the current changes in the documentation.

Though everything seems to work, there is something that I don't really understand. I have tcpdumped network traffic and when compression is enabled, I can't see any more the HTTP POST protocol appearing in Wireshark. I just see various TCP segments that hold my Chukwa chunk, but I can't see any higher protocol than TCP. What is more, using the same file (19KB) to compare between compress (size compress: 7.6 KB) and uncompressed tcpdumps, I have noticed that I get more (smaller) TCP segments with compressed communication than with uncompressed communication. Could someone enlightens me?
                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CHUKWA-664) network compression between agent and collector

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500902#comment-13500902 ] 

Eric Yang commented on CHUKWA-664:
----------------------------------

Any update on the junit test case?
                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Assignee: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664-2.patch, chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CHUKWA-664) network compression between agent and collector

Posted by "Eric Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457185#comment-13457185 ] 

Eric Yang commented on CHUKWA-664:
----------------------------------

HTTP POST header should be visible.  I don't see any reason in the implementation that would cause it to be omitted.  A test case to validate the data are sent correctly would be nice.
                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CHUKWA-664) network compression between agent and collector

Posted by "Sourygna Luangsay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13458403#comment-13458403 ] 

Sourygna Luangsay commented on CHUKWA-664:
------------------------------------------

Submitted the patch that fixes HTTP POST.

So, the actions that remain are (I am going soon abroad for 3 weeks so I won't be able to work on that Jira untill I come back on 15th of Octoter):
- checking if we can use "io.file.buffer.size parameter" if native-hadoop library is not loaded.
- writing some junit tests
- update the documentation
                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664-2.patch, chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CHUKWA-664) network compression between agent and collector

Posted by "Sourygna Luangsay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CHUKWA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456595#comment-13456595 ] 

Sourygna Luangsay commented on CHUKWA-664:
------------------------------------------

You were right: too many flushes due to compression.
Nonetheless, I don't think that compression is called for every chunk. I only call compressionOutputStream.finish() method once every chunks are written on the stream.

I have played a bit with Hadoop "io.file.buffer.size parameter" and I managed to get bigger (and less numerous) TCP fragments if I increase this buffer variable. So I guess that would fix the TCP incast problem (I have also tried changing the "chukwaAgent.fileTailingAdaptor.maxReadSize" parameter and got more interesting results).
The only trouble is that "io.file.buffer.size parameter" currently only works if you load the native-hadoop library for compression. I have got to have a better look at my code and Hadoop compression package and see if I can enable it if native-hadoop is not loaded.

And do you have any idea why HTTP POST can't be seen when compression is enabled?


                
> network compression between agent and collector
> -----------------------------------------------
>
>                 Key: CHUKWA-664
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-664
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.5.0, 0.6.0
>            Reporter: Sourygna Luangsay
>            Priority: Trivial
>             Fix For: 0.6.0
>
>         Attachments: chukwa-664.patch
>
>
> As suggested in http://mail-archives.apache.org/mod_mbox/incubator-chukwa-user/201207.mbox/%3C001b01cd69b4$13d9c100$3b8d4300$@com%3E , Chukwa should be able to compress network communications between agent and collector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira