You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/04/21 11:18:00 UTC

[GitHub] [ozone] elek opened a new pull request #2170: HDDS-4687. Disable compression for closed-container replication

elek opened a new pull request #2170:
URL: https://github.com/apache/ozone/pull/2170


   ## What changes were proposed in this pull request?
   
   During the measurement of closed container replication I found that the biggest bottleneck is the read side. 5 Gb container is replicated under ~3 minutes but ~2:30 was the downloading part.
   
   Closed containers are replicated via GRPC. The source side creates an `OutputStream` on-the-fly (`OnDemandContainerReplicationSource.java`) and stream all the container content as a "tar.gz" archive to the client.
   
   It turned out that the compression (the .gz part) is quite expensive:
   
   I created a CLI tool to export containers to tar files (same logic as the replication but without streaming via GRPC, just saving to a file).
   
   I have seen the 2:30 time to create the archive:
   
   ```
   2021-01-13 05:51:25,302 [main] INFO debug.ExportContainer: Preparation is done
   2021-01-13 05:53:53,472 [main] INFO debug.ExportContainer: Container is exported to /tmp/container-3.tar.gz
   ```
   
   But when I removed the compression in `TarContainerPacker.java`, the speed was significant better (25 sec instead of the 150 sec)
   
   ```
   2021-01-13 06:11:46,254 [main] INFO debug.ExportContainer: Preparation is done
   2021-01-13 06:12:11,512 [main] INFO debug.ExportContainer: Container is exported to /tmp/container-3.tar
   ```
   
   As a result I suggest turning off the compression for closed container replication.
   
   More details: https://github.com/elek/ozone-notes/tree/master/20210113-closed-container-replication
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4687
   
   ## How was this patch tested?
   
   Tested in real kubernetes cluster: 
   
    * data is generated with the freon data generator 
    * containers were replicated with the freon container replicator (time is checked from the log)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on pull request #2170: HDDS-4687. Disable compression for closed-container replication

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #2170:
URL: https://github.com/apache/ozone/pull/2170#issuecomment-831995963


   Full details are here: https://issues.apache.org/jira/browse/HDDS-5188 (cc @arp7 @swagle)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on pull request #2170: HDDS-4687. Disable compression for closed-container replication

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #2170:
URL: https://github.com/apache/ozone/pull/2170#issuecomment-831148556


   Thanks for the review @arp7. The reason why I didn't merge it yet that I am not sure if we need more testing or not. After I turned off the compression I noticed other OOM related exceptions which may or may not be related. 
   
   There is a chance that with removing this obstacle will also remove a natural throttling and the faster streaming cause additional problems.
   
   I am wondering if it should be tested with more realistic replication scenarios. 
   
   What do you think @arp7?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] arp7 commented on pull request #2170: HDDS-4687. Disable compression for closed-container replication

Posted by GitBox <gi...@apache.org>.
arp7 commented on pull request #2170:
URL: https://github.com/apache/ozone/pull/2170#issuecomment-831352521


   Yes we do need to introduce throttling separately, HDFS also has this for re-replication. Were the OOM exceptions seen in large scale testing or also on a smaller setup?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek closed pull request #2170: HDDS-4687. Disable compression for closed-container replication

Posted by GitBox <gi...@apache.org>.
elek closed pull request #2170:
URL: https://github.com/apache/ozone/pull/2170


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on pull request #2170: HDDS-4687. Disable compression for closed-container replication

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #2170:
URL: https://github.com/apache/ozone/pull/2170#issuecomment-831797240


   > large scale testing or also on a smaller setup
   
   Only on large scale testing. A local, smaller setup (where I replicated a few containers only) it worked well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] elek commented on pull request #2170: HDDS-4687. Disable compression for closed-container replication

Posted by GitBox <gi...@apache.org>.
elek commented on pull request #2170:
URL: https://github.com/apache/ozone/pull/2170#issuecomment-831892466


   Closing for now as it seems to require further testing. It increases the chance that Netty buffers became full. Will open a separated issue about the buffer allocation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org