You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/07/01 15:10:44 UTC

[GitHub] [pinot] gortiz commented on issue #8718: Reduce docker image size

gortiz commented on issue #8718:
URL: https://github.com/apache/pinot/issues/8718#issuecomment-1172448787

   I've just made an analysis of the docker image.
   
   It seems that the three bigger layers are:
   - 344MBs of base image (jdk-slim), which is highly reusable. That means that given two images, they will probably use the same base and therefore the space/download time would be paid only once
   - 617MBs of apt-update and apt-install, which is not reusable at all. We can improve this by creating a specific base image and reusing it.
   - 716MBs of apache-pinot, which are copied in a single layer. Of which:
      - 100MBs are examples, which are very static (they almost never change)
      - 454MBs are plugins, which are mostly shaded dependencies. They are highly optimizable with docker layers, but we cannot use that because they are shaded
      - 150MBs are pinot itself and their dependencies. I guess most of it would be the dependencies, which again could be layered, but they are shaded.
   
   This means that each time we change a single character and create a new docker image, we are storing and downloading in our pods (617 + 716)MBs of data. I think that almost 1GB of that data is static information we could just reuse if correctly using docker layers.
   
   What @xiangfu0 suggested about having different images with more or less plugins or that are able to download plugins at start time can be a solution, but I think it would have the side effect of making quite more difficult to understand to customers. If instead of doing that we just correctly use the layers, the first time a user downloads an image will need to download 1.3GBs of data, but if then he/she downloads a second version, it is very probable that most of the layers would be the same, so he/she would only need to download around 150MBs of data. It also applies to our own pods, which would only need to download these 150MBs instead of 1.3GBs on most upgrades.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org