You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/02/12 07:25:19 UTC

[GitHub] [pulsar] lhotari opened a new issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files

lhotari opened a new issue #9572:
URL: https://github.com/apache/pulsar/issues/9572


   **Is your enhancement request related to a problem? Please describe.**
   
   Currently the Pulsar IO .nar files are large in size. The total size of Pulsar IO files is 1952MB!
   Break down: https://gist.github.com/lhotari/810a543524e25457b521ac666913ad3c
   
   **Describe the solution you'd like**
   
   Exclude all Pulsar Functions Worker dependencies from Pulsar IO .nar files .
   
   For example, 
   
   ```
   $ unzip -l ~/.m2/repository/org/apache/pulsar/pulsar-io-data-generator/2.8.0-SNAPSHOT/pulsar-io-data-generator-2.8.0-SNAPSHOT.nar |grep META-INF/bundled-dependencies | sort -k 4,4
           0  02-12-2021 07:04   META-INF/bundled-dependencies/
      183117  02-12-2021 07:04   META-INF/bundled-dependencies/aircompressor-0.16.jar
        4467  02-12-2021 07:04   META-INF/bundled-dependencies/aopalliance-1.0.jar
      449146  02-12-2021 07:04   META-INF/bundled-dependencies/async-http-client-2.12.1.jar
        9909  02-12-2021 07:04   META-INF/bundled-dependencies/async-http-client-netty-utils-2.12.1.jar
      566992  02-12-2021 07:04   META-INF/bundled-dependencies/avro-1.9.1.jar
       25683  02-12-2021 07:04   META-INF/bundled-dependencies/avro-protobuf-1.9.1.jar
      887800  02-12-2021 07:04   META-INF/bundled-dependencies/bcpkix-jdk15on-1.68.jar
     6031548  02-12-2021 07:04   META-INF/bundled-dependencies/bcprov-ext-jdk15on-1.68.jar
     5961178  02-12-2021 07:04   META-INF/bundled-dependencies/bcprov-jdk15on-1.68.jar
      146056  02-12-2021 07:04   META-INF/bundled-dependencies/bookkeeper-common-4.12.1.jar
       16852  02-12-2021 07:04   META-INF/bundled-dependencies/bookkeeper-common-allocator-4.12.1.jar
       19351  02-12-2021 07:04   META-INF/bundled-dependencies/bookkeeper-stats-api-4.12.1.jar
    11082557  02-12-2021 07:04   META-INF/bundled-dependencies/bouncy-castle-bc-2.8.0-SNAPSHOT-pkg.jar
      214381  02-12-2021 07:04   META-INF/bundled-dependencies/checker-qual-3.5.0.jar
       65366  02-12-2021 07:04   META-INF/bundled-dependencies/circe-checksum-4.12.1.jar
      284184  02-12-2021 07:04   META-INF/bundled-dependencies/commons-codec-1.10.jar
      615064  02-12-2021 07:04   META-INF/bundled-dependencies/commons-compress-1.19.jar
      362679  02-12-2021 07:04   META-INF/bundled-dependencies/commons-configuration-1.10.jar
      208700  02-12-2021 07:04   META-INF/bundled-dependencies/commons-io-2.5.jar
      284220  02-12-2021 07:04   META-INF/bundled-dependencies/commons-lang-2.6.jar
      494856  02-12-2021 07:04   META-INF/bundled-dependencies/commons-lang3-3.6.jar
       61829  02-12-2021 07:04   META-INF/bundled-dependencies/commons-logging-1.2.jar
     2213560  02-12-2021 07:04   META-INF/bundled-dependencies/commons-math3-3.6.1.jar
       23508  02-12-2021 07:04   META-INF/bundled-dependencies/cpu-affinity-4.12.1.jar
       13879  02-12-2021 07:04   META-INF/bundled-dependencies/error_prone_annotations-2.3.4.jar
        4617  02-12-2021 07:04   META-INF/bundled-dependencies/failureaccess-1.0.1.jar
      240255  02-12-2021 07:04   META-INF/bundled-dependencies/gson-2.8.6.jar
     2862361  02-12-2021 07:04   META-INF/bundled-dependencies/guava-30.1-jre.jar
      674028  02-12-2021 07:04   META-INF/bundled-dependencies/guice-4.1.0.jar
       42873  02-12-2021 07:04   META-INF/bundled-dependencies/guice-assistedinject-4.1.0.jar
       45012  02-12-2021 07:04   META-INF/bundled-dependencies/iban4j-3.2.1.jar
        8781  02-12-2021 07:04   META-INF/bundled-dependencies/j2objc-annotations-1.3.jar
       68167  02-12-2021 07:04   META-INF/bundled-dependencies/jackson-annotations-2.11.1.jar
      351575  02-12-2021 07:04   META-INF/bundled-dependencies/jackson-core-2.11.1.jar
     1419800  02-12-2021 07:04   META-INF/bundled-dependencies/jackson-databind-2.11.1.jar
       46983  02-12-2021 07:04   META-INF/bundled-dependencies/jackson-dataformat-yaml-2.11.1.jar
       79295  02-12-2021 07:04   META-INF/bundled-dependencies/jackson-module-jsonSchema-2.11.1.jar
      780265  02-12-2021 07:04   META-INF/bundled-dependencies/javassist-3.25.0-GA.jar
       78030  02-12-2021 07:04   META-INF/bundled-dependencies/javax.activation-1.2.0.jar
        2497  02-12-2021 07:04   META-INF/bundled-dependencies/javax.inject-1.jar
      127509  02-12-2021 07:04   META-INF/bundled-dependencies/javax.ws.rs-api-2.1.jar
        2254  02-12-2021 07:04   META-INF/bundled-dependencies/jcip-annotations-1.0.jar
      252020  02-12-2021 07:04   META-INF/bundled-dependencies/jctools-core-2.1.2.jar
      566323  02-12-2021 07:04   META-INF/bundled-dependencies/jetty-util-9.4.35.v20201120.jar
      273528  02-12-2021 07:04   META-INF/bundled-dependencies/jfairy-0.5.9.jar
      640724  02-12-2021 07:04   META-INF/bundled-dependencies/joda-time-2.10.1.jar
       19936  02-12-2021 07:04   META-INF/bundled-dependencies/jsr305-3.0.2.jar
        2199  02-12-2021 07:04   META-INF/bundled-dependencies/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
       24995  02-12-2021 07:04   META-INF/bundled-dependencies/memory-0.8.3.jar
      289921  02-12-2021 07:04   META-INF/bundled-dependencies/netty-buffer-4.1.51.Final.jar
      320174  02-12-2021 07:04   META-INF/bundled-dependencies/netty-codec-4.1.51.Final.jar
       61345  02-12-2021 07:04   META-INF/bundled-dependencies/netty-codec-dns-4.1.51.Final.jar
       36193  02-12-2021 07:04   META-INF/bundled-dependencies/netty-codec-haproxy-4.1.51.Final.jar
      617948  02-12-2021 07:04   META-INF/bundled-dependencies/netty-codec-http-4.1.51.Final.jar
      625057  02-12-2021 07:04   META-INF/bundled-dependencies/netty-common-4.1.51.Final.jar
      456702  02-12-2021 07:04   META-INF/bundled-dependencies/netty-handler-4.1.51.Final.jar
       21842  02-12-2021 07:04   META-INF/bundled-dependencies/netty-reactive-streams-2.0.4.jar
       33158  02-12-2021 07:04   META-INF/bundled-dependencies/netty-resolver-4.1.51.Final.jar
      151765  02-12-2021 07:04   META-INF/bundled-dependencies/netty-resolver-dns-4.1.51.Final.jar
     4017922  02-12-2021 07:04   META-INF/bundled-dependencies/netty-tcnative-boringssl-static-2.0.33.Final.jar
      473222  02-12-2021 07:04   META-INF/bundled-dependencies/netty-transport-4.1.51.Final.jar
      152317  02-12-2021 07:04   META-INF/bundled-dependencies/netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar
       33062  02-12-2021 07:04   META-INF/bundled-dependencies/netty-transport-native-unix-common-4.1.51.Final.jar
       56446  02-12-2021 07:04   META-INF/bundled-dependencies/netty-transport-native-unix-common-4.1.51.Final-linux-x86_64.jar
     1660960  02-12-2021 07:04   META-INF/bundled-dependencies/protobuf-java-3.11.4.jar
       73874  02-12-2021 07:04   META-INF/bundled-dependencies/protobuf-java-util-3.11.4.jar
       47021  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-client-admin-api-2.8.0-SNAPSHOT.jar
      141344  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-client-api-2.8.0-SNAPSHOT.jar
      657161  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-client-original-2.8.0-SNAPSHOT.jar
      877274  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-common-2.8.0-SNAPSHOT.jar
       38477  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-config-validation-2.8.0-SNAPSHOT.jar
       21681  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-functions-api-2.8.0-SNAPSHOT.jar
       23202  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-io-core-2.8.0-SNAPSHOT.jar
       28200  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-package-core-2.8.0-SNAPSHOT.jar
        9037  02-12-2021 07:04   META-INF/bundled-dependencies/pulsar-transaction-common-2.8.0-SNAPSHOT.jar
       11369  02-12-2021 07:04   META-INF/bundled-dependencies/reactive-streams-1.0.3.jar
      130999  02-12-2021 07:04   META-INF/bundled-dependencies/reflections-0.9.11.jar
      421509  02-12-2021 07:04   META-INF/bundled-dependencies/sketches-core-0.8.3.jar
       41203  02-12-2021 07:04   META-INF/bundled-dependencies/slf4j-api-1.7.25.jar
      284338  02-12-2021 07:04   META-INF/bundled-dependencies/snakeyaml-1.18.jar
       21782  02-12-2021 07:04   META-INF/bundled-dependencies/swagger-annotations-1.6.2.jar
       63777  02-12-2021 07:04   META-INF/bundled-dependencies/validation-api-1.1.0.Final.jar
   ``` 
   
   pulsar-io-data-generator has a single unique dependency which is jfairy. This means that about 45MB of the dependencies are redundant in each pulsar-io .nar file. 
   
   These files won't get used at all for classloading. It is safe to remove all dependencies that are part of Pulsar Functions Worker's system classloader. The reason for this is that classloaders use parent-first lookups (by default, and also in Pulsar Functions Worker). 
   
   **Additional context**
   
   Reducing the size of Pulsar IO .nar files would help reducing the pulsar-all Docker image size too. There will be benefits in the Pulsar (core) build, although PIP-62 covers moving Pulsar IO connectors from apache/pulsar repository to apache/pulsar-connectors .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] freeznet commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files

Posted by GitBox <gi...@apache.org>.
freeznet commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-780954731


   @lhotari thanks for your detailed description.
   @sijie sure, i will create a pr to fix this issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-779127621


   The regression seems to have been caused by #9246 .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-1058893240


   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-782194168


   @freeznet @sijie I have created a separate issue about the excessive `pulsar-client-admin-api` dependencies: #9640 .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-779122230


   There seems to be a regression in master branch.
   
   in master branch, the size of pulsar-io-data-generator is about 45MB
   in branch-2.7, the size of pulsar-io-data-generator is about 11MB.
   
   commands to use for comparison
   ```
   mvn -am -pl org.apache.pulsar:pulsar-io-data-generator clean install -DskipTests
   du -ms pulsar-io/data-generator/target/pulsar-io-data-generator-*.nar
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-779462197


   @lhotari Thank you for catching this!  I think we want to have an admin API interface module to be used by the pulsar functions API because we want to expose PulsarAdmin through function context. So the problem seems to be that we didn't break down the interface to pulsar-admin-api and unnecessarily pull into other dependencies. @freeznet Can you take a look at this issue?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org