You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/02/12 07:25:19 UTC
[GitHub] [pulsar] lhotari opened a new issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files
lhotari opened a new issue #9572:
URL: https://github.com/apache/pulsar/issues/9572
**Is your enhancement request related to a problem? Please describe.**
Currently the Pulsar IO .nar files are large in size. The total size of Pulsar IO files is 1952MB!
Break down: https://gist.github.com/lhotari/810a543524e25457b521ac666913ad3c
**Describe the solution you'd like**
Exclude all Pulsar Functions Worker dependencies from Pulsar IO .nar files .
For example,
```
$ unzip -l ~/.m2/repository/org/apache/pulsar/pulsar-io-data-generator/2.8.0-SNAPSHOT/pulsar-io-data-generator-2.8.0-SNAPSHOT.nar |grep META-INF/bundled-dependencies | sort -k 4,4
0 02-12-2021 07:04 META-INF/bundled-dependencies/
183117 02-12-2021 07:04 META-INF/bundled-dependencies/aircompressor-0.16.jar
4467 02-12-2021 07:04 META-INF/bundled-dependencies/aopalliance-1.0.jar
449146 02-12-2021 07:04 META-INF/bundled-dependencies/async-http-client-2.12.1.jar
9909 02-12-2021 07:04 META-INF/bundled-dependencies/async-http-client-netty-utils-2.12.1.jar
566992 02-12-2021 07:04 META-INF/bundled-dependencies/avro-1.9.1.jar
25683 02-12-2021 07:04 META-INF/bundled-dependencies/avro-protobuf-1.9.1.jar
887800 02-12-2021 07:04 META-INF/bundled-dependencies/bcpkix-jdk15on-1.68.jar
6031548 02-12-2021 07:04 META-INF/bundled-dependencies/bcprov-ext-jdk15on-1.68.jar
5961178 02-12-2021 07:04 META-INF/bundled-dependencies/bcprov-jdk15on-1.68.jar
146056 02-12-2021 07:04 META-INF/bundled-dependencies/bookkeeper-common-4.12.1.jar
16852 02-12-2021 07:04 META-INF/bundled-dependencies/bookkeeper-common-allocator-4.12.1.jar
19351 02-12-2021 07:04 META-INF/bundled-dependencies/bookkeeper-stats-api-4.12.1.jar
11082557 02-12-2021 07:04 META-INF/bundled-dependencies/bouncy-castle-bc-2.8.0-SNAPSHOT-pkg.jar
214381 02-12-2021 07:04 META-INF/bundled-dependencies/checker-qual-3.5.0.jar
65366 02-12-2021 07:04 META-INF/bundled-dependencies/circe-checksum-4.12.1.jar
284184 02-12-2021 07:04 META-INF/bundled-dependencies/commons-codec-1.10.jar
615064 02-12-2021 07:04 META-INF/bundled-dependencies/commons-compress-1.19.jar
362679 02-12-2021 07:04 META-INF/bundled-dependencies/commons-configuration-1.10.jar
208700 02-12-2021 07:04 META-INF/bundled-dependencies/commons-io-2.5.jar
284220 02-12-2021 07:04 META-INF/bundled-dependencies/commons-lang-2.6.jar
494856 02-12-2021 07:04 META-INF/bundled-dependencies/commons-lang3-3.6.jar
61829 02-12-2021 07:04 META-INF/bundled-dependencies/commons-logging-1.2.jar
2213560 02-12-2021 07:04 META-INF/bundled-dependencies/commons-math3-3.6.1.jar
23508 02-12-2021 07:04 META-INF/bundled-dependencies/cpu-affinity-4.12.1.jar
13879 02-12-2021 07:04 META-INF/bundled-dependencies/error_prone_annotations-2.3.4.jar
4617 02-12-2021 07:04 META-INF/bundled-dependencies/failureaccess-1.0.1.jar
240255 02-12-2021 07:04 META-INF/bundled-dependencies/gson-2.8.6.jar
2862361 02-12-2021 07:04 META-INF/bundled-dependencies/guava-30.1-jre.jar
674028 02-12-2021 07:04 META-INF/bundled-dependencies/guice-4.1.0.jar
42873 02-12-2021 07:04 META-INF/bundled-dependencies/guice-assistedinject-4.1.0.jar
45012 02-12-2021 07:04 META-INF/bundled-dependencies/iban4j-3.2.1.jar
8781 02-12-2021 07:04 META-INF/bundled-dependencies/j2objc-annotations-1.3.jar
68167 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-annotations-2.11.1.jar
351575 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-core-2.11.1.jar
1419800 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-databind-2.11.1.jar
46983 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-dataformat-yaml-2.11.1.jar
79295 02-12-2021 07:04 META-INF/bundled-dependencies/jackson-module-jsonSchema-2.11.1.jar
780265 02-12-2021 07:04 META-INF/bundled-dependencies/javassist-3.25.0-GA.jar
78030 02-12-2021 07:04 META-INF/bundled-dependencies/javax.activation-1.2.0.jar
2497 02-12-2021 07:04 META-INF/bundled-dependencies/javax.inject-1.jar
127509 02-12-2021 07:04 META-INF/bundled-dependencies/javax.ws.rs-api-2.1.jar
2254 02-12-2021 07:04 META-INF/bundled-dependencies/jcip-annotations-1.0.jar
252020 02-12-2021 07:04 META-INF/bundled-dependencies/jctools-core-2.1.2.jar
566323 02-12-2021 07:04 META-INF/bundled-dependencies/jetty-util-9.4.35.v20201120.jar
273528 02-12-2021 07:04 META-INF/bundled-dependencies/jfairy-0.5.9.jar
640724 02-12-2021 07:04 META-INF/bundled-dependencies/joda-time-2.10.1.jar
19936 02-12-2021 07:04 META-INF/bundled-dependencies/jsr305-3.0.2.jar
2199 02-12-2021 07:04 META-INF/bundled-dependencies/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
24995 02-12-2021 07:04 META-INF/bundled-dependencies/memory-0.8.3.jar
289921 02-12-2021 07:04 META-INF/bundled-dependencies/netty-buffer-4.1.51.Final.jar
320174 02-12-2021 07:04 META-INF/bundled-dependencies/netty-codec-4.1.51.Final.jar
61345 02-12-2021 07:04 META-INF/bundled-dependencies/netty-codec-dns-4.1.51.Final.jar
36193 02-12-2021 07:04 META-INF/bundled-dependencies/netty-codec-haproxy-4.1.51.Final.jar
617948 02-12-2021 07:04 META-INF/bundled-dependencies/netty-codec-http-4.1.51.Final.jar
625057 02-12-2021 07:04 META-INF/bundled-dependencies/netty-common-4.1.51.Final.jar
456702 02-12-2021 07:04 META-INF/bundled-dependencies/netty-handler-4.1.51.Final.jar
21842 02-12-2021 07:04 META-INF/bundled-dependencies/netty-reactive-streams-2.0.4.jar
33158 02-12-2021 07:04 META-INF/bundled-dependencies/netty-resolver-4.1.51.Final.jar
151765 02-12-2021 07:04 META-INF/bundled-dependencies/netty-resolver-dns-4.1.51.Final.jar
4017922 02-12-2021 07:04 META-INF/bundled-dependencies/netty-tcnative-boringssl-static-2.0.33.Final.jar
473222 02-12-2021 07:04 META-INF/bundled-dependencies/netty-transport-4.1.51.Final.jar
152317 02-12-2021 07:04 META-INF/bundled-dependencies/netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar
33062 02-12-2021 07:04 META-INF/bundled-dependencies/netty-transport-native-unix-common-4.1.51.Final.jar
56446 02-12-2021 07:04 META-INF/bundled-dependencies/netty-transport-native-unix-common-4.1.51.Final-linux-x86_64.jar
1660960 02-12-2021 07:04 META-INF/bundled-dependencies/protobuf-java-3.11.4.jar
73874 02-12-2021 07:04 META-INF/bundled-dependencies/protobuf-java-util-3.11.4.jar
47021 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-client-admin-api-2.8.0-SNAPSHOT.jar
141344 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-client-api-2.8.0-SNAPSHOT.jar
657161 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-client-original-2.8.0-SNAPSHOT.jar
877274 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-common-2.8.0-SNAPSHOT.jar
38477 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-config-validation-2.8.0-SNAPSHOT.jar
21681 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-functions-api-2.8.0-SNAPSHOT.jar
23202 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-io-core-2.8.0-SNAPSHOT.jar
28200 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-package-core-2.8.0-SNAPSHOT.jar
9037 02-12-2021 07:04 META-INF/bundled-dependencies/pulsar-transaction-common-2.8.0-SNAPSHOT.jar
11369 02-12-2021 07:04 META-INF/bundled-dependencies/reactive-streams-1.0.3.jar
130999 02-12-2021 07:04 META-INF/bundled-dependencies/reflections-0.9.11.jar
421509 02-12-2021 07:04 META-INF/bundled-dependencies/sketches-core-0.8.3.jar
41203 02-12-2021 07:04 META-INF/bundled-dependencies/slf4j-api-1.7.25.jar
284338 02-12-2021 07:04 META-INF/bundled-dependencies/snakeyaml-1.18.jar
21782 02-12-2021 07:04 META-INF/bundled-dependencies/swagger-annotations-1.6.2.jar
63777 02-12-2021 07:04 META-INF/bundled-dependencies/validation-api-1.1.0.Final.jar
```
pulsar-io-data-generator has a single unique dependency which is jfairy. This means that about 45MB of the dependencies are redundant in each pulsar-io .nar file.
These files won't get used at all for classloading. It is safe to remove all dependencies that are part of Pulsar Functions Worker's system classloader. The reason for this is that classloaders use parent-first lookups (by default, and also in Pulsar Functions Worker).
**Additional context**
Reducing the size of Pulsar IO .nar files would help reducing the pulsar-all Docker image size too. There will be benefits in the Pulsar (core) build, although PIP-62 covers moving Pulsar IO connectors from apache/pulsar repository to apache/pulsar-connectors .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] freeznet commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files
Posted by GitBox <gi...@apache.org>.
freeznet commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-780954731
@lhotari thanks for your detailed description.
@sijie sure, i will create a pr to fix this issue.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] lhotari commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files
Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-779127621
The regression seems to have been caused by #9246 .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] codelipenghui commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files
Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-1058893240
The issue had no activity for 30 days, mark with Stale label.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] lhotari commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files
Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-782194168
@freeznet @sijie I have created a separate issue about the excessive `pulsar-client-admin-api` dependencies: #9640 .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] lhotari commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files
Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-779122230
There seems to be a regression in master branch.
in master branch, the size of pulsar-io-data-generator is about 45MB
in branch-2.7, the size of pulsar-io-data-generator is about 11MB.
commands to use for comparison
```
mvn -am -pl org.apache.pulsar:pulsar-io-data-generator clean install -DskipTests
du -ms pulsar-io/data-generator/target/pulsar-io-data-generator-*.nar
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] sijie commented on issue #9572: Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files
Posted by GitBox <gi...@apache.org>.
sijie commented on issue #9572:
URL: https://github.com/apache/pulsar/issues/9572#issuecomment-779462197
@lhotari Thank you for catching this! I think we want to have an admin API interface module to be used by the pulsar functions API because we want to expose PulsarAdmin through function context. So the problem seems to be that we didn't break down the interface to pulsar-admin-api and unnecessarily pull into other dependencies. @freeznet Can you take a look at this issue?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org