You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/07/24 01:57:04 UTC

[GitHub] [iceberg] kbendick commented on a change in pull request #1145: Implement the flink stream writer to accept the row data and emit the complete data files event to downstream

kbendick commented on a change in pull request #1145:
URL: https://github.com/apache/iceberg/pull/1145#discussion_r459818338



##########
File path: core/src/main/java/org/apache/iceberg/BaseFile.java
##########
@@ -360,7 +360,7 @@ public ByteBuffer keyMetadata() {
     if (list != null) {
       List<E> copy = Lists.newArrayListWithExpectedSize(list.size());
       copy.addAll(list);
-      return Collections.unmodifiableList(copy);

Review comment:
       Additionally, I've not contributed meaningfully to the code base (I fixed some white space issues), but I have been following along for quite some time and I'm very interested in attempting to using Flink to write Iceberg files at my work. If there's any way I could be of assistance with this, please let me know. Possibly I could help fill in the gaps of Flink knowledge.
   
   Please let me know where a better place to discuss new contributor opportunities would be! But I thought it would be prudent to share my experience with users using `de.javakaffee:kryo-serializers` in their flink programs.
   
   I'd also be open to testing out flink + iceberg integration if there's a need for that.

##########
File path: core/src/main/java/org/apache/iceberg/BaseFile.java
##########
@@ -360,7 +360,7 @@ public ByteBuffer keyMetadata() {
     if (list != null) {
       List<E> copy = Lists.newArrayListWithExpectedSize(list.size());
       copy.addAll(list);
-      return Collections.unmodifiableList(copy);

Review comment:
       Sorry if it's inappropriate for me to but in here, as I'm not a committer or part of the project.
   
   But I maintain and help developers at my company with dozens of flink programs. I can attest that we have had issues with using `de.javakaffee:kryo-serializers` in our flink programs, specifically with UnmodifiableListSerializer. As one example, flink allows users to override the class loading to either be parent first or child first. This, combined with the loading of an invisible class via `Class.forName` to register the UnmodifiableCollectionsSerializer has lead to developers encountering class not found issues depending on how they compile their programs.
   
   Additionally, in the newer versions of flink (I belive >1.9, which current is 1.11 recently released), certain internal classes use their own class loading order, ignoring the configured class loading order (parent vs child) which the user configured. This has lead to some unexpected issues at run time, namely class not found errors as the class is not on the correct class path for flink's kryo dependency to find it.
   
   In general, kryo is typically attempted to be avoided in flink-landia in favor of custom serializers which use the Flink serializer / deserializer interface.
   
   So I'd +1 with @JingsongLi to registering a custom serializer a custom flink serializer to be used in the flink-iceberg subproject. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org