You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/24 04:35:52 UTC

[GitHub] [iceberg] coolderli opened a new pull request #2859: Spark: Set properties for deletewriter

coolderli opened a new pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859


   DeleteWrite should call `setAll(properties)` to set properties to Parquet and Avro.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] coolderli closed pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
coolderli closed pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#issuecomment-919476134


   Would appreciate your review on the new writers, @coolderli!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] coolderli commented on pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
coolderli commented on pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#issuecomment-918883387


   > @coolderli, what went wrong without this?
   > 
   > Also, you should definitely coordinate with @aokolnychyi on the Spark implementations. I believe that he has them fully working at this point.
   
   Yes, I found @aokolnychyi has implemented new writes #2873, we can use the new writer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#discussion_r706865601



##########
File path: spark/src/main/java/org/apache/iceberg/spark/source/SparkAppenderFactory.java
##########
@@ -205,6 +205,7 @@ private StructType lazyPosDeleteSparkType() {
         case PARQUET:
           return Parquet.writeDeletes(file.encryptingOutputFile())
               .createWriterFunc(msgType -> SparkParquetWriters.buildWriter(lazyEqDeleteSparkType(), msgType))
+              .setAll(properties)

Review comment:
       @coolderli, could you add this update?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] coolderli commented on a change in pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
coolderli commented on a change in pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#discussion_r706976250



##########
File path: spark/src/main/java/org/apache/iceberg/spark/source/SparkAppenderFactory.java
##########
@@ -205,6 +205,7 @@ private StructType lazyPosDeleteSparkType() {
         case PARQUET:
           return Parquet.writeDeletes(file.encryptingOutputFile())
               .createWriterFunc(msgType -> SparkParquetWriters.buildWriter(lazyEqDeleteSparkType(), msgType))
+              .setAll(properties)

Review comment:
       @zhong-yj xxx




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#issuecomment-919477057


   I think we got this covered in the new `WriterFactory` implementations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi edited a comment on pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
aokolnychyi edited a comment on pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#issuecomment-919477057


   I think we got this covered in the new `WriterFactory` implementations where we call `setAll`.
   Is that why we closed this PR, @coolderli?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#issuecomment-917673708


   @coolderli, if you have a use case where you found this problem, could you add it to the description? In what cases is this needed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] coolderli commented on pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
coolderli commented on pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#issuecomment-917802695


   > @coolderli, if you have a use case where you found this problem, could you add it to the description? In what cases is this needed?
   
   @rdblue I was trying to use this writer to implement merge-on-read mode on spark for `merge into`, `update`, and `delete from`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] coolderli commented on pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
coolderli commented on pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#issuecomment-919756373


   > I think we got this covered in the new `WriterFactory` implementations where we call `setAll`.
   > Is that why we closed this PR, @coolderli?
   
   Yes, I think the new `WriterFactory` has covered this case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on a change in pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
openinx commented on a change in pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#discussion_r676266247



##########
File path: spark/src/main/java/org/apache/iceberg/spark/source/SparkAppenderFactory.java
##########
@@ -205,6 +205,7 @@ private StructType lazyPosDeleteSparkType() {
         case PARQUET:
           return Parquet.writeDeletes(file.encryptingOutputFile())
               .createWriterFunc(msgType -> SparkParquetWriters.buildWriter(lazyEqDeleteSparkType(), msgType))
+              .setAll(properties)

Review comment:
       Thanks for the contribution, @coolderli !  I also think the newPosDeleteWriter need the properties setting ...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#issuecomment-918373757


   @coolderli, what went wrong without this?
   
   Also, you should definitely coordinate with @aokolnychyi on the Spark implementations. I believe that he has them fully working at this point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] coolderli commented on a change in pull request #2859: Spark: Set properties for deletewriter

Posted by GitBox <gi...@apache.org>.
coolderli commented on a change in pull request #2859:
URL: https://github.com/apache/iceberg/pull/2859#discussion_r676297709



##########
File path: spark/src/main/java/org/apache/iceberg/spark/source/SparkAppenderFactory.java
##########
@@ -205,6 +205,7 @@ private StructType lazyPosDeleteSparkType() {
         case PARQUET:
           return Parquet.writeDeletes(file.encryptingOutputFile())
               .createWriterFunc(msgType -> SparkParquetWriters.buildWriter(lazyEqDeleteSparkType(), msgType))
+              .setAll(properties)

Review comment:
       Yes, I have already added. https://github.com/apache/iceberg/pull/2859/files#diff-906095312ad35f773d13f65e57921be14d10ba569ce7c3ff9a03dbedf423b666R248




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org