You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/24 07:25:10 UTC

[GitHub] [iceberg] FounderHy opened a new issue #3172: WHY column in the filter from the delete clause must be identity partition

FounderHy opened a new issue #3172:
URL: https://github.com/apache/iceberg/issues/3172


   Iceberg version: 0.12.0
   
   I use `DELETE FROM` sql to delete data from iceberg table, the sql like this:
   
   ```sql
   DELETE FROM order WHERE pay_date between '2021-09-12' and '2021-09-13'
   ```
   
   I execute this sql through spark, and found the result of `canDeleteWhere` is false
   
   After check the code, I found this, the field in the filter must be identity partition, I want to know how to use the delete sql in iceberg
   
   ```java
     @Override
     public boolean canDeleteWhere(Filter[] filters) {
       if (table().specs().size() > 1) {
         // cannot guarantee a metadata delete will be successful if we have multiple specs
         return false;
       }
   
       Set<Integer> identitySourceIds = table().spec().identitySourceIds();
       Schema schema = table().schema();
   
       for (Filter filter : filters) {
         // return false if the filter requires rewrite or if we cannot translate the filter
         if (requiresRewrite(filter, schema, identitySourceIds) || SparkFilters.convert(filter) == null) {
           return false;
         }
       }
   
       return true;
     }
   
     private boolean requiresRewrite(Filter filter, Schema schema, Set<Integer> identitySourceIds) {
       // TODO: handle dots correctly via v2references
       // TODO: detect more cases that don't require rewrites
       Set<String> filterRefs = Sets.newHashSet(filter.references());
       return filterRefs.stream().anyMatch(ref -> {
         Types.NestedField field = schema.findField(ref);
         ValidationException.check(field != null, "Cannot find field %s in schema", ref);
         return !identitySourceIds.contains(field.fieldId());
       });
     }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] FounderHy closed issue #3172: WHY column in the filter from the delete clause must be identity partition

Posted by GitBox <gi...@apache.org>.
FounderHy closed issue #3172:
URL: https://github.com/apache/iceberg/issues/3172


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] FounderHy commented on issue #3172: WHY column in the filter from the delete clause must be identity partition

Posted by GitBox <gi...@apache.org>.
FounderHy commented on issue #3172:
URL: https://github.com/apache/iceberg/issues/3172#issuecomment-932790122


   > Row-level delete requires Spark extensions, have you enabled that? I remember this is used for the default Spark full partition deletion without the need to do copy-on-write rewrite.
   
   Thank you for your response, I will try


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] FounderHy closed issue #3172: WHY column in the filter from the delete clause must be identity partition

Posted by GitBox <gi...@apache.org>.
FounderHy closed issue #3172:
URL: https://github.com/apache/iceberg/issues/3172


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on issue #3172: WHY column in the filter from the delete clause must be identity partition

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on issue #3172:
URL: https://github.com/apache/iceberg/issues/3172#issuecomment-926777079


   Row-level delete requires Spark extensions, have you enabled that? I remember this is used for the default Spark full partition deletion without the need to do copy-on-write rewrite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] FounderHy closed issue #3172: WHY column in the filter from the delete clause must be identity partition

Posted by GitBox <gi...@apache.org>.
FounderHy closed issue #3172:
URL: https://github.com/apache/iceberg/issues/3172


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org