You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/08/14 07:59:57 UTC

[GitHub] [iceberg] zhangdove opened a new issue #1341: Use overwrite API and throw ValidationException

zhangdove opened a new issue #1341:
URL: https://github.com/apache/iceberg/issues/1341


   1.My environment
   ```
   spark version:3.0.0
   iceberg version: 0.9.0
   ```
   2.My use case
   ```
     def createPartitionTable(catalog: HadoopCatalog, tableIdentifier: TableIdentifier): Unit = {
       val columns: List[Types.NestedField] = new ArrayList[Types.NestedField]
       columns.add(Types.NestedField.of(1, true, "id", Types.IntegerType.get, "id doc"))
       columns.add(Types.NestedField.of(2, true, "name", Types.StringType.get, "name doc"))
       columns.add(Types.NestedField.of(3, true, "time", Types.TimestampType.withZone(), "create time doc"))
   
       val schema: Schema = new Schema(columns)
       val partition = PartitionSpec.builderFor(schema).day("time", "day").build()
   
       val table = catalog.createTable(tableIdentifier, schema, partition)
     }
   
     def writeData(spark:SparkSession): Unit ={
       import spark.implicits._
       val seq = Seq(StructedDb(1, "v1", Timestamp.valueOf("2020-01-01 12:00:00")))
       seq.toDF.writeTo(s"hadoop_prod.${schemaName}.${tableName}").overwrite($"time" >= Timestamp.valueOf("2020-01-01 00:00:00"))
   
       val seq2 = Seq(StructedDb(2, "v2", Timestamp.valueOf("2020-01-02 13:00:00")))
       seq2.toDF.writeTo(s"hadoop_prod.${schemaName}.${tableName}").overwrite($"time" >= Timestamp.valueOf("2020-01-02 00:00:00"))
     }
   
     createPartitionTable(catalog, tableIdentifier)
     writeData(spark)
   ```
   3. Throw Exception
   ```
   Caused by: org.apache.iceberg.exceptions.ValidationException: Cannot delete file where some, but not all, rows match filter ref(name="time") >= 1577894400000000: file:/Users/dovezhang/iceberg/warehouse/testDb/testTb/data/day=2020-01-01/00000-0-c8dc7903-fba7-41ed-8c8b-916b7c066ffc-00001.parquet
   	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:42)
   	at org.apache.iceberg.ManifestFilterManager.manifestHasDeletedFiles(ManifestFilterManager.java:355)
   ```
   
   When writing the second data, why would iceberg delete the file in the partition where the first record is located?
   
   I'm not sure if I'm using it the wrong way, I would appreciate it if someone could tell me the right way.
   
   https://github.com/apache/iceberg/blob/master/site/docs/spark.md#overwriting-data


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] zhangdove closed issue #1341: Use overwrite API and throw ValidationException

Posted by GitBox <gi...@apache.org>.
zhangdove closed issue #1341:
URL: https://github.com/apache/iceberg/issues/1341


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] zhangdove edited a comment on issue #1341: Use overwrite API and throw ValidationException

Posted by GitBox <gi...@apache.org>.
zhangdove edited a comment on issue #1341:
URL: https://github.com/apache/iceberg/issues/1341#issuecomment-678955648


   I've done some more local tests and debug, and this phenomenon has to do with time zones. 
   
   > Somehow it seems to be working now. I'll finish this question.
   When closing the Issue, It was because I used my PR to reproduce the problem at that time
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] zhangdove commented on issue #1341: Use overwrite API and throw ValidationException

Posted by GitBox <gi...@apache.org>.
zhangdove commented on issue #1341:
URL: https://github.com/apache/iceberg/issues/1341#issuecomment-676902652


   Somehow it seems to be working now. I'll finish this question.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] zhangdove commented on issue #1341: Use overwrite API and throw ValidationException

Posted by GitBox <gi...@apache.org>.
zhangdove commented on issue #1341:
URL: https://github.com/apache/iceberg/issues/1341#issuecomment-678955648


   I've done some more local tests and debug, and this phenomenon has to do with time zones. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] zhangdove edited a comment on issue #1341: Use overwrite API and throw ValidationException

Posted by GitBox <gi...@apache.org>.
zhangdove edited a comment on issue #1341:
URL: https://github.com/apache/iceberg/issues/1341#issuecomment-678955648


   I've done some more local tests and debug, and this phenomenon has to do with time zones. 
   
   > Somehow it seems to be working now. I'll finish this question.
   
   When closing the Issue, It was because I used this PR #1355  to reproduce the problem at that time
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] zhangdove edited a comment on issue #1341: Use overwrite API and throw ValidationException

Posted by GitBox <gi...@apache.org>.
zhangdove edited a comment on issue #1341:
URL: https://github.com/apache/iceberg/issues/1341#issuecomment-678955648


   I've done some more local tests and debug, and this phenomenon has to do with time zones. 
   
   > Somehow it seems to be working now. I'll finish this question.
   
   When closing the Issue, It was because I used my PR to reproduce the problem at that time
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org