You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/02/17 06:49:26 UTC

[GitHub] [iceberg] yaooqinn opened a new pull request #4149: [Spark] Delete dataFile on iteration level in SparkParquetWritersFlatDataBenchmark

yaooqinn opened a new pull request #4149:
URL: https://github.com/apache/iceberg/pull/4149


   This PR fixes a minor issue for a jmh test - `SparkParquetWritersFlatDataBenchmark`
   
   The `dataFile` holds mocked outputs now cleaned at `Level.Trial` which will be executed after the set of benchmark iterations. Then, it cause errors like below.
   
   ```java
   # Warmup Iteration   1: 7.251 s/op
   # Warmup Iteration   2: <failure>
   
   org.apache.iceberg.exceptions.AlreadyExistsException: File already exists: /Users/kentyao/iceberg/spark/v3.2/spark/build/tmp/jmh/parquet-flat-data-benchmark3385433263675547096.parquet
   	at org.apache.iceberg.Files$LocalOutputFile.create(Files.java:58)
   	at org.apache.iceberg.parquet.ParquetIO$ParquetOutputFile.create(ParquetIO.java:148)
   ```
   
   In this PR, I set the TearDown level to Iteration to make the dataFile be deleted after each iteration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #4149: [Spark] Delete dataFile on iteration level in SparkParquetWritersFlatDataBenchmark

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #4149:
URL: https://github.com/apache/iceberg/pull/4149#issuecomment-1047268252


   Thanks, @yaooqinn!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] nastra commented on pull request #4149: [Spark] Delete dataFile on iteration level in SparkParquetWritersFlatDataBenchmark

Posted by GitBox <gi...@apache.org>.
nastra commented on pull request #4149:
URL: https://github.com/apache/iceberg/pull/4149#issuecomment-1046640345


   @rdblue or @RussellSpitzer when you merge this one, please also merge https://github.com/apache/iceberg/pull/3910 as it fixes the same issue for Spark 3.2 benchmarks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #4149: [Spark] Delete dataFile on iteration level in SparkParquetWritersFlatDataBenchmark

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #4149:
URL: https://github.com/apache/iceberg/pull/4149#issuecomment-1046283292


   @nastra or @RussellSpitzer, can you review this one? Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #4149: [Spark] Delete dataFile on iteration level in SparkParquetWritersFlatDataBenchmark

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #4149:
URL: https://github.com/apache/iceberg/pull/4149


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org