You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/06/16 15:33:20 UTC

[GitHub] [iceberg] bartosz25 opened a new issue, #5065: The check-ordering purpose

bartosz25 opened a new issue, #5065:
URL: https://github.com/apache/iceberg/issues/5065

   Hi, 
   
   I'm learning Iceberg and am struggling with the `check-ordering` option. As far as I understood, it 
   > Checks if input schema and table schema are same
   https://iceberg.apache.org/docs/latest/spark-configuration/#write-options 
   
   I got the test case from the repo...
   https://github.com/apache/iceberg/blob/2531545e3cd3b97494c9e3c137cfe04f4459a9fb/spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/source/TestPartitionValues.java
   
   ... and used it to play a bit with the option:
   
   ```
       sparkSession.sql(
         """
           |CREATE OR REPLACE  TABLE local.db.letters (
           |  id STRING NOT NULL,
           |  letter1 STRING NOT NULL,
           |  letter2 STRING NOT NULL
           |) USING iceberg
           |""".stripMargin)
   
       sparkSession.sql("SELECT '2' AS id, 'a' AS letter1, 'A' AS letter2")
         .select("letter2", "id" , "letter1") // This is not necessary but I left it for simpler columns ordering and to follow the example from the unit test
         .write
         .option("check-ordering", "true").insertInto("local.db.letters")
   ```
   I was expecting to see the code failing with the `check-ordering` enabled and the reordered columns, but it succeeded with the position-based insert:
   
   ```
   +---+-------+-------+
   |id |letter1|letter2|
   +---+-------+-------+
   |A  |2      |a      |
   +---+-------+-------+
   ```
   
   Thinking it's my local issue (Iceberg 0.13.1, Spark 3.2.0), I cloned the project repo 
   
   * extended the aforementioned unit test by 
   ```
       spark.read()
               .format("iceberg")
               .option(SparkReadOptions.VECTORIZATION_ENABLED, String.valueOf(vectorized))
               .load(location.toString()).show();
   ```
   
   * changed the check-ordering flag to true:
   ```
       df.select("data", "id").write()
               .format("iceberg")
               .mode(SaveMode.Append)
               .option(SparkWriteOptions.CHECK_ORDERING, "true")
               .save(location.toString());
   ```
   
   Surprisingly, the print returned correct results but the `check-ordering` flag seems having no effect on the writer. The operation works when it's enabled and disabled. 
   
   For sure, I'm missing something. Can you shed some light on it? Is my understanding of this `check-ordering` correct or wrong? If so, do you have any minimal reproducible code showing the insert broken because of the reordered `select(...)` and the `check-ordering` enabled?
   
   Thank you.
   Best,
   Bartosz.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] closed issue #5065: The check-ordering purpose

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #5065: The check-ordering purpose
URL: https://github.com/apache/iceberg/issues/5065


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #5065: The check-ordering purpose

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #5065:
URL: https://github.com/apache/iceberg/issues/5065#issuecomment-1366992248

   This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] github-actions[bot] commented on issue #5065: The check-ordering purpose

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #5065:
URL: https://github.com/apache/iceberg/issues/5065#issuecomment-1350134025

   This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org